Introduction

The non-mineralized part of the bone consists of the red and yellow bone marrow, which cover endocrine and hematopoietic functions and differ in respects of composition and vascularization [1, 2]. The quantitative composition of bone marrow has been assessed using single-voxel proton magnetic resonance spectroscopy (MRS) and chemical shift encoding-based water-fat magnetic resonance imaging (MRI). This allows for calculating surrogate parameters like proton density fat fraction (PDFF) and with MRS the identification of the chemical structure of fatty acids and their magnitude [3,4,5].

In aging and due to pathophysiological changes associated with endocrine or metabolic diseases like osteoporosis and type 2 diabetes, vertebral bone marrow alterations with shifts towards greater bone marrow adiposity and lower unsaturation levels were observed [6,7,8]. Decreasing bone mineral density (BMD) clinically represents an issue which can be detected in a characteristic distribution pattern in the elderly and people suffering from osteoporosis but also due to neoplastic diseases [9]. To evaluate the fracture risk associated with bone mineral loss, Dual Energy X-Absorptiometry (DXA) poses the preferred technique in a clinical setting. However, this method has inherent limitations. DXA is prone to confounding effects like local excessive fat tissue deposition and by inadequate assessment of local density inhomogeneities [10, 11]. Moreover, it has been shown that a high percentage of vertebral fractures occur in individuals not classified as osteoporotic or osteopenic according to DXA measurements [12]. MRI offers the possibility to investigate on spatial changes in bone marrow composition by detecting fat content and depicting tissue structure quantitatively [13, 14]. Thus, it enables the investigator to calculate subtle changes in vertebral bone marrow composition and to stratify the potential fracture risk [13].

In the past, PDFF has been shown to be a valid biomarker for evaluating fatty infiltration of vertebral bone marrow with an increase from the cervical to the lumbar spine in children and adults [4, 15]. Baum et al. [16] reported an accelerated fatty conversion of the vertebral bone marrow in females compared with males with increasing age particularly evident after menopause. Ruschke et al. [15] showed that detectable changes of vertebral fat marrow content and in this way an increase of PDFF not only occur in the adult vertebral column, but also can likewise be visualized during childhood. Relative age-related PDFF changes showed an anatomical variation with the most pronounced changes at lower lumbar vertebral levels in both sexes [4]. Furthermore, PDFF shows specific distribution patterns inter- and intraindividually dependent on hormonal changes (e.g., induced by menopause) and on the topography [3, 13, 17].

To detect structural changes qualitatively in terms of signal loss appearing in age- or malignancy-associated bone marrow alterations in the clinical routine, conventional anatomic imaging using T1- or T2-weighed sequences have been implemented in standard examination protocols [18, 19]. More advanced methods which allow for a quantitative evaluation like PDFF calculation are on the edge of being transferred into clinical use, because of the rising evidence for serving as a valid biomarker for muscular and skeletal fat accumulation [3, 13, 14, 20]. Still, the heterogeneity of the vertebral bone marrow using texture analysis based on chemical shift encoding water-fat MRI has not been analyzed.

The texture of an image can be defined as spatial arrangement of pixels with different intensities [21]. Texture measures can quantify the gray level variations reflecting repetitive patterns and uniformity in the image pixels, e.g., by using gray Haralick’s level co-occurrence matrix (GLCM) [22]. These texture parameters have been used for trabecular bone microstructure analysis in computed tomography (CT) scans and can similarly be applied on MR-based PDFF maps [21]. Additionally, certain texture features were shown to be of diagnostic help in identifying soft tissue malignancies from mammograms or fracture risk in CT [23, 24]. With that said, the methodic deduction of the concept of using texture parameters to attain a diagnostic gain and subsequent transfer to MRI as radiation-free modality is a logical step and of high clinical significance with implementation in risk stratification for potential fractures or tumor relapse.

The purpose of this feasibility study was to investigate the spatial heterogeneity of the lumbar vertebral bone marrow by using texture analysis in PDFF maps derived from chemical shift encoding–based water-fat MRI in pre- and postmenopausal women and to compare the performance of the different parameters in differentiating the two groups.

Materials and methods

Subjects

The study was approved by the local institutional committee for human research. All subjects gave written informed consent before participation in the study.

Healthy pre- and postmenopausal women were included in this study. Exclusion criteria were history of pathological bone changes such as hematological or metabolic bone disorders aside from osteoporosis, history of diabetes, and contraindications for MR imaging. In total, 15 pre- and 26 postmenopausal women were recruited. Subjects included in this study received no antiresorptive medication like bisphosphonates and denosumab. According to medically indicated DXA measurements, 17 postmenopausal women had normal BMD values, five were in the osteopenic range, and four were classified as osteoporotic.

MR imaging

All subjects underwent 3T MRI (Ingenia, Philips Healthcare, Best, The Netherlands). An eight-echo 3D spoiled gradient echo sequence was used for chemical shift encoding–based water-fat separation at the lumbar spine using the built-in-the-table posterior coil elements (12-channel array). The sequence acquired the eight echoes in a single TR using non-flyback (bipolar) readout gradients and the following imaging parameters: TR/TE1/ΔTE = 11/1.4/1.1 ms, FOV = 220 × 220 × 80 mm3, acquisition matrix size = 124 × 121, acquisition voxel size = 1.8 × 1.8 × 4.0 mm3, receiver bandwidth = 1527 Hz/pixel, frequency direction = A/P (to minimize breathing artifacts), 1 average, and scan time = 1 min and 17 s. A flip angle of 3° was used to minimize T1-bias effects.

Vertebral bone marrow fat quantification

The gradient echo imaging data were processed online using the fat quantification routine of the MR vendor. The routine procedure first performs a phase error correction and then a complex-based water-fat decomposition using a pre-calibrated seven-peak fat spectrum and a single T2* to model the signal variation with echo time. The imaging-based proton density fat fraction (PDFF) map was computed as the ratio of the fat signal over the sum of fat and water signals. The vertebral bodies L1 to L5 were included in the analysis and manually segmented by a radiologist (Fig. 1). The posterior elements and sclerotic changes of the endplates were excluded. Segmentation was performed on the PDFF maps by using the free open-source software Medical Imaging Interaction Toolkit (MITK, developed by the Division of Medical and Biological Informatics, German Cancer Research Center, Heidelberg, Germany; www.mitk.org).

Fig. 1
figure 1

Representative segmentation of lumbar vertebral bodies 1 to 5 in the PDFF map of a 22 year old woman

Texture analysis

Texture analysis was performed on vertebral bodies (L1 to L5) using GLCM [13]. Initially, gray level quantization was performed to prevent sparseness by normalizing the image intensities using the maximum gray level present in an image. The GCLM metrics were obtained from 16-bit images [25,26,27]. The statistical moments (variance, skewness, and kurtosis) and second-order GLCM features (energy, entropy, contrast, homogeneity, correlation, sum average, variance, and dissimilarity) were determined. The features quantify smoothness, roughness, and heterogeneity in an image. GLCM computes the joint probability of two adjacent voxel intensities at a given offset d = (dx, dy, dz) and angular directions θ = (0°, 45°, 90°, and 135°) [21, 22]. Where dx and dy denotes the displacement along x and y axis; dz denotes the displacement along z axis to compute the co-occurrence of voxel intensities at a given offset d and a specific angular direction θ [28]. The co-occurrence probabilities of voxel intensities were computed from 26 neighbors, aligned in 13 directions. The mean value of the features computed from the 13 directions ensures the rotation invariance [28]. The gray level uniform quantization and texture analysis were performed using MATLAB 2017 (MathWorks Inc., Natick, MA, USA).

Statistical analysis

The statistical analyses were performed with SPSS (SPSS Inc., Chicago, IL, USA). All tests were done using a two-sided 0.05 level of significance.

The Kolmogorov-Smirnov test indicated no normally distributed data for the majority of parameters. Mean and standard deviation (SD) of PDFF and texture parameters averaged over L1 to L5 were computed for pre- and postmenopausal women and compared using the Mann-Whitney tests. Furthermore, receiver operating characteristics (ROC) were performed to assess the performance of PDFF and texture parameters averaged over L1 to L5 to differentiate pre- and postmenopausal women and reported as area under the curve (AUC) values. AUC values were comparted by the method proposed by Hanley and Mc Neil [29]. The Friedmann tests for PDFF/texture parameters and entering L1 to L5 as group variable were preformed testing whether any significant differences between all five vertebral levels exist. This was done in pre- and postmenopausal women separately.

The Spearman correlations coefficients r were computed to investigate the association of PDFF and texture parameters with age and BMI. Linear regression models were used to adjust the differences in textural parameters between pre- and postmenopausal women for mean PDFF and age.

Results

Study population

Our study population consisted of 15 premenopausal and 26 postmenopausal women. Subjects in the premenopausal group were aged 30 ± 7 years and those in the postmenopausal group 65 ± 7 years. Both groups did not significantly (p > 0.05) differ in BMI (Table 1).

Table 1 Subject characteristics (age and BMI), PDFF values, and texture features averaged over L1 to L5 in pre- and postmenopausal women. Parameters were compared between the two groups with the Mann-Whitney tests (p values) and receiver operator characteristics (area under curve (AUC)). Status: 0 premenopausal, 1 postmenopausal

PDFF measurements

Mean PDFF values averaged over L1 to L5 showed statistically significant differences between pre- to postmenopausal women (27.76 ± 7.31% versus 49.37 ± 8.14%; p < 0.001; AUC = 0.97; Table 1). PDFF significantly (p < 0.001) increased from L1 to L5 in both groups (Table 2).

Table 2 PDFF values and texture features in L1 to L5 in pre- and postmenopausal women. Differences between vertebral levels L1 to L5 were evaluated using the Friedmann tests in pre- and postmenopausal women, separately

Texture analysis

Eleven texture features were computed in pre- and postmenopausal women as shown in Table 1. Contrast and dissimilarity differentiated the pre- and postmenopausal best with AUC of 0.97 and 0.96, respectively (Fig. 2). Kurtosisglobal, correlation, and sumaverage showed no significant difference between pre- and postmenopausal women (p > 0.05). In contrast to PDFF, all texture features except for correlation showed no statistically significant anatomical variations from L1 to L5 (p > 0.05; Table 2). However, correlation showed no consistent trend from L1 to L5 (Table 2). No significant (p > 0.05) differences between the AUCs based on mean PDFF values and AUCs based on texture features contrast and dissimilarity could be detected.

Fig. 2
figure 2

Representative color-coded PDFF maps of the lumbar vertebral bone marrow of a premenopausal woman (age 22 years; mean PDFF 26.6%; contrast 125,518, dissimilarity 82.97) (a) and a postmenopausal woman (age 71 years; mean PDFF 42.7%; contrast 183,113, dissimilarity 108.75) (b)

Correlations

Significant correlations were detected between age and PDFF (r = 0.703, p < 0.0001), contrast (r = 0.626, p < 0.0001), and dissimilarity (r = 0.470, p < 0.0001), respectively (Table 3 and Fig. 3). BMI showed no significant correlations with age, PDFF, and texture parameters (p > 0.05; Table 3 and Fig. 3).

Table 3 Correlation between subject characteristics (age and BMI), PDFF values, and texture parameters (contrast and dissimilarity) in pre- and postmenopausal women. Parameters were compared with Spearman’s rho test
Fig. 3
figure 3

PDFF, contrast, and dissimilarity are plotted against BMI (a, c, e) and age (b, d, f). PDFF, contrast, and dissimilarity correlate significantly with age

Linear regression

Adjusting for PDFF as a control variable, contrast, (p = 0.011) and dissimilarity (p = 0.009) showed significant differences between pre- and postmenopausal women. Adjusting for age resulted in no significant differences in PDFF, dissimilarity, and contrast between the two groups (p > 0.05).

Discussion

This study demonstrated that postmenopausal women had not only an increased lumbar vertebral bone marrow PDFF, but also a greater bone marrow heterogeneity as assessed by texture analysis in PDFF maps compared with premenopausal women.

Texture analysis was firstly described by Haralick in the 1970s as a tool for classification of imaging features in general like photographic or satellite images, introducing 28 parameters like contrast, correlation, and entropy among others [22]. Since then, a large and increasing number of researchers have used these texture features for medical imaging analysis, e.g., in CT, MRI, FDG-PET, and ultrasound [28, 30,31,32]. Besides its recent multifold use in oncological imaging in terms of tissue entity discrimination, characterization, and treatment response monitoring, texture analysis also was described as a reproducible tool to quantitatively assess paraspinal fatty infiltration in MRI [33,34,35,36]. In these and other studies, texture heterogeneity was described to be associated with therapy response and clinical outcome [35]. Besides MRI, the use of texture analysis was investigated on using mammography and in CT in the past to analyze its capability to contribute to computer-aided cancer diagnosis or bone quality measurements [23, 24].

In this feasibility study, we showed that bone marrow heterogeneity, analogously to PDFF, increases significantly in postmenopausal women. In contrast to increasing PDFF from L1–5 however, bone marrow heterogeneity remained constant from L1 to L5. One possible hypothesis for this finding from a pathophysiologic point of view might be due to the transformation pattern from red to yellow bone marrow starting from solitary foci [1, 37]. The temporal discrepancy regarding the starting point of vertebral bone marrow fatty conversion beginning from L5 and visualized by increasing PDFF values from the cervical to the lumbar spine has been reported previously [15]. The results of texture analysis presented in this study imply a rather homogenous spatial fatty bone marrow conversion at different vertebral levels despite the differing time of the initial onset the structural changes are taking place. Although experimental preclinical studies have been conducted repetitively proving the negative correlation of increasing bone marrow fat and trabecular structure in animal models, the spatial replacement pattern in humans still remains unclear in this study giving us a hint towards anatomical homogeneous bone marrow changes [1, 38].

Dissimilarity and contrast outperformed the other Haralick texture features calculated from GLCM and showed comparable discrimination power to PDFF in differentiating between pre- and postmenopausal women (AUC = 0.97 for contrast, AUC = 0.96 for dissimilarity, and AUC = 0.97 for PDFF, respectively). Neither the AUC values for dissimilarity and PDFF nor the AUC values for contrast and PDFF showed significant differences regarding the differentiation of the two groups.

To our knowledge, this study is the first to verify an increased bone marrow heterogeneity in postmenopausal women by use of texture analysis. Structural musculoskeletal changes like increasing bone marrow adiposity due to aging, hormonal changes, and endocrine or metabolic diseases have been described extensively in the past [5, 6]. Baum et al. [3, 4] visualized these changes in the lumbar spine using chemical shift encoding–based water-fat MRI and showed an increasing bone marrow adiposity from L1 to L5 as well as in postmenopausal period. Texture parameters (except correlation) showed no significant difference concerning anatomical variation through all scanned subjects. However, besides dissimilarity and contrast, other texture metrics like skewness, kurtosis, energy, and entropy proved to be risen after menopause with the latter showing the best result in differentiating post- from premenopausal women (AUC = 0.79). With that said, it stands to reason that texture analysis for water-fat MRI may allow for similar diagnostic capabilities in the clinic, with additional benefit of being radiation free.

The texture feature “contrast” gives elements with similar gray level values a low weight. Elements with differing gray levels are given a high figure [22]. Texture “dissimilarity” is evaluated with the Kullback-Leibler divergence and can roughly be described as a measure of how different gray levels of two elements appear. “Entropy” accounts for a measure of randomness in pixel distribution and may depict clinically relevant changes in vertebral micro-architectural alterations. Other groups also investigated on the reliability of different texture parameters and proved that features like kurtosis, skewness, and uniformity showed good results in diagnostic and monitoring quality in cancer imaging [34, 35]. The described arbitrarily detected texture features are inherently dependent on imaging properties like resolution, noise, and scan parameters (repetition time, echo time, and receiver bandwidth) [39]. To ensure comparable signal-to-noise ratios throughout the scans, similar scan parameters and MRI protocols therefore should be used.

There are several limitations to this conducted study. First, to start with the methodical aspects confining the presented work, only 11 texture features were investigated on. Further parameters which showed good and reproducible results in other studies, e.g., uniformity, could be added to texture feature pool. Second, the heterogeneous distribution of healthy, osteopenic, and osteoporotic classified women can be mentioned. In a following study, postmenopausal women homogenously distributed into the three mentioned subgroups according to DXA measurements could be scanned and textural features within and across the groups could be adequately compared. This would be an important issue in diagnostic imaging and a step towards acquiring benchmark and threshold values for disease entity differentiation in MRI. In a further step, other diseases proven to be associated with structural changes in vertebral bone marrow composition and increasing PDFF values like type 2 diabetes mellitus could be in the focus of attention following the hints the presented study gave us concerning the connection between texture features, morphology, and pathophysiology [3, 14, 16].

In conclusion, this study shows that texture features namely dissimilarity and contrast acquired by chemical shift encoding–based water-fat MRI can be used to describe spatial heterogeneity of vertebral bone marrow in pre- and postmenopausal women. These parameters might offer additional insight into vertebral bone marrow alterations due to aging or hormonal changes compared with established parameters like PDFF and enlighten osseous pathologic processes.