Thigh muscle segmentation of chemical shift encoding-based water-fat magnetic resonance images: The reference database MyoSegmenTUM

Magnetic resonance imaging (MRI) can non-invasively assess muscle anatomy, exercise effects and pathologies with different underlying causes such as neuromuscular diseases (NMD). Quantitative MRI including fat fraction mapping using chemical shift encoding-based water-fat MRI has emerged for reliable determination of muscle volume and fat composition. The data analysis of water-fat images requires segmentation of the different muscles which has been mainly performed manually in the past and is a very time consuming process, currently limiting the clinical applicability. An automatization of the segmentation process would lead to a more time-efficient analysis. In the present work, the manually segmented thigh magnetic resonance imaging database MyoSegmenTUM is presented. It hosts water-fat MR images of both thighs of 15 healthy subjects and 4 patients with NMD with a voxel size of 3.2x2x4 mm3 with the corresponding segmentation masks for four functional muscle groups: quadriceps femoris, sartorius, gracilis, hamstrings. The database is freely accessible online at https://osf.io/svwa7/?view_only=c2c980c17b3a40fca35d088a3cdd83e2. The database is mainly meant as ground truth which can be used as training and test dataset for automatic muscle segmentation algorithms. The segmentation allows extraction of muscle cross sectional area (CSA) and volume. Proton density fat fraction (PDFF) of the defined muscle groups from the corresponding images and quadriceps muscle strength measurements/neurological muscle strength rating can be used for benchmarking purposes.


Introduction
The non-invasive evaluation of muscle tissue with magnetic resonance imaging (MRI) has recently gained a lot of interest to assess muscle strength, neuromuscular diseases (NMD), musculoskeletal disorders, metabolic diseases and aging effects. An increased signal on T2-weighted MR images is observed after exercise [1][2][3][4][5] and influence of training effects have been shown in MR images [6,7]. In NMD the two main characteristic pathologies are acute edematous muscle alterations and fatty infiltration of chronically affected muscles [8][9][10]. In other musculoskeletal disorders such as osteoarthritis muscle proton density fat fraction (PDFF) has been related to symptomatic and structural severity [11]. Regional distribution of intramuscular adipose tissue has also been shown to be different in patients with metabolic diseases such as type 2 diabetes [12].
Conventional magnetic resonance imaging of muscles includes T1-weighted and T2-weighted Short Tau Inversion Recovery (STIR) sequences. Based on such images a qualitative assessment of the described changes can be performed and semi-quantitative rating scales for changes in the muscles tissue can be applied [13][14][15][16]. However, the hereby performed analysis is highly dependent on the subjective judgment of the reader and a longitudinal evaluation of training effects or disease progression remains challenging. To allow a more objective analysis there is an emerging need for quantitative MR data.
Muscle hypertrophy or atrophy can be accessed by the cross sectional area (CSA) of the muscle. The CSA can be calculated based on anatomical images of the musculature. Using quantitative chemical shift encoding-based water-fat MRI fatty infiltration in muscle tissue can be additionally evaluated [17][18][19].
PDFF and CSA of muscle groups and individual muscles can be related to biometrically measured muscle strength [20][21][22]. In patients with NMD characteristic patterns of fatty infiltration especially in the thigh have been defined [23] representing a promising approach for diagnosis of the underlying disease using MR images.
For the subsequent analysis of the quantitative MR data that allows the investigation of single muscle groups, regions of interest (ROIs) have to be defined through delineation of the muscle contours [24]. Despite recent advances on semi-automatic or automatic segmentation methods, their application in clinical settings is still difficult [25], and the segmentation of the muscle often needs to be performed mainly manually for every single muscle or muscle group. Skeletal muscle manual segmentation is a very time consuming process and a bottleneck in the widespread clinical application of quantitative skeletal muscle MRI. A reliable, robust and fast automated or semi-automated way of muscle segmentation would be highly beneficial for the analysis of quantitative muscle MR images. An automatic muscle segmentation would also improve multi-parametric analysis of different biomarkers because the defined ROIs could be easily applied on other sequences covering the same anatomical region. Already existing methods would highly benefit from further evaluation and have yet been only applied on a ground truth of healthy subjects to the best of our knowledge [25][26][27][28][29].
Particularly the thigh is a region of high interest for quantitative muscle MRI because of its technical, anatomical and functional advantages. The thigh is a region of good magnetic field homogeneity with relatively low motion artefacts. Patients with NMD show the disease characteristic patterns mainly in the thigh and lower leg [23] and it is possible to perform muscle selective exercise in this region.
In the present work, a database offering free online access to manually segmented thigh muscles of healthy volunteers and NMD patients is reported. A ground truth MR image database is introduced aiming to facilitate access to manually segmented images and to become an essential tool as training or test dataset in developing automatic segmentation methods for thigh muscles. Volumetric information for the segmented muscle groups as well as strength measurements are also provided.

Subjects
The study was approved by the institutional Committee for Human Research (Ethikkommission der Fakultaet fuer Medizin der TU Muenchen). All subjects gave written informed consent before participation in the study.

MR imaging
The bilateral thigh muscles were scanned on a 3 Tesla system (Philips, Ingenia, Best, Netherlands). Scanning was performed in 2 (healthy volunteers) or 3 (patients) consecutive axial stacks to cover bilaterally the whole thigh from the hip down to the cranial edge of the patella. The built-in 12-channel posterior coil and a 16-channel anterior coil were used, which was placed on top of the hip and thigh region to ensure best signal quality for the scanned muscle groups.
For the healthy volunteers HV001 to HV011 a six-echo 3D spoiled gradient echo sequence was used for chemical shift encoding-based water-fat separation. The sequence acquired the six echoes in a single TR using non-flyback (bipolar) read-out gradients and the following imaging parameters: TR/TE min /ΔTE = 10/1.04/0.8 ms, FOV = 300x525 mm 2 , acquisition matrix = 96x263, acquired slice thickness = 4 mm, reconstructed matrix size = 560x560, voxel size = 3.2x2x4 mm 3 , number of slices = 65, receiver bandwidth = 2345 Hz/pixel, frequency direction = A/P (to minimize breathing artifacts), SENSE in L/R direction with reduction factor R = 2, N avg = 1, scan time = 1 min and 48 s per stack. A flip angle of 3˚was used to minimize T 1 -bias effects.
For the healthy volunteers HV012 to HV015 a six-echo 3D spoiled gradient echo sequence was used for chemical shift encoding-based water-fat separation. The sequence acquired the six echoes in a single TR using non-flyback (bipolar) read-out gradients and the following imaging parameters: TR/TE min /ΔTE = 6.4/1.1/0.8 ms, FOV = 220x400 mm 2 , acquisition matrix = 68x150, acquired slice thickness = 4 mm, reconstructed matrix size = 432x432, voxel size = 3.2x2.2x4 mm 3 , number of slices = 63, receiver bandwidth = 2484 Hz/pixel, frequency direction = A/P (to minimize breathing artifacts), N avg = 1, scan time = 1 min and 25 s per stack. A flip angle of 3˚was used to minimize T 1 -bias effects.
For the patients, a six-echo 3D spoiled gradient echo sequence was used for chemical shift encoding-based water-fat separation. The sequence acquired the six echoes in a single TR using non-flyback (bipolar) read-out gradients and the following imaging parameters: TR/ TE min /ΔTE = 10/1.04/0.8 ms, FOV = 262x424 mm 2 , acquisition matrix = 84x211, acquired slice thickness = 8 mm, reconstructed matrix size = 512x512, voxel size = 3.2x2x4 mm 3 , number of slices = 30, receiver bandwidth = 2325 Hz/pixel, frequency direction = A/P (to minimize breathing artifacts), SENSE in L/R direction with reduction factor R = 2, N avg = 1, scan time = 20 s per stack. A flip angle of 3˚was used to minimize T 1 -bias effects.
The gradient echo imaging data were processed online using the multi-echo mDIXON fat quantification method provided by the manufacturer. Specifically, a complex-based water-fat decomposition was performed using a single T 2 Ã correction and a pre-calibrated fat spectrum, accounting for the presence of the multiple peaks in the fat spectrum. A seven-peak fat spectrum model was employed. The imaging-based proton density fat fraction (PDFF) map was computed as the ratio of the fat signal over the sum of fat and water signals.
Axial water images, axial fat images and axial PDFF maps of each stack were stored as separate datasets for each subject as a Ã .dcm file and a Ã .nii file (https://nifti.nimh.nih.gov/nifti-1).

MR image segmentation
Muscle segmentation was performed by manually drawing regions of interest (ROIs) on the PDFF maps using the open access image viewer software MITK (German Cancer Research Center, Division of Medical and Biological Informatics, Medical Imaging Interaction Toolkit, Heidelberg, Germany). The ROIs delineated the following clinically relevant muscle groups: quadriceps femoris muscle, sartorius muscle, gracilis muscle and hamstring muscles. The ROIs were placed in each muscle group with a margin of approximately 2 mm to their outer contour to avoid the accidental inclusion of subcutaneous fat and the muscle fat-interface as previously reported [20]. The ROIs extend from the cranial beginning of the muscle groups down to the muscle tendon transition at the knee. The segmentations were performed by one operator (2 years of experience in imaging of NMD patients) in the images of all 19 subjects including those acquired for reproducibility purposes and reviewed by a board certified radiologist (10 years of experience in musculoskeletal radiology). The average segmentation time was approximately 6 hours for one subject. The manual segmentation of each muscle group is available as a binary mask, in which pixels with intensity value of 1 correspond to muscle tissue, while pixels with value 0 to the background. Each mask of each image stack was stored as a separate Ã .mha file. Consequently, each subject dataset has 16 to 24 corresponding segmentation masks including both legs and all stacks.

Muscle strength measurements/Neurological muscle strength rating
Right quadriceps muscle maximum isometric torque [Nm] produced by knee extension at 60å nd 90˚knee flexion angle was obtained in healthy volunteers (HV001 to HV011) by using a rotational dynamometer (Isomed 2000, D&R Fertsl GmbH, Hemau, Germany). In HV012 to HV15 isometric muscle strength measurements were performed bilaterally at 60˚. The subjects were seated in upright position (90˚hip flexion) and carefully fastened with safety belts to avoid any kind of additional movement. The aim was to generate the individual maximum isometric torque in the quadriceps muscle at 60˚and 90˚knee flexion angle. The subjects performed three repetitions with maximum isometric muscle activity by full recovery in between and the highest value in each angle was used for the data analysis as previously reported [20].
Neurological muscle strength rating was performed by a board certified neurologist bilaterally in the thigh of all 4 patients for knee flexion and knee extension using the Medical Research Council (MRC) score [30]: 0/5 no contraction; 1/5 muscle flicker, but no movement; 2/5 movement possible, but not against gravity; 3/5 movement possible against gravity, but not against resistance by the examiner; 4/5 movement possible against some resistance by the examiner; 5/5 normal strength.

Results
The manually segmented thigh magnetic resonance imaging database is available online at https://osf.io/svwa7/?view_only=c2c980c17b3a40fca35d088a3cdd83e2. It includes gender, age, weight and height for each subject and quadriceps muscle strength measurements of the healthy volunteers as well as the neurological muscle strength rating (MRC score) of the patients (Tables 1 and 2). Axial water images, axial fat images and axial PDFF maps of each stack were deposited as separate datasets for each subject as Ã .dcm file and Ã .nii file. The segmentation masks of the four muscle groups of both legs in all stacks were deposited as Ã .mha files. showing the characteristic pathological involvement patterns of these diseases. On the left leg, the four different muscle segmentation masks are highlighted, respectively.
In Tables 3 to 6 the PDFF and the volume of all muscle groups of all healthy volunteers and patients are summarized. Each table contains the values for one muscle group bilaterally, respectively.

Discussion
In the present work a database for manually segmented thigh muscles in MR images of healthy volunteers and patients with neuromuscular diseases is presented.
The database offers access to axial water, axial fat images and axial PDFF maps of the thigh as well as the corresponding segmentation masks for four functional muscle groups. Therefore, the database offers free access to training or test datasets for automatic segmentation algorithms. Intensity information from the different water and fat contrast can be taken into account in terms of a joint multivariate analysis process to separate muscle and fat voxels [31][32][33]. Furthermore, the open-access database could be used to benchmark different segmentation methods in a comparable way. The data is available in two (healthy volunteers) and three (NMD patients) consecutive stacks. The stacks were acquired without a gap and consequently could easily be merged by a potential user of the database. However, the separated stacks are  Approximately 80% of the presented images are from healthy volunteers. The healthy muscle tissue and its allocation in the thigh represent the ground truth for automatic segmentation algorithms. The other images represent an insight into typical presentations of three different muscle diseases: DM2, LGMD2A, ALS, being "extreme" cases which are useful to understand the limitations of an applied automatic segmentation method.
The calculated mean PDFF and CSA exemplary illustrate the application of quantitative MRI data in muscle imaging. Tables 3 to 6 can serve as the ground truth when performing benchmark tests with newly developed computer vision or machine learning algorithms and may help to evaluate their performance.
The quantitative data in Tables 3 to 6 can only be extracted after the definition of specific ROIs. As there is an increasing attempt for quantification in muscle MRI to assess acute and chronic changes in the muscle tissue there is a high need for a time efficient way of segmenting the muscle groups or individual muscles. An automatic analysis of quantitative MRI data enables the analysis of big data. This may allow the application of quantitative water-fat imaging in clinical practice resulting in a better monitoring of disease progression and therapy effectiveness. It could foster the development of fully automated diagnostic procedures by computational segmenting and analyzing PDFF maps, followed by an automatic diagnostic process using the characteristic patterns of NMD [23] in axial MR images and thus helping to identify the correct diseases. The present database offers the possibility to work with whole muscle volumes in the thigh. It is known that quantitative values such as the PDFF and water T2 can differ throughout the muscle volume affecting the muscle heterogeneously along the proximodistal axis [34]. Therefore, a segmentation of the whole muscle volume is essential and offers new insights into muscle physiology and pathology.
The relatively low number of datasets can be seen as a limitation of the present database, particularly in the context of traditional machine learning ground truth databases. However, the database can be extended by more manually segmented muscle imaging data. It is planned to gradually increase the number of datasets by additional datasets of healthy thigh musculature and from patients with various NMD showing less and more extreme pathologies of the muscle tissue. The amount of training data for automatic segmentation algorithms based on neural networks is highly dependent on the desired performance of the segmentation tool and how the available data is altered (mirrored, deformed) to artificially increase the number of training sets. However, the provided database could be seen as a good starting point for the development of automatic algorithms and the planned extension should provide enough datasets for a sufficient training and testing of the segmentation algorithms.
To obtain accurate PDFF results the presented segmentation was placed slightly inside the contour of each muscle group to exclude subcutaneous fat and muscle-fat interface. This        might be considered as a second limitation, as an automatic segmentation will have to be performed at the exact muscular border and eroded in a consecutive step for evaluation of PDFF.

Conclusion
A database (MyoSegmenTUM) for manually segmented thigh muscles in MR images of healthy volunteers and patients with neuromuscular diseases was presented together with the corresponding manual segmentation masks. The database offers training and test datasets for the development of automatic muscle segmentation algorithms which are highly needed to exploit the maximum potential out of quantitative muscle MRI in the future for diagnosis and treatment of muscle pathologies.