The complementary use of muscle ultrasound and MRI in FSHD: Early versus later disease stage follow-up

.


Introduction
Muscle imaging can complement clinical examination in neuromuscular disorders.It helps characterize structural muscle changes and can identify patterns of muscle involvement that aid the diagnostic process (Simon et al., 2016).Muscle imaging can also be used as a biomarker to complement patient assessment in clinical trials (Dahlqvist et al., 2020b).In some muscular dystrophies, abnormalities on muscle imaging are known to precede muscle weakness and functional disabilities, correlate strongly with clinical outcome, and are highly sensitive to change, even in slowly progressive dystrophies (van der Plas et al., 2021;Wang et al., 2021).This contrasts with many of the available clinical outcome measures, which have a low sensitivity to change within the timeframe of a clinical trial of one-to-two years.Therefore, studies of slowly progressive muscular dystrophies often use imaging modalities as screening tools to select participants prior to study inclusion, and as surrogate endpoints to evaluate progression of muscle abnormalities and assess possible intervention effects (Adminstration, 2021;Fulcrum-Therapeutics, 2022;Garibaldi et al., 2022;Hamel and Tawil, 2018).
Facioscapulohumeral muscular dystrophy (FSHD) is one of the most prevalent muscular dystrophies in adulthood, with reported prevalence rates ranging from 6 to 12 per 100,000 (Deenen et al., 2014;Flanigan et al., 2001).An improved understanding of the pathogenic mechanism causing FSHD has led to the development of possible targeted therapies (Mul, 2022).Human clinical trials are now being conducted, and more are expected in the upcoming years.Several imaging modalities have been studied for use as possible diagnostic, prognostic, monitoring and response biomarkers in clinical trials in FSHD (Adminstration, 2021;Monforte et al., 2023).The most-commonly-used imaging techniques for muscle disorders are MRI and ultrasound (Simon et al., 2016).Muscle MRI allows for imaging of large (or whole) body regions, assesses superficial and deep muscles, and can identify muscle edema.MRI acquisition may be uncomfortable to intermediate to severely affected patients, in particular in patients with spinal deformities.Muscle ultrasound can be performed at bedside or with a patient seated in a wheelchair, captures the superficial muscle layers with high resolution and can access smaller body regions such as the face and diaphragm muscles.Both techniques require dedicated protocols and operator expertise.
Muscle imaging with MRI and ultrasound both capture structural changes in the muscle tissue.The techniques differ fundamentally in the way the images are acquired (Dahlqvist et al., 2020b, Wijntjes andvan Alfen, 2021).To evaluate muscle abnormalities using MRI, different pulse sequences can be used.Chemical-shift-based water-fat separation or 'Dixon' sequences are commonly used for the quantification of muscle fatty replacement, whereas T 2 -weighted TIRM/STIR or water T 2 mapping sequences are used for the evaluation of muscle edema (Dahlqvist et al., 2020b, Monforte et al., 2023).Muscle ultrasound, on the other hand, probes muscle abnormalities using soundwaves that reflect from tissue transitions.More of these transitions are present in dystrophic muscle as muscle fibers are replaced by fat or fibrosis (Wijntjes and van Alfen, 2021).Dystrophic muscles show more transitions and hence an increased echogenicity, along with a disruption of the normal tissue architecture (Pillen et al., 2009;Reimers et al., 1993).Of note, a completely fat-replaced muscle will have few tissue transitions left and will therefore become hypoechoic again, resembling subcutaneous fat.Consequently, the echogenicity of completely fat-replaced muscles will eventually decrease, following a parabolic curve (Wijntjes and van Alfen, 2021).Also, echogenicity is device-dependent, which complicates multi-center standardization of ultrasound.
Because of the inherent differences in image acquisition and clinical application between muscle ultrasound and MRI, knowledge of how these methods relate to one another can be very helpful.Few studies have described the cross-sectional relationships between muscle MRI and muscle ultrasound in FSHD (Fionda et al., 2023;Janssen et al., 2014;Mul et al., 2018).Their results suggest that ultrasound detects muscle changes in muscles that still appear normal on MR images during earlier disease stages, whereas MRI is better-suited to detect later disease stages with extensive fatty replacement of muscle tissue.
In this study we provide the five-year follow-up data of muscle ultrasound and muscle MRI outcome, and the corresponding clinical outcomes.The main aim is to assess how the two techniques complement one another, to optimize their uses as imaging biomarkers in future clinical trials in FSHD.

Study population
We invited all FSHD patients that underwent both MRI and ultrasound scanning during the baseline visit of our observational 2014-2015 cohort study to participate in the follow-up study (n = 27) (Mul et al., 2018).All patients were 18 years or older and had genetically confirmed FSHD 1 or FSHD 2. Asymptomatic patients, defined as patients without symptoms of muscle weakness but with muscle FSHD signs on examination, and nonpenetrant gene carriers, defined as persons without symptoms of muscle weakness on history and without muscle FSHD signs on examination, were also included in the study (Wohlgemuth et al., 2018).MRI and ultrasound were performed on the same day at the Radiology and Clinical Neurophysiology departments, respectively, at Radboud university medical center in Nijmegen.Followup visits were scheduled between 2019 and 2020.
This study was performed in line with the principles of the Declaration of Helsinki and the study protocol was approved by the regional medical ethics committee (METC Oost-Nederland 2018-5035).All participants gave written informed consent prior to participating.

MRI acquisition and analysis
The MRI scan protocol and quantitative analysis methods have been described previously (Mul et al., 2018;Mul et al., 2017;Vincenten et al., 2023).At the time of the baseline study, wholebody protocols were technically difficult and considered too lengthy; therefore, a lower extremity imaging protocol was adopted instead.All MR images were acquired on a 3 T TIM Trio scanner at baseline, and a 3 T Prisma at follow-up (Siemens Healthineers, Erlangen, Germany).The changes this system upgrade caused have been corrected -as previously described (Vincenten et al., 2023).
Briefly, after localizers, an axial 2D two-point Dixon sequence was applied in both upper and the lower legs with the following parameters: repetition time = 10 ms, echo times = 2.45 and 3.68 ms, flip angle = 3°, field of view = 271 Â 435 mm, matrix size = 200 Â 320, 5 mm slice thickness and 144 slices.This was followed by a co-localized TIRM sequence with: repetition time = 4, 000 ms, echo time = 40 ms, flip angle = 150°, field of view = 271 Â 435 mm, matrix size = 160 Â 256, 72-slice stack with a 5 mm gap, and an inversion time = 220 ms to null fat.
Predefined anatomical landmarks were used to determine which slices of the MRI dataset were to be analyzed.Considering the observed distal to proximal gradient in fat replacement along muscles of FSHD patients and the time consuming manual segmentation method we used, we selected one distal slice of each muscle (Heskamp et al., 2022).In the upper leg, we selected slices at 2/3 along a line joining the anterior superior iliac spine and the proximal edge of the patella.In the lower leg, we selected slices at 1/3 along a line spanning the distal edge of the patella and the lateral malleolus, except for the peroneus tertius, for which we used the slice at 4/5th of the distance between these landmarks (Mul et al., 2017).The selected locations were marked on the skin using fixed fish-oil capsules, which appear bright on MR images.These fish-oil capsules were used to obtain the ultrasound images at the exact level of the MR images.Two authors visually verified the selected slices to ensure consistency between baseline and follow-up scans.
Pixelwise fat-fraction (FF) maps were calculated from scannerreconstructed Dixon fat and water images using MATLAB (version R2022a, MathWorks, Massachusetts, USA).Regions of interest (ROIs) for each muscle were then drawn on each fat fraction map by one author (S.V., with 8 years of experience in assessing MR images of FSHD patients) using Fiji software (ImageJ, NIH, Wisconsin, USA) (Schindelin et al., 2012).The following muscles were examined bilaterally: rectus femoris (RF), vastus lateralis (VL), the medial head of the gastrocnemius (GM), tibialis anterior (TA) and peroneus tertius combined with the extensor digitorum longus (PT).Muscle FFs were then calculated per ROI.A muscle with a fat fraction below 10% was considered normal, whereas a muscle with a fat fraction of > 60% was considered severely affected.A large increase in fat fraction was considered an increase of !10% (Heskamp et al., 2022;Mercuri et al., 2007).TIRM images were visually assessed by two authors (S.V. and K.M., the latter having 9 years of experience in assessing MR images of FSHD patients) for the presence of signal hyperintensities indicating edema, and each individual muscle was scored as either TIRM positive or TIRM negative.
To assess lower extremity fat replacement for each patient and to compare MRI outcomes to clinical outcome measures and ultrasound outcomes, an additional FF variable was calculated: the MRI FF compound score (FF-CoS).The FF-CoS was calculated by averaging the FF of all (left and right leg) muscles combined.The DFF-CoS was then calculated by subtracting the FF-CoS at baseline from the FF-CoS at follow-up.The FF-CoS was not corrected for crosssectional area (CSA) -even though CSAs based on MRI ROIs were available for all assessed muscles -because it needed to be compared to the ultrasound compound score, and CSAs based on ultrasound ROIs could not be reliably assessed, for instance due to attenuation.

Muscle ultrasound acquisition and analysis
The ultrasound scanning protocol and analysis methods have been described in detail elsewhere (Mul et al., 2018).To match the MRI protocol, only lower extremity muscles were included in this scanning protocol.Muscle ultrasound was performed with an Esaote MyLabTwice ultrasound scanner (Esaote SpA, Genoa, Italy) using an LA533 3-13 MHz linear transducer.A preset was used for system settings and all further settings were kept constant.For most muscles, a 4-cm depth preset was chosen, with the focal area fixed in the lower edge of the image to avoid an uneven grayscale (or 'echogenicity') distribution on this ultrasound system.A gain setting of 50% was used for all muscles.The same leg muscles were examined as in the MRI protocol.All muscles were imaged at a fixed anatomical location matching the MRI protocol in the transverse plane.The transducer was held perpendicular to the underlying bone or fascia to create the most optimal image reflections.Each muscle was measured three consecutive times.All muscle ultrasound scans were performed by a trained technician during the baseline visit and by a trained physician (S.V.) during the follow-up visit.
The severity of muscle abnormality was visually graded by an neurologist with more than 20 years of ultrasound experience (N.v.A.) using the semi-quantitative Heckmatt scale (Heckmatt et al., 1982).The Heckmatt scale ranges from 1 (normal muscle) to 4 (severely abnormal muscle with absent bone reflection).In addition, a quantitative analysis of the mean muscle echogenicity was performed for each muscle (Wijntjes and van Alfen, 2021).An ROI was drawn in each image to include the maximum muscle area without artifacts; in case of visible attenuation, only the top 1/3 of the muscle was annotated.Subsequently, the mean pixel gray level was calculated using the 'histogram' function in a custom graphics software program called ''QUMIA" and averaged over the three images sampled from each muscle (Wijntjes and van Alfen, 2021).These average muscle echogenicity's were then compared to muscle-and device-specific reference values from healthy controls, corrected for sex, age and weight, and subsequently converted to echogenicity z-scores (EZ-score), as previously reported (Goselink et al., 2020).A muscle with an EZ-score below 2.0 is considered normal (Wijntjes et al., 2022).A change in EZ-scores !1.0 was considered similarly large to an FF change of !10%, because 1.0 represents a change of approximately 10% of the range of EZscores.
To assess lower extremity muscle echogenicity for each participant and to compare ultrasound outcome to clinical outcomes, two additional ultrasound variables were calculated: 1) the Heckmatt sum-score (Heckmatt-SS) was calculated by summing up the Heckmatt scores of all evaluated muscles for each participant; and 2) the compound echogenicity z-score (EZ-CoS), which was calculated by averaging the echogenicity z-scores of all evaluated muscles for each participant.

Clinical outcome measures
Manual muscle testing using the Medical Research Council (MRC) grading was performed in the knee flexors, knee extensors, ankle dorsiflexors and ankle plantar flexors.An MRC sum score was calculated (MRC-SS).A functional motor assessment of all patients was performed using the Motor Function Measure (MFM) (Bérard et al., 2005).The MFM consists of 32 items that are all scored using a 4-point scale, resulting in a maximum score of 96.The total score is expressed as a percentage of this maximum score and ranges from 0 to 100%, in which 100% is the highest score possible.FSHD disease severity was assessed using 2 scoring systems: 1) the FSHD-Clinical Severity Score (FSHD-CSS or 'Ricci score')-a 11grade scale in which 0 indicates no muscle weakness and 10 indicates wheelchair dependence (Ricci et al., 1999); and 2) the FSHD clinical score (FSHD-CS)-a sum score ranging from 0 (no symptoms) to 15 (highest disease severity) that scores the symptoms of FSHD patients in different body regions (Lamperti et al., 2010).All these clinical outcome measures were performed during both baseline and follow-up visits by two authors (SV and KM).

Statistical analysis
Data analysis was performed using IBM SPSS Statistics 27 (IBM Corp, Armonk, NY USA).We summarized non-normally distributed data as medians with quartiles and interquartile range (IQR), and normally-distributed data as means with standard deviations (SD).To analyze the correlations between the muscle MRI results, muscle ultrasound results, and the clinical outcome measures used, we performed Spearman's rho analyses.We used pairwise deletion for any missing data in the correlation analyses.The Wilcoxon signed rank test was performed to test the differences between baseline and follow-up results for skewed data.Statistical significance was defined as p less than 0.05.Correction for multiple testing was done using the Bonferroni method, and the corresponding p-value was set to 0.05/(number of hypotheses tested), varying from 0.01 to 0.003.

Participants
Twenty of the 27 FSHD patients from the baseline study participated in this follow-up study.Of the remaining 7 patients from the baseline study, 1 had died, 2 participated in the follow-up study without MRI and/or ultrasound examination and 4 were lost to follow-up.Follow-up visits were planned at a mean of 4.9 (0.3) years after the baseline visit.Patient characteristics can be found in Table 1.

MRI fat fraction
The median [IQR] baseline MRI FF compound score (FF-CoS) was 17.8% [25.5] and the median [IQR] change in FF-CoS during followup was + 2.8% [3.8] (p < 0.001) (Fig. 1; Table 2).Changes in FF-CoS correlated with the baseline FSHD-CSS and FSHD-CS (q = 0.560-0.650, p < 0.05), indicating that participants with higher baseline clinical severity scores had larger increases in FF-CoS over time.Changes in FF-CoS did not correlate with sex, age, duration of symptoms or D4Z4 repeat size (p ranging from 0.15 to 0.68).

Ultrasound variables
The median [IQR] 2).Median baseline and follow-up sum-or compound ultrasound scores can be found respectively in Supplementary Table 1A and 1B; median baseline and follow-up Heckmatt-and EZ-scores per muscle can be found in Supplementary Table 2.
The distribution of baseline EZ-scores for the total number of muscles evaluated showed a shift toward a higher number of (severely) affected muscles (in other words: higher EZ-scores) at follow-up (Fig. 1).Changes in sum-or compound ultrasound variables did not correlate with sex, age, duration of symptoms, repeat size or baseline clinical outcome measures (p ranging from 0.050 to 0.964).

Relation between muscle ultrasound and muscle MRI
The Heckmatt-SS and EZ-CoS correlated highly and significantly with FF-CoS at baseline and follow-up (q between 0.86-0.87,p < 0.001; Table 3A).Similar to the results of the baseline cohort, the muscles with the highest MRI FFs often showed ultrasound EZ-scores decreasing towards normal (Fig. 2; see Fig. 3B for examples of matching muscle MRI and ultrasound images).The parabolic vertex, or the turning point of the correlation between FF and EZ-scores, seemed to be muscle dependent, with the lowest FF value for this point in the VL (at an FF of approximately 35%), and the highest value in the RF (at an FF of approximately 55%) (Fig. 2).
The 198 muscles reviewed in this study were divided into 4 groups based on whether their FF and EZ-scores matched or differed: 2 groups with matching FF and EZ-score at baseline (n = 167, 84%) and 2 groups showing differences between FF and EZ-scores (n = 31, 16%) (Fig. 3A).The grouping was based on FF and EZ-scores, because of the discrepancies noted between the two methods in an earlier study comparing the two techniques   ( Mul et al., 2018).Each group will be described in more detail in the text below, with example muscle MRI and ultrasound images from all groups being shown in Fig. 3B.For all 4 groups, median baseline MRI FF, ultrasound EZ-/Heckmatt-scores and median changes in MRI FF, ultrasound EZ-/Heckmatt-scores over 5 years' time can be found in Table 4.
3.4.1.Group 1: Matching imaging outcomes, echogenicity z-score and fat fraction both in the normal range Of all muscles studied (n = 198), 128 (64.6%) had an FF and EZscore in normal ranges at baseline.
At five-year follow-up, those 128 muscles showed the following changes in FF, EZ-scores and TIRM positivity:  Table 3 (all correlations are corrected for multiple testing using the Bonferroni method; statistical significance is addressed with a *).A large increase in FF of !10% (ranging from + 11.0% to 36.6%), leading to an FF in abnormal ranges, was found in 9/128 muscles (7%).The remaining muscles (93%) showed a relatively stable FF over time (ranging between À4.0% and + 9.7%).
Five of the 128 muscles with normal values at baseline were TIRM positive at baseline, of which 4 showed a large increase in FF at follow-up of > 10% (80%).One of these 4 muscles showed a large increase of EZ-score at follow-up (+4.6; 20%).Twenty-one of the 128 muscles were TIRM positive at follow-up (16.4%).

Group 2: Different imaging outcomes, echogenicity z-score increased with a normal fat fraction
Nine muscles (5%) showed a normal FF on MRI images, but an increased ultrasound EZ-score at baseline.
None of the 9 muscles was TIRM positive at baseline, but 4 were at follow-up (44.4%), including the 2 that showed a large increase in FF at follow-up.

Group 3: Matching imaging outcomes, both echogenicity z-score and fat fraction increased
Thirty-nine muscles had both an FF and EZ-score within abnormal ranges at baseline (19.7%).
Eight of these 39 muscles were TIRM positive at baseline (20.5%), of which two showed a large increase in FF at follow-up of > 10% (20%).A total of 12 of these 39 muscles were TIRM positive at follow-up (30.8%).

Group 4: Different imaging outcomes, echogenicity z-score normal with an increased fat fraction
Twenty-two muscles had a normal ultrasound EZ-score but showed an increased MRI FF at baseline (11%).
A large increase in EZ-score of !1.0 (ranging from + 1.1 to 5.4) was found in 8/22 muscles (36.4%).In all 8 muscles this led to an abnormal EZ-score at follow-up and thus matching ultrasound and MRI results.A large decrease in EZ-score over time (ranging from À4.9 to À1.1), was seen in 5 of these 22 muscles (22.7%), whereas the remaining 9 muscles showed stable EZ-scores at follow-up (ranging from À0.5 to + 0.6).

Heckmatt data
The median [IQR] change in Heckmatt score of muscles in all the aforementioned groups was 0 [0] (Table 4).Matching Heckmatt scores per group of all assessed muscles can be found in Supplementary Results.

Correlations between muscle imaging variables and clinical outcome measures
All clinical outcome measures and imaging sum-or compoundscores (Heckmat-SS, EZ-CoS and FF-CoS) showed strong crosssectional correlations at baseline and at follow-up (0.64-0.92; p < 0.05) (Table 3, panels B-D).Any change in clinical outcome measures, besides the FSHD-CS, also correlated strongly with all baseline imaging sum-or compound-scores (q ranges between 0.52-0.78;p < 0.05) (Table 3E): namely, the higher the imaging sum-/compound-score at baseline, the larger the change in clinical outcome measure.The largest increases in MFM were seen in participants with a baseline Heckmatt-SS of 22 or higher, an EZ-CoS of 0.8 or higher, or a FF-CoS of 30 or higher.

Asymptomatic/non-penetrant participants
All muscles of the one asymptomatic and one non-penetrant participant in our cohort had a normal FF and normal EZ-scores at baseline and five-year follow-up.Only one of the muscles of the asymptomatic participant showed a large increase in EZ-score of 1.2 over time, but the EZ-score at follow-up remained normal (-0.3).This muscle also showed a stable FF over time (baseline FF 5.7; follow-up FF 5.2).All muscles from the non-penetrant participant had stable FF and EZ-scores at five-year follow-up.No TIRM positive muscles were found in these two participants.Of note, excluding these two participants from analysis in 3.2 and 3.3 did show slight increases in compound FF and EZ-changes over time, but it did not change the outcome otherwise.

Discussion
This study in FSHD has shown that in muscles that appear healthy at baseline, muscle ultrasound more often shows large increases in echogenicity z-score (EZ-score) than MRI does in fat fraction (FF) at follow-up, suggesting that in these muscles muscle ultrasound better-captures deterioration over time.Muscles that only showed subtle early abnormalities on ultrasound images at baseline often showed progressing muscle pathology at followup with both muscle edema and fat replacement detected by MRI.At the other end of the clinical spectrum, normal EZ-scores were found in intermediate to severely fatty replaced muscles with high FF, indicating that muscle MRI better-identified these muscles.Intermediate to severely affected muscles also showed varying EZ-score progression with stable or increasing FF, showing that MRI was also better at detecting deterioration in these muscles, even though the changes we found over time were small (Fig. 3).
We will elaborate on our imaging findings in the context of progressing muscle pathology in FSHD as described by several muscle biopsy studies and visualized in Fig. 4 (Lassche et al., 2020;Neuromuscular, 2022;Statland et al., 2015).Imaging and biopsy studies agree that muscle edema (or inflammation) is most evident in early to intermediate disease stages, while fat replacement is most found in intermediate to late disease stages.Muscle fibrosis is studied relatively less because it is difficult to visualize using conventional MRI techniques (Carlier et al., 2016;Marty et al., 2023), and ultrasound cannot distinguish it from other muscle changes.However, several muscle biopsy studies have highlighted the relevance of fibrosis in FSHD muscle pathology (Bosnakovski et al., 2022;Di Pietro et al., 2022;Serra and Wagner, 2020) and showed that the first signs of fibrosis are found together with early muscle changes that precede fat replacement, such as increased muscle fiber size variation, internal nuclei and the first inflamma-tory infiltrates (Fig. 4) (Lassche et al., 2020;Neuromuscular, 2022;Statland et al., 2015).These early muscle changes increase the number of tissue reflections in muscles and are likely the cause of the subtle structural muscle abnormalities that we found only on the ultrasound images.Our results show that 2 out of the 9 muscles with only subtle ultrasound abnormalities at baseline showed large increases in fat replacement at follow-up (22%).This risk of deterioration is Table 4 Muscle subgroup characteristics, ranked by matching and different imaging baseline MRI fat fraction (FF) and ultrasound echogenicity z-scores (EZ-scores) (as described in the text in section 3.4 and visualized in Fig. 3).similar to the risk of muscles that showed abnormalities on both ultrasound and MRI at baseline (23%), while the muscles that appeared normal at baseline on both imaging modalities had a relatively large chance of remaining normal over time (93%).This suggests that these early muscle changes -likely including muscle fibrosis -are the starting point of further pathophysiological changes in FSHD muscles and may indicate that targeting these early processes can influence subsequent muscle inflammation and fatty replacement (Bosnakovski et al., 2017;Bosnakovski et al., 2022;Bosnakovski et al., 2020;Lassche et al., 2020).
Our results also substantiate the evidence that TIRM positivity is a risk factor for faster fat replacement in FSHD (Dahlqvist et al., 2020a, Ferguson et al., 2018;Vincenten et al., 2023).In our cohort, 38% of all TIRM positive muscles have FF increases > 10% over time.That risk of deterioration seems to be higher than the risk of muscles showing early muscle changes only on muscle ultrasound images in our cohort (22%).Considering the muscles showing early muscles changes were TIRM negative, these early changes and TIRM positivity seem independent risk factors for muscle deterioration and could be of complementary prognostic use for selection of relevant muscle and/or participants in future clinical trials.
All muscles showed either stable or increasing FFs at follow-up, indicating that FF is an excellent measure of disease progression once muscle fat replacement has begun.This is in line with recent literature (Leung, 2018;Wang et al., 2021) and suggests that FF is useful as monitoring biomarker in FSHD.While most muscles with an increased FF at baseline also showed stable (44%) or increasing EZ-scores (23%) at follow-up, a considerable number of muscles (33%) showed decreasing EZ-scores over time.This finding fits the expected inverted U-shaped relation between fat replacement and echogenicity (Mul et al., 2018), which is caused by progressing muscle pathology in FSHD (Fig. 4).This finding probably also explains some of the lack of sensitivity to change over time of the EZ-score in our study.This is different from what two other similarly-sized longitudinal cohort studies found during the shorter time course of one year (Dijkstra et al., 2021;Goselink et al., 2020).These two studies contained only or mostly children with FSHD, making it likely that they observed more early-stage muscle changes and less end-stage affected muscles at the descending limb of the inverted U.These earlier studies also used a different protocol that included upper-extremity and trunk muscles, and found that the muscles most sensitive to change were the rectus femoris, rectus abdominis and trapezius muscles (Goselink et al., 2020).These muscles are often affected early in the disease course of FSHD (Leung et al., 2015).Our protocol included only lower extremity muscles, which are mostly (with the exception of the rectus femoris) affected at later disease stages, and most participants were in the intermediate-to-late disease stage.This suggests that tracking disease progression in FSHD using quantitative muscle ultrasound is best used in children and patients in the early-disease stages (even though individual muscles can be extensively fat replaced relatively early in the disease).
Different muscles were found to have different shapes, i.e. slopes, of their inverted-U relation between fat replacement and echogenicity, suggesting that some muscles, once they start degenerating, go through this process relatively quickly, while others may be more resistant to pathologic change.It also suggests that some muscles are more severely affected by fat replacement or are more rapidly replaced by fat, whereas others are more affected  (Lassche et al., 2020;Neuromuscular, 2022;Statland et al., 2015).The added dashed lines are the same lines as visualized in Fig. 3 by muscle fibrosis.This is of consequence for clinical trial design: if, for example, an intervention mainly inhibits muscle fibrosis, muscles that are known to be more affected by fibrosis should be prioritized in the imaging protocol.Noticeably, our Heckmatt results did show small but significant increases over five-years, and a strong correlation to the MR FF outcome in late disease stages (Fig. 3A).This may suggest that Heckmatt grading might be the optimal scoring system for muscle ultrasound, but the scale also has serious disadvantages: it is an ordinal scale with only four grade options and clinimetric limitations (Wijntjes unpublished data), that indicates that if Heckmatt is used longitudinally, muscle changes may be under-or overestimated (Goselink et al., 2020).We therefore recommend to always use both visual evaluation and quantified EZ-scores in longitudinal studies or trials to optimize the information gathered from muscle ultrasound.
Our results confirm that changes in clinical outcome measures were related to baseline ultrasound sum-/compound scores, indicating that ultrasound outcome seems to predict clinical deterioration and may be suitable as a prognostic biomarker.The use of ultrasound as prognostic biomarker is likely better reserved for the early to intermediate disease stages, because of the clinical parabolic progression rate in FSHD: the more severely affected a patient gets, the more the progression rate decreases.Similar to the longitudinal results from our quantitative MRI study in FSHD (Vincenten et al., 2023), participants with certain baseline ultrasound results seem most prone to clinical deterioration: a Heckmatt-SS of !22 or an EZ-CoS of !0.8.Muscle ultrasound has already proven its diagnostic use for FSHD (Boon et al., 2021;Vincenten, 2023), but its prognostic use would establish ultrasound as a screening tool for patient inclusion in clinical trials.
The most important limitation of our study is the small size of our cohort, though most muscle ultrasound studies in FSHD had similar cohort sizes and we were able to include the full spectrum of disease severity.Furthermore, our muscle selection was limited to lower extremity muscles.It remains unclear if our findings can be generalized to upper extremity and truncal muscles, which is relevant considering these muscles are often earlier affected than lower extremity muscles in FSHD.Also, we would reconsider selecting the peroneus tertius muscle, because it is rarely studied in FSHD research.These limitations need to be addressed in future research.
In conclusion, our results indicate that muscle ultrasound can be advantageous in early disease stages in FSHD and that its outcome may predict an acceleration of the disease progression, while current muscle MRI protocols are best suited for the intermediate and late disease stages.Multicenter large cohort data are needed to confirm our results.

Fig. 1 .
Fig. 1.Distribution of all examined muscles at baseline and five year follow-up (n = absolute amount of muscles assessed and '%' = relative amount of muscles assessed) for: A. Ultrasound Heckmatt scores; B. Ultrasound echogenicity z-scores (EZ-scores); C. MRI Fat Fraction (FF).The changes between baseline and follow-up are most evident in the FF distribution map, as expected based on Wilcoxon analysis (Results section 3.2).

Fig. 2 .
Fig. 2. Correlation between MRI fat fraction (FF) and ultrasound echogenicity z-scores (EZ-scores) showing the parabolic curve for multiple muscles at baseline and followup: rectus femoris (A), gastrocnemius medial head (B), vastus lateralis (C) and tibialis anterior (D).The peroneus tertius muscle showed a curve similar to the tibialis anterior.The dotted lines visualize the parabolic relations (regression models) between FF and EZ-score for each muscle of the right and left leg, crossed by the dashed line in the vertex or turning point of this relation.Regression model R squared values for right and left leg for each muscle: A. Rectus femoris: left leg R 2 = 0.49, right leg R 2 = 0.66; B. Gastrocnemius medial head: left leg R 2 = 0.46, right leg R 2 = 0.57; C. Tibialis anterior left leg R 2 = 0.43, right leg R 2 = 0.56; D. Vastus lateralis: left leg R 2 = 0.50, right leg R 2 = 0.42.

Fig. 3 .
Fig. 3. A. Schematic visualizing the hypothesized degree of muscle abnormality in lines as quantified using different imaging techniques for progressing disease stages: muscle MRI fat fraction (FF) and semi-quantitative muscle ultrasound Heckmatt as well as quantitative ultrasound echogenicity z-scores (EZ-scores).Below the figure, all muscles are divided into four groups, based on matching or different baseline imaging FF, Heckmatt and EZ-scores (matching the text in the Results section, 3.4).Group numbers correspond to the text in 3.4, Table 4 and Fig. 3B.The bars in the figure represent the relative median values (in percentages of abnormality) of the baseline results of all different imaging techniques for each group in this cohort.B. Example muscle images showing one muscle from each of the 4 groups based on matching or different MRI FF and ultrasound EZ-scores (matching Fig. 3A and the text in the Results section, 3.4).Visualized here are both baseline and follow-up images of muscle ultrasound and chemical-shift-based water-fat-separation or 'Dixon' muscle MRI FF maps.Group 1, right medial head of the gastrocnemius: Normal EZ, Heckmatt and FF at baseline (EZ-score 1.7; Heckmatt 1, FF 5.4%) and follow-up (EZ-score 1.3; Heckmatt 1; FF 6.5%) Group 2, left medial head of the gastrocnemius: Abnormal EZ-score (7.5) and Heckmatt (3), but normal FF (7.5%) at baseline; Abnormal EZ-score (3.2), Heckmatt (3) and increasingly abnormal FF (21.3%) at follow-up.Group 3, right tibialis anterior: Abnormal EZ-score (2.0), Heckmat (3) and FF (51.2%) at baseline; Decreasing EZ-score (1.6) and stable Heckmatt (3) and FF (50.5%) at follow-up.Group 4, left vastus lateralis: Abnormal FF (17.9%) and Heckmatt (2), but normal EZ-score (1.1) at baseline; Increasing FF (29.6%),Heckmatt (3) and EZ-score (3.5) at follow-up.

Fig. 4 .
Fig. 4. Disease progression in FSHD based on muscle biopsy pathology results, schematically visualized.Listed are characteristics of normal muscle tissue, early stage disease pathology and late stage disease pathology(Lassche et al., 2020;Neuromuscular, 2022;Statland et al., 2015).The added dashed lines are the same lines as visualized in Fig.3, representing the hypothesized degree of muscle abnormality as quantified using different imaging techniques (here: ultrasound echogenicity z-score (EZ-score) in black and MRI fat fraction (FF) in dark grey) for progressing disease stages.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Fig. 4. Disease progression in FSHD based on muscle biopsy pathology results, schematically visualized.Listed are characteristics of normal muscle tissue, early stage disease pathology and late stage disease pathology(Lassche et al., 2020;Neuromuscular, 2022;Statland et al., 2015).The added dashed lines are the same lines as visualized in Fig.3, representing the hypothesized degree of muscle abnormality as quantified using different imaging techniques (here: ultrasound echogenicity z-score (EZ-score) in black and MRI fat fraction (FF) in dark grey) for progressing disease stages.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 1
Patient characteristics at follow-up.

Table 1
Data are expressed as mean (standard deviation) (range) or median [interquartile range] (range), except where noted.

Table 2
Change in muscle ultrasound and muscle MRI sum-/compound scores and clinical outcome measures.