Repeatability and reproducibility of longitudinal relaxation rate in 12 small-animal MRI systems

Background Many translational MR biomarkers derive from measurements of the water proton longitudinal relaxation rate R1, but evidence for between-site reproducibility of R1 in small-animal MRI is lacking. Objective To assess R1 repeatability and multi-site reproducibility in phantoms for preclinical MRI. Methods R1 was measured by saturation recovery in 2% agarose phantoms with five nickel chloride concentrations in 12 magnets at 5 field strengths in 11 centres on two different occasions within 1–13 days. R1 was analysed in three different regions of interest, giving 360 measurements in total. Root-mean-square repeatability and reproducibility coefficients of variation (CoV) were calculated. Propagation of reproducibility errors into 21 translational MR measurements and biomarkers was estimated. Relaxivities were calculated. Dynamic signal stability was also measured. Results CoV for day-to-day repeatability (N = 180 regions of interest) was 2.34% and for between-centre reproducibility (N = 9 centres) was 1.43%. Mostly, these do not propagate to biologically significant between-centre error, although a few R1-based MR biomarkers were found to be quite sensitive even to such small errors in R1, notably in myocardial fibrosis, in white matter, and in oxygen-enhanced MRI. The relaxivity of aqueous Ni2+ in 2% agarose varied between 0.66 s−1 mM−1 at 3 T and 0.94 s−1 mM−1 at 11.7T. Interpretation While several factors affect the reproducibility of R1-based MR biomarkers measured preclinically, between-centre propagation of errors arising from intrinsic equipment irreproducibility should in most cases be small. However, in a few specific cases exceptional efforts might be required to ensure R1-reproducibility.


Introduction
Many useful MR biomarkers derive from measurements of the water proton longitudinal relaxation time T 1 , or alternatively the relaxation rate R 1 ≡ T 1 −1 . Errors in R 1 [1] are common, will propagate, and may damage the reproducibility and accuracy of the resulting MR biomarkers. Although considerable effort has been devoted to measuring and assuring the accuracy of R 1 in clinical MR [2][3][4][5] systems, there is little evidence for the cross-site reproducibility of R 1 measurements in MR systems designed for small-animal research. The lack of standardisation in preclinical imaging has been recognised as an important problem [6,7] which in the worst case could invalidate the findings from animal studies, or confound meta-analyses and translation.
Reproducibility in a valid phantom is an important and ethical prerequisite for reproducible values in vivo. Poor technical validation has been a major impediment to clinical translation of MR biomarkers [8]. An ideal R 1 phantom should be traceable [2]; resist biological, chemical and physical deterioration; perform effectively over a range of temperatures convenient and relevant for the users; cover the parameter range expected in subsequent studies; not exhibit physiologically unrepresentative MR characteristics such as radiation damping, convection, unphysiologic T 2 , excessive self-diffusion, off-resonance chemical shifts, standing waves, or abrupt boundaries; interrogate the entire volume subsequently to be occupied by body parts being imaged; have dimensions suitable for the subject subsequently to be imaged (in this case rats and mice); be convenient for the intended users; and be cost-effective for the intended users. To meet these criteria, nickel agarose phantoms following the design of Christoffersson et al. [9] were used.
Two distinct general approaches to MR standardisation have previously been employed. In the first, which we term "centrally-led", a central organisation, often independent of the participating sites, is accountable for overall measurement accuracy and reproducibility. They mandate the phantom and acquisition protocol and analyse centrally. They may perform set-up and training at each participating site, instruct sites to repeat aberrant measurements, or even expel sites who cannot achieve the required accuracy. Centrally-led standardisation is common in clinical trials performed to ICH GCP [10,11], or where the MR measurement is regulated as a companion diagnostic [12]. In the second approach, which we term "institution-led", each investigator is accountable for measurement accuracy in their own centre. They are responsible for their own acquisition and analysis, and for compliance with any guidelines for their chosen phantom. "Institution-led" standardisation is common in academic research and in single-centre studies. Although we expect "centrally-led" standardisation to provide better reproducibility than "institution-led" standardisation, in this work we modelled "institution-led" standardisation as this is more representative of practice in preclinical MR. The study was performed within an international consortium of imaging centres participating in the validation of imaging biomarkers [13], and developing reliable preclinical MR assays which would give comparable results in different laboratories. The aim of this work was to assess the repeatability and reproducibility of R 1 in a realistic rodent MR protocol. Simple simulations were performed in order to compare the likely propagation of reproducibility errors into a broad range of R 1 -derived MR biomarkers.

Preclinical phantom
Batches of 2% agarose with nickel chloride concentrations respectively of 0.50, 1.04, 2.02, 4.08 and 8.05 mM, with 0.05% sodium azide, were prepared centrally in Berlin and used to create identical phantoms (Supplementary Fig. S3.1) which were distributed to the participating laboratories. The phantoms were prepared and authenticated (supplementary material S3) in July 2017, shipped in August 2017, and the measurements were performed between December 2017 and February 2018.

MR methods
Thirteen centres involved in an international consortium for the validation of imaging biomarkers for drug safety assessment [13] were invited to participate. Where centres had access to more than one MR system, they were invited to submit data from multiple MR systems. Eleven centres agreed to participate, one of which (G) provided data from two different magnets (G1 and G2): in the analyses, G1 and G2 were treated as if from two different centres. Details of the 12 MR systems are given in Table 1. Eleven of the 12 MR systems (all except B) were in laboratories which regularly and routinely measure MR biomarkers in rodents, intending to translate their findings to create diagnostics or therapeutics to improve human health. Although the use of any particular manufacturer's equipment was not mandated, all participating centres elected to employ Bruker Avance/ParaVision systems. An "institution-led" approach to standardisation was adopted. Pilot studies were performed only in centres B and G. No site training was performed, no quality control was imposed, nor were sites permitted to repeat their measurements to eliminate apparent outliers. Region-of-Interest (RoI) definition and T 1 calculation were performed locally.
Centres were asked to measure R 1 by saturation recovery using a standard RARE sequence. (Additional measurements using an investigational fast steady-state free-precession (FISP) sequence designed for the consortium's in vivo needs will be reported elsewhere). In an attempt to provide temperature stability and minimise susceptibility artefacts, each phantom was embedded in a cucumber (Supplementary Figs. S3.2 and S3.3). Centres were instructed to "allow the five cucumbered phantoms to come to thermal equilibrium in the magnet bore…[and] measure the temperature of the cucumber flesh in several places and verify thermal equilibrium has been reached." The temperature of the cucumber flesh around the phantom was measured before and after each acquisition. The entire protocol was run in each centre on two separate days, mean 2.7 days apart (range 1-13).
In ParaVision, the standard RARE T 1 saturation-recovery measurement method "T1map_RARE protocol" (Rat/Head/Relaxometry) was invoked. All images were coronal with 58 × 58 mm field of view, 128 × 128 matrix, with a π/2 for 1.16 mm slice selection followed by a π train with RARE factor 8, effective echo time 30 ms, echo spacing 7.5 ms. Signal averaging was not employed and 5 dummy scans were used. Saturation recovery experiments used repetition times TR of 5500, 2000, 1200, 750, 500, 300, 200 and 100 ms giving a scan time of 169 s, not including the dummy scans. Next, a "dynamic-no-enhancement" (DNE) stability series to simulate dynamic contrast-enhanced MRI was run for 5 min (approximately 34 images) with repeated acquisition using the same parameters but with TR fixed at 500 ms.

Analyses
Each centre conducted measurements independently and was blinded to findings from the other centres until their own results had been submitted. At each centre, T 1 values were obtained using a 2parameter fit in ParaVision from circular 25 mm 2 RoIs, i.e. 29 μl volumes, approximately 120 voxels, at three RoI positions. These were: at the isocentre; radially at the edge of the phantom 10 mm from isocentre; and axially at the end of the phantom 12-20 mm from isocentre, denoted respectively by (X,Y,Z) = (0,0,0), (10,0,0) and (0,0,12) mm. The 2-parameter fit assumed zero longitudinal magnetisation at the mid-point of the eighth echo. The resulting T 1 values and standard deviation of the fit for each RoI, together with the mean and standard deviation DNE signal for (X,Y,Z) = (0,0,0), were submitted to the core lab in Manchester for further analysis.
At the core lab, root-mean-square (rms) within-centre R 1 repeatabilities and between-centre reproducibilities were calculated using Microsoft Excel. Each calculation was performed both using absolute units (i.e. standard deviations with units s −1 ), and using coefficients of variation (CoV, dimensionless, presented as percentages). This was done because absolute R 1 units (s −1 ) propagate to absolute concentration of relaxive substance and in some instances to absolute biomarker value, while coefficients of variation may be more relevant when biomarker change is considered. Post-hoc tests of significance were made for "effect of day" using Student's t-test, and for "effect of RoI position" by analysis of variance. No correction for multiple comparisons was made but p < 0.01 was considered significant. For each centre, weighted mean R 1 values were calculated for each of the five phantoms: where r 1, B 0 /s −1 • mM −1 is the longitudinal relaxivity of aqueous Ni 2+ in 2% agarose at field B 0 , R 1, [Ni]=0, B 0 /s −1 is the longitudinal relaxation rate of 2% agarose at field B 0 , and ε is a normally-distributed error term assumed to subsume inter alia any temperature effects.

Cross-validation
Our "institution-led" study design required each centre to derive its own T 1 values. Since centres elected to use the proprietary ParaVision software, a small supplementary study was also performed using an alternative analysis to verify values. Data from one centre were reanalysed. Centre A's data were considered a good test set because they submitted data with both high and low fit errors. For each of the 10 RARE data sets (5 phantoms × 2 days), and for the same three RoIs used in the primary analysis, signal mean and standard deviation were retrieved for each TR value. R 1 was calculated using "R" [14] using four expressions of the form: For three-parameter fits, Minf, M0 and R1 were fitted, while for two-parameter fits M0 was set to zero. For weighted fits, each RoI value y was weighted by w, the inverse of the variance in y, while for unweighted fits w was set to unity. For each of the 30 data sets, each of the four estimates of R 1 from "R", R 1 R , was compared with the reciprocal T 1 from Paravision, R 1 PV . In each of the four cases:

Illustrative simulations
Error propagation associated with two standard deviations of R 1 reproducibility was estimated for a range of derived measurements and biomarkers, using representative relaxivities and other parameters from the literature. This is conservative as it does not fully eliminate repeatability error. Three general cases were considered: firstly, native R 1 (or T 1 ) used as a biomarker; secondly, concentration of endogenous or exogenous paramagnetic substance used as biomarker; and thirdly, biomarkers derived from compartmental models. For Dynamic Contrast-Enhanced (DCE) MRI, the error in precontrast R 1 was propagated into the biomarkers for four preclinical case-studies. Representative 'true' values of kinetic parameters, pre-contrast R 1 values, and appropriate tracer kinetic models were chosen from literature to estimate contrast agent concentration uptake in each tissue type. Simulation parameters are provided in Supplementary Material.

Results
Each centre was requested to submit 30 R 1 measurements (5 phantoms × 3 locations × 2 days), the results of 10 DNE runs (5 phantoms × 2 days), and the 10 associated temperature measurements (5 phantoms × 2 days). The quality of the exponential fits for the 8 TR values was generally good, although in 15/360 cases the fit error was worse than 5% (9 cases in centre G2, 3 cases in centre A and 3 cases in centre L) (see Fig. 1). All these outliers were included in the analysis and not eliminated. One centre (J) did not provide DNE or temperature measurements in a suitable format, so its results were omitted from any analyses that needed those data. For the other centres, temperatures were recorded to ± 0 .1°C: the mean was 19.3°C (SD 1.3), the mean deviation in temperature between day 1 and day 2 was 0.65°C, and the worst deviation 5°C (centre B, 0.5 mM phantom).  Table 2 provides mean values. Fig. 2 shows the field dependence of r 1 from this work, with additional data points added from the literature [3,9,15,16]. Table 3 shows repeatability and reproducibility. Day-to-day repeatability ranged from 0.025 s −1 (centre D) to 0.097 s −1 (centre A): day-to-day repeatability CoV ranged from 0.76% (centre F) to 5.48% (centre L). In exploratory analysis, the day-to-day repeatability of 2.34% was not markedly improved either if measurements were Fig. 1. R 1 measurements (logarithmic axis) for each of centres A-L. Each centre made measurements on five 2% agarose phantoms with different Ni 2+ concentrations. The six horizontal lines represent R 1 values calculated from the field-dependent relaxivities as explained in Table 2. There are two groups of three data points for each phantom at each centre representing, respectively, days 1 and 2, and RoIs (X,Y,Z) = (0,0,0), (10,0,0) and (0,0,12). Error bars are T 1 fit errors from ParaVision.

Table 2
Relaxation rates R 1 and relaxivities r 1 . At each centre R 1 (measured) represents the weighted mean of the six measurements (2 days × 3 positions), while R 1 (fitted, 0.00 mM) and r 1 are respectively the intercept and slope of a linear regression of R 1 against [Ni 2+ ]. At 4.7 T and 7 T, where measurements were made at multiple centres, the SD is also given.  For day-to-day repeatability, 2 centres (B, L) showed a statistically significant effect of day, and 4 centres (D, E, G2, K) showed a statistically significant effect of RoI position. Dynamic (DNE) signal stability CoV varied between 0.30% (centre C) and 2.1% (centre L), and in exploratory analyses was not found to be associated with B 0 , nor with repeatability, nor with the T 1 fit error. Between-centre reproducibility of R 1 was measured for the 5 phantoms at both 4.7 T and 7 T. Least reproducible, on a CoV basis, was the 0.5 mM phantom at 4.7 T (2.94%, N = 4 centres) or, on an absolute units basis, the 8 mM phantom at 7 T (0.065 s −1 , N = 5 centres). In exploratory analysis, reproducibility was not improved if measurements were restricted to the isocentre (all-RoIs rms reproducibility was 0.031 s −1 or 1.4% while isocentre rms reproducibility was 0.064 s −1 or 1.6%). A measure of the linearity of R 1 as a biomarker over the range 0.8-8 s −1 was obtained from the relaxivity Eq. (1): the rms standard error of r 1,B 0 was 0.6% (range 0.2% in centre B, to 1.7% in centre L, N = 12 centres).

Comparison of analysis algorithms
R 1 values for centre A derived from two-parameter fits performed in "R" and in Paravision were close: mean differences were 0.024% for an unweighted fit and 0.26% for a weighted fit. When three-parameter fits performed in "R" were compared with two-parameter fits performed Paravision, disagreement was greater: 1.67% for an unweighted fit and 1.74% for a weighted fit. Bland-Altman style plots are provided in Supplementary Fig. S5.

Illustrative propagation to irreproducibility in biomarker values
Illustrative between-centre irreproducibility expected from two standard deviations of the observed R 1 reproducibility for a range of derived measurements and biomarkers are given in Table 4.
For measurements of concentration of substance, the propagated irreproducibility naturally varies with relaxivity, while for "derived" biomarkers the propagated irreproducibilities were generally ≤10%.

Discussion
In this work we addressed the repeatability and reproducibility of R 1 in MR systems designed and employed for translational in vivo research. We prefer to work with R 1 rather than T 1 , since from a metrology perspective [17], R 1 is a ratio variable while T 1 is merely an interval variable. No single method for measuring R 1 is optimal for all in vivo studies. The most accurate methods (e.g. inversion recovery with long TR and short TE readout) are neither fast nor efficient. In vivo studies involve complex tradeoffs between accuracy, speed, spatial resolution, field of view, need for fat suppression, sensitivity to inflow, sensitivity to motion artefact, biexponential decay, and other confounding behaviours of tissue magnetisation such as T 2 and magnetisation transfer. Moreover, even after a specific method is chosen, errors can be very sensitive to pulse sequence parameters such as choice of delays and nutation angles, spoiling and refocussing strategies, mis-set pulses and so on. In this study we elected to use a RARE saturation recovery technique covering the entire field of view, as this is fairly robust and efficient: our findings may not be directly translatable to other commonly used techniques such as Variable Flip Angle [1,18,19] or Look-Locker [1,20,21] which are vulnerable to different confounds, or even to other saturation-recovery techniques with different pulse sequence parameters.

Repeatability and reproducibility
Previous work in preclinical MR systems has addressed the betweencentre reproducibility of apparent diffusion coefficients [22] and volumetrics [23], but there is little evidence on relaxation rates. Clinical MR systems are designed, maintained and operated under Medical Device regulations, but these engineering and regulatory constraints do not apply to preclinical systems, so their reproducibility might differ from clinical reproducibility.
Repeatability [24,25] (ISO 3534:2:3.3.5) refers to the similarity between measurements over a short interval made using the same test object in the same equipment operated by the same investigator. Repeatability is particularly important when the same MR biomarker is measured on successive occasions in the same human or animal, for example before and after treatment. Repeatability depends on signal-tonoise ratio and on factors such as motion artefact, for which phantoms

Table 3
Repeatability and reproducibility. CoV: coefficient of variation; rms: root mean square. The DNE row shows signal stability for a "dynamic-no-enhancement" (DNE) run of T 1 -weighted (T1W) acquisitions.  [26] to address the perceived "reproducibility crisis" [27] in translational medicine [28]. In this work, relevant values of R 1 reproducibility and repeatability were small, and there was no obvious factor (such as temperature, B 0 , R 1 , or centre) that made any one set of measurements worse. Indeed, the error in the exponential fit of signal intensity against TR was numerically the largest error. Several between-centre studies of T 1 or R 1 reproducibility have been published for clinical equipment [4,5,29,30]: our CoV of 1.43% compares favourably with CoVs recently reported for inversion recovery phantom protocols in clinical systems of 5.5%-8.2% [29]. The relaxivity of Ni(H 2 O) 6 2+ arises because two of the 3d nickel orbitals are half-filled, creating a high-spin triplet state with two unpaired electrons. At lower fields, below 1 T, populations of the three electron spin states are almost independent of B 0 , as the Zeeman splittings are dominated by spin-orbit coupling (zero field splitting) and not by the applied field B 0 . Above 2 T, the Zeeman splittings increase linearly with B 0 . The relaxivity occurs through proton-electron dipolar mechanisms, with the relevant spectral density being the longitudinal relaxation rate R 1,e of the nickel electrons [31]. At low B 0 , R 1,e depends on fluctuations of the zero field splitting which are independent of B 0 , and previous investigators, working at relatively low fields, reported little field dependence for nickel agarose water proton T 1 values [15]. However our data, taken together with previous work (Fig. 2), clearly show a modest increase in relaxivity over the range 0.1-11.7 T.

Implications for translational research
Repeatability errors (same subject, same device) have previously been extensively studied. Good repeatability in phantoms is a necessary, but not sufficient, condition for good repeatability in vivo, because phantoms seldom model physiologic variability. However reproducibility errors (between centres) are much less studied, but are critically important in translating from single-centre to multi-centre use. Since physiologic variability is largely absorbed in the repeatability error, phantoms can be very informative about reproducibility.

Table 4
Propagation of errors using Table 3 reproducibility, with plausible or representative values for a range of important measurements and biomarkers. Actual error propagation varies widely between applications: the values here should therefore be regarded as indicative, but not as a substitute for a thorough analysis of error propagation in any particular setting.
A second class of imaging biomarkers attaches a specific interpretation of the observed longitudinal relaxation, for example in arterial spin labelling [51,52] or in MR thermometry [53][54][55]. Thirdly, R 1 is commonly used to determine the spatially resolved in vivo concentration of an exogenous or endogenous paramagnetic substance of known relaxivity. Relaxivity can be field-, tissue-and temperaturedependent, and varies over many orders of magnitude between relaxive substances: from < 10 −2 s −1 mM −1 for deoxyhaemoglobin monomer [56] to > 10 3 s −1 mM −1 reported for certain investigational polymetallated contrast agents [57]. R 1 errors propagate to low micromolar errors in typical gadolinium-or manganese-based contrast agents. However, propagation of errors may be more significant for techniques based on lower-relaxivity substances. For example in oxygen-enhanced MRI, which measures hyperoxia-induced changes in deoxyhaemoglobin and dissolved oxygen concentration via change in R 1 [45,58], meticulous standardisation is warranted. From Table 4, error propagation might also be important for studies of therapeutic nitroxyls and perhaps for thermometry.
Finally, there are many biomarkers derived indirectly from contrast agent concentration, using a physiologic model. These include measures of perfusion and permeability in tumours, infarcts, synovitis or lung disease; myocardial extracellular volume, cartilage fixed charge density in osteoarthritis, and liver transporter function in toxicology. All biomarkers are also measured in animal models, often aiming to assist the design and interpretation of clinical studies, so it is important to understand the validity of these measurements in preclinical systems. Table 4 includes a representative selection of such MR biomarkers, with simple assessments of how instrumentation-derived irreproducibility in R 1 might propagate. For example, the measured between-centre uncertainty in precontrast R 1 translates to at most 10% between-centre uncertainty in the biomarkers derived from DCE-MRI (Table 4). This error is smaller than the typical day-to-day repeatability error, and in itself would have little effect on the interpretation of change in parameters such as K trans , because treatment effects are typically much > 10% [59].
A realistic assessment of propagation of errors is complex and beyond the scope of this work. In particular, in compartmental models, reproducibility errors and repeatability errors are not completely independent. We omitted from consideration terms which primarily affect repeatability, such as error cancellation with post-contrast R 1 , additional R 1 errors that arise in the presence of contrast agent (e.g. signal saturation, limited water exchange), and in vivo effects (e.g. inflow, breathing motion, bolus dispersion, partial volume). Nevertheless, Table 4 provides comparative order-of magnitude assessments to highlight cases in which the variance seen in our study might be important. With this caveat, in myocardial fibrosis, in normal-appearing multiple sclerosis white mater, and in oxygen-enhanced MRI, R 1 -based MR biomarkers would be quite sensitive even to such small errors in R 1 unless additional acquisition and analysis methods are designed to reduce the impact of error propagation. An example of this is the use of dynamic time series in OE-MRI that determine ∆R 1 (t) by referencing the time-varying signal to a baseline R 1 measurement, thereby reducing the degrees of freedom in the measurement and subsequent error propagation [60]. Similar approaches have been common in DCE-MRI for many years.

Study limitations
(1) This study was performed using only one vendor's equipment, Bruker Avance I, II or III systems running Paravision 5 or 6, representing a typical range of equipment for preclinical MR biomarker research at the time when the study was performed (2017-18). The findings may not be translatable to other vendors' equipment.
(2) Only one pulse sequence (saturation recovery with RARE readout) was employed. This was chosen [61] in a compromise between accuracy and speed. However the assumption of zero longitudinal magnetisation at the mid-point of the eighth echo may be invalid if B 1 is imperfect, and the findings may not be translatable to other sequences with different B 1 sensitivity. (3) The accuracy of our data was not verified using an external standard, such as spectroscopic inversion-recovery. (4) A common problem for MR phantoms is temperature dependence.
In addition to ambient room temperature, heat is imparted to the phantom from the shims during the working day, from the pulsed gradients, and directly from radiofrequency power deposited by the pulse. Data at 1.5 T [4] and 2.35 T [15] show R 1 temperature dependencies in the range −1.3%/°C to +0.7%/°C; data at 0. 08 T [16] show an r 1 temperature dependence of 0.006 s −1 mM −1 /°C. Although temperatures were measured in this study, no direct measurement was made of the agarose temperature itself during MR data acquisition, and exploratory analyses did not reveal temperature as a confound. (5) In order to address the question of reproducibility in normal academic practice, our study modelled "institution-led" standardisation. No site training was performed, no quality control was imposed, nor were sites permitted to repeat their measurements to eliminate apparent outliers. We did not verify that all scanners were performing optimally, and indeed SNR estimated from the DNE runs did not show the anticipated variation with B 0 or coil design. RoIs and T 1 calculations were performed locally. Possibly, "centrally-led" standardisation rigorously imposed by a core lab might further improve reproducibility. (6) No phantom study can fully model the in vivo measurement.
Nevertheless, a well-designed phantom study sets a lower limit on the error to be expected from measurements in living animals.

Conclusions
Using nickel agarose phantoms in typical preclinical MR systems, R 1 exhibited adequate reproducibility for most purposes. Reproducibility (and repeatability) of < 0.06 s −1 and < 2.4% was readily achieved. These small technical (instrumentation-derived) errors in R 1 measurement mostly do not contribute biologically significant errors into R 1based MR biomarkers. However, in a small number of very demanding applications, such as myocardial fibrosis, white mater, or oxygen-enhanced MRI, the accuracy of R 1 -based MR biomarkers would be quite sensitive even to such small errors in R 1 , therefore in these cases further work may be needed to adequately standardise R 1 data acquisition and analysis.

Conflicts of interest
CG, GS and SZ are employees of Bayer AG, a for-profit company providing MR contrast agents. PDH is an employee at Antaros Medical, a for-profit company providing MR biomarker services. SK and KS are employees of Bruker BioSpin MRI GmbH, a for-profit company which is the manufacturer of the MR systems used in the study. JCW receives compensation from Bioxydyn Ltd., a for-profit company providing MR biomarker services.