A medical device-grade T1 and ECV phantom for global T1 mapping quality assurance—the T1 Mapping and ECV Standardization in cardiovascular magnetic resonance (T1MES) program

Background T1 mapping and extracellular volume (ECV) have the potential to guide patient care and serve as surrogate end-points in clinical trials, but measurements differ between cardiovascular magnetic resonance (CMR) scanners and pulse sequences. To help deliver T1 mapping to global clinical care, we developed a phantom-based quality assurance (QA) system for verification of measurement stability over time at individual sites, with further aims of generalization of results across sites, vendor systems, software versions and imaging sequences. We thus created T1MES: The T1 Mapping and ECV Standardization Program. Methods A design collaboration consisting of a specialist MRI small-medium enterprise, clinicians, physicists and national metrology institutes was formed. A phantom was designed covering clinically relevant ranges of T1 and T2 in blood and myocardium, pre and post-contrast, for 1.5 T and 3 T. Reproducible mass manufacture was established. The device received regulatory clearance by the Food and Drug Administration (FDA) and Conformité Européene (CE) marking. Results The T1MES phantom is an agarose gel-based phantom using nickel chloride as the paramagnetic relaxation modifier. It was reproducibly specified and mass-produced with a rigorously repeatable process. Each phantom contains nine differently-doped agarose gel tubes embedded in a gel/beads matrix. Phantoms were free of air bubbles and susceptibility artifacts at both field strengths and T1 maps were free from off-resonance artifacts. The incorporation of high-density polyethylene beads in the main gel fill was effective at flattening the B1 field. T1 and T2 values measured in T1MES showed coefficients of variation of 1 % or less between repeat scans indicating good short-term reproducibility. Temperature dependency experiments confirmed that over the range 15–30 °C the short-T1 tubes were more stable with temperature than the long-T1 tubes. A batch of 69 phantoms was mass-produced with random sampling of ten of these showing coefficients of variations for T1 of 0.64 ± 0.45 % and 0.49 ± 0.34 % at 1.5 T and 3 T respectively. Conclusion The T1MES program has developed a T1 mapping phantom to CE/FDA manufacturing standards. An initial 69 phantoms with a multi-vendor user manual are now being scanned fortnightly in centers worldwide. Future results will explore T1 mapping sequences, platform performance, stability and the potential for standardization. Electronic supplementary material The online version of this article (doi:10.1186/s12968-016-0280-z) contains supplementary material, which is available to authorized users.


(Continued from previous page)
Conclusion: The T1MES program has developed a T 1 mapping phantom to CE/FDA manufacturing standards. An initial 69 phantoms with a multi-vendor user manual are now being scanned fortnightly in centers worldwide. Future results will explore T 1 mapping sequences, platform performance, stability and the potential for standardization.
Keywords: T 1 mapping, Standardization, Phantom, Background Myocardial tissue characterisation by T 1 mapping and estimation of extracellular volume (ECV) by cardiovascular magnetic resonance (CMR) is playing an increasingly important role in the diagnosis and management of patients and clinical trials [1]. T 1 mapping is available as three broad classes of sequences, on multiple platforms, at two field strengths. Factors influencing T 1 mapping stability and inter-sequence comparisons are well understood [1][2][3][4] but little is known about T 1 mapping delivery at a larger scale over many sites and there is no global quality assurance (QA) system.
The goal of the T1MES program (T1 Mapping and Extracellular volume Standardisation) was to construct an optimised phantom for QA of myocardial T 1 mapping, covering a relevant range of T 1 values with suitable T 2 values for the tissues modelled. The proposed QA consists of regular scans using fixed T1-mapping protocols identical to whatever fixed protocols are used in vivo at each participating site. We therefore aimed for a phantom design that would have stable T1 values for as long as possible. We also aimed for a phantom design avoiding temperature sensitivity of its T1 values as explained later in Methods. Such a QA system would form part of a system for optimal mapping precision and accuracy [2] within the increasingly known fundamental limitations of the T 1 mapping methods [5,6].
The T 1 Mapping and ECV Standardization (T1MES) program therefore aimed to: 1. Create a partnership of physicists, clinicians and national metrology institutes 2. Design phantom systems for 1.5 T and 3 T for any manufacturer/sequence reflecting T 1 values in myocardium and blood, pre-and post-Gadoliniumbased contrast agents (GBCA) 3. Reproducibly specify and mass produce phantoms with a rigorously repeatable process and to regulatory standards 4. Distribute them to global CMR sites with detailed instructions for fortnightly scanning 5. Publish full details of the formulation to encourage additional applications 6. Measure confounders (e.g. temperature dependency) 7. Analyse scans over 1 year to study the stability of T 1 measurements over time at each scanner, including a temperature correction model for T 1 8. Curate phantom data long-term in an open access repository available for reuse/analysis 9. Analyse the inter-site differences in T 1 values and explore the deliverability of a technique-independent 'T 1 /ECV Standard' through local calibration To date we have achieved steps 1 to 6 of this process, namely the development, testing, certification, QA protocol and preliminary results of T1MES. This paper summarises these first 6 milestones.

Definitions
The term "phantom" refers to the complete test object (Fig. 1).
The term "tube" refers to each of the small bottles embedded within the phantom.
The "gel matrix" is the gel and bead mixture filling the phantom that surrounds all of the tubes.

Collaboration process
A design collaboration for developing and testing the T1MES phantom and its prototypes was established, consisting of clinicians, physicists, national metrology institutes (the US National Institute of Standards and Technology [NIST] and the German Physikalisch-Technische Bundesanstalt [PTB]) and a small-medium enterprise familiar with phantom production (Resonance Health [RH], Perth, Australia). Funding was secured including a grant from the European Association of Cardiovascular Imaging. Time and expertise was provided for free by the partnership. To engage a global community with constrained funding, the phantoms were gifted (first come, first served) to centers with the proviso that they: a) scan them fortnightly for 1 year and upload the results; b) engage with the partnership to explore any unexpected results; c) do not do anything that could potentially compromise (a) or (b) (e.g. deconstruct the phantom object); and d) give proper reference to the T1MES project if they use the phantoms for other purposes.

Phantom design
The design process involved several prototype iterations (known as models A-D before the final mass-production of E-models). Some aspects such as artefacts from the prototype A through D-models that guided the final Emodel design are described in Methods and in Fig. 2 with a timeline in Fig. 3. At the very least, the initial A-D models were needed to achieve reasonable T 1 and T 2 values without deleterious imaging artefacts, especially as imaging was conducted remotely from the manufacturer.
The range of T 1 and T 2 values in the phantom aims to cover typical native and post-GBCA values in both myocardium and blood. The especially wide range of T 1 post-GBCA (due to variable practice regarding dose, wash-out delays etc. and of course also disease) requires several tubes to cover it. From a review of published values and our own experience, we selected the values listed. Whatever rationale is adopted, with a limited number of tubes there will inevitably be gaps. T 1 is generally longer at 3 T compared to 1.5 T. Initially we aimed to design a single phantom for both 1.5 T and 3 T, containing a sufficient number of tubes to cover the needed T 1 ranges in blood and myocardium, with suitable T 2 values, pre and post-GBCA at both field strengths. However, the frequency dispersion (i.e. B 0 field dependence) of relaxation times in the phantoms differed strongly from that of myocardium and blood, particularly for the long pre-GBCA tubes, requiring a total of 13 different tubes for 1.5 T and 3 T. Fitting 13 tubes into a single phantom would either have made the object 'large' (in relation to the B 1 distortion at 3 T discussed below) or would have required the use of smaller calibre tubes. The following considerations justify our construction therefore of 'field-specific' phantoms: -Tubes had to be a minimum of 20 mm diameter so regions of interest (arbitrarily set to13 mm) would exclude in-plane imaging artifacts at the boundaries between tubes related to the use of clinical T 1 mapping protocols with coarse image resolution, mostly based on single-shot imaging (e.g. Gibbs artifact at the edge of tubes [ Fig. 2d] or the potential impact of filtering against it applied differently by various protocol parameters). Altering protocols to optimise phantom scanning would be inconsistent with the aim of the project. The true resolution achieved is further convoluted by the use of asymmetric frequency-encoded readouts for faster repetition time (TR) in balanced steady-state free precession (bSSFP) imaging or partial-phaseencode sampling for shorter total shot duration, and to some extent also by signal variation during the shot. -Embedding tubes into a gel-filled phantom is important for three reasons: 1) to permit sufficient signal for scanner calibrations; 2) to minimise B 0 and B 1 field distortions local to each tube; and 3) for greater thermal stability. However, embedding all the 13 tubes (to cover 1.5 T and 3 T values) into a single phantom (whether water or water-based gel-filled) will have increased its overall dimensions making it harder to make (our tests and others [7,8] show that B 1 homogeneity across large ROIs could not be achieved especially at 3 T). Alternative oil-based phantoms have a smaller dielectric permittivity, useful for weaker radiofrequency (RF) displacement current distortion of B 1 , but the chemical shift of the matrix fill would require embedded tubes also to use oil-based chemistry (as in diffusion phantoms). Alkanes or similar [9] could not deliver the required range of T 1 and T 2 (written as T 1 |T 2 ) and a predominately single-peak nuclear magnetic resonance (NMR) spectrum, with the required temperature stability. By using separate water-based gel-filled phantoms for 1.5 T and 3 T with the known high permittivity of water, at a size large enough to fit the needed tubes there was still significant B 1 distortion (range of different flip angles achieved for a prescribed protocol nominal flip-angle) but we were able to counteract it using a method described later. cat's head' artifact of air-bubbles trapped in the paramagnetically doped aqueous tubes. Significant off-resonance artifact is also noticeable in the central tubes. c Another coronal image through A-model but with larger gaps between tubes showing the combined effect of motion artifact (due to the aqueous fill) and B 0 distortion. d Transverse image of C-model attempting to use narrower tubes to pack 12 instead of 9, but significant Gibbs artifact can be seen in each tube. e Transverse image of C-model showing three small dark circular artifacts (12, 3 and 9 o'clock positions) caused by glue used to stabilize the tube arrangement. We subsequently switched to silicone-based glues that were less likely to trap air bubbles and were artifact-free. f Severe stabilisation artifact appearing as a thick dark band around the border of a D-model-here the phantom was scanned immediately after being received from the courier company and the bottle was still very cold from the transportation. Additionally susceptibility artifacts can be seen as thin linear bands spoiling some of the tubes (9 and 3 o'clock). g Significant image intensity inhomogeneity during a D-model test session on a GE scanner caused by accidental omission of the folded blanket, intended to separate the phantom bottle from the anterior chest coil. h Curved tube artifact and dark rings arising from ink printed onto the sides of digestive tubes (images courtesy of K. E. Keenan and NIST). i Coronal bSSFP localiser image and (j) typical T 1 map of a final 3 T T1MES phantom obtained by MOLLI using a bSSFP readout on a Siemens 3 T Skyra scanner. bSSFP = balanced steady-state free precession; MOLLI = modified Look-Locker inversion recovery. Other abbreviation as in Fig. 1 -This project aims to provide quality assurance for clinically used T 1 protocols without adapting to the phantom (e.g. no switching to spoiled-gradient echo, or using shorter-TR, no alterations of resolution or field of view etc.; see Additional file 1). Clinical T 1 mapping protocols are sensitive to off-resonance effects for various well-known reasons. Therefore, B 0 distortion near any of the tubes needed to be minimised (tests showed how tube alignment with the B 0 direction was best-this data not shown).

Phantom materials
All materials proposed for phantoms to date suffer different deficiencies. We adopted the most suitable formulation known, which are paramagnetically doped agarose or carrageenan gels [10,11]. Some of the main design aspects are listed in Table 1. Agarose or similar gel phantoms are widely used in MR research but less often in commercial phantoms, probably because of long-term stability issues discussed later. Gels permit independent variation of T 1 |T 2 and they avoid fluid movement within image slice during long inversion recovery (IR) times that could potentially introduce uncertainty in the T 1 * to T 1 conversion [12]. A more concentrated gelling agent mainly shortens T 2 ; a higher paramagnetic ion concentration mainly shortens T 1 [11,13]-the two effects are not independent but can be modelled [14] enabling design of mixtures with any required T 1 |T 2 combination. We did not include sodium chloride (NaCl) (see B 1 uniformity section below). Gel choices include carrageenan, gelatin, agar-agar, polyvinyl alcohol, silicone, polyacrylamide. Some have undesirable NMR spectral properties. The paramagnetic ion choice [15] includes copper, cobalt, iron, manganese (Mn 2+ ), gadolinium and nickel (Ni 2+ ). Due to the individual T 1 |T 2 relaxivities of the various ions, no currently known ionic mixture in water can deliver the native myocardial T 1 |T 2 combination (which requires a relatively high T 1 with a short T 2 ). Ni 2+ was our first choice as the paramagnetic relaxation modifier at it is less temperature and frequency dependent than other ions [13,16] and because nickel chloride (NiCl 2 ) agarose gel phantoms have been shown to be stable over a 1 year period [17].

Characterization of T 1 and T 2 dependence on agarose and nickel
To achieve the required T 1 |T 2 tube values we characterised the relation between T 1 |T 2 , agarose and NiCl 2 concentrations. We made a wide variety of test mixtures as follows: we dissolved at 95°C for 2.5 h, 135 different concentrations of NiCl 2 , water and agarose, each in a separate 50 ml digestive tube. Using a preheated serological pipette, samples were transferred into preheated NMR tubes (to prevent instant setting of the gel while flowing down the tube), allowed to set and analysed at a measuring temperature of 22°C with a 1.4 T Bruker Minispec mq60 (60 MHz) relaxometer (Perth, Western Australia). Exponential fitting was done and T 1 and T 2 recorded. Based on these results we calibrated the equations [14] modelling the relationship between ingredients and T 1 |T 2 relaxation times (omitting saline). The model assumes a linear relation between the ingredients and the relaxation rates (R 1 ,R 2 ) = (1/T 1 ,1/T 2 ). Using this the ingredients for any required T 1 |T 2 tube could be calculated. The model was tested for the set of 13 unique T 1 |T 2 combinations desired for the 1.5 T and 3 T phantoms. Some iterations (models A through D, Fig. 3) were required to derive from the model (based on a non-imaging 60 MHz relaxometer) tube values applicable to clinical 1.5 T and 3 T MR systems described later.

B 0 uniformity
The approximately cuboid, outer body of the T1MES NiCl 2 -agarose gel phantom (Fig. 1a) consisted of a short, hollow, wide necked and leakproof brown-transparent poly vinyl chloride bottle with a melting temperature of 140°C (Series #310-73353, Kautex Textron GmbH & Co. KG, Bonn, Germany). The adopted shape is more ellipsoidal than many of the shapes rejected in our tests, consistent with basic magnetostatics (sphere of Lorenz) at 1.5 T and 3 T. The B 0 distortion by the phantom arises from electronic diamagnetism and is not significantly affected by the paramagnetic ion concentrations used. Adding sufficient paramagnetic material to cancel The ideal phantom would be uniform and ellipsoidal to avoid susceptibility-induced magnetostatic field perturbation. Such a phantom would permit sphere of Lorentz uniformity but this is not easily mass produced. Many phantoms are cylindrical with the long axis along the static field, B 0 but there is usually off-resonance at the z-ends of such objects [7].
An outer phantom body with a smooth surface and soft rounded-edges, placed inside B 0 still distorts some of the imposed magnetic field lines at its z-ends so we prescribed scanning halfway along the length of the bottle.
Long term gel stability and risk of moulding Phantoms with long-term stability could assure the stability of methods applied to patients against scanner alternations and across multiple centers.
Moulding was prevented by aseptic manufacturing, the toxicity of Ni 2+ ions, and the absence of nutrients in the type of agarose used. Tap water might contain microbial contamination and metal ions so high purity water was used. The main risk is from contraction of gel on loss of water leading to gaps and water condensation but NiCl 2 -doped agarose gel phantoms can be stable over a 1-year period [17].
Seal, leakages, air trapping for aqueous fill Air pockets in the agarose gel phantom will give rise to susceptibility artifacts on account of the large mismatch in static magnetic susceptibility between air and surrounding gel producing a local distortion in magnetic field strength.
The main phantom was sealed by a black polypropylene screw cap fitted with a polyethylene foam insert. Each internal digestive tube was sealed by a tight screw cap. Gel preparation with warm, degassed water reduced air bubble formation. Note the tube "base-upward" setting procedure and subsequent "top-up" of the contracted gel in each tube after setting, described in the text.
Adjustments of B 0 and reference frequency Adjustments of B 0 and scanner reference frequency over the phantom have the ability to impact T 1 estimates.
We specified a constant shim volume for each scan. This is manufacturer-dependent-see the T1MES manual [23]. Consistency between repeat scans is the main point.

Gel diamagnetism
In the T1MES model system, because the impact of the paramagnetic ions is so small, we can conceptually treat the main phantom box as if it had no tubes, as if it were just filled with uniform gel throughout The T1MES system has partly paramagnetic and partly diamagnetic constituents, but the impact of the paramagnetic Ni 2+ ions is small, around 10 % (because concentrations are small) so the overall interaction is diamagnetic, considering the~9 parts per million diamagnetism of most tissues relative to air from Lenz electronic diamagnetism.

Gibbs artifact ringing and other inplane effects
Truncating artifacts appear as lines of alternating brightness and darkness in the read-out and phase encode direction. Some effects also from asymmetric readout and ky coverage.
Large diameter digestive tubes to house the 9 agarose doped solutions, so that central regions of each tube are sufficiently distant (a number of pixels away) from regions impacted by artifacts from abrupt signal intensity transitions at the tube edges.
T 1 |T 2 ranges: blood/ myocardium, pre/post-GBCA The T 1 |T 2 values were carefully modelled for native and post-gadolinium based contrast agent, blood and myocardium.

Tube arrangement
The phantom corners are more prone to inhomogeneities of the B 0 and B 1 magnetic fields.
Longer T 1 tubes were placed nearer the middle of the phantom layout and avoided the corners.
Cu 2+ copper ions, Mn 2+ manganese ions, Ni 2+ nickel ions, NiCl 2 nickel chloride the diamagnetism and flatten B 0 would excessively shorten the relaxation times. The final body shape gave sufficient B 0 uniformity for T 1 mapping over only a small region approximately halfway along its length when aligned coaxially with B 0 . Regions towards the cap and base of this object were subject to off-resonance errors [18]. The tubes inside the phantom were therefore not fixed directly down to the base of the main bottle. A 20 mm layer of non-coloured (non-saturated) polystyrene resin (Diggers Casting and Embedding Resin 500GM, #FIE00506-9311052000759, Recochem Inc. Perth, Western Australia) was first set hard in the base of the main bottle, and the tubes were adhered to the top of this layer, so that the tubes occupied the middle of the phantom in the cap-to-base direction, where the B 0 field is optimally uniform. B 0 uniformity was mapped to evaluate this cause of distorted T 1 estimates, using a multi-echo gradient echo sequence based on the phase difference between known echo times [19]. A frequency range of +/−50 Hz across the phantom was regarded as acceptable based on published T 1 -mapping sensitivity to off-resonance [18]. [20,21] is complex but fundamentally the electric dipole moment of the water molecule rotates in the oscillating electric field associated with the RF B 1 field, giving rise to displacement current. Sucrose or other large nonionic molecules can reduce water permittivity, by in effect diluting the problematic water molecules. However, the spectral contribution of such molecules at the high concentrations required is a severe complication. An alternative approach often described in phantom literature is the addition of sodium chloride or similar simple ionic solutes (n.b. not to be confused with high permittivity of powdered titanates, suspended in deuterated water). This tackles the problem from a different direction as it leaves the permittivity unchanged but increases the conductivity (σ) instead, to reduce ωε/σ, i.e. the ratio of displacement current to conduction current. Adding NaCl to the T1MES phantom acted on B 1 distortion at a shallower depth in the T1MES phantom and did not cancel the overall B 1 curvature at any NaCl concentration tested.
In this work, deriving from the sucrose approach, we hypothesised that mixing plastic beads into the matrix gel might also effectively dilute the dielectric permittivity of water and lead to improved B 1 uniformity without directly altering the outer matrix gel T 1 |T 2 values (see Table 2, 846 ms |141 ms). Our choice of outer matrix gel T 1 |T 2 values was informed by tests looking at different outer matrix gel T 1 |T 2 combinations (data not shown) and their impact on bSSFP-stabilisation artifacts at both field strengths. For the beads, two different kinds of plastic bead were evaluated: highly monosized microbeads composed of crosslinked poly methyl-methacrylate (PMMA) polymer (6 μm, Spheromers, Microbeads AS, Norway) and high-density polyethylene (HDPE) beads of oblate spheroidal form (3 mm polar axis by 4.2 mm equatorial diameter) consisting of smooth, semi-translucent, colourless HDPE with a melt index >60°C (HDPE Marlex HHM 5502 BN, Chevron Phillips Chemical Company LP, Texas, USA). It is important to control the supply of HDPE pellets to ensure that they have not been reground, reblended or otherwise modified. The two different plastic bead versions of T1MES matrix gel were compared to the use of sucrose or sodium chloride (formulations tested: (1) added to 1050 ml of Ni 2+ -doped gelling solution, separately and in combination = 800 g sucrose, 50 g NaCl; (2) added to 1000 ml of distilled water containing NiCl 2 and MnCl 2 with T 1~6 00 ms, T 2~1 70 ms: 5 g NaCl; (3) added to 2534 ml of distilled water: 1 g, 4 g, 6.5 g, 11.5 g, 14 g, 19 g, 21.5 g NaCl). B 1 homogeneity was evaluated by flip angle (FA) maps derived by the double angle method using FA 60°and 120°(θ1, 2*θ1) by long TR (8 s) scanning using a 4 ms duration sinc (−3π to +3π) slice excitation width to minimise error due to FA variation through the slice.

Temperature dependence of T 1 and T 2
Temperature dependency experiments on T 1 |T 2 values [15] were carried out at various stages: Test 1: Performed at the PTB laboratory in June 2015 on a 3 T prototype-D (whole phantom with 9 tubes) across 17 temperatures between 14.9°C and 32.0°C for T 1 and across 6 temperatures between 15.6°C and 31.1°C for T 2 . Each measurement was repeated twice (with a 2 day gap) and made using a 3 T Siemens Magnetom Verio system (VB17) and a 12-channel head coil. Test 2: Performed at the NIST laboratory in November 2015 on six loose tubes from the final production run of E-model phantoms. T 1 |T 2 were measured at 9.9, 17.1, 20.1, 23.1 and 30.1°C on an Agilent 1.5 T small bore scanner in a temperature-controlled environment. Temperatures were measured using a fiber optic probe.  One of the final E-model phantoms for 3 T was tested for short-term repeatability of T 1 |T 2 values using a Siemens 3 T Skyra at Royal Brompton Hospital in November 2015. This test was performed by removing and repositioning the receiver coil, phantom and its supports on each of ten runs, incurring full readjustment of all scanner setup procedures before each run. The acquired data was ten runs, each containing two repeated T 1 maps, performed at 20.3 ± 0.5°C. An extension of this work showed that the temperature increase of the T1MES phantom caused by specific absorption rate (SAR) deposition during imaging for repeated T 1 maps was negligible.

Detailed construction of phantoms
Some of the detailed construction topics and constraints are listed in Table 1.
Mass production was from large batches of 14 solutions (13 tubes + outer matrix gel, Table 2) from which all the tubes and outer containers were filled accordingly. The mass production required some caution against deterioration of the agarose/NiCl 2 mixtures if kept at high temperatures for periods exceeding around 8 h. The production of all copies of each tube therefore had to be completed within a single working day and as rapidly as possible. Deterioration was noted as a change of agarose gel colour from colourless to faint yellow. Microwave oven heating for initial agarose dissolution was followed by further magnetically-stirred heating and adjustments (based on relaxometry of samples from the mixture). Stirring was essential for uniform gel production into all copies of each tube. Each of the nine tubes is filled with differently doped agarose gels and contains minimal air gaps. Agarose gel contracts as it sets solid, contracting more in stronger agarose mixtures. By "topping up" more gel to the space left by contraction after the initial fill had set in each tube, the air gap can be minimised. Further, by cooling the tubes from the base (by standing them in approximately 2 cm depth of cold water), the gel solidified from the base upward so that contraction left a gap at the top of the tube for adding the "top-up". This practical step was essential to avoid mid gel contraction gaps forming that is otherwise observed when the gel is allowed to set naturally earlier along the tube sidewalls. Such mid-gel gaps tend to cause a tear down the middle of the gel-filled tube making it unusable for ROI placement in images. The dissolving and solidifying temperatures of agarose gel show hysteresis, dissolving fully only near boiling-point, but requiring cooling to around 45°C for solidification. The hysteresis assists practically, for example when pouring molten gel around the HDPE beads needed for the main matrix fill.
Of the 18 tubes used in the 1.5 T and 3 T phantoms, 4 are 1.5 T specific, 4 are 3 T specific (because tissue native T 1 is longer at 3 T) and five tubes (the post-GBCA tubes) common to both field strengths (Fig. 4). Although some difference in post-GBCA T 1 values does occur between 3 T vs. 1.5 T, this difference is absorbed within the very wide range of GBCA doses, post-GBCA times, GBCA types etc. in clinical use. Therefore 13 individual recipes were made. The 9 tubes in each field-specific phantom generate 9 different T 1 |T 2 combinations (Fig. 5) modelled to cover the physiological range of native and post-GBCA, blood and myocardium in health and disease. There was no macromolecular addition with no attempt to model magnetisation transfer [22].
After pouring in the resin base, leaving this to set, and adhering the 9 filled tubes on top of this base using ethylene vinyl acetate and polypropylene uncoloured mixture based hotmelt typically applied from a "hot glue  Table 2) are explained in the figure. Slow scan reference data for T 1 |T 2 is displayed in green (for T 1 by slow IRSE and for T 2 by slow SE, RR interval 900 ms at 21 ± 2°C), T 1 values shown in orange represent the mean value per tube derived from tests on five of the E-model phantoms (using a 5(3)3 256-matrix RR = 900 ms at 21 ± 2°C variant of MOLLI adapted for native T 1 mapping; Siemens WIP 448B at 1.5 T and WIP 780B at 3 T), and in blue are T 1 |T 2 values obtained by the manufacturer in Australia using a 1.4 T Bruker minispec relaxometer at 22°C. Tube arrangement is such that long T 1 tubes potentially suffering from more artifacts are kept towards the middle of the phantom and away from the corners. GBCA = gadolinium-based contrast agents; IRSE = inversion recovery spin echo; myo = myocardium; RR = inter-beat interval; SE = spin echo. All T 1 |T 2 values are stated in ms. Other abbreviation as in Fig. 2 gun", we packed the compact HDPE pellets into the bottle and then poured in the agarose/NiCl 2 mixture (typically at a temperature~50-60°C) taking care to avoid air pockets from forming in the matrix gel fill.
The T1MES phantom has a volume of 2 l, inner length of 187 mm and inner body cross section 122 mm by 122 mm. The labels show an isocenter cross mark, the correct orientation for positioning it under an anterior chest coil, and a serial number and date of manufacture. Also attached to the outside of the phantom is a liquid crystal display (LCD) thermometer of 1°C resolution. Notably some pigments used on plastic tubes distort the magnetic field [12] (Fig. 2h), so all components were tested carefully, rigorously sourced and documented to avoid unexpected changes which could affect future production batches. Even with the efforts to optimise B 0 and B 1 uniformity, some T 1 |T 2 combinations are more sensitive to off-resonance errors so these tubes were placed centrally in the phantom avoiding corner locations of greater B 0 /B 1 error (explaining the otherwise somewhat counterintuitive ordering of tubes according to their T 1 values).
Production of one phantom took on average 5 h (distributed over batch production not serial manufacture). As the phantom build was all by manual labour and not automated, it took 3 weeks and four full-time members, 340 h in total to produce the 69 phantoms in this batch.

Prototype and production batch testing and quality control
Reproducible manufacturing was established for all tubes. Three prototypes (models A to C) had unsatisfactory B 0 and B 1 uniformities before the satisfactory model-D design. Between June and August 2015, 10 D-model phantoms (five for each of 1.5 T and 3 T) were characterized at ten experienced CMR centers for artifacts and for initial verification of the tube T 1 |T 2 values. In September 2015, the final batch of artifact-free (Fig. 2i, j) T1MES phantoms (E-models) were mass-manufactured and shipped to CMR centers worldwide.
All aspects of phantom production conducted at the RH laboratory were performed in accordance with their certified quality management system including the recruitment and training of staff and the quality control checks of final phantoms. Prior to the mass manufacturing, extensive experiments were done in order to setup the standard operation procedures and working instructions to ensure final phantom integrity. Quality control was ensured at three levels: operator level (e.g. careful choice of materials), engineering level (e.g. the responsible process engineer conducted in-production tests/ measurements and inspections, such as checks for bubbles in the tubes and bottle seals, and based on the outcome of this analysis, initiated improvement activities) and management level (e.g. by facilitating training and identifying better measurement or production equipment that could be used for future batches). Operator level quality control evaluated phantoms in real-time during the production process through visual inspection to ensure production ran smoothly, predictably, and to the required standards (e.g. by ensuring a flat resin surface, correctly sealed tubes, tight bead packing of the outer matrix gel, etc.). Overall phantom integrity was also visually checked for any production defects prior to shipment (e.g. precise alignment of isocenter cross label correctly offset from the upper surface of the resin base, no distortion of the outer bottle due to excessively hot gel etc.).
Phantom calibration and validation has limitations as phantoms do not fully model tissue (see Discussion). Nonetheless, 'ground truth' values in phantoms were measured using slow scanning 'gold standard' sequences that have previously demonstrated accuracy in phantom work. Of the 69 final E-model phantoms, 10 (14.5 %; 5 at each of 1.5 T and 3 T) underwent 'gold standard' slow T 1 measurements by IRSE (8 TIs from 25-3200 ms) and T 2 measurements by slow SE (8 TEs from 10-640 ms) at a single center (Royal Brompton Hospital; Siemens, 1.5 T Aera and 3 T Skyra; Fig. 6). These slow T 1 |T 2 measurements were only performed once and the results used as 'ground truth' for the subsequent measurements. In addition, all tubes were relaxometer-certified preassembly.

Scanning protocol for T1MES
A fundamental aspect of T1MES was to invite each site to submit phantom data with whichever T 1 mapping sequence they were using clinically. We did not prespecify any aspect of the T 1 mapping sequence to use, except careful replication of position and phantom setup without any alteration of the parameters used clinically and not to modify any other parameter of the chosen protocolled T 1 mapping method during the period of supplying T1MES repeat scans-i.e. to stick to a fixed protocol (as specified in the JCMR Consensus Guidelines for T 1 /ECV). If changes were inevitable, for example due to scanner upgrades, a method of informing T1MES has been implemented and is described in the manual (Additional file 1). Instructions for adjustment and sizing of the shim volume did need to be vendorspecific and these are explained in the appendix section of the T1MES user manual circulated to all participants.
At all participating T1MES sites, the final phantom is currently scheduled for fortnightly scanning for 1 year using a fixed protocol for inter-scan test-retest analysis. Some centers are additionally scanning the phantom using the same sequence at the same position providing data necessary for short-term intra-scan test-retest analysis. Results from this longitudinal data collection are expected to be published in 2017. The T1MES user manual and QA protocol [23] stipulates that the T1MES phantom be kept in the MR magnet room (for stability and also so that its internal temperature will match that displayed by the surface LCD label) and imaged every 2 weeks for 1 year using consistent coil and phantom arrangement. The T1MES user manual emphasises that image parameters be kept unchanged for serial scans except for automatic adjustments of FA and reference frequency. The user manual specifies the range of acceptable positioning of the phantom in the scanner aligned with the main magnetic field. The phantom is scanned axially halfway along the length of the 9 internal tubes corresponding to halfway along the length of the main bottle, imaging only that slice, to avoid z-end B 0 distortion. To ensure consistent adjustments of B 0 and scanner reference frequency over the phantom at each repeat scan, the shim volume (also referred to as adjustments volume, adjust region, shim region, shim box) is identically sized and positioned on the phantom bottle for each scan (see Additional file 1). The scan protocol is kept identical for serial scans at each center. Centers were requested to use the same standard anterior chest coil each time.
The minimum fortnightly contribution to T1MES consists of conventional CMR scans: A) the initial localizers; B) at least any one T 1 mapping sequence with simulated electrocardiogram set at 67 beats per minute (inter-beat [RR] interval 900 ms). The T1MES QA program generates three main types of multicenter data: 1) raw data pertaining to long reference scans for T 1 (IRSE) and T 2 (SE) that we reconstruct on receipt: 2) raw T 1 mapping data from some specific centers without the ability to reconstruct their own maps locally, thus we reconstruct the maps on receipt; 3) reconstructed T 1 |T 2 maps (majority of sites). T 1 |T 2 values were taken as mean values from circular ROIs of fixed diameter, in each of the nine tubes in pixel-wise maps.
Within the network are sites using identical magnets, coils and protocols providing an opportunity for a wide range of inter-sequence and inter-site analyses (scheduled for 2017).

Statistics
Statistical analysis was performed in the R programming language (version 3.0.1, The R Foundation for Statistical Computing). Descriptive data are expressed as mean ± standard deviation except where otherwise stated. Distribution of data was assessed on histograms and using Shapiro-Wilk test. The coefficient of variation (CoV) between repeated scans was calculated as a measure of reproducibility. For defining the model that describes the relation between ingredients and relaxation rates (R 1 |R 2 ), the fitted parameters were found by fitting a surface for both T 1 and T 2 using the MATLAB (The MathWorks Inc., Natick, MA, USA, R2012b) curvefitting tool and the linear least-squares approach. The analysis of incoming T1MES datasets is carried out using a MATLAB graphical user interface. From the data, mean T 1 and T 2 values were measured from each of the nine contrast tubes. Using the ROI measurement tool in MATLAB, mean signal intensity of the central 50 % area of each of the nine tubes was calculated.

Model predictions of T 1 and T 2
Linear models for longitudinal and transverse relaxation rates R 1 |R 2 in terms of the ingredients agarose and NiCl 2 can be written following similar work previously published [14]: where x = 1, 2, C w,agarose and C Ni 2þ are the weight and molar concentration of agarose and Ni 2+ , respectively, and a x , b x and c x are found by surface fitting (Fig. 5): From these relationships and replacing relaxation rate R x by relaxation time T x we calculated the required agarose % (by weight) and Ni 2+ concentrations (equal to added molar concentration of NiCl 2 .6H 2 O as it is highly dissociated) for each of the 13 tube stock solutions as shown in Table 2.
The presented model was accurate within the rootmean-square errors (RMSE) in Fig. 5 caption over the range T 1 = 300-1900 ms and T 2 = 40-300 ms that cover the range of relaxation times expected in healthy and diseased myocardium pre-and post-GBCA.

Reference T 1 and T 2 values
Comparison of 'gold standard' T 1 and T 2 values (Fig. 6) between the ten E-model phantoms tested, confirmed reproducibility of manufacturing. Across the 9 tubes, CoV for T 1 ranged from 0.17 to 1.25 % at 1.5 T and 0.08 to 1.0 % at 3 T, while T 2 ranged from 0.74 to 2.12 % at 1.5 T and 0.40 to 1.72 % at 3 T.

B 0 uniformity
Final phantoms were free of air bubbles and susceptibility artifacts at both field strengths. T 1 maps were obtained in the specified mid-phantom slice at the specified scan setup, and were free from off-resonance artifacts (Fig. 2i, j). Provided the bottle was placed coaxial with z-axis, imaged as a transverse slice halfway along, and with the use of shimming as specified in the T1MES manual, B 0 uniformity was delivered (Fig. 7a) to within ±30 Hz at 3 T.

B 1 uniformity
The compact HDPE beads (~1 kg of compact pellets per phantom bottle) adequately flattened the B 1 field at 3 T (Fig. 7b), compared to the PMMA microbeads, sucrose and sodium chloride. The HDPE beads cause a speckle of dark regions in the gel matrix as they generate no MR signal that is normally detectable. The beads are expected to have similar diamagnetism to the gel so they have no impact on the B 0 field.

Temperature dependency experiments
Collectively the results (Fig. 8) by slow SE scanning methods show that over the range 15-30°C the short-T 1 tubes are more stable with temperature than the long-T 1 tubes where T 1 increased more strongly with temperature. T 2 values also change significantly with temperature (Fig. 8b), decreasing as temperature increases.

Short-term reproducibility
Test 1: Six loose tubes as used in the 1.5 T E-model (Fig. 9) showed a CoV of ≤1 % for both T 1 and T 2 reproducibility. Tube B with the longest T 1 and T 2 showed the greatest variability between repeated scans. Test 2: Test-retest evaluation of one of the final phantoms for 3 T by cardiac T 1 mapping, including complete repositioning and readjustments, also gave a short-term repeatability CoV for T 1 ≤1 % (Table 3 detailing results for 3 T). For T 2 measured by fast T 2 -prepared single-shot methods, the CoV was usually below 1 % with an exceptionally large 4.1 % in the tube B with longest T 1 .

Production, distribution and initiation of trial
On 1st September 2015 the E-model T1MES phantoms (batch numbers TTP15-001 and TTP30-001 for 1.5 T and 3 T respectively) received regulatory clearance by the Food and Drug Administration (FDA) and Conformité Européene (CE) marking as a Class I Medical Device (GMDN 40636). This initial mass manufacturing phantom experience was not always 100 % successful and important quality control lessons have been learnt: for example two different fill solutions for tubes were accidentally mislabelled initially and had to be discarded and remade; individual tubes with visible bubbles on inspection had to be corrected with appropriate procedures; any solution stock with T 1 or/and T 2 not falling within +/− 3 % of our pre-specified targeted range had to be adjusted.
A total of 75 multi-vendor CMR scanners (four systems: Siemens, Philips, General Electric [GE] and Agilent) across five continents (Table 4), are currently using T1MES phantoms for their local T 1 mapping QA as part of the international T1MES program. This amounts to an initial 53 individual CMR centers and 69 devices, with six centers using the same field-specific phantom for QA scans on more than one local machine.

Discussion
Results obtained thus far demonstrate that: 1) mass production of phantoms to regulatory standards and in accordance with a rigorously repeatable process is feasible, 2) based on the sequences used, T 1 |T 2 times in gels are highly reproducible in the short-term, 3) a significant temperature dependency of measured T 1 |T 2 values exists in tubes with longer T 1 values that will require the use of a correction model.
The T1MES program seeks to advance the field of quantitative CMR relaxometry and the use of imaging biomarkers like T 1 mapping and ECV in clinical trials and clinical practice. Our aim was to collaborate with industry, with leading CMR academics and clinical centers with an interest in T 1 mapping, so as to develop and test a multicenter QA infrastructure, to protect normal reference data at centers and also potentially to improve consistency of T 1 mapping and ECV results across imaging platforms, clinical sites, and over time. Key to the achievement of accurate and reproducible T 1 mapping/ ECV results in CMR is the accelerated development and adoption of rigorous hardware and software standards.
However, this proposal is subject to a further limitation that the phantoms do not model other aspects of tissues, particularly for myocardium-the magnetisation transfer [22] neither does it address the mapping techniques' ability to discriminate T 1 values between adjacent regions of interest (the clinical challenge of discriminating tissue T 1 values in adjacent myocardial segments). For example, the signal-to-noise ratio in the phantoms is unrealistically high as the surface coils are typically nearer; evaluating such an ability is beyond the scope of T1MES. The only realistic aim may prove to be that of providing individual (or genuinely identical) centers with a QA phantom that could protect normal reference data and assure (or even permit correction of changes in) stability of protocols during a long study.
The 1-year study, now running, is expected also to give information about gel stability. It seems reasonable to expect sudden steps in T 1 values from genuine changes in the acquisition, or scatter from any remaining uncontrolled parameters or imperfect temperature correction, but there would be a gradual monotonic drift as the gel water content changes. Agarose gel is inherently unstable even within a sealed tube, because the gel contracts as water leaves it, appearing as excess water (as droplets) in the gap left by the contraction, often visible on the inner wall of the tube. Note that this effect can occur within well-sealed tubes. It is unrelated to contamination because Fig. 7 B 0 and B 1 field homogeneity. a B 0 field homogeneity across the nine phantom compartments as a measure of off-resonance in Hz at 3 T (single E-model phantom results). These are extremely small shifts in frequency (30 Hz = 0.25 ppm) at 3 T and should not be regarded as significantly different between the tubes. b Diagonal profile of the B 1 field (as per green discontinuous line in the inset) comparing relative flip angles on a Siemens 3 T system. Variance of B 1 was smallest across the 9 compartments with CoV 1.54 % for HDPE beads consisting of smooth, semi-translucent, colourless compact discs (as colouring in plastics has the potential to distort the B 0 magnetic field [12], see Fig. 2h) with a melt index >60°C. We choose pellets that had not been regrinded, reblended or composite for this purpose. Highly monosized microbeads measured 6 μm and were composed of crosslinked PMMA polymer. Neither microbeads, sucrose nor NaCl were comparably effective in flattening the B 1 field. PMMA = poly methyl methacrylate. Other abbreviation as in Fig. 4 agarose without added nutrients does not support mould growth. Over time, this shrinkage may also occur in the matrix fill leading to air-gaps and B 0 distortion, potentially occurring near the tubes making a possible contribution to an apparent drift in T 1 values over time. For the first time, the 1-year study will give large-scale initial data on the durability of this type of phantom. At study end, we aim to recall approximately 10 % of the phantoms which will be inspected for flaws in the gel using high-resolution 3D imaging, with collection also of long reference T 1 |T 2 data as gel drying with shrinkage and condensation into the gap is known to occur even within a sealed tube. Centers are free to keep and use the T1MES phantoms after the 1-year study ends. There is no provision for return shipment to the coordinating site, nor any knowledge of how long the gels will remain usable.
The field and temperature dependence of T 1 for phantoms containing Ni 2+ is much smaller than those containing other paramagnetic ions like Cu 2+ . As T 1 increases above 500 ms (in tubes with a low concentration of Ni 2+ ), the tube's T 1 becomes more temperature-sensitive as it is increasingly dominated by the temperature sensitive T 1 of water in the gel [24,25]. Therefore temperature monitoring of each fortnightly session is essential. Our results enable us to integrate a temperature-correction model into our multicenter T1MES analysis, that will be published at the end of the project. The temperature sensitivity of T 1 revealed in the present work may not be a concern for clinical T 1 mapping in healthy volunteers (as the human body is homeothermic-temperature of 37°C) but it may be a concern for hypothermic or febrile patients. Furthermore T 2 temperature dependence could also impact measured T 1 as some fast-T 1 methods have considerable T 2 sensitivity.

Conclusion
We report on the establishment of a collaboration to develop CMR phantoms to CE/FDA standards and an  (Test 1 in methods) performed on a D-model whole phantom (tube nomenclature differed from that used in E-models) comparing the stability of T 1 (a) and T 2 (b) values between two repeat experiments (2 days apart) at various temperatures between 15°C and 32°C on a 3 T Siemens Verio system. Whiskers represent mean ± standard error. (c) Temperature dependency experiment (Test 2 in methods) comparing T 1 |T 2 values in tubes A, B, C, D, E and I (middle right insert) from a final E-model phantom across five temperatures Fig. 9 Short-term reproducibility. Short-term reproducibility (three runs) at the NIST laboratory (Test 1 in methods) for phantom T 1 values in six loose tubes (top left insert) from a final E-model phantom showing CoV of 1 % or less. Tube B with the longest T 1 |T 2 showed the greatest variability between reads. CoV = coefficient of variation initial multicenter repeat scanning program aiming for global QA of T 1 and ECV protocols. A rigorous and reproducible manufacturing process for the phantoms has been established. The temperature sensitivity, short-term stability and inter-phantom consistency have all been assessed in support of the main project. An initial 69 phantoms with a multi-vendor user manual are now being scanned fortnightly in centers worldwide, permitting the academic exploration of T 1 mapping sequences, platform performance and stability over a year.