Minimum Reporting Standards for in vivo Magnetic Resonance Spectroscopy (MRSinMRS): Experts' consensus recommendations

The translation of MRS to clinical practice has been impeded by the lack of technical standardization. There are multiple methods of acquisition, post‐processing, and analysis whose details greatly impact the interpretation of the results. These details are often not fully reported, making it difficult to assess MRS studies on a standardized basis. This hampers the reviewing of manuscripts, limits the reproducibility of study results, and complicates meta‐analysis of the literature. In this paper a consensus group of MRS experts provides minimum guidelines for the reporting of MRS methods and results, including the standardized description of MRS hardware, data acquisition, analysis, and quality assessment. This consensus statement describes each of these requirements in detail and includes a checklist to assist authors and journal reviewers and to provide a practical way for journal editors to ensure that MRS studies are reported in full.


| INTRODUCTION
Despite over 30 years of development and thousands of papers describing the use of in vivo MRS for non-invasive research in health and disease, including diagnosis and treatment monitoring across a broad range of human conditions, MRS has yet to reach full clinical acceptance. 1 While there remain several important technical issues, 2 one of the major problems is the lack of standards for reporting MRS studies. The importance can be described on several levels. First, there is increasing concern in the general scientific community over the lack of rigor and reproducibility of scientific studies. 3 Details of MRS methodologies need to be fully reported for readers to critically evaluate the quality of the published results and to reproduce the experiments. Second, recent meta-analyses and evidence-based reviews of MRS 4 have noted the lack of detail in peerreviewed publications, which makes it difficult to compare study results. Third, the lack of reporting guidelines for MRS means that new researchers in the field find limited guidance on practice. As a result, MRS studies are sometimes conducted using inappropriate or incorrect methods that may lead to erroneous and/or inconsistent conclusions. Finally, MRS is a versatile method that finds application across fields where there may be insufficient peer expertise to provide critical technical evaluation of methods and analysis. A core set of standards for the rigorous reporting of MRS studies will help to ensure that MRS studies can be adequately reviewed to standards accepted by the specialist MRS community.
This lack of consistency in reporting was highlighted in a recent meta-analysis of MRS studies in chronic pain, 5 leading those authors to propose a minimum quality assessment guide, MRS-Q. 5 To this end, an expanded set of guidelines for minimum and recommended reporting requirements is presented in this paper. The origin of these guidelines was a panel at the 2016 International Society of Magnetic Resonance in Medicine (ISMRM) workshop "MR spectroscopy: from current best practice to latest frontiers." These minimum and recommended requirements were then reviewed and amended by authors selected from the ISMRM Magnetic Resonance Spectroscopy Study Group who have reviewed at least 10 MRS-focused papers for the following journals: Magnetic Resonance in Medicine, NMR in Biomedicine, Journal of Magnetic Resonance, Radiology, and Magnetic Resonance Materials in Physics, Biology, and Medicine. These are well-established peer-reviewed specialist journals that have focused on MRS-related topics, which ensures that the authors are considered experts in the technical aspects of MRS and experienced in its scientific use. Recognizing the need to include input from less experienced authors, we also included two trainees as authors to review and edit the manuscript to ensure it was clear to authors new to the field. We then formed the Experts Working Group on reporting standards for MRS, who support the paper's recommendations with collaborators with more than 5 years of experience in MRS methodology and application, who either have extended years of service as reviewers for the main MRS journals or are editors of those journals. This follows the same pathway to consensus as the other consensus papers in this special issue. [6][7][8][9][10][11][12][13][14][15][16][17] These consensus papers provide context to these recommendations, and for further details, as indicated throughout the paper, new authors should reference these papers.
In order to facilitate implementation of these guidelines, a checklist of minimum requirements for the publication of MRS studies was developed (Table 1) authors would complete the table and submit it to the journal, in addition to their manuscript, for review, or use the table to check whether all   essential parameters have been listed in the Methods part of the manuscript, with the table subsequently to be included as an appendix to the   article. For single site or nucleus studies the first column should be used, and for multisite or multisequence studies it is recommended to complete additional columns as appropriate. Likewise, in the appendices of this paper, several examples have been provided to illustrate how this table   should be completed. The model follows checklists such as STARD, 18 CONSORT, 19 PRISMA, 20 and STROBE. 21 This will enable editors, reviewers, and ultimately readers to be sure of the MRS methodology employed in particular studies, and to ensure that all sufficient details are available to those intending to reproduce or extend the studies or use the results for meta-analyses. The checklist ( Table 1) will also help to standardize the presentation of MRS information and provide journals less familiar with MRS with a systematic way of certifying the methods used.

| REPORTING GUIDELINES
Below we set out in five sections the important pieces of information about an MRS study that are to be considered as either requirements, or recommendations, along with reasons why these are considered important. A more in-depth description of terminology and abbreviations to be used can be found in the work of Kreis et al, 11 while a fuller discussion of several concepts are to be found in other consensus papers in this special issue. 6 c. Model. Hardware differences exist depending on the model of the scanner, for example the bore size and gradient hardware, which impact on B 0 homogeneity and echo-planar spectroscopic imaging performance, respectively. 22 The software version is often omitted, but should also be given whenever possible, as some special features such as frequency correction and shimming algorithms may differ between different software versions.
d. RF coils. The RF coil information should include the nuclei the coil is tuned to so that it is clear which nuclei are observed. For double-tuned coils, both nuclei should be indicated with a forward slash in between. As the coil design can have a major impact on the data acquired, it is important to include all the relevant details of the coil such as whether a single coil was used for transmit and receive and/or the number of channels for phased array receive and transmit coils. If it is not a standard manufacturer product coil, further details such as the design of the coil should be included and a reference for a previous publication that may provide more detail.
e. Additional hardware. Finally, details should be included of any additional hardware used, such as shim/gradient inserts, dielectric pads, or any other modification of the hardware used for data acquisition.
ii. Location of volume(s) of interest (VOI(s)), eg posterior cingulate gyrus, M. tibialis anterior, internal capsule of prostate, etc. A figure that displays the VOI on anatomic images is recommended.
iii. Nominal VOI size [cm 3  f. Water suppression method (and any other suppression methods used, eg lipid suppression, outer volume suppression).
g. Shimming method, 10 reference peak used for assessing shim performance, and thresholds for "acceptance of shim" chosen.
h. Triggering method, if used (respiratory, peripheral, cardiac triggering, including device used and delays).
i. Frequency and motion correction methods, if used (prospective or retrospective, external tracker or navigator method). Rationale: a. Pulse sequence. The pulse sequence dictates the parameters that need to be described under additional sequence parameters. Citing the original article that first introduced and described this sequence in detail is recommended along with outlining important deviations from the original sequence, and if the sequence is vendor supplied or a customized sequence.
b. Location. The voxel location is the anatomical position of the VOI selected for single-voxel spectroscopy or the excitation or selection volume in MRSI methods. It should be described in the checklist table in brief and in the manuscript be either shown in a figure or described in detail with anatomical landmarks. It is important to address concerns regarding regional specificity of results and possible tissue-specific effects (ie for brain gray matter, white matter, and cerebrospinal fluid content).
c. VOI size. The VOI size must include the dimensions along the right-left, anterior-posterior, and superior-inferior directions with anatomical referencing if relevant. For MRSI, this should be the excitation volume. Regional analyses for MRSI can use signal averaging over multiple voxels, which should also be described in detail if used.
It is important to show example spectra obtained from these regions to allow the reviewer and reader to assess the quality of the data. The spectra should be representative, and, if possible, visualize the studied effect by comparison with a reference spectrum (eg healthy subject/tissue versus patient/affected tissue; physiological conditions such as rest versus end of exercise for muscle). See also the quality assurance section. In recognition of the limited space for figures in some journals, a figure containing the VOI and corresponding MR spectra could be placed in the appendix or supplemental data. See Figure 1 for an example.
d. Timing parameters (echo time and repetition time) are considered essential parameters as these will affect the way spectra appear. T E and T R lead to differential T 2 and T 1 relaxation effects, with this effect being present between different metabolites. The importance of this is best illustrated by considering total creatine (tCr) and total choline (tCho). The methyl signal of tCr is a commonly utilized internal reference peak; however, its T 2 relaxation constant is shorter than that for tCho. 23,24 This means that for studies with long T E (eg 144 ms) the tCho/tCr peak height or area ratio will be larger than for studies with shorter T E (eg 30 ms). This can lead to a misinterpretation of differences between two different studies if the T 2 relaxation difference is not considered. For STEAM sequences mixing time (TM; the time between the second and third 90 RF pulses) will affect the evolution of multiquantum coherence, 25 and so may impact quantification even if the effective T E is the same between two studies. Similar effects can be seen for changes in standard vendor-implemented sequences' timings for other acquisition schemes, and so information on any such changes should always be provided. f. Additional sequence parameters. These will be determined by the sequence used for acquisition. For most methods, however, it would be appropriate to describe the spectral width in hertz and the number of data points acquired. If any frequency offset is used, it should also be described.
For MRSI methods, the necessary details include the FOV and matrix size so that the nominal volume of MRS voxels can be determined.
Acceleration methods (such as parallel imaging, compressed sensing, or spatial-spectral encoding) can be used to reduce the scan times required for MRSI methods and should be described with details of the method and parameters used. 26 Similarly, k-space weighting of the acquisition should also be described, such as whether full or elliptical k-space sampling is used or retrospective filters used (eg Hamming), and any k-space zero-filling factors applied. These factors will impact the effective, ie resultant, voxel volume, which should also be stated if known. 14 Additional modifications to the default settings of the pulse sequence should be described. For example, if a frequency offset for excitation of water-suppressed scans is used to address chemical shift differences between the water reference and metabolite scans, this should be specified, either as offset frequency from water or as chemical shift value in parts per million (ppm) on the standard MRS frequency axis.
Reporting of sub-echo times (T E1 , T E2 ), if known, is also recommended. F I G U R E 1 Representative spectrum and voxel location. Representative PRESS spectrum from the posterior cingulate acquired on a 3 T Philips Ingenia. In this figure the raw data from one participant are shown; however, mean data with SDs, multiple data sets, and fitted data may also be shown, as long as the raw data are presented in a fashion that allows an assessment of data quality. The chemical shift axis is labeled in ppm units. Data were collected in accordance with the WMA Declaration of Helsinki g. Water and fat suppression. Water suppression is a key element of the data acquisition in proton MRS ( 1 H MRS), as both the method used, and the degree of water suppression, can greatly influence the spectral quality and analysis of the data. The type of water suppression used should be specified if specific water suppression methods are selected. If the authors used the default water suppression method for their choice of pulse sequence, it is acceptable to report "Standard," as manufacturers often do not specify which water suppression method is used. If there are parameters related to water suppression such as "weak water suppression" as specified on Siemens systems, or the bandwidth of the water suppression pulses, this should be listed. For further details see the consensus paper by Tkáč et al 6 in this special issue. As with water, fat suppression techniques may also impact data quality, and if used, specifics should also be listed (eg frequency offset, number and location of outer voxel suppression bands, bandwidth).
h. Shimming method. Similarly, different shimming methods may be selected at the time of acquisition. In most cases, authors will utilize the vendor-provided automated shimming, which usually involves the use of a gradient echo field map to optimize the B 0 field homogeneity, but may employ other methods (eg "pencil beam" VOI in Philips). If a vendor-supplied methodology is used, the authors should state this; if first, second, or third order shims are used; and describe whether or not the resulting linewidth was measured and used for quality assurance. Ideally, studies should measure the linewidth (to be reported as full-width at half-maximum, FWHM) for the unsuppressed water resonance or a specific metabolite peak 11 in each examination and report the threshold at which shimming was considered acceptably achieved, and how this was assessed (eg system reported results for shim, phase or magnitude spectrum, or other  16 If more than one software package was used, they should all be listed along with the aspects of the analysis for which they were used.
b. Deviations in processing steps. Any automatic and manual processing steps deviating from a software package's default analysis have to be listed: for example, changes to phasing, frequency alignment, eddy current corrections. For phased array coils, any alterations to coil combination should be described. In addition, it should be described if these changes are performed on single acquisitions before averaging. References should be provided that describe the methodology and its specifics rather than publications that simply utilize the method.
c. Quantitative output measures. The output measure of the spectral analysis should be described. There are three main ways that MRS metabolite concentrations can be described.
First, MRS results are often reported as a ratio of the primary metabolite to another. This can be done using the ratio of the peak area measurements, or ratio of relative concentrations, which accounts for the number of resonant nuclei in each compound. It is important to indicate which metabolite is used as the denominator.
The second method is to report the metabolites as "institutional units," which is the signal reported by the software, normalized such that measures at different time points or from different subjects can be compared. This normalization usually stops short of all steps required to report conventional concentration estimates. The most basic measure is based on the peak height of the metabolite, but this is greatly influenced by the linewidth, and therefore reporting the area under the curve (or the equivalent measure for time-domain fitting) is recommended. In either case the baseline fitting method should be described. A common approach for normalization to institutional units in 1 H MRS is to take the ratio of the fitted metabolite signal to the fit of the unsuppressed water resonance.
Finally, metabolite concentrations can be expressed in "absolute units" (standard chemical units, such as millimoles per wet weight, molar, or molal) using some conversion methods, which usually rely on multiple assumptions (eg an assumed tissue content for the reference component). In order to provide estimates of metabolite concentration, contributions to the signal from different tissue compartments should be considered. For the brain this means that gray matter, white matter, and cerebrospinal fluid volumes calculated via voxel segmentation should be reported as appropriate, especially if water is used as the internal reference. If relaxation correction is applied, listing T 2 and T 1 values used (and/or a suitable reference) is necessary.
d. Quantification references and assumptions, and model fitting. Some software packages utilize model-fitting methods for spectral analysis. In those cases, the models used should be described in detail, as the number of metabolites used can greatly impact the result; for example, were the models simulated, and if so using what software (eg VESPA, 36 GAMMA, 37 FID-A, 32 NMR-PROBE or NMRSCOPE in jMRUI, 28 MARRS,etc). The basis set used should be described either as the "default" basis set provided with the software, or if it was modified which metabolites were included in the basis set. In addition, the fitting model also has to be specified in terms of implemented parameter relations and constraints. This must be spelled out in full if deviating from default parameter sets for the specific versions of the fit packages or quoted literature reference.
Moreover, for brain 1 H MRS spectra information on how the macromolecule signals were handled in the fitting procedure is mandatory. This can be done either by using a spectrum of macromolecules acquired in vivo, or by a mathematical approach, which is usually incorporated in the software package. When the mathematical approach is used, details of how it was done also need to be mentioned, ie for QUEST in jMRUI Linewidths are typically measured as the FWHM of the fitted resonance. For 1 H MRS, this may be done using the water resonance and determined either at the time of acquisition during the pre-scan shimming routine or post hoc through a spectral analysis of the water spectrum.
These values should be reported to ensure that spectra are of adequate quality to analyze. In non-1 H MRS, usually the most prominent singlet resonance in the spectrum is used to measure linewidth (eg PCr). Linewidths can also be obtained from the output of fitting packages, where they would usually indicate the linewidth of specific metabolite signals. It is important to specify the origin of the linewidth indicated.
b. Data exclusion criteria. The data exclusion criteria should specifically provide the thresholds for which data were excluded, whether they were based on SNR, linewidth, and/or other quality measures, and the specifics of this measure, as this can bias the overall analysis of the study data. 18 For example, "subjects were excluded if the SNR of tCr was less than 5 or the FWHM was greater than 12 Hz." Note that, to avoid bias, 38 if CRLBs are used as exclusion criteria they should not be in the form of percentage values of a metabolite of interest that can have a small value in an individual subject, but rather be formulated in absolute concentration units (or relative to a stable reference metabolite). It is also important to describe how many subjects or voxels per subject cohort were eliminated based on the specified criteria.
c. Quality measures of model fitting. Additional measures of goodness of fit, or fit error, should be reported where applicable (eg CRLB for lower bound of the fit error, or SD). If reproducibility or repeatability 11 of a measure has been shown, it is recommended to report it to demonstrate the robustness of single measures.
d. Representative spectrum. Finally, one of the most important methods of quality control is visual inspection of the MR spectrum by experienced users or MRS experts (note: in MRSI visual inspection of metabolic maps becomes equally important 14 ). Sample spectra are required so that both reviewers and readers can assess the quality and interpretation of the MRS data (see the 'Acquisition' section in the selection criteria for representative spectra). While a single spectrum may of course not reflect the quality of all of the data, it does provide a general assessment.
In contrast, selecting a single spectrum from thousands of spectra in MRSI may not reflect the overall MRSI data quality, and so maps, with details of how they are scaled and exemplar spectra displayed, are more appropriate. The requirements for this spectrum include the following.
(1) The raw spectrum must be shown, not the fitted data alone, as the fitted data do not reflect the SNR or potential systematic artifacts. If the baseline is calculated, it is recommended to show it. Fitted data are recommended to be shown in addition as an overlay to reflect quality of model fitting.
(2) The x axis or chemical shift axis should be displayed with units in ppm. (3) If the spectrum is apodized for display purposes, the apodization parameters should be given in the figure caption. An example is shown in Figure 1. As described above, spectra can be included in a figure that also presents the voxel location to meet space and figure constraints for specific journals. In addition, for a more complete representation of spectral quality in the study, plotting of the average spectrum across all data points and SDs around this average from each studied cohort may be displayed. In the checklist, the figure number should be described so that it can be easily referenced and also serves to indicate its presence.
The details described above should be included in the text of the manuscript or in a supplemental methods section. The MRSinMRS checklist (Table 1) is intended to be a reference for the author and the reviewer as well as the reader, and while it should be included as supplemental material it is not intended to replace the manuscript text. Example checklists based on information from existing publications are provided in the appendices of this paper (Appendix 1, Appendix 2, Appendix 3, Appendix 4) to provide guidance to authors as to the details that should be included in the checklist. Those items in the checklist that are in italics are details that should have been but were not included in these publications, illustrating further the value of including the checklist to ensure that all important details are included in the manuscript.

| CONCLUSION
These minimum reporting guidelines for MRS should allow the field to improve the rigor and critical examination of reported results, improve the reproducibility and comparability of studies, and provide new entrants to the field with detailed guidance as to reporting practices. To assist authors in reporting and reviewers in assessing these essential and recommended parameters, we have provided a simplified checklist (Table 1). It is hoped that this checklist will facilitate writing for authors, improve analysis for journal reviewers, and provide an easy way for journal editors to ensure that MRS studies are reported in full. In addition, reporting requirements, if checked early, encourage researchers to consider these aspects ahead of time, hopefully before data collection has commenced. While it is preferred that details of the MRS acquisition and analysis are included in the main text, the MRS reporting checklist can also be provided as part of the appendix or supplementary material of a submission and used in the review process, as with many other manuscript checklists such as PRISMA, STARD, CONSORT, and STROBE. (Researchers may also find the MRS-Q V1 form at Open Science Framework (https://osf.io/8s7j9/) useful.) Adherence to these minimum requirements and recommended guidelines is expected to ensure that all MRS papers provide the necessary information to reproduce studies as well as provide a basis for comparison for the evaluation of the studies across clinical domains. As with initiatives in other fields of biological and clinical research, it is expected that this will improve reproducibility and validity, and strengthen the field going forward.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of this article.  h. Shimming method, reference peak, and thresholds for "acceptance of shim" chosen 1st and 2nd order, vendor standard method (DESS sequence in "advanced shim" mode until convergence), line-width of PCr peak was evaluated post hoc i. Triggering or motion correction method (respiratory, peripheral, cardiac triggering, incl. device used and delays) Subjects were instructed to push the pedal only during times without RF excitation or signal reception, cued by gradient noise. Adherence to the protocol was inspected via data from the force sensors.

Data analysis methods and outputs
a. Analysis software 31 P MR spectroscopy data were processed from raw data exported from the scanner using in-house developed Python scripts (http://www.python.org) for phasing and channel combination. Signals were phased to the highest peak magnitude of PCr in the frequency domain after 7 Hz Lorentzian apodization and 4 × zero-filling. The channel combination was then performed by weighted averaging of the raw data (that is, without apodization and zero-filling). Weights were calculated as proportional to signal, averaged over four resting spectra (excluding the fully relaxed spectrum). Spectra were then fitted in AMARES, as implemented in jMRUI, 3. Data analysis methods and outputs d. Quantification references and assumptions, fitting model assumptions Quantification relative to total 31 P signal, which was assumed to be constant. End-exercise PCr depletion relative to post-exercise asymptotic value of monoexponential fit of recovery

Data quality
a. Reported variables (SNR, linewidth (with reference peaks)) SNR was calculated using the partially saturated resting spectra of each time series by dividing the PCr peak amplitude by the SD of the signal in a region containing only noise, 15 ppm off-center across 1/16 of the total bandwidth. Linewidths were taken from the AMARES fit of the PCr peak.
b. Data exclusion criteria >10% changes of sum of total 31  Items listed in italics are details that were not included in the paper that served as the source for this example.