Unbiased measurements of reconstruction fidelity of sparsely sampled magnetic resonance spectra

The application of sparse-sampling techniques to NMR data acquisition would benefit from reliable quality measurements for reconstructed spectra. We introduce a pair of noise-normalized measurements, and , for differentiating inadequate modelling from overfitting. While and can be used jointly for methods that do not enforce exact agreement between the back-calculated time domain and the original sparse data, the cross-validation measure is applicable to all reconstruction algorithms. We show that the fidelity of reconstruction is sensitive to changes in and that model overfitting results in elevated and reduced spectral quality.

The practical aspect with all reconstruction of NUS is that the problem is underdetermined and a model for solution has to be entered. The problem hence becomes non-linear. This is manifested in NUS reconstructed spectra that (a) the reconstruction of the noise becomes non-Gaussian (nonwhite), and (b) an absolute attenuation of the signal intensities that depends on the intrinsic noise and the sparsity. Notable is that the precession of the signal height is still Gaussian distributed.
Due to the attenuation, the relative signal height is perturbed in favor of strong signals. This effect is documented (Hyberts et al, J Biomol NMR (2013) 55:167-178). These effects are even more pronounced when compensating for "overfitting" (i.e. retaining the the Lagrange λ with a finite value). The spectra where compensation is done are hence possibly less useful as they in essentially add the attenuation with the width of precision, reducing the signal heights further and effectively aggravating the problem with relative signal heights further.
Please also note that in the above-mentioned publication, a procedure of determining estimating sensitivity of NMR spectra. This procedure is then also applied to NUS obtained simulations. As this is done with hmsIST to conversion (i.e., considered to be "overfitted"), artifacts would be degrading the results if they would be serious. The procedure evaluates if the largest point in the spectrum is at the position of the known simulated signal. If elsewhere (i.e., the procedure picks up an artifact), the detection probability is lessened. It is however found that the detection probability is mainly the same for the non-weighted sampling (SSW=0 in figure 4A) as it is for the uniformly sampled case. Hence it is unlikely that the issue of artifacts is of any severity. This is that the problem of "overfitting" NUS reconstruction may be considered minor.
It is true that one can state that FM is a special case of IST (or rather the other way as FM is able to use other target functions than min ℓ1-norm), however only when both have converged. During the process, there is however a distinction, which Reviewer #1 eludes to, namely that, the obtained data points are kept static throughout the minimization of FM, and not throughout the process of IST. The curves of that are produced over the iterations of IST would hence most likely be very different than those created by FM with different numbers of iterations.
Remark 2 to previous Reviewer 2 and response: I am not convinced that the conclusion of Rnoise/work is as much a universal measurement of the reconstruction quality as it is a measurement of the spectral situation. Two additional situations would be appreciated to be investigated are: 1) As IST can very well be applied to traditional uniformly sampled spectra, it would be very interesting to see if there is a "quality" cutoff in similar manner with finite iterations. It would evaluate the universality of the definition of these parameters.
2) At page 5 in the manuscript it is stated "In NMR, however, each position on the directly observed dimension of a spectrum constitutes an independent reconstruction problem...". Indeed, this is the common reconstruction procedure as it allows for a parallel approach. I.e. it allows for shorter reconstruction time by farming out the process on several independent computations. However as this manuscript claim universality and a measure of quality, there is no problem in considering the observed dimension as to be part of the reconstruction. This as I see it will alter the calibration of the noise.
I hear the authors' desire to provide measurements of goodness regarding reconstructions of non-uniformly sampled (NUS) NMR spectroscopy. The issue of "overfitting" is however not a novel concern, but has been a quest of Jeffrey Hoch et. al. for some time. Hoch et al's approach ought to be acknowledged within the manuscript and not just via references! (see e.g : Paramasivam et al, J Phys Chem B. 2012 Jun 28;116(25): 7416-7427.) In response to the reviewer's comment, the following sentence is added to the revised manuscript. "The issue of model overfitting has been raised by Hoch and colleagues 14, 18 , though no algorithm-independent measurements for such effects have been reported." The deeper question is that of "correctness" and usefulness. To this I would like to allude to the situation of the so-called "Milkmaid problem" often associated to the introduction of Lagrange Multipliers. E.g. http://www2.sjs.org/raulston/mvc.10/lagrange.intro.htm. The addition of λ lets us solve the problem in an efficient way; yet at the end of the solution, λ is always eliminated.
However, if we somewhat change the Milkmaid problem with adding that the water level of the river tend to change, and hence flood the meadow in various unpredictable ways. Then it is a question which of following solutions is "correct", (i) assuming the maximum height of the river, (ii) the average height of the river, or (iii) the average height of the river plus one standard deviation. The solutions (i) and (iii) retains the Lagrange λ with a finite value.

The practical aspect with all reconstruction of NUS is that the problem is underdetermined and a model for solution has to be entered. The problem hence becomes non-linear. This is manifested in NUS reconstructed spectra that (a) the reconstruction of the noise becomes non-Gaussian (non-white), and (b) an absolute attenuation of the signal intensities that depends on the intrinsic noise and the sparsity. Notable is that the precession of the signal height is still Gaussian distributed.
Due to the attenuation, the relative signal height is perturbed in favor of strong signals. This effect is documented (Hyberts et al, J Biomol NMR (2013) 55:167-178). These effects are even more pronounced when compensating for "overfitting" (i.e. retaining the the Lagrange λ with a finite value). The spectra where compensation is done are hence possibly less useful as they in essentially add the attenuation with the width of precision, reducing the signal heights further and effectively aggravating the problem with relative signal heights further.
Please also note that in the above-mentioned publication, a procedure of determining estimating sensitivity of NMR spectra. This procedure is then also applied to NUS obtained simulations. As this is done with hmsIST to conversion (i.e., considered to be "overfitted"), artifacts would be degrading the results if they would be serious. The procedure evaluates if the largest point in the spectrum is at the position of the known simulated signal. If elsewhere (i.e., the procedure picks up an ar probabili the unifo is that th   There are two points we would like to emphasize here. First, with the same input data, different reconstruction algorithms have yielded strikingly different reconstruction spectra. Which one is the best representation of the real spectrum? Is it the one with the largest number of signals, the one with the smallest number of signals, or the one in between? How do we reach such a conclusion? On what basis do we reach such a conclusion?
Second, we would like to ask the reviewer to focus on the boxed areas in the left panel.
Most of these signals are missing in the maximum entropy reconstruction (ME), but are present in the reconstructions by MDD and SSA/FT. Yet, the latter two algorithms do not yield consistent signals either. Some of the reconstructed signals in the boxed regions are artifacts (based on comparison with a fully sampled reference spectrum). Can the reviewer tell which ones are genuine signals and which ones are artifacts in the absence of a fully sampled reference spectrum? We cannot.
The above examples illustrate that (1) the quality of reconstructed NMR spectra can vary greatly even with the same input data; (2) reconstruction artifacts can cause serious issues for NMR data interpretation; and (3) it is absolutely necessary to establish a fidelity measurement independent of the reconstruction algorithm for detection of inadequate modeling as well as overfitting, which is the focus of this study.
1) As IST can very well be applied to traditional uniformly sampled spectra, it would be very interesting to see if there is a "quality" cutoff in similar manner with finite iterations. It would evaluate the universality of the definition of these parameters.
The calculation of the noise free