Introduction

Mathematical modeling of biophysical effects has played a key role in the establishment, development, and routine clinical interpretation of magnetic resonance imaging (MRI)1. While proton spins, the primary source of signal in MRI, are governed by non-intuitive and complex quantum mechanics2, their ensemble and net magnetization can be well described as a classical mechanic physics system, formulated as a set of coupled differential equations3. Based on these well-validated Bloch equations, imaging scientists were able to simulate offline the interactions between the tissue magnetic properties and the scanner-generated radio-frequency (RF) irradiation. This capability has enabled the development of new pulse sequences and image acquisition strategies aimed for enhancing a variety of clinically meaningful tissue characteristics4,5.

Chemical exchange saturation transfer (CEST) MRI is an increasingly studied molecular imaging technique, capable of detecting millimolar concentrations of mobile proteins, peptides, and metabolites, via the exchange mechanism between the compound of interest labile protons and the water protons6. CEST imaging has been demonstrated as a promising tool for a variety of biomedical tasks including early stroke characterization7, cancer detection and grading8,9,10, kidney disease evaluation11,12, reporter gene imaging13,14,15, and metabolism characterization in neurological disorders16. Semisolid magnetization transfer (MT) imaging is another molecular MRI approach, where the data is acquired very similarly to CEST, yet it provides information on semisolid components, such as membranes or myelin sheets17. The semisolid MT contrast has been used for characterizing white-matter disease (such as multiple sclerosis), assessing the treatment response to cancer18, and increasing the quality of magnetic resonance angiography (MRA) images19.

Similar to conventional water-proton-based MRI, the progress of CEST and semisolid MT imaging was intertwined with the ability to formulate, explore, and solve the underlying mathematical model. Notably, such multi-proton-pool scenarios necessitate a substantial expansion of the Bloch model, which needs to account for the additional interplay between the exchangeable proton pools, the water pool, and the acquisition parameters. To meet these requirements, the expanded Bloch–McConnell (BM) model was developed20 and later served as a powerful tool for a variety of molecular imaging applications, such as optimizing the design of paramagnetic CEST contrast agents21, uncovering iopamidol’s multi-site exchange properties and pH sensitivity22, and the determination of the exchange parameters of various CEST compounds23,24.

By fitting experimentally acquired data to the BM equations, researchers were able to reconstruct quantitative medical images and map clinically useful biophysical properties such as pH25 and temperature26. While in steady-state CEST cases, the parameter fitting can be performed via analytical solutions of the BM equations27, this approach mandates a lengthy acquisition, which requires the use of multiple saturation powers, potentially resulting in impractical scan times (e.g., more than an hour28). On the other hand, applying accurate BM fitting for a clinically relevant, non-steady-state, and faster acquisition scheme requires using the numerical model solution, leading to exceedingly long reconstruction times (on the order of hours29).

Recently, several deep-learning-based strategies were developed to accelerate the molecular MRI pipeline. CEST Z-spectrum data were converted into phosphocreatine concentration maps using a neural network (NN) trained with BM-simulated signals29. The extraction of amide proton volume fraction and exchange rates, as well as semisolid MT parameters, was successfully performed in-vivo, using pseudo-random MR-fingerprinting (MRF) acquisition schemes, empowered by deep reconstruction strategies (typically trained using numerical dictionaries containing millions of entries)30,31,32,33.

However, all the above works rely on a computationally demanding generation of synthetic BM-based signals, requiring from hours to days, depending on the complexity of the imaging scenario addressed. This exhaustive and mandatory process must be repeated for every change in the acquisition protocol parameters or hardware available (e.g., modifying the recovery time, flip angle, duty cycle, or B\(_0\) field strength), and for every new biological scenario of interest (e.g., where the metabolite of interest or relaxation properties may vary). This restriction severely limits the generalizability and utilization of quantitative CEST/semisolid MT techniques for the broad spectrum of molecular imaging applications. In addition, the lengthy BM simulation times hinder the efficient optimization of any semisolid MT/CEST acquisition protocols, as such efforts mandate an accurate simulation of a huge number of BM signals, which become impractical for the Guassian saturation pulse trains typically required for clinical applications and for super-Lorentzian lineshapes.

Here, we designed a deep-learning-based system that can capture the nonlinear relations embedded in the molecular MRI Bloch–McConnell model, enabling a rapid and accurate generation of biologically realistic synthetic data. The system was designed to allow robustness and flexibility for various hardware and biological conditions while accommodating any acquisition protocol length. Validation was performed in-silico, in-vitro, and in-vivo for both classical CEST-weighted imaging and quantitative semisolid MT/CEST MRF.

Results

Dynamic simulation of molecular MRI signals

A fully connected NN was designed to receive tissue and scanner parameters of interest and rapidly generate the corresponding semisolid-MT or CEST signal (Fig. 1a). Each inference cycle was designed to output a single time-evolution signal element, which was then stored and served as input for the next cycle. This configuration enabled a synthesis unrestricted by acquisition protocol length. The accuracy of the rapid signal generator was initially examined in-silico, using six practical imaging scenarios34, involving the following molecular targets: L-arginine, semisolid MT, aliphatic relayed nuclear Overhauser enhancement (rNOE), amine, iohexol, and amide. Representative pairs of NN-predicted and ground truth signals (obtained via traditional numerical solution of the BM equations) are shown in Fig. 2. An excellent agreement between the NN-generated and the ground-truth signals was observed for the entire test set (Fig. 3), with a normalized root-mean-squared error (NRMSE) smaller than 3%, and a significant correlation across all cases (Pearson’s r > 0.99, p < 0.0001). The signal generation time was 63–92% shorter compared to the reference consensus signal generator31,35 when using the exact same desktop computer and CPUs, and 78–96% shorter when employing a GPU. Exact generation times for all imaging scenarios are availalble in Supplementary Information Table 1. To investigate the adaptive NN performance as a function of the signal element iteration, we calculated the NRMSE across different iteration steps for all imaging scenarios presented in Fig. 2 (see Supplementary Information Figure S1). No evident trend was obtained between the iteration step and the error.

Table 1 Tissue and scanner parameter range used for dynamic NN training.
Figure 1
figure 1

Network architectures. (a) A dynamic fully connected NN receives tissue parameters, scanner parameters, and the previous inference cycle signal S\(_i\). The output is the next time-evolution signal element S\(_{i+1}\). The inference cycle continues according to the user’s desired signal acquisition length N. (b) Application Optimized network. The input of the network are tissue and scanner parameters. The output is the entire magnetic signal inferred at once, according to a particular value N used during the training. T\(_{1w}\)—water longitudinal relaxation, T\(_{2w}\)—water transverse relaxation, T\(_{1s}\)—solute longitudinal relaxation, T\(_{2s}\)—solute transverse relaxation, f\(_{si}\)—proton volume fraction for solute i, k\(_{swi}\)—proton exchange rate for solute i, \(\omega _b\)—chemical shift, T\(_p\)—saturation pulse duration, T\(_{rec}\)—recovery time, B\(_0\)—main magnetic field, FA—flip angle, \(\omega _{rf}\)—saturation pulse frequency offset, B\(_1\)—saturation pulse power, n\(_p\)—number of pulses in the saturation pulse train.

Figure 2
figure 2

In-silico comparison between dynamic-NN-generated semisolid MT/CEST signals, and their ground-truth counterparts. Each subfigure describes a representative test-set signal trajectory taken from a different molecular imaging scenario34, all generated using a single NN (Fig. 1a). In all cases, a series of varied saturation pulse powers (B\(_1\)) was applied, except for the In-vivo MT scenario where both the B\(_1\) and the saturaion pulse frequency offset (\(\omega _{rf}\)) were simultaneously varying. Note the excellent agreement between the reference ground-truth standard and the NN-generated trajectories.

Figure 3
figure 3

Statistical analysis of the in-silico signal generation experiment performed using the dynamic NN. An excellent agreement was obtained between the NN-predicted signal trajectories and their ground truth counterparts (Pearson’s r > 0.99, p < 0.0001, normalized root mean square error (NRMSE) < 3%) for all six examined imaging scenarios.

In-vitro CEST MRF based on NN-generated signals

To further validate the applicability of deep signal generation for standard imaging applications, an in-vitro phantom study was conducted. A 50 mM L-arginine phantom, composed of three vials titrated to different pH levels (yielding different proton exchange rates), was scanned at 9.4T. A dictionary of 665,873 simulated signals (Fig. 3—top left), corresponding to various combinations of the water pool and the amine proton exchange parameters36,37 was generated using the NN-based and the standard reference method. A traditional dot-product-based MRF reconstruction was then performed to quantify the L-arginine properties using the two dictionaries. The resulting proton exchange rate (Fig. 4b,f) and concentration (Fig. 4a,e) maps obtained using both methods were in excellent agreement (Supplementary Information Table 2), demonstrating an NRMSE < 2% and a structural similarity index (SSIM) > 0.90.

Table 2 Tissue and scanner parameter range used for application-optimized NN training.
Figure 4
figure 4

Magnetic resonance fingerprinting quantitative reconstruction of in-vitro CEST and in-vivo semisolid MT data using dynamic-NN-generated signals. (a,b,e,f) L-arginine concentration and proton exchange rate maps were obtained using dynamic-NN-generated signals (top) and traditional numerical solutions of the BM equations (bottom). (c,d,g,h) In vivo semisolid MT proton volume fraction and exchange rate maps from a wild-type mouse, obtained using dynamic-NN-based signals (top) and traditional numerical solution of the BM equations (bottom).

In vivo semisolid MT MRF based on NN-generated data

As another means to assess the NN-based signal generator under realistic settings, a semisolid MT brain imaging experiment was performed in a wild-type mouse scanned at 9.4T. A dictionary composed of 26,400 signals representing various tissue parameter combinations was generated using the same dynamic NN used in the L-arginine phantom study. The resulting dictionary was used for dot-product MRF reconstruction of the semisolid MT proton volume fraction and exchange rates across the mouse’s brain and compared to the corresponding images reconstructed using a traditional dictionary synthesis. The resulting exchange parameter maps generated using both methods (Fig. 4c,d,g,h) were in excellent agreement (Supplementary Information Fig. S2), demonstrating an NRMSE < 4% and a structural similarity index (SSIM) > 0.98.

Application optimized ultrafast signal generation

For applications focused on a single exchangeable proton pool or a particular pathology of interest, the optimization of the signal-to-noise ratio, or parameter quantification ability is highly desirable, but the flexibility to accommodate a variety of imaging cases and various acquisition schedule lengths can be relaxed. To accommodate these situations, we have designed another NN-based molecular signal generation architecture (Fig. 1b), where the dynamic component is absent, and the entire signal trajectory (including all its time-evolution components) is generated simultaneously. While such architecture can only enable a fixed-size signal generation, it inherently offers increased acceleration compared to the dynamic NN (Fig. 1a), requiring only a single inference cycle. The system was validated using a challenging imaging scenario, where CEST signals from the human brain need to be synthesized while considering the simultaneous effects from seven proton pools (amide, guanidine, amine, OH, NOE, semisolid MT, and water)17. The simulation was performed using a clinically relevant saturation pulse train, which is particularly lengthy to simulate using standard signal generators. A comparison between a representative NN-generated Z-spectrum from the above-described case and its traditionally simulated counterpart is shown in Fig. 5a. The excellent agreement between the entire test-set results obtained with both methods is shown in Fig. 5b (Pearson’s r > 0.99, p-value < 0.0001, NRMSE < 1%). Notably, the test-set generation time was shortened from 25.2 hr (90,719 s, Supporting Information Table 1) using the reference method to 1.6 s using the application optimized NN, implemented using the exact same (non-GPU) hardware.

Figure 5
figure 5

In-silico generation of a seven-pool human brain Z-spectrum data, acquired with a Sinc-Gaussian saturation pulse train, using an application optimized NN. (a) A representative brain CEST signal generated using the NN-predicted (blue) or the reference standard method (black). (b) Statistical analysis of the entire test set data, demonstrating an excellent agreement between the NN-predicted and ground truth reference values (Pearson’s r > 0.99, p < 0.0001, normalized root mean square error = 0.0003).

Discussion

The BM equations have played a vital role in the progress of semisolid MT and CEST imaging. Their solution provides offline insights on protocol design, field strength dependencies38, and predicted compound-related effects, and is being routinely used in a variety of imaging tasks, such as traditional Bloch fitting39 and semisolid-MT/CEST MRF32. While various deep-learning-based strategies were recently proposed for accelerating/quantifying molecular semisolid MT/CEST imaging29,30,34,40,41,42,43,44, most of these computational approaches require an exhaustive simulation of the underlying BM model that needs to be repeated for millions of different tissue parameter combinations. The proposed NN-based signal generator is expected to complement these quantitative CEST efforts, providing a drastic acceleration at a different point in the imaging pipeline—the dictionary generation step. Moreover, the same signal generator could be used for accelerating classical Bloch-fitting of traditional CEST data. Notably, the dynamic nature of the suggested framework (Fig. 1a) offers a robust framework for accommodating any signal length and a wide variety of tissue and scanner parameters while circumventing the need to re-synthesize data for a newly examined imaging protocol and biological scenario. This characteristic is expected to enable rapid comparison between the encoding capability of various-length quantitative CEST protocols, assist in choosing the optimal field strength for a given application, and improve the contrast to noise-ratio in semisolid MT or CEST-weighted imaging.

While analytical solutions for semisolid MT and CEST-weighted signals were previously derived27,38,45,46, they rely on several inherent assumptions that yield inaccuracy at particular exchange regimes and are incompatible with many saturation pulse trains and pulse shapes. As numerically solving the BM equations in such cases is extremely computationally demanding, the deep signal generator becomes particularly attractive, as its inference time is agnostic to the pulse shape, saturation pulse train characteristics, and the macromolecular Lorentzian/super-Lorentzian absorption lineshape. Specifically, a 56,699-fold acceleration was demonstrated here (Fig. 5) to simulate a seven-pool CEST Z-spectrum acquisition using a sinc-Gaussian pulse trains (without a GPU).

The choice of the fully connected architecture for the deep signal generator was based on its previous success in capturing semisolid MT/CEST signal dynamics30 and as a means to minimize the model complexity, training, and inference time. Future work could nonetheless explore more sophisticated recurrent-based architectures47, which may further improve the system’s accuracy (though with a potential penalty in computational time).

A single dynamic model (with a fixed set of optimized neural network weights) was able to accurately generate semisolid MT/CEST signals (NRMSE < 3%, Pearson’s r > 0.99, p < 0.0001) for six different imaging scenarios (Figs. 2 and 3). The same model could be expanded and trained to accommodate additional pulse-sequence diagrams and various readout patterns used by the MRI community. This process is expected to be relatively straightforward via the use of the open-source pulseq standard format sequences31. Similarly, the same model could be used to generate traditional water pool T\(_1\) and T\(_2\) signal dictionaries for conventional MRF48 while taking into consideration the magnetization transfer effect on the measured relaxivity49,50.

This work has used a Pulseq-CEST-based rapid numerical solver of the Bloch–McConnell equations as the ground-truth reference31,32. While Pulseq-CEST has been used in various previous reports30,32,35, there is still an ongoing effort by the research community to compare different codes and reach a consensus on CEST simulation51,52. The deep learning approach described in the manuscript can easily be adapted and re-trained using any other simulator-based signals, to accommodate a future consensus.

To assess if the network performance are dependent on the tissue and acquisition parameters, we have separately calculated the NRMSE as a function of water relaxivity, proton volume fraction and exchange rate, and the saturation pulse power, for a representative imaging scenario (L-arginine, Figs. 2 and 3). As shown in Supplementary Information Fig. S3, while a very mild dependency was demonstrated for the water relaxivity, an increased error was demonstrated for the slowest proton exchange rates, the close to zero saturation pulse powers and the close to zero proton volume fractions. This can be explained by the relatively low CEST/semisolid MT signal associated with a very slow exchange, a highly diluted CEST compound, and inefficient saturation powers. This limitation could potentially be mitigated, by intentionally generating additional training signals from this challenging SNR regime.

As demonstrated in Figs. 3 and 5b, the application optimized NN has demonstrated increased correlation between the NN-predicted signals and the ground truth, compared to the adaptive NN and faster inference times (Supporting Information Table 1). To further compare the outputs for the two networks, we evaluated their performance on the same representative example, the amine imaging scenario described in Figs. 2 and 3. As shown in Supplementary Information Fig. S4, while both networks demonstrate a significant agreement with the ground-truth reference (pearson’s r > 0.99, p < 0.0001) , the NRMSE obtained by the application optimized network is lower (0.59% compared to 1.1%) and the inference time shorter (0.82 s compared to 1.42 s). However, the network is only suitable for a pre-determined acquisition protocol length, and is mostly relevant for a researcher/physician with a research focus on a particular imaging application (e.g., brain amide proton transfer CEST weighted imaging using a fixed full-length Z-spectrum). The adaptive NN, on the other hand, offers a wide applicability to a variety of imaging scenarios (Fig. 2), while retaining sufficiently low error rates and practically reasonable inference times.

Conclusion

A deep learning-based framework for dynamic and rapid generation of semisolid molecular MT and CEST MRI signals was developed, demonstrating broad applicability with various imaging scenarios and acquisition protocols.

Methods

Deep molecular signal generator architecture

Two deep learning models were designed to receive a set input of tissue and scan parameters and rapidly generate the corresponding semisolid MT or CEST signals. Both architectures consisted of a four-layers fully connected neural network, with 256 x 256 neurons in the two hidden layers and sigmoid activation functions. The dynamic network variant (Fig. 1a) was designed to accommodate a two-pool imaging scenario of various target compounds using a continuous wave saturation pulse scheme with a wide range of saturation and readout characteristics (Table 1), and a variety of field strengths (B\(_0\) = 3T, 4.7T, 7T, 9.4T, and 11.7T). The network operated in an iterative manner, where each inference cycle i yielded a single time-evolution signal element (S\(_i\) = |M\(_{xy_{i}}|\)). In the next cycle, the last estimated signal served as additional input for generating S\(_{i+1}\) until reaching the required signal length N. Training was performed using 104,190,000 random signal trajectories from the tissue and scanner parameter range described in Table 1. The data was split between training and validation sets in a ratio of 90/10%, employing an early-stopping approach to prevent over-fitting. A separate set of 1,694,183 signals from various imaging scenarios34 was synthesized as a test set.

The application-optimized network variant (Fig. 1b) was designed to obtain a further acceleration in signal generation time, with the cost of removing the flexibility for variable output signal acquisition lengths. The network was oriented to address a seven-proton-pool human brain imaging scenario at a variety of field strengths (B\(_0\) = 3T, 4.7T, 7T, 9.4T, and 11.7T), with various tissue parameter values enabled, based on the baseline properties described in17,53. A flexible CEST protocol, consisting of Sinc-Gaussian pulses, variable pulse duration, variable number of saturation pulses, recovery time, and B\(_0\) magnetic field was considered, and the output consisted of an N=34 long acquisition schedule, obtained in a single inference cycle. The training was performed using 5,443,200 random signal trajectories from the tissue and scanner parameter range described in Table 2, with a 90/10% split between training and validation data. A separate set of 259,200 signal trajectories (at B\(_{0}\) = 4.7T) was synthesized for the test phase.

For both network variants, the training was performed using signal dictionaries generated via a numerical reference solution of the Bloch–McConnell equations (see next section), employing the adaptive moment estimation (ADAM) optimizer, a mean square error loss, a learning rate of 0.0001, and a batch size of 2048. The NNs were realized using the TensorFlow framework, trained using a GeForce RTX 3060 TI graphic processor, and an Intel i5-11400 2.6 GHz 12-core CPU desktop. The total training time was 640 min and 35 min for the dynamic and application-optimized network variants, respectively. Inference times were calculated both with and without the use of the GPU (Supporting Information Table 1).

Reference standard molecular signal generator

Dictionaries of simulated signal intensity trajectories were generated using a state-of-the-art Bloch–McConnell equations numerical solver31,35, implemented in C\(^{++}\) according to the open-source pulseq-based standard54,55. To provide a fair comparison for the MRF application, requiring the simultaneous and exhaustive generation of a huge number of signal trajectories, we have activated the pulseq-signal generator in a parallel execution manner32 (see the code availability section). The simulations were implemented on an Intel i5-11400 2.6 GHz 12-core CPU desktop.

CEST phantom imaging

An in-vitro MRF imaging study was performed using an L-arginine phantom composed of three different vials with a 50 mM concentration and a pH titrated to 5.0, 5.5, and 6.0 pH (affecting the amine proton exchange rate)36,37. Imaging was conducted using a 9.4T MRI scanner (Bruker Biospin, Billerica, MA), a transmit/receive volume coil (Bruker Biospin, Billerica, MA), a field of view (FOV) of 32 x 32 mm\(^2\), a matrix of 64 x 64 pixels, and a 5 mm slice thickness. An in-house programmed, flexible semisolid MT/CEST-EPI protocol was used, employing a pseudo-random varied series of saturation pulse parameters, as described in36,37.

In vivo semisolid MT mouse imaging

All animal experiments and procedures were performed in accordance with the NIH Guide for the Care and Use of Laboratory Animals and were approved by the Institutional Animal Care and Use Committee of the Massachusetts General Hospital. The study is reported in accordance with ARRIVE guidelines. A single adult C57/BL6 wild-type male mouse (28 gr), was purchased from Jackson Laboratory for a proof of concept imaging experiment. It was anesthetized using 1–2% isoflurane and placed on an MRI cradle with ear and bite bars to secure the head. The respiration rate was monitored with a small animal physiological monitoring system (SA Instruments, Stony Brook, NY), and the temperature was maintained by blowing warm air in the bore of the magnet. A quadrature volume coil was used for RF transmission, and a mouse brain phased array surface coil was used for receive (Bruker Biospin, Billerica, MA). A field of view (FOV) of 19 x 19 mm\(^2\), a matrix of 64x64 pixels, and a 1 mm slice thickness were used. A pseudo-random semisolid-MT MRF protocol was implemented, where the saturation pulse power and frequency offset were varied between differently acquired raw images, as described in30.

Statistical analysis

Pearson’s correlation coefficients were calculated using the open-source SciPy scientific computing library for Python56. The structural similarity index (SSIM) was computed using the SSIM-python imaging library (PIL). Differences were considered significant at p < 0.05.