Nuts and bolts of 4D-MRI for radiotherapy

Magnetic resonance imaging (MRI) is increasingly being used in the radiotherapy workflow because of its superior soft tissue contrast and high flexibility in contrast. In addition to anatomical and functional imaging, MRI can also be used to characterize the physiologically induced motion of both the tumor and organs-at-risk. Respiratory-correlated 4D-MRI has gained large interest as an alternative to 4D-CT for the characterization of respiratory motion throughout the thorax and abdomen. These 4D-MRI data sets consist of three spatial dimensions and the respiratory phase or amplitude over the fourth dimension (opposed to time-resolved 4D-MRI that represents time in the fourth dimension). Over the last 15 years numerous methods have been presented in literature. This review article provides a comprehensive overview of the various 4D-MRI techniques, and describes the differences in MRI data acquisition and 4D data set generation from a methodological point of view. The current status and future perspective of these techniques are highlighted, and the requirements for safe introduction into the clinic (e.g. method validation) are discussed.

since 4D-CT only allows for transversal imaging, the imaging orientation in 4D-MRI can be chosen at will. Furthermore 4D-MRI does not use ionizing radiation, which may be especially beneficial for pediatric patients. Moreover, 4D-MRI can picture both tumor and OARs, which is often difficult in 4D-CT, due to its limited softtissue contrast (see figure 1 and supplementary video 1 (stacks.iop.org/PMB/63/21TR01/mmedia)). Unfortunately, 4D-MRI cannot readily be generated on a clinical MRI system, since currently none of the major MR vendors offer 4D-MRI as product implementation. To fill this gap, a considerable number of 4D-MRI methods have been published in the literature over the past 15 years. In general, these implementations acquire MR data over a longer period of time, to sample sufficient respiratory cycles, in combination with a respiratory surrogate signal that is used to sort the data into different respiratory phases or displacements.
This review focuses on 4D-MRI implementations for radiotherapy planning and treatment. Throughout this review, the term 4D-MRI refers to respiratory-correlated 4D-MRI, unless otherwise specified. In a respiratorycorrelated 4D-MRI the four dimensions indicate the three spatial dimensions and the respiratory phase. This is different from time-resolved 4D-MRI, where the fourth dimension is represented by time. In our search for 4D-MRI implementations for radiotherapy, we have found a total of 61 papers describing different techniques and focusing on different aspects of the implementation (up through March 2018). Especially over the last three years, a large increase in the number of papers describing different 4D-MRI techniques was observed (see table 1 at the end of this review). To put all these techniques into perspective, the review will categorize these techniques and describe the different considerations that have to be made when implementing 4D-MRI.
The main focus of this review is to: (1) Provide a detailed analysis of the intricacies of the MR physics behind the various acquisition and reconstruction strategies that are currently proposed for 4D-MRI generation.
(2) Describe and categorize the different 4D-MRI acquisitions published in literature.
(3) Highlight potential pitfalls for the acquisition and reconstruction of a 4D-MRI data set.
(4) Provide an outlook on future research and developments in the field of 4D-MRI. This review is divided into different sections. First, basic MR theory is described to understand the different implementations that are published. The second part focuses on the different acquisition strategies that are used throughout literature and how respiratory surrogate signals, which are required to generate a respiratorycorrelated 4D-MRI data set, are extracted. Next, the generation of such respiratory-correlated 4D-MRI data sets is described alongside the validation of the generated data sets. Finally, the clinical applications are discussed and an outlook is included where developments and future research are described.

MR signal theory
MRI physics is often considered challenging and more complex than other imaging modalities. Here, we will briefly introduce some of the basic principles of MR Image formation that are required to understand the methods described in the remainder of this paper. For a more comprehensive background, including signal formation fundamentals, the reader is referred to the excellent textbooks by Haacke et al (1999) and Bernstein et al (2004), in particular chapters 1, 4, 10, 14, 15, and 11, 13, 14, 17, respectively. Concepts like spin polarization, excitation, and signal detection are assumed to be known to the reader and are not covered here for conciseness. MRI experts may choose to skip this section.   in a cholangiocarcinoma patient. The 4D-MRI was acquired with a 3D golden-angle radial stack-of-stars sequence and reconstructed using the XD-GRASP (extra-dimensional golden-angle radial sparse parallel imaging) algorithm. Compared to 4D-CT, 4D-MRI demonstrates increased soft-tissue contrast for gross tumor volume delineation in pink (a), (b) and reduced stitching artifacts (white arrow) compared to 4D-CT, which is also observed in the zoomed images (blue rectangle). Table 1. Overview of the different 4D-MRI implementations published in literature. n/a = information not available; cine = first acquire all dynamics for one slice, then move on to next slice; sequential = first all slices for one dynamic, repeat for N dynamics; continuous = continuous scanning (mostly for 3D); bSSFP = balanced steady-state free precession; SPGR = Spoiled gradient echo; FLASH = fast low angle shot; T 2 -TSE = T 2 -turbo spin echo; HASTE = Half-Fourier acquisition single-shot turbo spin echo; GRE = gradient echo; methods that do not use a surrogate and are lacking a sorting domain are time-resolved 4D acquisition (generally low spatial resolution and/or coverage).

Authors
Region Surrogate XCAT phantom n/a n/a n/a n/a n/a n/a 1.0 × 1.0 mm 2 , 5 mm 512 × 512 mm 2 , 250 mm n/a In MRI, spatial encoding is different from most other imaging modalities as image formation happens in the physical (spatial) and frequency (Fourier) domain. After a spatially selective radiofrequency (RF) pulse the encoding is performed using the 'field gradients' of the MRI scanner, which introduce a spatially varying magnetic field. In effect, these field gradients allow local manipulation of the Larmor frequency of the hydrogen spins, which links the spatial encoding to the frequency domain.

Selective excitation
The process of image formation starts with a selective excitation RF pulse that restricts the excitation to a 2D plane or slab of interest within the patient. A selective excitation is achieved by applying a field gradient during the excitation pulse. This creates a range of resonance frequencies along the axis of the applied gradient. Since the excitation is only effective within the frequency band of the RF pulse, the excitation is restricted to the area in which the resonance frequency and RF frequency match. As a result, signal can only originate from this area. The gradient strength in combination with the frequency bandwidth of the RF pulse determines the thickness of the excited slice or slab. By changing the center frequency of the RF pulse one can alter the location of the excited slice along the applied gradient.

In-plane localization
Once the slice (or slab) is excited, spatial encoding along the remaining dimensions is performed to form an image. Again, the field gradients are used. By applying a linear field gradient the resonance frequency is varied linearly along that particular dimension. This introduces a phase difference between spins at different locations. The amount of phase dispersion is dependent on the amplitude and the duration of the gradient (i.e. the timeintegral of the gradient pulse). Since the MR signal is the sum of the entire spin ensemble, the detected signal is actually a measure of phase interference between spins at various spatial frequencies. The native domain, in which the signal is acquired, is therefore the spatial frequency domain, called k-space, where the k-number denotes the spatial frequency. Intuitively, the center of k-space encodes the low spatial resolution and includes information on bulk motion, while the peripheral k-space encodes the higher spatial frequencies (i.e. sharpens the anatomical structures) and holds information of spatially localized motion. In Cartesian imaging, the lines in k-space are acquired along a rectilinear grid. According to the Nyquist theorem the extent of k-space determines the resolution (Δx) of the image, whereas the distance between the k-space samples (Δk) determines the field-of-view (FOV). Once all k-lines are collected a simple 2D Fourier transform (2DFT) provides the reconstructed image.

2D versus 3D acquisitions
In order to collect volumetric imaging data one could repeat the process described above for multiple slice locations. For each slice the location is adjusted by changing the center frequency of the RF pulse. In theory one is free to choose the order in which the slices are collected in, but on most scanners the ordering is limited to sequential (one after another, in subsequent order: 1,2,3,…, n) or interleaved (first the even slices, followed by the odd slices: 2, 4, 6,…, n, 1, 2, 3 …, n − 1). Normally interleaved would be preferred to mitigate slice crosstalk caused by the imperfect slice excitation profile: due to imperfections in the slice excitation pulse, the slice excitation profile is never a perfect block pulse, but rather a rounded shape (i.e. a block pulse convolved by a Gaussian with a certain full width half maximum). As a result a portion of the spins experience excitation twice when adjacent slices are excited directly after one another. Interleaved scanning allows more time for relaxation before the adjacent slice is excited, thereby mitigating the effect of cross-talk. The thinner the slice, the more difficult it is to produce a rectangular slice profile. Moreover, signal-to-noise ratio (SNR) is linearly dependent on the slice thickness, where thinner slices produce lower SNR. Hence, the slice thickness in 2D multi-slice imaging is usually limited to 3-4 mm.
A solution that overcomes the problem of slice profile imperfections is to acquire a 3D volume instead of multiple 2D slices. In 3D acquisitions the entire volume (i.e. a thick slab) is excited with every RF pulse and a second phase encode axis is used to encode the third dimension. Effectively the slice encoding is thus replaced by Fourier encoding. The advantage of 3D versus 2D acquisitions is that 3D imaging allows isotropic high-resolution imaging, since the slice thickness is not limited by the slice excitation profile. Because the entire volume is excited with every RF pulse (and hence is contributing to the signal) the signal-to-noise (SNR) ratio is typically higher for an equivalent slice thickness. The disadvantage of 3D acquisitions is that the repetition time (T R ) is typically short, which may be limiting in terms of the contrast that is produced. Also, a 3D readout is longer than a 2D readout, so motion during image acquisition is more problematic when unaccounted for.

Cartesian versus non-Cartesian readout trajectories
During the acquisition, k-space is sampled via a path defined by the applied gradients. There are many trajectories possible. The appearance of the image is to a large extent determined by the design of the trajectory. In particular the vulnerability to image artifacts is strongly influenced by the readout trajectory.
Most clinical exams, sample k-space along a Cartesian trajectory: a single gradient, say G x , is applied, which samples the points in a straight line along the k x dimension (figure 2(a)). For each line, the phase encode gradient (G y ) moves the trajectory to a different k y position. A Cartesian readout is a very robust acquisition scheme, since errors in gradient timing, off-resonance effects, or other imperfections only have an effect along a single direction in k-space. The artifacts are therefore either small or predictable. This is the main reason why Cartesian readouts are often the method of choice in clinical exams. A disadvantage of Cartesian imaging is the fact that it produces strong, coherent, aliasing artifacts when the sampled data does not fulfill the Nyquist criteria (i.e. when the acquisition is undersampled), although to a certain extent this can be overcome by the use of parallel imaging techniques, such as SENSitivity Encoding (SENSE) and Generalized Autocalibrating Partially Parallel Acquisitions (GRAPPA) (Pruessmann et al 1999, Griswold et al 2002 (see section 2.5 for more details).
Non-Cartesian readouts are defined as readouts that do not follow the Cartesian grid. By applying a combination of gradients any path through k-space can be traversed. Examples that are offered by most vendors are radial and spiral figures 2(b) and (c). Non-Cartesian readouts are often used to overcome some of the limitations of Cartesian imaging. Radial readouts, for example, have the advantage that the center of k-space is sampled by every readout line, which makes it very robust against intraview motion (i.e. motion during acquisition). Spiral readouts provide very efficient k-space coverage and make optimal use of the gradient performance. Spiral readouts are therefore often used for fast imaging, such as real-time cardiac MRI (Yang et al 1998, Nayak et al 2005. Another benefit of non-Cartesian imaging is the fact that undersampling artifacts are incoherent in imaging space, which is beneficial for iterative reconstruction techniques, like compressed sensing that allow higher acceleration factors (R > 8) compared to the traditional parallel imaging techniques (R < 5) by acquiring less data points.

Image acceleration
Most 4D-MRI methods described in literature use standard acceleration methods such as partial fourier (PF) and parallel imaging to reduce scan time. In PF acquisitions a portion of k-space is deliberately not collected (often up to 3/8 or 7/16 of k-space). PF methods rely on the property of the Fourier transform that real functions have conjugate symmetry in k-space. The simplest PF reconstruction fills the missing lines with zeros, although this inherently results in considerable blurring in image space. Other methods synthesize the uncollected lines in k-space by exploiting conjugate phase symmetry in k-space. The assumption of conjugate phase symmetry, however, is only valid when the imaged object is real. In most cases, however, the object has a non-constant phase across the image. Most PF methods that synthesize the missing data in k-space therefore contain some form of phase correction based on the portion of the data that is symmetrically acquired around k = 0. The minimum amount of acquired data is therefore slightly more than 50% for 1D PF (typically around 65%). Higher acceleration factors can be achieved with parallel imaging. Parallel imaging methods use the spatially varying coil sensitivity profiles in multi-channel receiver coil arrays to subsample k-space (for example by skipping every other line) (Roemer et al 1990, Sodickson and Manning 1997, Pruessmann et al 1999, Griswold et al 2002. Various accelerated parallel imaging reconstructions have been proposed that either work in the image domain or in k-space. SENSE (Pruessmann et al 1999) operates in the image domain: normally skipping every other line in an acquisition would result in an aliased image. SENSE uses the fact that as long as the acquisition is subsampled with equidistant spacing the signal will alias in well-defined locations. The aliased signal in the acquired images can therefore be written as the sum of the signal from known locations in the unaliased image weighted by the coil sensitivities at these locations. Solving for this set of equations provides the de-aliased image. For this the coil sensitivity information is required, which is usually acquired in a separate scan. The k-space counterpart of SENSE is GRAPPA (Griswold et al 2002). GRAPPA aims to fill in the missing data lines in k-space before the data is Fourier transformed into image space. The assumption that is made is that each point in k-space can be synthesized by a linear combination of neighboring data points in the other coil images as long as their coil profiles are sufficiently different. This can be understood by considering that the signal in the image domain is modulated by the coil profile. The effect of this is identical to a convolution in k-space. Each coil therefore effectively samples a slightly shifted line in k-space, which implies that more than one k-space line could be reconstructed from the data if one knew the relationship between the different coils. GRAPPA works by finding the linear coefficients (weights) between the coils necessary to effectively deconvolve the coil profiles and reconstruct multiple lines from different coil signals. The different weights between the target points (the missing points in k-space) and the source points (the acquired neighboring data) are estimated by a reference scan, called the auto-calibration scan (ACS). This can either be a fully sampled central portion of the (undersampled) k-space or a separately acquired reference scan. The relationship between the target points is independent of the location in k-space, and therefore the weights that are estimated from the center of k-space can be applied to the missing data points throughout the rest of k-space. The amount of acceleration depends on the coil geometry and undersampling pattern. In general higher acceleration factors can be achieved for 3D acquisitions, as the undersampling (and thus resulting aliasing) can be divided over two dimensions. Typical acceleration factors for current clinically available multi-channel (e.g. 32 channels) arrays range between R = 3 for 2D and R = 6 (2 × 3) for 3D acquisitions. A more recent development is compressed sensing (CS) (Lustig et al 2007). CS MRI is an approach to image acceleration that is founded on the notion that all medical images can be compressed by finding an appropriate transform domain, in which the image can be sparsely represented (e.g. spatial finite differences or wavelet domain). This, together with a random sampling pattern in k-space allows images to be reconstructed from far fewer data than prescribed by the Nyquist-Shannon sampling theorem (Liu and Saloner 2014). CS has allowed up to 20-fold image acceleration, far higher than any parallel imaging technique (Lustig et al 2007). In CS reconstruction a constrained optimization problem is solved iteratively, in which sparsity is promoted in the transform domain, while maintaining data fidelity by comparing the measured k-space data to the Fourier transform of the reconstructed image. The iterative nature of the reconstruction results in reconstruction time which are considerably longer (up to several hours) than conventional linear reconstruction times.

Acquisition strategies for 4D-MRI
As described in section 2, different strategies exist to collect volumetric MRI data. The majority of publications that have appeared of the last few years have focused on developing optimal strategies to generate 4D-MRI data sets. An overview of the recent literature shows that the proposed 4D-MRI methods can be categorized into two distinct groups: multi-slice 2D acquisitions, and 3D acquisitions. Within each group variations in terms of acquisition order, k-space trajectory, or imaging contrast have been described in the literature.
About 40 papers have appeared since 2005 that describe the use of multi-slice 2D imaging to generate 4D-MRI. The use of 3D acquisitions has only been explored more recently. While motion compensation methods by retrospective gating have been investigated for just over a decade, actual 4D-MRI generation for the use of motion characterization has only been described since 2013 for 3D acquisitions. The interest in 3D acquisitions, however, is rising rapidly. In 2017 eight papers discussed 2D acquisitions, while seven papers were published on 3D imaging (see table 1).

2D multi-slice methods
The early work on 4D-MRI entirely consists of methods that use 2D multi-slice acquisitions. This is due to the fact that generation of a 4D-MRI data set is easier for 2D acquisitions, because sorting can be performed on the DICOM images (see section 5 for further details). As knowledge on MRI pulse programming or reconstruction is limited in most radiation therapy groups, a DICOM based method provides a practical alternative for the initial development of 4D-MRI.
The published 2D multi-slice approaches have mainly used T2 weighted Turbo spin echo (T 2 -TSE) or balanced steady-state free precession (bSSFP). T 2 -TSE is the main workhorse for delineation of many tumor sites and provides a T 2 weighted image (provided the relaxation time is large enough). BSSFP on the other hand is a very efficient short T R sequence that provides a T 2 /T 1 contrast (Scheffler 1999, Bieri and. This sequence is often used in cine MRI to maximize temporal resolution as it provides the highest signal per unit of time by re-using transverse magnetization from one T R to the next (Nayak et al 2005, Miller 2012). Multi-slice 2D acquisitions acquire the data on a slice-by-slice basis. As explained in section 2 the order in which the slices are acquired can be altered. The exact ordering has an effect on the efficiency of the 4D data collection as well as the possible imaging contrasts as we will see below. Basically the methods can be divided into two groups: prospective and retrospective methods of which the latter can again be subdivided into sequential and cine-mode acquisitions.

Retrospective methods
Sequential imaging is most comparable to a normal (static) image acquisition. In sequential imaging the same volume is acquired multiple times after which the data is retrospectively sorted correlated to the respiratory cycle. The slice ordering in this form of acquisition can either be ascending, descending, or interleaved. All retrospective methods, however, have the disadvantage that the sequence is agnostic of the respiratory waveform during the acquisition. The filling of the respiratory bins during the reordering process is therefore not always ensured (i.e. a certain slice may never be acquired at the right respiratory state). To mitigate the stochastic nature of the acquisition, cine-mode has also been introduced. In cine-mode, the same slice location is acquired multiple times until a complete respiratory cycle is sampled before moving on to the next slice location (Cai et al 2011. This approach reduces the risk of missing slices in the final 4D-MRI data set. The scan duration per slice location depends on the duration of the respiratory cycle. In the studies that have been published the required scan duration was either determined based on a pre-scan (Cai et al 2011) or set conservatively long , although in principle it would also be possible to feed back the respiratory signals from e.g. the respiratory bellows during the scan to ensure that the entire respiratory cycle is sampled for each slice location. Steadystate sequences are more suitable for cine-mode imaging, while sequences such as T 2 -TSE are more suitable for sequential imaging due to the requirement for a long T R .

Prospective methods
In prospective methods the slice ordering is determined on-the-fly based on the respiratory signal (Du et al 2015. Compared to retrospective reordering, prospective methods have improved acquisition efficiency, as they monitor what slices have been acquired and thus avoid re-acquiring slices. Furthermore one can ensure that all slices are collected for each respiratory state. The T R in prospectively triggered sequences, however, is no longer fixed, but determined by the patient's breathing frequency. To avoid fluctuation in magnetization as a result of changing repetition times care should be taken that complete relaxation has occurred before the same slice location (but for another respiratory state) is acquired again.
Slice orientation is another important consideration. Characteristic of multi-slice 2D acquisition is the high in-plane resolution in combination with a coarse through plane resolution due to the limited slice thickness. Delineations are mostly performed on transversal slices, since most conventional radiation treatment planning systems only accept transverse image, as a result of the fact that they use CT images, which only allow for transversal imaging. For delineation purposes, a transversal acquisition is the most natural orientation to acquire the images in. However, the main direction of (respiratory induced) motion also lies along the feet-head direction, which results in large amount of through plane motion for this orientation. Spins of moving tissue may therefore move from one slice to the next causing image artifacts. In abdominal and thoracic imaging a sagittal orientation is often optimal in order to minimize though plane motion, since left-right motion is generally the smallest and negligible (<2 mm) compared to the slice thickness .
As most 2D multi-slice methods use Cartesian readouts, standard image acceleration methods such as partial Fourier and parallel imaging are often employed to reduce scan time. Recently, the use of simultaneous multislice (SMS) has also been proposed. In SMS a multi-band RF pulse is used to excite multiple slices, which are simultaneously sampled. As a result the signal from two slices is overlapped, so parallel imaging reconstruction is used to separate the signals that originate from the different slices. To improve the parallel image reconstruction the phase of the multi-band RF pulse is used to shift the relative position of adjacent slices in image space (Breuer et al 2005, Stemkens et al 2016a. In a paper by Celicanin et al (2015), SMS was successfully used to acquire a 2D navigator signal simultaneously with the imaging slice. By changing the distance between two SMS slices the navigator slice could be kept at the same location throughout the experiment, while the imaging slice position was altered to collect the entire volume. The benefit of this approach is that the 2D navigator is collected at the exact same time as the imaging slices. It is, however, questionable whether the additional information of the 2D navigator outweighs the two-fold SMS acceleration factor that is sacrificed to acquire a full 2D navigator for each acquired imaging slice.
A more conventional method to extract respiratory surrogates from imaging data is the use of a separate 1D navigator acquisition, which allows simple extraction of the respiratory motion signal from the 1D projection via cross correlation (Nehrke and Manke 2000). Section 3.4 provides further details on the various surrogate signals that are available to correlate the imaging data to the respiratory waveform.

3D methods
The number of 4D-MRI methods that use 3D readouts is increasing. As a standard 3D readout takes multiple respiratory cycles to collect, the data has to be reordered before image reconstruction (i.e. in k-space). As the development often requires custom pulse sequences and reconstruction algorithms the development of these types of methods are mostly performed by established MR physics research groups. Most papers published so far differ in the readout that is used. The reason for this is two-fold: for one, the readout determines for a large extent the image quality, particularly in the presence of motion. Second, the additional dimension allows for more flexibility in terms of readout trajectory. To date, various 3D readout trajectories have been developed for 4D-MRI that include non-Cartesian, Cartesian, and hybrid readouts (see table 1).

Non-Cartesian readouts
The use of 3D radial, also known as 3D projection imaging or kooshball, has been explored by one group , Deng et al 2016, 2017, Park et al 2016. The readout consists of a large number of radial projections with different polar and azimuthal angles. A golden means ordering scheme is used (Chan et al 2009) to achieve a uniform distribution of sampled lines throughout the acquisition. The benefit of a full 3D readout is the benign point spread function, which allows for very high undersampling factors with benign (incoherent) image artifacts. This property makes radial a very suitable readout strategy for compressed sensing image acceleration. Further, the center of k-space is sampled by each line, which makes the readout very robust. The disadvantages of this readout are that oversampling the center of k-space requires time, which reduces the efficiency of the readout. Also, the reconstruction involves a 3D gridding step, which is computationally expensive. This particularly impacts on the reconstruction speed of iterative reconstructions (i.e. iterative SENSE or CS reconstructions).

Hybrid readouts
One way of mitigating these effects is the use of hybrid radial-Cartesian readouts (Buerger et al 2012, Stemkens et al 2015, 2017b, Rank et al 2016b, 2017, Freedman et al 2017, Mickevicius and Paulson 2017. These hybrid readouts sample k-space along a radial pattern in two dimensions, while the third is sampled on a Cartesian grid. Two examples that have been described in the more recent 4D-MRI literature are the radial stack-of-stars (SoS) (Stemkens et al 2015, 2017b, Rank et al 2016b, 2017, Freedman et al 2017, Mickevicius and Paulson 2017 and the radial phase-encoded readout (Buerger et al 2012). Figure 3(b) shows a diagram of the radial SoS. While maintaining the radial sampling pattern in the k x -k y plane, the readout uses Cartesian sampling along the k z -direction. Each shot thus consists of an angulated Cartesian readout in which all the k z -partitions are acquired. Similar to the 3D radial acquisition subsequent shots are typically rotated by the golden angle (Winkelmann et al 2007) to ensure quasi-uniform k-space coverage when the data is undersampled. The benefit of the Cartesian dimension is that the image reconstruction can be parallelized along that dimension and the reconstruction requires 2D rather than 3D gridding. Radial phase encoded MRI takes the opposite approach. Here, Cartesian sampling along readout direction (k x ) is combined with a Golden-order radial sampling scheme in the phaseencoding plane (k y -k z ) (Buerger et al 2013). The benefit of this technique is that typical corrections that are needed for radial acquisition (e.g. eddy-current-related phase errors) are not needed for this type of readout as the readout and phase encode directions are completely separated like in conventional Cartesian imaging.

Cartesian readouts
Cartesian readouts have also been investigated for the use in 4D-MRI. Examples include the rotating Cartesian k-space (ROCK) trajectory proposed by Han et al (2017) and the compressed sensing partial subsampling (ESPReSSO) scheme by Küstner et al (2017). These methods are similar to the Radial Phase Encode trajectory, but differ by the fact that the PE positions are positioned onto the rasterized (Cartesian) grid. For both methods the undersampling pattern is pseudo-randomly chosen to facilitate compressed sensing reconstruction and reduce coherent motion artifacts. Nearly all 3D based methods are self-navigated, which means that the respiratory surrogate signal is obtained directly from the image data, although other approaches such as pencil-beam navigators or respiratory bellows are also possible. Because the entire imaging volume is excited by every RF pulse in 3D acquisitions, the contrast is mostly limited to spoiled gradient echo (SPGR), ultra short echo time (UTE), or bSSFP sequences. 3D T 2 -TSE is possible, but so far has not been used as a native contrast for 3D based 4D-MRI, because of its prolonged T R to allow for T 1 relaxation and therefore significantly longer acquisition time compared to e.g. SPGR or bSSFP.

2D versus 3D methods compared
The implication of using 3D versus 2D imaging for 4D-MRI is substantial (see table 2). First, the time-interval between subsequent excitations of the same tissue is much shorter for 3D acquisitions than for 2D multi-slice imaging. As a result the longitudinal steady-state magnetization that is available for signal formation is lower. On the other hand, the fact that the signal is sampled from the entire volume, rather than a single slice, leads to an intrinsic SNR advantage of √N slices . The fact that the image is Fourier encoded along all three dimensions allows for imaging at high isotropic resolution. The longer readouts, however, make this type of sequence more prone to intra-view motion artifacts if not appropriately dealt with (this 'problem' has in fact spurred early work that focused on motion compensation by reordering of the data) (Bailes et al 1985, Buerger et al 2013. The short T R in 3D imaging makes GRE the most natural contrast for 3D acquisitions. This includes spoiled sequences, such as SPGR (or FLASH), that provide pure T 1 contrast as well as steady-state sequences, such bSSFP, that provide a T 2 /T 1 contrast. Depending on whether the 4D-MRI is used for delineation purposes or merely as a motion characterization method (since organ contrast can be limited) this may or may not be a limitation of the current 3D based 4D-MRI methods.
A benefit of 3D readouts versus 2D readouts is the increased flexibility of the additional dimension. This increased flexibility can be exploited to design trajectories that are better suited to acceleration techniques that allow for higher acceleration factors such as 2D GRAPPA and SENSE or compressed sensing. Moreover, the radial readouts used in most 3D based 4D-MRI methods sample the center of k-space with every shot, which ensures that the anatomy represents the average position of all sampled physiological states. A final, but very important implication for radiation therapy applications is the effect of gradient non-linearity. By design the field gradients of any MRI scanner are not perfectly linear over the entire field-of-view. If not taken into account these non-linearities would result in considerable geometric distortions with increasing severity away from isocenter. In principle the non-linearity is carefully mapped by the vendor. This information is subsequently used during reconstruction to correct the geometric distortions via interpolation in image space. The interpolation step, however, requires the acquisition of a contiguous volume and is easier to perform for a 3D acquisition than for multi-slice 2D acquisitions. Hence, some vendors only correct in-plane distortions for 2D acquired images, which severely limits the applicability for radiotherapy. This (uncorrected) slice distortion is referred to as potato chipping and could lead to local displacements of up to four cm when imaging at a distance of 20 cm away from isocenter (Borman et al 2018).

Surrogate signals
Three general approaches have been utilized to guide the reordering of 4D-MRI data: external surrogates, internal surrogates, and self-navigation. A subset of surrogate signals mirrors those of 4D-CT, while others are unique to MRI. Those that mirror 4D-CT include external surrogates such as pneumatic belts (bellows) ( (Remmert et al 2007). Challenges with external surrogates include signal saturation, gain resetting, synchronization of start and end times with pulse sequence acquisition, and loss of phase coherence with acquired k-space data (Tryggestad et al 2013, Stemkens et al 2015. In addition, external surrogate motion may not directly correlate with target or organs at risk (OAR) motion (Feng et al 2009, Goldstein et al 2010). Finally, logistics of setup and positioning time, battery life, and MR bore interference can also challenge use of external surrogates for 4D-MRI. The remaining approaches to guide 4D-MRI reconstruction are unique to MRI. Internal surrogate signals include pencil-beam navigators. Pencil beam navigators are fast 1D MRI acquisitions that utilize 2D radiofrequency pulses to selectively excite a column of spins (Ehman and Felmlee 1989). The navigators can be prescribed with arbitrary orientation over high contrast interfaces of moving structures closest to the target or structure of interest. The most common navigator prescription is the lung-liver interface (Stemkens et al 2015). However, navigators can also be placed at the kidney/perirenal fat interface. During acquisition, two sequences are executed: (i) the parent sequence consisting of pencil-beam navigator sampling, (ii) the child sequence consisting of the desired contrast weighting. Although the pencil-beam navigator signal may correlate with target or OAR motion, the parent/child relationship of the acquisition decreases scanning efficiency when pencil beam navigators are used (Buerger et al 2013). Furthermore, disruption of the imaging steady-state can occur, which can complicate pencil-beam navigator use with coherent and incoherent steady state free precession pulse sequences for 4D-MRI (Scheffler and Lehnhardt 2003). Lastly, for 3D acquisition the pencil-beam navigator can saturate part of the imaging volume, if placed inside the desired imaging volume.
Self-navigation (also often referred to as self-gating or self-guidance) is a process by which the acquired MRI data itself can be used to detect motion and/or correct for motion artifacts during image reconstruction. In terms of 4D-MRI, self-navigation is used to reorder the acquired data during 4D-MR reconstruction. Self-navigation has been performed in both the image and frequency domains. Image domain methods have included 2D image navigators (Von Siebenthal et al 2007, Celicanin et al 2015 and changes in body surface area (Cai et al 2011, Yang et al 2014. For 2D image navigators, 2D images are acquired over complete respiratory cycles in either a cine (Würslin et al 2013, Bernatowicz et al 2016 or sequential (Celicanin et al 2015) acquisition strategy. Image registration is then used to reorder the acquired images into respiratory bins. This method can be enhanced using SMS acquisitions (Celicanin et al 2015). Image registration of the 2D image navigator is then used to reorder the 2D images into different respiratory bins. Cross-correlation of 1D projections (Buerger et al 2013, Deng et al 2016, Han et al 2017, Mickevicius and Paulson 2017, or changes in centroid positions of 1D projections , obtained from Fourier transform along the k z direction in 3D acquisitions, is an additional image-domain self-navigation approach. Finally, self-navigation can also be based on the thermal noise variance of the receive coils during acquisition (Andreychenko et al 2017). Frequency domain self-navigation has also been performed to reshuffle 4D-MR data (Hui et al 2016. One frequency domain approach is based on variations in the amplitude of the k-space center (i.e. DC signal) of 3D acquisitions . Another method is based on the Fourier shift theorem, in which phase ramps in the frequency domain corresponds to shifts in the image domain (Hui et al 2016). Challenges with self-navigators include phase instability, eddy currents, and partial saturation bands (in the context of 2D image-based self-navigators). Advantages include increased efficiency and direct synchronization with the imaging data.

Generating a respiratory-correlated data set
Once MR data is acquired and a respiratory surrogate signal is extracted with good temporal correlation with the MR data, a 4D-MRI can be generated. Depending on the MR acquisition (i.e. the read-out), data are sorted in image-or k-space, using the respiratory information from the surrogate signal. If necessary, additional processing steps are done after sorting the data, primarily for k-space based methods.

Binning
Sorting of the data can be done in a number of ways, similar to sorting 4D-CT data. These data can be k-space data or images. In general, there are two ways of sorting the data; based on the amplitude of the respiration (amplitude binning), or based on the phase of the respiration (phase binning). In amplitude binning, data within a certain range of respiratory amplitudes is assigned to one of n equally spaced amplitude bins, based on the full respiratory signal (see figure 4(a)). For phase binning, each breathing cycle is divided into n equal time bins, in which all data is sorted (see figure 4(c)). The number of bins n can be chosen freely, but is generally between four and ten, since 4D-CT also generates ten bins. Overall, less bins requires less data, but introduces more intra-bin variability. Often the peaks and troughs of the respiratory signal are extracted, representing end-exhale andinhale point, to determine the respiratory phase. The majority of the published 4D-MRI (and 4D-CT) methods used phase-based sorting, since phase-based sorting intrinsically holds probabilistic information that is useful for radiotherapy planning, i.e. the time spent in each phase is equal for phase binning. Therefore, these data can readily be used to generate an ITV margin, which is the overlapping tumor position over all respiratory phases, or a mid-pos volume, which is the time-averaged position. Additionally, amplitude binning may require much longer acquisition times due to breathing variations, such as baseline drifts.
Exceptions are prospectively gated methods, which generally employ amplitude binning. These prospective methods acquire data (in general 2D images) at a given amplitude, based on surrogate signals, such as respiratory bellows (Rohlfing et al 2004, Hu et al 2013, Du et al 2015, or 1D MR navigators (Tokuda et al 2008. While the probabilistic information is also available for amplitude binning, since the amount of data for each bin is known after sorting, it can only be extracted by calculating the probability density function (PDF) and, if necessary, convolving the data with this PDF (Engelsman et al 2005). Alternatively, amplitude binning with variable bin width can be used, where the width of each amplitude bin varies such that the amount of data in each bin is equal (see figure 4(b)). Consequently, probabilistic information is included using such a sorting approach.
For 4D-CT it was concluded that amplitude sorting can significantly improve image quality and reduce artifacts (Fitzpatrick et al 2005, Rietzel et al 2005, Wink et al 2006, Abdelnour et al 2007, Lu et al 2006. Typical 4D-CT artifacts, such as blurring, duplicate structures, overlapping slices and incomplete structures (Yamamoto et al 2008) were significantly reduced when amplitude sorting was used. However, it was also reported that amplitude binning could result in insufficient data at specific couch positions when no data is acquired at certain amplitudes (Abdelnour et al 2007, Yamamoto et al 2008. Moreover, amplitude binning generally does not discriminate between exhale and inhale. Thus, hysteresis, the phenomena that respiration is different when inhaling compared to exhaling, cannot be captured. While this may be no problem when determining the maximum respiratory-induced motion, both the mid-position volume and ITV margin are different when including or excluding hysteresis effects. Nevertheless, hysteresis can also be characterized through amplitude binning, by dividing the respiratory cycle into ex-and inhale and sorting identical amplitudes into either ex-or inhale amplitude bins , Hu et al 2013.
The image quality and artifact level of the reconstructed 4D-MRI partly depend on the sorting method. The sorting method, number of bins, and variability of the respiration determine the variations that are present within each bin, i.e. the intra-bin variability. For both k-space and image-based methods, larger intra-bin variability results in increased artifacts and decreased image quality. Since different amplitude data are sorted into the same bin, the mean position of that bin is an average of all these amplitudes. This typically leads to a small underestimation of the motion when compared to 2D cine MR imaging.
A simulation where a cos 6 respiratory signal (which is commonly used to simulate respiratory motion (Lujan et al 1999)) is sorted into ten bins using different methods shows that the mean bin position and the intrabin amplitude variability (i.e. the variations in amplitude that are sorted into a single bin), shown by the dots and transparent area, respectively, vary between the different methods (see figure 4(d)). Consequently, the mid-position will be different, as well as the image quality after sorting. (d) Intra-bin amplitude variability after sorting a cos 6 respiratory signal with the shading representing ± one standard deviation. The black cross represents the mid-position for that sorting method.
Overall, different flavors of binning exist for 4D-MRI. While amplitude binning may give smaller intra-bin amplitude variations, phase binning inherently contains probabilistic information that may be useful in radiotherapy treatment planning.

Image-versus k-space based sorting
In general, 2D acquired data is first reconstructed into images before sorting in the image domain. 3D acquisitions, on the other hand, are sorted in k-space and subsequently reconstructed. Consequently, image-based sorting has been used more in earlier publications, while k-space sorting gained popularity in recent years. For prospective methods, image data is already sorted into the different bins immediately after acquisition, so no additional sorting has to be performed.

Image-based sorting
In image-based sorting, 2D images (or 3D volumes) are first reconstructed (on the scanner) and subsequently sorted into the respiratory bins, based on the surrogate signal using amplitude or phase binning (see figure 5(a)). The liver-lung interface in the slice direction is often used to determine the accurateness of sorting, since a smooth liver top (i.e. diaphragm shape) is expected (see figure 5(c), left panel, for coronal view of sagittally acquired images). Irregular breathing may give rise to an irregular liver contour (see figure 5(c), middle panel), so-called 'stitching artifacts'. After sorting it may happen that at certain slice locations and bins, multiple image slices are present (oversampled). Either the first or last image can be used, or the image with the highest similarity with surrounding slices (Tryggestad et al 2013). Conversely, it can also happen that no image is assigned at a certain location and bin, leading to black bands in the reconstructed volumes (see figure 5(c), right panel). This can Figure 5. Workflow comparison. Pipeline for image-(a) and k-space-based (b) sorting. For image-based sorting, images are first reconstructed before sorting into the different bins. For k-space-based sorting, k-space is sorted into the different bins and subsequently reconstructed. (c) The contour of the liver is a good landmark to assess the sorting process. In the left panel a smooth liver contour is seen as expected, but due to irregular breathing, an irregular contour can be observed in the other panels. Moreover, missing slices are observed in the right panel. (d) For k-space based sorting, irregular breathing results in more blurring, but a smooth liver contour is preserved. happen when this slice location was insufficiently sampled, or, for amplitude binning, this respiratory amplitude was not encountered during sampling. The missing image can be extracted by interpolating from different slice locations or different phases, or by taking the slice from an adjacent phase (Cai et al 2011, Cai et al 2015, Liu et al 2015a, Yang et al 2014. A longer acquisition minimizes the risk of missing slices, but increases overall scan time. If 2D data is acquired in cine-mode, at least one respiratory cycle (5-10 s) has to be acquired per slice location to sample sufficient data.

K-space sorting
In k-space sorting, parts of k-space are sorted into different phase volumes and subsequently reconstructed to image space (see figure 5(b)). This can be a single k-space line, k-space segments or an entire k-space, given that the respiratory amplitude/phase for that part of k-space is known. Most k-space-based sorting methods use 3D non-Cartesian MR sampling (Buerger et al 2012, Stemkens et al 2015, Yue et al 2015a, Deng et al 2016, 2017, Rank et al 2016a, 2017, Freedman et al 2017, Mickevicius and Paulson 2017, while some used Cartesian sampling in 2D (Liu et al 2015b) or 3D (Stemkens et al 2015. An advantage of k-space sorting, and non-Cartesian sampling in particular, is that missing data does not lead to black lines in the reconstructed volumes, but to an increase of undersampling artifacts, which manifest as aliasing in Cartesian and streaking in radial sampling. Moreover, if multiple identical k-space segments are sorted into a certain bin, complex averaging can be used to increase SNR. After sorting, data is transformed to image space using an inverse FFT. For non-Cartesian sampling this is preceded by a gridding step which convolves data onto a Cartesian grid (Fessler et al 2003). Since all data within a bin contribute equally, the reconstructed volumes represent the average position of the anatomy during data acquisition of the k-space segments that are within that bin (see figure 5(d)).

Comparison
K-space-and image-based sorting both have advantages and disadvantages. Advantages of k-space sorting are that undersampling is less problematic, the temporal resolution of a segment that is sorted can be much higher (e.g. a single k-space line), and it can be used with 3D acquisitions. Additionally, for non-Cartesian k-space sorting methods, irregular breathing leads to increased blurring in the reconstructed images, while for imagebased sorting methods this generally results in an irregular liver contour (see figures 5(c) and (d)). However, k-space-based methods are generally more difficult to implement and reconstruction may take longer, as the entire reconstruction is usually performed offline. So far no prospective methods have been proposed with k-space sorting. Image-based methods are generally based on 2D acquisitions that provide more freedom in choice of contrast (see section 3).

Compressed sensing and its effect on motion smoothness
While for image-based methods no additional steps have to be taken after sorting (except for filling in missing slices), 3D k-space-based methods often employ additional reconstruction steps. In recent years, a trend towards shorter acquisitions is seen to reduce acquisition time. Consequently, undersampling artifacts increase in the reconstructed 4D-MRI. To minimize these undersampling artifacts iterative reconstructions were proposed for non-Cartesian 3D acquisitions. Examples are non-Cartesian iterative SENSE (Pruessmann et al 2001) in Buerger et al (2012), or compressed sensing (CS) (Lustig et al 2007) with different regularization terms in Freedman et al (2017), Mickevicius and Paulson (2017), Rank et al (2017) and Stemkens et al (2017b). CS reconstructions exploit image sparsity in one or more domains (spatial or temporal). For 4D-MRI, total variation (TV) sparsity in the temporal domain were suggested (Mickevicius and Paulson 2017) as well as the motion-correction highdimensional TV in both spatial and temporal domain . Increasing the regularization using these sparsifying constraints results in smoother images with reduces artifacts, but may also underestimate the motion in the resulting 4D-MRI (Mickevicius and Paulson 2017). Essentially, the motion within the 4D-MRI is smoothed as a result of the CS reconstruction. Note that such iterative reconstructions are computationally intensive and take anywhere between a few minutes (Mickevicius and Paulson 2017) to several hours . Although CS is a valuable technique to minimize artifacts for severely accelerated 4D-MRI acquisitions, care should be taken when using CS and any motion underestimations should be accessed and quantified. Lastly, the same amount of data should be present in each bin, so these methods use amplitude binning with variable bin width, or phase-based binning.

Reconstruction challenges
Most 4D-MRI techniques described in the literature perform the reconstruction (i.e. surrogate extraction, sorting and image reconstruction if necessary) offline on a separate PC. Exceptions are prospective methods that have to keep track of the acquired data on the scanner and are therefore generally reconstructed on the scanner. In order to use 4D-MRI in a clinical setting, a robust reconstruction is required that generates the 4D-MRI automatically and sends the 4D-MRI to a picture archiving and communication system (PACS) or other DICOM server. This way the 4D-MRI is included in the clinical process and is accessible for viewing to clinicians. Ideally, an online implementation is employed where the 4D-MRI is immediately reconstructed on the scanner, similar to prospective methods. However, this is often difficult, since non-standard reconstruction steps are used. This is something that needs to be picked up by the vendors. Several vendors are already introducing CS type of reconstructions on the market, so the step towards 4D-MRI implementations could be relatively small. Alternatively, large departments may choose to set up their own reconstruction pipeline where acquired data is (automatically) reconstructed on an offline server and pushed to PACS. An advantage of such an infrastructure is that computationally extensive reconstructions, such as CS, can run overnight. Having a robust and automatic 4D-MRI reconstruction pipeline is essential for clinical introduction and larger patient studies.

Current and potential clinical application of 4D-MRI
4D-MRI can be integrated into the clinical radiotherapy workflow at different steps. During MR simulation, 4D-MRI can be used to construct high-quality, static, reference images for treatment plan generation, including mid-ventilation (van de Lindt et al 2016), mid-position (Wolthaus et al 2008b, Kontaxis et al 2017, or alternate respiratory phase images. Leveraging its soft tissue contrast benefits, target and organ at risk (OAR) delineation accuracy may improve with 4D-MRI compared to 4D-CT (Hu et al 2013, Yang et al 2015c. Target and OAR motion models for margin assessment and construction of motion structures (e.g. internal target volumes or planning risk volumes) can be determined from 4D-MRI that may not suffer from motion underestimation and stitching artifacts as in 4D-CT (Yang et al 2015c). 4D-MRI can also be used to establish 4D deformation vector fields (DVFs), which could be used for example, as vehicles to align additional MR image contrasts, obtained during MR simulation, to specific respiratory phases or motion states of reference images (Freedman et al 2017). This may be particularly important, in that it enables MR images to be acquired at a particular respiratory phase more suitable for acquisition (e.g. inhalation breath hold) and then warped to a particular respiratory phase more suitable for treatment (e.g. end-expiration). The motion models, coupled with review of the 4D-MR surrogate or self-navigator signal, can be used to determine whether a particular patient may be a good candidate for respiratory gated radiotherapy delivery and, if so, selection of gating windows.
4D-MRI can also be integrated for pre-beam, beam-on, and post-beam phases of in-room, hybrid MRguided radiotherapy devices (Fallone 2014, Keall et al 2014, Mutic and Dempsey 2014, Lagendijk et al 2016. During the pre-beam phase of treatment, 4D-MRI can be used to construct high-quality reference images capturing the anatomy and motion of the day. This information could be used in clinical decision support for evaluating whether plan adaptation is required prior to delivery. Moreover, 4D-MRI can be used to generate synthetic (4D-)CT images for planning in conventional treatment planning or for daily plan optimization. In addition, motion models derived from pre-beam 4D-MRI can be used to determine the optimal alignment of real-time cine imaging planes (i.e. to capture the largest displacements) to monitor target and/or OAR motion during the beam-on phase of treatment. Real-time cine MR images acquired during beam-on can be used to drive the prebeam motion models to construct synthetic 3D image volumes at the temporal resolution of these real-time cine images (Stemkens et al 2016b, Harris et al 2018. The synthetic 3D volumes can be used, in combination with the synthetic (4D-)CT data, in the post-beam phase to reconstruct the dose delivered during the treatment fraction, providing a mechanism to perform 'truth-in-delivery' analysis (Stemkens et al 2017b). The reconstructed dose could then be accumulated and eventually fed back into the adaptive planning process for the next treatment fraction (Kontaxis et al 2015). The real-time cine MR imaging should be as fast a possible and cover at least the main direction of motion of the target. Using orthogonal cine-MRI, 3D motion of the target can be determined. Since the goal of this real-time imaging is to accurately pinpoint the target (and/or OAR) position, the image quality should be sufficient for the registration or tracking algorithm to determine the target position accurately. For the optical flow algorithm, it has been reported in literature that it performs accurately even when SNR is low (Roujol et al 2011), voxel size is large (5 × 5 × 5 mm 3 for 3D imaging) (Glitzner et al 2015), and radial undersampling streaking artifacts are present (Stemkens et al 2013). . However, each of these approaches suffers from confounding effects that obscures direct validation. Respiratory bellows signals may demonstrate loss of phase coherence with acquired data (Tryggestad et al 2013). Digital and physical phantoms may not accurately depict the complex motions observed in vivo. In addition, target contrast in dynamic motion phantoms is often markedly different, which almost certainly impacts image registration (Yue et al 2015b). Furthermore, due to contrast differences, visualization of dynamic phantom targets may require alternative pulse sequence parameters compared to those used in vivo.

Validation of 4D-MRI
Validation with cine MRI is typically performed by comparing sequential, cine MRI and 4D-MRI acquisitions obtained in the same scanning session. However, irregular breathing, baseline shifts, and organ filling between acquisitions may challenge validation. In addition, cine MRI is often acquired with a small number of 2D imaging slices, which may not fully characterize 3D motion and may underestimate displacements in cases with substantial through-plane motion. With improvements in SMS imaging, in which multiple 2D cine imaging slices can be acquired concurrently, these challenges may be reduced but still not eliminated.
Similar to cine MRI, separate 4D-CT and 4D-MRI acquisitions can challenge motion comparisons. However, unlike cine MRI, the patient must be completely repositioned between 4D-CT and 4D-MRI acquisitions, which can exacerbate dissimilarities. In addition, the low soft-tissue contrast of 4D-CT, coupled with its reduced contrast-to-noise ratio (Wolthaus et al 2008b), can complicate delineation of solid tumors used for motion comparison (Hu et al 2013, Tai et al 2013, Yang et al 2015c. Finally, it has been well established that 4D-CT may underestimate motion displacements, which can be further affected by stitching artifacts during 4D-CT reconstruction (Watkins et al 2010). Despite the abovementioned validation challenges, 4D-MRI motion amplitude agreements within 2 mm or less have been reported (Yue et al 2015b, Deng et al 2016, Mickevicius and Paulson 2017, Han et al 2017.

Future of 4D-MRI
The term 4D-MRI is an ambiguous term. A quick literature search reveals that the fourth dimension can either refer to time or a physiological dimension (e.g. the cardiac or respiratory cycle). Major applications that are dubbed 4D-MRI are: 4D cardiovascular imaging, 4D MR Angiography, and 4D Flow imaging. In nearly all cases the fourth dimension refers to the physiological dimension (i.e. the respiratory or cardiac cycle) and not time since the speed of the MRI acquisition is generally too slow to resolve these processes in time. Similarly, in 4D-MRI for radiation oncology, the topic of this review, '4D' refers to respiratory-correlated MRI, rather than time-resolved MRI. It is important to make a clear distinction between respiratory-correlated and time-resolved MRI, as the terminology is likely to shift at a certain point when time-resolved MRI does become feasible. The last two decades have mainly focused on hardware development and image acquisition, which together have enabled a 20-fold image acceleration (Sodickson and Manning 1997, Pruessmann et al 1999, 2001, Griswold et al 2002. Currently we see strong advances in image reconstruction (Lustig et al 2007, Hammernik et al 2018. It is anticipated that this will enable another 5-to 10-fold acceleration. Eventually, true 4D MRI, in which the physiological motions are actually resolved in time manner, will become possible.

Current technological developments
Current developments focus on enhancing the impact and applicability of 4D-MRI. One technological development, currently explored by several groups, is the use of motion models to fill the gap between pretreatment 4D-MRI and fast 2D imaging during treatment delivery. In each of the currently proposed methods, the volumetric information obtained by the respiratory correlated 4D-MRI is projected onto the fast incoming 2D imaging slices in order to create synthetic time-resolved 4D imaging (Harris et al 2016, Stemkens et al 2016b, Paganelli et al 2018. The generated volumetric time-series can then be used for dose accumulation mapping (Stemkens et al 2017b).
Another recent development is the extension of 4D-MRI to a fifth dimension. Here the fifth dimension can either be another physiological dimension (e.g. the cardiac cycle), or a functional contrast dimension (e.g. the dynamic signal enhancement curve in Dynamic Contrast Enhanced Imaging). Although intended as a motion compensation methods for radiological exams (Chandarana et al 2015, these methods could serve radiation oncology as well as a dual purpose sequence, in which functional imaging is combined with motion characterization (Stemkens et al 2017a).
Finally, we are likely to see the extension to multi-echo 4D-MRI in order to generate 4D synthetic CT data. Combining 4D imaging with synthetic CT generation will enhance the accuracy of current dose accumulation methods on the MR-Linac, which currently still use a bulk density assignment.

Limitations for clinical Implementation
The major reason that 4D-MRI is still only used in research setting is the fact that none of the MRI vendors currently offer 4D-MRI as a product, although some vendors are currently testing prototype implementations. Siemens, for example, is exploring a work-in-progress (WIP) 4D radial stack-of-stars acquisition, whereas Philips is investigating the use of multi-slice 2D MRI. Before these methods can be clinically introduced, however, a few refinements are needed. For methods that employ 3D readouts it would be desirable to have the flexibility to acquire T2w contrast as well as the current T1-weighted gradient echo contrast. Recent work has shown that hybrid acquisitions that provide both T1w and T2w contrast are indeed possible (Benkert et al 2018). For the multi-slice 2D acquisitions gradient non-linearity correction in all three dimensions (including through-plane) is a must for radiotherapy purposes. Further, a prospective stopping criterion, which automatically stops the acquisition as soon as the data sufficiency conditions are met, would help in reducing the overall scan time. For that same reason the ability to generate synthetic CT data, thereby bypassing the need for a separate acquisition, would allow the implementation of 4D-MRI into an MRI-only workflow. It has to be noted, however, that for anatomical sites, such as lung (for which PET imaging is a standard pre-treatment imaging modality) the CT obtained during the PET-CT acquisition will likely remain leading for quite some time.

Adaptive imaging
Respiratory motion is not the only type of motion that plays an important role in radiotherapy. Other types of motion are bulk motion, drifting, organ filling and bowel peristalsis in the abdomen, and cardiac pulsation in thoracic imaging. With recent data showing poorer survival due to increased heart dose (McWilliam et al 2017) it is more than likely that the increased imaging speed will be put to use to critically sample the heart pulsation as well in order to accurately monitor the dose to the heart. On the other hand, the slower types of motion, like bulk motion drift and peristalsis could already be tracked by sequences, such as the golden angle radial readout, which allow sliding window image reconstructions at various temporal resolutions (Winkelmann et al 2007). It is to be expected that these types of acquisitions, originally developed for radiological purposes, will be adapted to the requirements within radiotherapy (Bruijnen et al 2018).

Requirements to go real-time
The ongoing developments in image acceleration cultivate the desire of real-time volumetric imaging. Clinical acquisition times of fast 3D imaging are in the order of 20 s (a single breath hold imaging) with a resolution of 2 × 2 × 2 mm 3 . It is possible to further increase the imaging speed by reducing the resolution. Glitzner et al (2015) have demonstrated that a resolution of 5 × 5 × 5 mm 3 is feasible for image registration and tumor motion tracking. On the other hand CS has shown to allow image reconstructions of highly undersampled data enabling image acquisition with a resolution of 1.1 × 1.1 × 2 mm 3 at a 3.2 s time interval . Although highly successful in offline, retrospective, image applications, the long image reconstruction times (often in the order of minutes to hours) prohibit its online, real-time, use. Recently, however, deep learning (DL) image reconstruction has been introduced (Hyun et al 2017, Hammernik et al 2018, Han et al 2018, Qin et al 2017. DL has the potential to bypass the reconstruction time problem by shifting the computational burden to a pre-learning phase, while being extremely fast during the actual acquisition. Although some papers have been published on this, the application of real-time imaging has not yet been explored.

Conclusions
This review focused on the nuts and bolts of implementing and using respiratory-correlated 4D-MRI for radiotherapy purposes. Because of its high versatility we will most likely see more 4D-MRI implementations in the near future. It is expected that k-space-based 3D methods will be used more often moving forward, since their main disadvantages (i.e. long computation times and difficult reconstructions) are more likely to be solved in the foreseen future than the artifacts inherent to image-based multi-slice methods, such as potato chipping and missing data. Collaboration with MRI vendors who choose to implement solid 4D-MRI methods will eventually lead to true time-resolved 4D-MRI.