New acquisition techniques and their prospects for the achievable resolution of fMRI

This work reviews recent advances in technologies for functional magnetic resonance imaging (fMRI) of the human brain and highlights the push for higher functional specificity based on increased spatial resolution and specific MR contrasts to reveal previously undetectable functional properties of small-scale cortical structures. We discuss how the combination of MR hardware, advanced acquisition techniques and various MR contrast mechanisms have enabled recent advances in functional neuroimaging. However, these advanced fMRI practices have only been applied to a handful of neuroscience questions to date, with the majority of the neuroscience community still using conventional imaging techniques. We thus discuss upcoming challenges and possibilities for fMRI technology development in human neuroscience. We hope that readers interested in functional brain imaging acquire an understanding of current and novel developments and potential future applications, even if they don't have a background in MR physics or engineering. We summarize the capabilities of standard fMRI acquisition schemes with pointers to relevant literature and comprehensive reviews and introduce more recent developments.


Background
Understanding how the brain works is intriguing scientists and laymen alike. For a long time, information could only be obtained from brain samples, animal studies, or patients with neurological abnormalities. Gaining information about the living brain underlying human behaviour was impossible due to the need for non-invasive, low-risk experimental methods. Initially, approaches that measure electrophysiological signals accompanying neural processes, such as electroencephalography (EEG) and magnetoencephalography (MEG), have provided insights into the temporal characteristics of brain processes. The spatial resolution, however, remained comparatively low (Ferree et al., 2001;Im et al., 2007), although recent technical developments allow the discrimination of cortical laminae using MEG (Bonaiuto et al., 2018;Troebinger et al., 2014). Utilizing the differences in magnetic susceptibility of oxygenated and deoxygenated blood (Pauling and Coryell, 1936), Ogawa et al. (1990) termed the blood oxygenation level dependent (BOLD) effect 30 years ago, and showed that magnetic resonance imaging (MRI) is very sensitive to deoxyhemoglobin concentration. This critical finding enabled the acquisition of spatially resolved maps of human brain activity as used in most functional magnetic resonance imaging (fMRI) experiments today. Initially, fMRI images of the human brain were acquired with a spatial resolution of around 3 mm voxel size based on a 64 × 64 image matrix and an image size (field-of-view) of approximately 200 mm. This nominal spatial resolutionlimited by the MR acquisition techniqueallowed whole brain coverage in ∼ 2− 3 seconds, which was deemed sufficient as the timing of the hemodynamic response of the main BOLD response is relatively slow (∼ 5 s time-to-peak) (Norris, 2006). However, evidence from animal studies using highly resolved, Abbreviations: BOLD, blood oxygen level dependent; EPI, echo-planar imaging; fMRI, functional magnetic resonance imaging; MRI, magnetic resonance imaging; SNR, signal-to-noise ratio. ☆ This article is part of a Special Issue with the subject "pushing fMRI resolution".
oxygenation-sensitive optical methods demonstrated that the main BOLD response was due to later, flow-related effects localised in draining veins, and a much quicker, initial response would be localised closer to the neural activity (Chen et al., 2011;Grinvald et al., 1986;Vanzetta et al., 2004;Vanzetta and Grinvald, 2008). As such, one of the most challenging aspects in fMRI is to obtain high spatial specificity, considering that with standard approaches it seems we can only observe delayed and unspecific vascular responses.

The hunt for higher specificity
The opportunity of higher spatial specificity revealed through optical imaging served as a rationale to increase spatial and temporal resolution and capture either the early response or to increase spatial resolution sufficiently to exclude downstream draining vein effects. Fig. 1 illustrates the potential of oxygenation-based functional imaging of ocular dominance columns on the left, and the then available MRI-based methodology on the right. Prerequisite for increasing temporal and spatial resolution is improved scanner hardware in combination with novel acquisition and reconstruction techniques, which demanded substantial investments and resources for research in these fields. As this became available, the potential of fMRI was demonstrated early on with the example of ocular dominance columns . An alternative to increase spatial specificity is to manipulate the functional contrast by modifying the MR signal acquisition. Designing an MR sequence based on physiological parameters other than blood oxygenation would avoid the draining vein problem in the first place, and a range of contrasts and sequence implementations have been suggested and used in functional experiments . Another option is the use of a spin-echo sequence in place of a gradient-echo sequence, as simulations have shown that the sensitivity to large venous vessels is considerably reduced for spin-echo acquisitions (Boxerman et al., 1995). In this review we summarize the efforts and advances to push the boundaries to improve specificity for functional MRI of the human brain by focusing on the spatial resolution and functional contrast. The aspect of temporal resolution will be discussed in more detail in [this issue: Jonathan Polimeni and Laura Lewis: The biological timescales of fMRI: Nonlinearities of the BOLD response support the detection of surprisingly fast neuronal dynamics; Luca Vizioli: Probing temporal information in fast-TR fMRI data during attention modulations]. Further, this review exclusively discusses the developments for highresolution human fMRI. Nevertheless, high-resolution animal fMRI, and in particular the unique combination with other modalities such as optical imaging Silva, 2017;Wang et al., 2013) or electrophysiological recordings (Logothetis, 2003;Logothetis et al., 2001) constitute a key resource for the interpretation of fMRI signals acquired in humans (Goense et al., 2016;Kim and Ogawa, 2012;Poplawsky et al., 2019). Our targeted readership are scientists interested in functional brain imaging, who don't necessarily have a background in MR physics or engineering, but who would like to be informed about the current achievements. The reader should come to have an understanding of novel developments and potential future applications and appreciate how the interplay of improved hardware, advanced acquisition techniques and various MR contrast mechanisms led to the recent advances in functional neuroimaging. We start by summarizing the current status and by reviewing the capabilities of standard fMRI acquisition schemes with pointers to relevant literature and comprehensive reviews and then introduce more recent developments. We will briefly summarize the mode of action of each technique and highlight pivotal prerequisites and assumptions.

The workhorse of fMRI: Multi-slice 2D gradient-echo EPI
The acquisition scheme overwhelmingly used for fMRI is echo planar imaging (EPI). It was envisioned by Nobel Prize Laureate Peter Mansfield (Mansfield, 1977) many years before the necessary gradient hardware was actually available. The key idea is to combine multiple, very fast readouts to sample a whole slice 1 in a single RF excitation or shot in less than 100 ms. With a multi-slice implementation one can cover the whole brain with 3 mm spatial resolution within a few seconds, where each voxel would then contain 27 μl of tissue and 300 ′ 000 to 3 ′ 000 ′ 000 neurons (Abeles, 1991;Logothetis, 2008). Already employed in the early fMRI experiments (Bandettini et al., 1992;Kwong et al., 1992), multi-slice 2D EPI with 3 mm voxel size is still  Kwong et al. (1992). A baseline image acquired during darkness (Upper Left) was subtracted from subsequent images. Eight of these subtraction images are displayed, chosen when the image intensities reached a steady-state signal level, during darkness (OFF) and during 8-Hz photic stimulation (ON).
predominantly used even nowadays 2 as it allows a temporal resolution of 2− 3 seconds with full brain coverage and good temporal specificity. Unfortunately, it turns out that there are good reasons not to push beyond these parameters, as higher spatial resolutions lead to several challenges. To increase spatial resolution in fMRI one needs to sample higher spatial frequencies and acquire thinner slices (see Table 1 and Box 1). However, this prolongs acquisition time, reduces resolution through blurring, enhances geometric distortions through the longer sampling window, and reduces signal-to-noise ratio (SNR). To overcome these challenges and achieve higher spatial specificity, three strategies (or a combination thereof) need to be implemented: improved hardware, advanced acquisition and reconstruction techniques, and different functional contrasts.

Prerequisites for increased resolution in fMRI
2.1. MR hardware developments 2.1.1. Magnetic field strength The prospects of new acquisition techniques for fMRI depend considerably on the underlying hardware. For an in-depth discussion on the technological aspects of fMRI acquisitions, see Wald (2012). One key factor is the magnetic field strength, because higher field strength allows for higher SNR (Edelstein et al., 1986;Pohmann et al., 2016). This, in turn, yields the potential for structural and functional MRI at higher resolutions. This relationship also drives the continuous development of new magnets (Ugurbil, 2012). For high-resolution fMRI, the image SNR is the upper bound on the available time-course SNR (tSNR), which, in turn, determines the sensitivity to detect signal changes of interest (Triantafyllou et al., 2005). As illustrated in Fig. 2 (left), image SNR increases with voxel size and field strength. Thus, structural images can be acquired at higher resolution without SNR loss at higher field. The same applies to fMRI acquisitions using small voxel volumes, where the time-course SNR strongly depends on the available image SNR (Fig. 2, right). After nearly 20 years of development, MRI scanners with 7 Tesla (T) magnetic field strength are now cleared for clinical use in humans. 3 Current efforts for human imaging at even higher field strength include five MRI scanners at 9.4 T 4,5,6,7,8 , one at 10.5 T (He et al., 2020), and two at 11.7 T (Nowogrodzki, 2018). To move beyond that, initiatives around the world are developing new super-conducting materials and novel engineering solutions for signal transmission and reception (Bird et al., 2015;Budinger et al., 2016). First fMRI results from 9.4 T MRI scanners show their potential (Bause et al., 2016;Huber et al., 2018b;Kemper et al., 2018), although many challenges remain (He et al., 2020;Pohmann et al., 2016;Vaughan et al., 2006;Wiggins et al., 2019). Currently, most sub-millimetre fMRI studies are either performed at 7 T, or with dedicated radiofrequency coils at 3 T Polimeni and Uludag, 2018).

Radiofrequency (RF) transmit coils
Radiofrequency (RF) transmit coils are used to transmit energy into the tissue to generate the MR signal. To confine tissue heating to safe limits, the specific absorption rate (SAR) has been introduced, which constrains the maximum energy that can be deposited per unit time and tissue volume (see Fiedler et al. (2018) for a current overview). With increasing field strength the wavelength of the electromagnetic waves that are used to excite the spins decreases (Glover et al., 1985;Schick, 2005). At ultra-high field, the resulting inhomogeneities in the transmit (B 1 + ) field lead to significant non-uniformity in the MR signal and the corresponding image intensities (Padormo et al., 2016;Poser and Setsompop, 2018). At the same time, the power required to generate radiofrequency pulse increases with the main magnetic field strength (Ibrahim et al., 2009;. Consequently, in some parts of the brain the flip angles are too low to achieve the desired excitation for gradient-echo fMRI, leading to a lack of signal or loss of contrast. Similarly, the even higher flip angles required for refocussing or inversion make spin-echo, arterial spin labelling and vascular space occupancy fMRI particularly challenging at ultra-high field. To generate more homogenous transmit fields and thereby more efficient radio-frequency power deposition, parallel transmission (pTx) (Katscher et al., 2003;Zhu, 2004), which utilizes multiple radiofrequency transmit coils, has been developed (for a comprehensive review see Padormo et al. (2016), with current updates provided in Deniz (2019)). Recently, particular emphasis was placed on the practical implementation of parallel transmission, and 'universal pulses' (Gras et al., 2017) were designed to facilitate simpler workflows. Accordingly, first studies using this technique report increased time-course SNR and contrast-to-noise ratio using parallel transmission in fMRI Le Ster et al., 2019;Wu et al., 2019) (but see also earlier examples of parallel transmission applied to simultaneous multislice pulses  or spin-echo fMRI (De Martino et al., 2012)). The greater control of the transmit field through parallel transmission also offers selective or inner volume excitation (Mooiweer et al., 2017). When a sufficiently small volume of interest is selected, higher resolution (Deniz, 2019;Mooiweer et al., 2017) or higher time-course SNR (van der Zwaag et al., 2018) can then be achieved within the same sampling window.

Radiofrequency (RF) receive coils
Radiofrequency (RF) receive coils are the sensors of the MR system that pick-up the radiofrequency signals from which the images are ultimately reconstructed. Their performance considerably impacts the possible image resolution in two ways (Wald, 2012). First, the coil electronics, such as amplifiers and cables, determine the available image SNR. Array coils, which contain many small coil loops and are mostly used today, increase image SNR (Roemer et al., 1990) and, subsequently, time-course SNR, especially for small voxel volumes (Triantafyllou et al., 2011). Smaller distances between coil and sample also increase image SNR, which is utilized in surface coils mounted close to the region of interest (Hendriks et al., 2020). Second, the coil geometry determines the overall acceleration capabilities and associated noise penalties (Pruessmann et al., 1999). Note that parallel imaging, which is the most important acceleration technique for high-resolution fMRI (see section 2.2), became only possible with the development of array coils. Despite their potential (Hendriks et al., 2019;Wiggins et al., 2009), the development of dedicated radiofrequency receive coils is challenging and efforts are limited to few labs in the world (Wald, 2012). One must be aware that radiofrequency coil arrays provide drastically varying performance depending on the brain region and acceleration 2 To obtain a rough estimate of the currently used spatial resolution in fMRI studies, we surveyed all volumes (194 -203) published in the second half of 2019 in the journal NeuroImage. We searched for functional magnetic resonance imaging studies on humans where original data were acquired (no public data sets). In total, 80 studies performed at 3 Tesla magnetic field strength fulfilled these criteria. We excluded six studies performed at 1.5 Tesla, six studies performed at 7 Tesla, two studies that did not report the field strength and one study that did not report the resolution from our analysis. In summary, the median voxel volume was 27 μl, corresponding to a voxel size of 3x3x3mm 3 , and 50% of the studies used a voxel volume between 25.1 μl (2.9 3 mm 3 ) and 39.2 μl (3.4 3 mm 3 ).

Imaging gradients
The desire to acquire EPI images, and thereby speed up the imaging process including but not limited to the brain, initiated the development of new gradient coils in the late 80 ′ s and early 90 ′ s (Cohen and Schmitt, 2012). The availability of EPI-capable MRI scanners was pivotal for the first fMRI studies (for a review see Blamire (2012)). Recent developments in gradient hardware are driven from diffusion imaging as part of the Human Connectome Project Setsompop et al., 2013) and short-T 2 imaging (Froidevaux et al., 2020;Weiger et al., 2018). They allow sampling of higher spatial frequencies in a shorter time, and thus enable increased resolution. For example, commercial gradients in the early 90 s, i.e., at the time of the first fMRI experiments (Bandettini et al., 1992), had a gradient strength 10 m T/m and a slew rate 20 m T/m/ms, necessitating the building of dedicated gradients, which could deliver 25 m T/m and 400 m T/m/s (Wong, 2012). Otherwise, the sampling would be too slow to acquire an image before the signal has decayed. Current clinical systems employ gradients with gradient strength between 45 up to 80 m T/m and slew rates around 200 m T/m/s 9,10,11 . The gradient system developed in the Human Connectome Project exceeds the gradient strength by nearly one order of magnitude (300 m T/m) . However, today's gradient performance is mostly limited through physiological constrains in the form of peripheral nerve stimulation (Mansfield and Harvey, 1993;Schenck, 2005;Wong, 2012). To overcome these limits, gradient coil designs which inherently induce less peripheral nerve stimulation are being revisited (Wong, 2012). Alternatively, sophisticated modelling that accounts for the stimulation limits of the human body will help to further improve gradient coil design (Davids et al., 2020(Davids et al., , 2019.

Data management
Good research data management practices are becoming part of the MRI community (Borghi and Van Gulick, 2018). Both, concerns about the reproducibility  and the tremendous potential of re-using data (Van Horn and Gazzaniga, 2013) have driven these developments. This notion was further advanced through multiple large data acquisition and sharing initiatives (Marcus et al., 2013;Mueller et al., 2005;Sudlow et al., 2015), whose data bases contain several thousand participants. Subsequently, guidelines for reporting , storing (Gorgolewski et al., 2016), and sharing (Wilkinson et al., 2016) of fMRI data were developed in recent years. As fMRI resolution increases, so does the size of these data sets. For instance, a 30-minute fMRI experiment with a repetition time of 2 s, a voxel volume of 27 μl (3 mm x 3 mm x 3 mm) and a field-of-view of 210 mm x 210 mm x 105 mm providing whole brain coverage contains 154 million (10 6 ) data points. The same experiment with a hypothetical voxel volume of 125 nl (0.5 mm x 0.5 mm x 0.5 mm) and a repetition time of 1 s would contain 64 billion (10 9 ) data points, which is a 410-fold increase. Thus, the development of acquisition techniques that can achieve these high resolutions needs to be accompanied by appropriate developments in hardware and compute infrastructure to facilitate image

Box 1 Resolution in fMRI.
Nominal resolution In general, image resolution is defined as the smallest resolvable distance between two objects, or how close two objects or features within an object can be, while they can still be resolved (Brown et al., 2014). What is usually reported in fMRI studies, however, is the nominal resolution or voxel size, which is the size of the field-of-view (FOV) divided by the size of the image sampling grid. The effective resolution, which is based on the definition above, is lower than the nominal resolutions due to the image acquisition process (Yacoub and Wald, 2018). Effective resolution Recall that image information in MRI is collected in Fourier space, and thus resolution is limited by the highest sampled spatial frequency (Brown et al., 2014). Hence, a point source in the object is reconstructed as the point source convolved with a sinc function. In addition, the MR signal decays exponentially during the sampling window (characterised by the transversal relaxation time T 2 *), which adds further spatial blurring to the image (Farzaneh et al., 1990). However, differences in the T 2 *-based signal decay are also the basis for the most common fMRI contrast (Ogawa et al., 1993). Further reduction in image resolution is caused by geometric distortions due to magnetic field inhomogeneities and gradient non-linearities (Jezzard and Clare, 1999). Image distortions due to magnetic field inhomogeneities are a major impediment for high-resolution fMRI, because these distortions increase with magnetic field strength and with the time it takes to collect the data in Fourier space. Thus, the effective spatial resolution of an fMRI image can be considerably lower than its nominal resolution, and various countermeasures during image acquisition and reconstruction are required to obtain the desired resolution (Marques and Norris, 2018;Polimeni et al., 2018;Poser and Setsompop, 2018). Spatial specificity Spatial specificity describes how much the generated activation maps are a genuine representation of the underlying neural activity (Logothetis, 2008). This depends on the temporal and spatial resolution of the acquisition, as well as the functional contrast Polimeni et al., 2018). This is one of the main aims in high resolution fMRI and based on functional experiments using optical methods this should in principle be feasible. product/HC781271/ingenia-30t-cx-mr-system/specifications (accessed 11.11.2020). 11 Magnetom Prisma, 2020. Siemens Healthineers. https://www.siemens-hea lthineers.com/en-au/magnetic-resonance-imaging/3t-mri-scanner/magnetomprisma/features (accessed 8.5.2020).

Image acquisition and reconstruction techniques
The installation of higher magnetic field scanners and the development of advanced radiofrequency coil concepts were critical prerequisites for increasing the spatial resolution in fMRI. Parallel to the improvement in MR hardware was an accompanying development of acquisition and reconstruction techniques that otherwise would not have been possible. Techniques to increase resolution usually aim to either acquire the image information faster (using stronger gradients, see section 2.1.4) or only parts of the information at a time, with the missing pieces being recovered during image reconstruction. Thereby, higher spatial frequencies can be sampled before the signal is decayed, enabling the reconstruction of high-resolution images. One of the earliest developments was partial Fourier imaging (Feinberg et al., 1986), which exploits known symmetries of the Fourier data space. In theory, a reduction of up to 50 % should be possible (Feinberg et al., 1986;Jesmanowicz et al., 1998), but in combination with the T 2 * signal decay this can lead to profound image blurring 12 (Huber et al., 2015). Central to the overall quality of the final image is the reconstruction algorithm that is used to fill the missing samples, with iterative methods such as POCS resulting in fewer reconstruction artefacts (Haacke et al., 1991;McGibney et al., 1993). A second widely used technique, called parallel imaging, allows for 50 % (Kirilina et al., 2016;Lutti et al., 2013;Todd et al., 2016) and more (Feinberg et al., 2010;Moeller et al., 2010;Poser et al., 2010) to be omitted during sampling. During image reconstruction, the spatial profiles of the radiofrequency receive coilsvia sensitivity maps (Pruessmann et al., 1999) or autocalibration data (Griswold et al., 2002) are then utilized to create remarkably high quality images and time series, considering that 50 % or more of the data has not been sampled. However, the reduced number of samples and the limited information that can be extracted from the profiles of the radiofrequency coils can lead to a substantial reduction in image SNR of at least 30 % and often more 13 (Pruessmann et al., 1999). Further, to achieve a high isotropic resolution the slice thickness needs to be reduced. Thus, more slices are required to achieve the same coverage, increasing the overall acquisition time.
The same principle can be used by partially sampling data points not just within on slice, but also across multiple slices either in a simultaneous multislice 2D or a 3D acquisition scheme, again with the dependency on receive coil arrays and their sensitivity informationfor a comprehensive technical review see Poser and Setsompop (2018). In the 2D scheme, multiple slices are simultaneously excited and acquiredknown as simultaneous multislice (SMS) or multiband acquisition (Larkman et al., 2001;Moeller et al., 2010). The disentangling of simultaneously acquired slices using the CAIPIRINHA reconstruction technique (Breuer et al., 2006(Breuer et al., , 2005 and its blipped-CAIPIRINHA implementation for EPI (Setsompop et al., 2012) have helped reduce EPI acquisition times by up to a factor of 16 (Chen et al., 2015). In the ideal case, i.e., no reconstruction (g-factor) noise and the same acquisition time, this comes with no penalty in image SNR. While SMS is not directly suited to increase resolution in fMRI, it allows to sample faster at the same resolution or sample at a higher resolution in the same time (in combination with other means). Other, more unconventional approaches to improve temporal resolution include MR-encephalography (MREG) (Hennig et al., 2007) and Inverse Imaging (InI) (Lin et al., 2006) making use of reference images in the reconstruction process. Prior information is also used in EPIK (Jones et al., 1993;Van Vaals et al., 1993), where parts of the k-space are reused for more time points or assumptions on the periodicity of the stimulus/BOLD response are made (UNFOLD) (Madore et al., 1999). The potential of utilizing newly developed machine learning approaches (Knoll et al., 2020) to reduce reconstruction time and improve image fidelity in these heavily undersampled cases remains to be explored.

Higher spatial resolution
In general, the advanced image acquisition and reconstruction techniques discussed above can be used to markedly increase spatial resolution when applied to EPI. For example, both Lutti et al. (2013) and Todd et al. (2016) exemplify how the voxel volume can be reduced by 88 % (from 3 3 mm 3 to 1.5 3 mm 3 ) using vendor-provided hardware and sequences at 3 T magnetic field strengthwhich was still on the wish-list in 2012 (Wald, 2012). Increasing the spatial resolution in fMRI to this level reduces signal loss, dilution with un-activated tissue, such as white matter, cerebrospinal fluid, and neighbouring gyri, and noise contamination from the cerebrospinal fluid. In combination with the improved BOLD specificity, the overall contrast-to-noise ratio thus increases (Wald, 2012). However, investigations into the benefit of sophisticated MR sequences providing a spatial resolution of around 2 mm and a repetition time (TR) ⋅g . In addition, a coildependent geometry (g) -factor, which varies with voxel location, further reduces the image SNR (Pruessmann et al., 1999).
of 1 s showed that marked improvements at the single-subject level did not translate to the group level (Kirilina et al., 2016). While we can only speculate here, first investigations show that individual variability of brain organisation (Braga and Buckner, 2017;Gordon et al., 2015) and limitations in classical volume-based analysis (Coalson et al., 2018) might be the limiting factor for group analysis using higher-resolution images. Substantially larger reductions in voxel volume (up to 98 % at 0.8 3 mm 3 ) have been achieved at 3 T for dedicated applications using specialized hardware, different acquisition techniques, and long scan durations Puckett et al., 2016;Ress et al., 2007;Voit and Frahm, 2005). However, most fMRI studies with sub-millimetre resolutions, i.e., voxel volumes below 1 μl, are performed at 7 T. This is because the SNR in MRI scales with the field strength and voxel volume (Edelstein et al., 1986;Pohmann et al., 2016), such that an SNR loss through smaller voxels can be compensated by higher field strength (Fig. 2). Accordingly, a 'mesoscopic' resolutiondefined here as a voxel volume between 100 nl and 500 nlhas been achieved a number of times using very distinct acquisition techniques and dedicated hardware (Duong et al., 2002;Feinberg et al., 2018;Fracasso et al., 2018;Heidemann et al., 2012;Koopmans et al., 2010;Olman et al., 2012). Duong et al. (2002), Heidemann et al. (2012) and Feinberg et al. (2018) used variants of a 2D EPI. Duong et al. (2002) used a single-slice 2D spin-echo EPI in combination with a surface coil to achieve a voxel volume of 500 nl. Heidemann et al. (2012) utilized additional radiofrequency pulses to supress signal from outside the region of interest (Pfeuffer et al., 2002b). Thus, a larger number of samples could be omitted, which allowed to sample higher spatial frequencies, and a voxel volume of 275 nl was achieved. In the recent study by Feinberg et al. (2018), a dedicated surface coil and simultaneous multislice EPI was used to image ocular dominance columns at 125 nl voxel volume. Koopmans et al. (2010) utilized a slow, but more accurate, (non-EPI) gradient-echo imaging technique and a custom-made surface coil array for optimal undersampling (Barth and Norris, 2007) to perform depth-dependent analyses at a voxel volume of 421 nl. While this technique allows relatively high spatial resolution, it is limited in terms of coverage, speed and potentially effected by so-called in-flow effects. Initially, these non-EPI acquisitions were limited to a single slice, but attempts were made to acquire multiple slices (Loenneker et al., 1996) or speed up the acquisition using echo-shifting techniques with a 2D (Liu et al., 1993) or 3D readout (van Gelderen et al., 1995). Fracasso et al. (2018) investigated depth-dependent positive and negative BOLD responses using a 3D EPI acquisition scheme (Poser et al., 2010) in combination with a custom-built surface radio-frequency coil at a voxel volume of 166 nl. For high-resolution fMRI, 3D EPI offers a significant increase in image SNR (∼ ̅̅̅̅̅̅̅̅̅̅̅ ̅ N Slices √ ) compared with 2D EPI, while using the same original principles (Mansfield, 1977). While in 2D EPI one slice after the other is acquired, a whole 3D volume is excited in 3D EPI. Note that the same acceleration and reconstruction techniques (Breuer et al., 2006(Breuer et al., , 2005Griswold et al., 2002;McGibney et al., 1993;Pruessmann et al., 1999) can be used to increase the resolution Zahneisen et al., 2015). Additional advantages of 3D EPI are its lower power requirements and its sharper slice profiles. Thus, 3D EPI has been shown to be advantageous over 2D EPI for applications with high spatial (Jorge et al., 2013;Kirilina et al., 2016;Poser et al., 2010;Reynaud et al., 2017;van der Zwaag et al., 2012) and high temporal resolution (Stirnberg et al., 2017). Olman et al. (2012) also utilized a 3D acquisition, but based on a gradient-and-spin-echo sequence (see also section 4.2). Using a dedicated surface coil, they could measure fMRI responses at different cortical depths with 343 nl voxel volume. If gradients and acceleration are at their maximum and longer sampling times are not possible due to the associated T 2 * signal decay, the resolution can be further increased using a multi-shot or in-plane segmentation approach (Butts et al., 1994). Therein, the data are no longer sampled in one shot, i.e. after one radiofrequency excitation, as was the initial idea behind echo planar imaging, but in a number of shots representing a hybrid approach between EPI and classical sampling schemes 14 . This technique was particularly popular when parallel imaging was not yet available (Cheng et al., 2001;Goodyear and Menon, 2001;Menon and Goodyear, 1999;Yacoub et al., 2001). Despite the associated long repetition times of several seconds and the increased susceptibility to image artefacts (Reeder et al., 1997), even today in-plane segmentation is a valuable tool to achieve some of the highest resolution of 216 nl to 125 nl in fMRI (Berman et al., 2020;Fracasso et al., 2018;Huber et al., 2020a;Stirnberg and Stöcker, 2020). The examples discussed above highlight the small voxel volumes that can be achieved with various techniques. However, the substantial reduction in SNR and contrast-to-noise ratio at these resolutions, especially when surface coils are not available, means that in practice 2D and 3D EPI sequences with more moderate voxel volumes between 1 μl (1 3 mm 3 ) and 500 nl (0.8 3 mm 3 ) are used (Finn et al., 2019;Huber et al., 2020b;Koizumi et al., 2019;Lawrence et al., 2019;Lewis et al., 2018;Moerel et al., 2019;Nasr et al., 2016;Puckett et al., 2017;Qian et al., 2020). Spiral acquisitions (Barth et al., 1999;Engel et al., 2018;Glover, 2012;Graedel et al., 2020;Kashyap et al., 2020;Kasper et al., 2020;Liberman and Poser, 2019;Ress et al., 2007) and radial readout schemes (Graedel et al., 2017;Lee et al., 2010) hold promise to improve resolution further by optimising the efficiency of the readout. But moving away from classical acquisition schemes poses significant challenges to image quality and robustness of time series especially at sub-millimetre resolution, which require advanced acquisition and reconstruction schemes Graedel et al., 2020;Kashyap et al., 2020;Kasper et al., 2018). Traditionally, non-EPI acquisition techniques, such as variants of GRE acquisitions (spoiled, non-spoiled, balanced, non-balanced), have also shown their value, and while one has to compromise on coverage and/or speed, the image SNR and quality can be high (Báez-Yánez et al., 2017;Barth et al., 2010;Malekian et al., 2018;Miller, 2012;Scheffler and Ehses, 2016). While not the focus of this review, it is important to note that not only the data acquisition, but also the data analysis greatly impacts the spatial specificity. The data acquired in fMRI studies are subject to complex analysis pipelines, in which they are interpolated , projected (Operto et al., 2008), pooled (Polimeni et al., 2018), cleaned (Murphy andBright, 2017) and smoothed (Blazejewska et al., 2019). This can drastically reduce the effective resolution of the data after the analysis. Thus, when acquiring high-resolution fMRI data, additional steps during the analysis are necessary to preserve and utilize the full resolution (see, for example, the comprehensive guideline paper by Polimeni et al. (2018)). This might also include the numerous fMRI signal models that are currently developed Heinzle et al., 2016;Markuerkiaga et al., 2016) to account for the vascular effects in depth-dependent fMRI analysis.

Background
Commonly used contrasts in fMRI measure a physiological, i.e. vascular and/or hemodynamic response to a neural event. For an in-depth discussion on the cellular and physiological mechanisms behind the translation of neural activity into a vascular or hemodynamic response see Iadecola (2017) and Logothetis (2008). The most commonly used contrast mechanism in fMRI is the BOLD effect (Ogawa et al., 1990), which is based on the differing magnetic properties of oxygenated and de-oxygenated haemoglobin in the red blood cells (Pauling and Coryell, 1936). Physiologically, changes in blood oxygenation level are driven by changes in cerebral blood volume (CBV), cerebral blood flow (CBF), and the cerebral metabolic rate of oxygen (CMR O2 ) (Buxton et al., 2004). While BOLD fMRI can provide images of brain activation with high spatial specificity, the later and stronger, but unspecific response can diminish this (see Fig. 1). Thus, MR sequences sensitive to the other physiological variables underlying the BOLD effect were developed. The aim was to improve spatial specificity, without having to increase the spatial resolution. In the following, we will discuss the acquisition techniques for each contrast mechanism focussing on their spatial and temporal resolution, time-course SNR and time-course SNR efficiency, sensitivity and contrast-to-noise ratio.

Blood-oxygenation-level-dependent (BOLD) contrasts
BOLD-based fMRI is commonly performed using a gradient-echo sequence. The T 2 *-contrast of a gradient-echo makes it intrinsically sensitive to the BOLD effect, and numerous studies on maximizing this sensitivity in different brain regions are available (Deichmann et al., 2003Fera et al., 2004;Gorno-Tempini et al., 2002;Poser et al., 2006;Poser and Norris, 2009;Posse et al., 1999;Robinson et al., 2004Robinson et al., , 2008Volz et al., 2019Volz et al., , 2009Weiskopf et al., 2007Weiskopf et al., , 2006. Gradient-echo BOLD fMRI provides highest sensitivity and time-course SNR efficiency (Koopmans and Yacoub, 2019), and is commonly used to push the spatial resolution in functional maps Kashyap et al., 2018;Polimeni et al., 2010). Fig. 3 illustrates the application of 7 T gradient-echo BOLD fMRI to investigate detailed characteristics of the visual system (Nasr et al., 2016). The effect size, i. e., the difference in MR signal during activation compared to baseline levels, is highest for gradient-echo BOLD. The effect size increases with field strength (Gati et al., 1997;Okada et al., 2005;van der Zwaag et al., 2009), and at 7 T, signal changes of 10 % and above have been reported van der Zwaag et al., 2009;Yacoub et al., 2001). For an extensive discussion of the potential of high-resolution gradient-echo BOLD fMRI for neuroimaging, see the recent articles by De , Dumoulin et al. (2018) and Schluppeck et al. (2018). As was realized early on (Menon et al., 1995;Turner, 2002), changes in blood-oxygenation level can not only be observed in capillaries, but also in draining veins further away from the site of neural activity (see Koopmans and Yacoub (2019) for a summary). For depth-dependent analyses using gradient-echo data, these veins draining blood towards the pial surface have been shown to distort the estimated profiles (Kay et al., 2019;Moerel et al., 2018;Polimeni et al., 2010). Thus, T 2 -weighted BOLD fMRI using a spin-echo sequence was developed, which was estimated to contain 2− 3 times less contributions from draining veins than T 2 *-weighted gradient-echo fMRI at 7 T (Uludag et al., 2009). Spin-echo sequences utilize an additional, strong radio-frequency pulse to reduce contributions from large veins to the fMRI signal (Ogawa et al., 1993). However, this additional pulse increases the acquisition time, limits the coverage (Duong et al., 2002;Pfeuffer et al., 2002b), and poses challenging power requirements especially at 7 T (Koopmans et al., 2012), resulting in significantly reduced time-course SNR (Boyacioglu et al., 2014;Harmer et al., 2012;Olman et al., 2010;Rua et al., 2017). Additionally, the signal change is 50% lower compared with gradient-echo acquisitions (Harmer et al., 2012). This makes spin-echo acquisitions extremely inefficient compared with gradient-echo acquisitions (Koopmans and Yacoub, 2019). Notably, to achieve the fast acquisition speeds required for fMRI, spin-echo acquisitions are combined with the same data acquisition module as gradient-echo acquisitions. This re-introduces a T 2 *-weighting, which increases with resolution, and reduces the spatial specificity of spin-echo fMRI. Nevertheless, if these issues can be mitigated by using specialized hardware, i.e., surface coils, and a very small coverage (a single slice in one case), spin-echo fMRI has been successfully implemented for highresolution acquisitions of 0.5 μl-5.8 μl voxel volume (Duong et al., 2002;Olman et al., 2018Olman et al., , 2010Yacoub et al., 2007Yacoub et al., , 2005Yacoub et al., , 2003. Otherwise, the strong dependence of the contrast-to-noise ratio on the image SNR (Olman et al., 2018), and the two times larger effect size and 3.5 times higher contrast-to-noise ratio in gradient-echo BOLD fMRI (Harmer et al., 2012) diminish the advantages of spin-echo fMRI in more conventional settings (Boyacioglu et al., 2014;Rua et al., 2017). An alternative sequence called 3D-gradient-and-spin-echo (3D-GRASE) ( Feinberg and Oshio, 1991), which provides a mix of T 2 -and T 2 *-weighting, has shown higher sensitivity and time-course SNR than classical spin-echo acquisition (Kemper et al., 2015). However, the uptake has been slow in the community, probably owing to the complex MR signal characteristics (Koopmans and Yacoub, 2019). Nevertheless, 3D-gradient-and-spin-echo acquisitions can provide higher specificity than gradient-echo acquisitions at the cost of smaller coverage and lower sensitivity Muckli et al., 2015). In particular Fig. 3. Gradient-echo BOLD based activation maps depicting the well-known pattern of ocular dominance columns in V1 (left) and colour-selective stripes in V2 and V3. Adapted from Nasr et al. (2016). for depth-dependent analysis, a 3D-gradient-and-spin-echo acquisition has shown significantly reduced bias towards the pial surface and similar sensitivity and specificity as vascular-space-occupancy imaging (Beckett et al., 2020). If small regions of interest can be selected and additional hardware is available, voxel volumes of 0.3 μl (Olman et al., 2012) and 0.5 μl  have been successfully employed for depth-dependent analysis using a 3D-gradient-and-spin-echo acquisition.

Cerebral-blood-flow (CBF) contrast
Arterial-spin-labelling (ASL) sequences are designed to measure blood water spins that diffuse into the tissue at the capillary level, reflecting cerebral blood flow (CBF) close to the locus of neural activation (Calamante et al., 1999). To this end, blood water spins in the neck region are magnetically labelled, and then travel through the arteries into the tissue. The labelled spins then cause a reduction in MR signal compared to an unlabelled control image. The labelling of blood water spins in the neck region is challenging, especially at 7 T (Alsop et al., 2015). To achieve sufficient labelling efficiency and to not exceed the maximum radiofrequency power, long acquisition times are necessary, which dramatically reduce the time-course SNR efficiency. Dedicated radiofrequency pulses and hardware are in development to optimize labelling efficiency (Jezzard et al., 2018). For image formation, arterial-spin-labelling sequences use the same methods as BOLD fMRI, i.e., gradient-echo, spin-echo or 3D-gradientand-spin-echo acquisitions (Bause et al., 2016;Feinberg and Günther, 2009;Ivanov et al., 2017;Pfeuffer et al., 2002a;Schwarzbauer, 2000;Vidorreta et al., 2013;Wong et al., 2000;Yang et al., 1998). In particular, spiral gradient-echo acquisitions were investigated for arterial spin labelling because of their increased signal levels Nielsen and Hernandez-Garcia, 2013;Perthen et al., 2008;Yang et al., 2002). Thus, functional arterial spin labelling could theoretically be performed at the same nominal resolution as BOLD fMRI. However, the relative MR signal change is only between 0.2 % and 0.6 % , and time-course SNR values below 0.5 are reported for a variety of arterial-spin labelling sequences . Consequently, high-resolution arterial-spin-labelling studies are performed at comparably large voxel volumes (1.2 μl-9 μl) (Bause et al., 2016;Duong et al., 2002;Ivanov et al., 2017;Pfeuffer et al., 2002a;Zuo et al., 2013). One recent conference contribution presented data with a voxel volume of 0.35 μl and time-course SNR values of around 0.6 , indicating the potential for depth-dependent analysis of arterial-spin-labelling data . Note that the simultaneously acquired perfusion and BOLD contrasts from arterial-spin-labelling sequences can be used to estimate changes in the cerebral metabolic rate of oxygen (Davis et al., 1998).

Cerebral-blood-volume (CBV) contrast
Vascular-space-occupancy (VASO) is a non-invasive, cerebral blood volume-based contrast (Lu et al., 2013). The sequence variant developed by Huber et al. (2014) for vascular-space-occupancy fMRI at 7 T is currently a very successful non-BOLD contrast for high-resolution acquisitions, and in particularly for depth-dependent applications (Finn et al., 2019;Huber et al., 2020bHuber et al., , 2017Persichetti et al., 2020;Yu et al., 2019). Fig. 4 shows the activation maps derived from cerebral-blood-volume and BOLD images, with the corresponding depth-dependent profiles in the right panel. The fundamental idea is to null the signal from blood water spins using a radio-frequency pulse. An increase in cerebral blood volume, i.e., more nulled blood water spins, would thus result in a decrease in MR signal. Vascular-space-occupancy fMRI also uses a gradient-echo sequence for image formation and can therefore achieve the same high nominal resolutions. But because a gradient-echo or gradient-and-spin-echo (Beckett et al., 2020;Poser and Norris, 2011) is used, the intrinsic BOLD-contrast is still present and needs to be corrected. In its current implementation (Huber et al., 2014), the correction is achieved by acquiring BOLD-weighted reference images interleaved with the cerebral blood volume-weighted images. The increased specificity of cerebral blood volume-based acquisitions (Jin and Kim, 2008) though comes at a cost. The time-course SNR in vascular-space-occupancy fMRI is between 60 % and 70 % of that of BOLD (Huber et al., 2018a(Huber et al., , 2016, and the overall time-course SNR efficiency is also lower because of the twofold acquisition times for the same number of data points. The contrast-to-noise ratio for vascular-space-occupancy fMRI is therefore around 40 % to 70 % compared to gradient-echo BOLD . Note that both arterial-spin-labelling and vascular-space-occupancy contrasts rely on the blood flow, and thus pose stringent timing requirements on the acquisition. In turn, this limits the achievable acquisition time to a few seconds, or the maximum coverage at high-resolution (Hua et al., 2011;Huber et al., 2019). The most recent implementation of a cerebral blood volume-contrast, which is presented in this special issue (Huber et al., 2020a) and termed MAGEC-VASO, can overcome these limitations. Instead of aiming to null the signal from blood water spins, it is sufficient to introduce a different T1-weighting between blood water spins and the surrounding tissue. An increase in cerebral blood volume would then continue to induce a corresponding signal change in the MR image. Because the sampling of the data is no longer time-limited, it is possible to either achieve whole-brain coverage at submillimetre resolution or voxel volumes of 125 nl with the high specificity of a cerebral blood volume-contrast (Huber et al., 2020a).

Conclusion
We have reviewed how recent advances in MR hardware and acquisition and reconstruction techniques have increased spatial specificity for fMRI. This enabled the investigation of mesoscopic properties of human brain function and started to provide detailed insights into the smallscale organisation of the human brain, such as cortical columns and layers. This would not have been possible without the additional SNR afforded by higher magnetic field strength and dense coil arrays in combination with high-performance gradients and advanced acquisition techniques such as simultaneous multislice or 3D EPI. On the example of ocular dominance columns, one can see the slow, but steady progress that has been achieved (Cheng, 2012). While the potential to image these columns was shown in 1990 (Ts'o et al., 1990), it took nearly a decade for the first BOLD fMRI study to appear . Since then, numerous feasibility studies demonstrated the potential of ultra-high field fMRI to study the visual system (Cheng et al., 2001;Goodyear and Menon, 2001;Grinvald et al., 2000;Menezes de Oliveira et al., 2019;Voit and Frahm, 2005;Yacoub et al., 2008Yacoub et al., , 2007 leading up to applications driven by even more advanced neuroscientific questions (Nasr et al., 2016;Olman et al., 2018Olman et al., , 2012. So, what are the remaining challenges for new acquisition techniques and thus, what are the prospects for the achievable resolution of fMRI? One is the selection of the most optimal combination of hardware, acquisition technique and MR contrast, as well as stimulus design and analysis strategy. The appropriate MR contrast for the research question is crucial, i.e., either the superior sensitivity of gradient-echo BOLD or the excellent specificity of spin-echo BOLD and cerebral-blood-volume contrast. The particularities of the image acquisition process in fMRI mean that spatial and temporal resolution, image and time-course SNR, sensitivity, coverage, motion robustness, and image fidelity are all interrelated. When settling for a specific spatial resolution (or any other parameter), trade-offs in all other areas become necessary. It is thus worthwhile to identify those factors most critical for answering the research question before starting the fMRI image acquisition. An excellent guide for choosing the right sequence not only for functional but also for structural MRI has been provided by Marques and Norris (2018). Also, the pragmatic recommendations given in Koopmans and Yacoub (2019) can serve as a guideline: first, pilot a neuroscience study using a gradient-echo BOLD sequence with the lowest resolution at which the effect of interest would still be expected to be seen. Then verify if the effect of interest (for example the difference between two conditions) can be identified with these parameters. Only if this is not possible or if the results are uninterpretable, additional measures to increase resolution or remove vascular effects should be investigated. Continuous quality assurance that not only assesses the quality of the raw images, but also the ability to detect the effect of interest are crucial. However, this requires that the whole analysis pipeline is pre-definedand the effect of interest or a surrogate is available at the single-subject level. Additionally, the translation and dissemination of these developments needs to be considered. This issue is obviously linked to the previous one as a wide range of expertise, such as MR physics, biomedical engineering, physiology, neuroscience, biology, and psychology is necessary to perform these highest resolution fMRI studies. Many of the examples presented here employ minutely tailored acquisitions, which might not perform as well in other brain regions, with less experienced participants, or on other hardware. Given the huge success of large data acquisition initiatives (Marcus et al., 2013;Mueller et al., 2005;Sudlow et al., 2015), one might wonder if a similar model as in astronomy can be employed for neuroscience. The data acquisition could be performed in dedicated centres providing the necessary hardware, expertise and routine, while the neuroscientist predefines the experimental design and certain requirements for the data. While this approach is already the basis of many successful collaborations between MR physicists and neuroscientists, a purely academic system will not scale to the necessary size and throughput of participants required for neuroscientific studies. Another unresolved issue remains the performance of high-resolution group studies. In most of the publications discussed throughout, results from the single-subject level were presented. This is not only because the overall number of participants is low, but also because the anatomical variability at this level of detail cannot be overcome using standard techniques. New approaches include the alignment across participants in a common model space defined through auxiliary functional information (Haxby et al., 2011;Heinzle et al., 2011) or detailed anatomical information as, for example, in depth-dependent analyses . What can we expect from future fMRI acquisitions? Based on recent efforts one can speculate that acquisitions with 125 nl voxel volume (0.5 × 0.5 × 0.5 mm 3 voxel size) covering a part of the cortex within 500 ms -1000 ms second will become feasible. This potential resolution is based on recent publications at international conferences that show improved gradient performance and receive coil arrays, specialized acquisition techniques, and even higher magnetic field strengths than available now. However, this prediction has already been made more than a decade ago (Logothetis, 2008), which highlights the long timescales these developments require. These high resolutions require very specialized hardware and expertise and will likely become available first at major collaborative research centres. Additionally, the imaging community will hopefully be able to translate these developments using the most advanced equipment to a more standard settingas has already been achieved with the simultaneous multislice (SMS) acquisition technique developed on dedicated connectome scannersand enable progress on a broader basis.

Acknowledgements
MB acknowledges funding from ARC Future Fellowship grant FT140100865, ARC Training Centre grant IC170100035, ARC Discovery Project grant DP200103386, NHMRC-NIH BRAIN Initiative Collaborative Research Grant APP1117020, NIH grant 1R01MH111419. SB acknowledges funding from the NHMRC-NIH BRAIN Initiative Collaborative Research Grant APP1117020, NIH grant 1R01MH111419. We would also like to thank Avery Berman for discussions on neurovascular coupling, Tom Shaw for critical reading of the manuscript, Benedikt Poser for valuable feedback on the Radiofrequency transmit coil section, the three reviewers for their thorough and helpful comments, and Lars Kasper, from whom the '0.5 3 mm 3 in 0.5 s' statement originates.

Appendix A. The Peer Review Overview and Supplementary data
The Peer Review Overview and Supplementary data associated with this article can be found in the online version, at doi:https://doi.org/10.10 16/j.pneurobio.2020.101936.