Improvement of sensitivity and specificity for laminar BOLD fMRI with double spin-echo EPI in humans at 7 T

Mapping mesoscopic cortical functional units such as columns or laminae is increasingly pursued by ultra-high field (UHF) functional magnetic resonance imaging (fMRI). The most popular approach for high-resolution fMRI is currently gradient-echo (GE) blood oxygenation level-dependent (BOLD) fMRI. However, its spatial accuracy is reduced due to its sensitivity to draining vessels, including pial veins, whereas spin-echo (SE) BOLD signal is expected to have higher spatial accuracy, albeit with lower sensitivity than the GE-BOLD signal. Here, we introduce a new double spin-echo (dSE) echo-planar imaging (EPI) method to improve the sensitivity of SE-BOLD contrast by averaging two spin-echoes using three radiofrequency pulses. Human fMRI experiments were performed with slices perpendicular to the central sulcus between motor and sensory cortices at 7 T during fist-clenching with touching. First, we evaluated the feasibility of single-shot dSE-EPI for BOLD fMRI with 1.5 mm isotropic resolution and found that dSE-BOLD fMRI has higher signal-to-noise ratio (SNR), temporal SNR (tSNR), and higher functional sensitivity than conventional SE-BOLD fMRI. Second, to investigate the laminar specificity of dSE-BOLD fMRI, we implemented a multi-shot approach to achieve 0.8-mm isotropic resolution with sliding-window reconstruction. Unlike GE-BOLD fMRI, the cortical profile of dSE-BOLD fMRI peaked at ∼ 1.0 mm from the surface of the primary motor and sensory cortices, demonstrating an improvement of laminar specificity in humans over GE-BOLD fMRI. The proposed multi-shot dSE-EPI method is viable for high spatial resolution UHF-fMRI studies in the pursuit of resolving mesoscopic functional units.


Introduction
High spatio-temporal resolution functional magnetic resonance imaging (fMRI) permits the investigation of detailed functional units, such as the cortical columns or laminae. Considering the sub-millimeter size of human columns and laminae, not only does signal-to-noise ratio (SNR) become critical but also the spatial specificity of the signal. The spatial specificity of blood oxygenation level-dependent (BOLD) signal is affected by physiological and physical parameters, such as vasculature of the region-of-interest (ROI), magnetic field strength, and choice Abbreviations: ANOVA, one-way analysis of variance; BOLD, blood oxygenation level-dependent; CBV, cerebral blood volume; CNR, contrast-to-noise ratio; CSF, cerebrospinal fluid; dSE, double spin-echo; EPI, echo-planar imaging; ESP, echo spacing; FID, free-induction decay; FLASH, fast low angle shot; FLEET, fast low-angle excitation echo-planar technique; fMRI, functional magnetic resonance imaging; FOV, field of view; FPM, finite perturber method; GE, gradient-echo; GRAPPA, geneRalized Autocalibrating Partial Parallel Acquisition; GRASE, gradient and spin-echo; M1, primary motor cortex; MC, Monte Carlo; NR, number of repetitions; PINS, Power Independent Number of Slices; PSF, point spread function; RF, radiofrequency; ROI, region-of-interest; S1, primary sensory cortex; SAR, specific absorption rate; SD, standard deviation; SE, spin-echo; SEM, standard errors of the mean; STE, stimulated echoe; SNR, signal-to-noise ratio; TE, echo time; TR, repetition time; tSNR, temporal SNR; UHF, ultra-high field; VASO, vascular space occupancy. a 180°radiofrequency (RF) pulse can refocus the dephasing of the protons induced by susceptibility effects, resulting in the reduction of the extravascular BOLD effect around large vessels ( Boxerman et al., 1995 ;Ogawa et al., 1993 ;Weisskoff et al., 1994 ). Additionally, it is generally expected that SE-BOLD signal has a narrower spatial point spread function (PSF) when compared to that of GE-BOLD signal ( Norris, 2012 ;Yacoub et al., 2005 ). Despite these apparent benefits, SE-BOLD signal suffers from lower sensitivity than GE-BOLD signal, especially at higher spatial resolution ( Yacoub et al., 2005 ), which fundamentally limits the application of SE-BOLD signal for fMRI ( Boyacio ǧlu et al., 2014 ;Budde et al., 2014 ;Harmer et al., 2012 ). Specifically, pushing to higher spatial resolution generally comes at the cost of a reduced signal-tonoise of the BOLD signal change, leading to a lower contrast-to-noise ratio (CNR) ( Parkes et al., 2005 ).
Ultra-high field (UHF) MRI offers promising opportunities for fMRI applications, in particular for high-resolution fMRI due to the increased SNR and BOLD CNR Gati et al., 1997 ;Norris, 2003 ;Schick, 2005 ;Yacoub et al., 2001b ). Additionally, the intravascular BOLD contribution from blood vessels is reduced at high magnetic fields due to the shortened T 2 values of blood Lee et al., 1999 ). Thus, performing fMRI experiments at UHF is desirable to achieve high spatial resolution SE-BOLD signal with high spatial specificity and to minimize the intravascular BOLD contribution with the direct benefit of improved sensitivity over conventional magnetic fields ( ≤ 3 T). Many previous studies have shown that SE-BOLD signal acquisition suppressed the fMRI signal from large vessels near the cortical surface compared to that of GE-BOLD signal ( Budde et al., 2014 ;Harel et al., 2006 ;Harmer et al., 2012 ;Yacoub et al., 2007 ;Zhao et al., 2006Zhao et al., , 2004, and column studies in humans have been successfully conducted at 7 T with SE echo-planar imaging (EPI) acquisition ( Yacoub et al., , 2007 but still suffer from low sensitivity at high spatial resolution. Aside from conventional 2D SE-EPI preparation, the gradient and spin-echo (GRASE) techniques generate T 2 weighting by combining multiple SE refocusing pulses of a Carr-Purcell-Meiboom-Gill sequence with intervening EPI echo trains ( Feinberg et al., 2008 ;Kemper et al., 2015 ;Oshio and Feinberg, 1991 ). Sub-millimeter resolution fMRI with zoomed 3D-GRASE has been used for laminar and columnar imaging ( Beckett et al., 2020 ;De Martino et al., 2015Kemper et al., 2018 ;Muckli et al., 2015 ;Olman et al., 2012 ;Zimmermann et al., 2011 ) and demonstrated improved spatial specificity in layer studies. However, the 3D-GRASE sequence can contain stimulated echoes (STEs) with T 1 contrast ( Goerke et al., 2007 ), which can make the BOLD contrast mechanisms of GRASE more complex than those of conventional SE. Other SElike contrast mechanisms, i.e., steady-state free precession (SSFP), have also shown the potential for high-resolution fMRI ( Báez-Yánez et al., 2017 ;Goa et al., 2014 ;Miller, 2012 ;Scheffler et al., 2001 ;Scheffler and Ehses, 2016 ), but showed peak signal changes at the pial surface not expected from pure T 2 -contrast ( Goa et al., 2014 ). As an alternative method to improve the functional CNR, the multi-echo fMRI strategy has been successfully applied to GE-EPI by sampling multiple echo times in a single shot ( Kundu et al., 2017 ;Poser et al., 2006 ;Poser and Norris, 2009 ;Posse et al., 1999 ). To the best of our knowledge, the multiecho fMRI strategy has not been implemented in SE-BOLD fMRI.
In this work, we introduce a new method called double spin-echo (dSE) EPI to enhance the functional CNR of SE-BOLD contrast by averaging two separate spin-echoes using three RF pulses ( Counsell, 1993 ;Heid et al., 1993 ;Nelson et al., 1998 ;Sodickson and Cory, 1998 ). We demonstrated the feasibility of dSE-EPI by comparing sensitivity indexes such as SNR, temporal SNR (tSNR), and CNR between dSE-EPI and conventional SE-EPI, obtained from resting-state fMRI. Also, Monte Carlo (MC) simulation of dSE-BOLD signal is performed to support experimental measurements. To demonstrate the laminar specificity of the proposed dSE-BOLD fMRI, we have achieved 0.8 mm isotropic resolution by implementing a multi-shot approach with the use of sliding-window reconstruction. With multi-shot dSE-EPI, we directly compared the layer profiles of multi-shot dSE-BOLD and multi-shot GE-BOLD fMRI in the primary sensory and motor cortices during fist-clenching with touching. Multi-shot dSE-BOLD fMRI indeed yielded improved laminar specificity and is therefore an excellent tool for high spatial resolution UHF-fMRI studies to resolve mesoscopic functional units.

Participants and MRI system
Six healthy subjects (3 male and 3 female) aged 28-39 years old participated in this study, approved by the Institutional Review Board of Sungkyunkwan University. Procedures were fully explained to all subjects, and informed written consent was obtained before scanning in accordance with the Declaration of Helsinki. All measurements were performed on a 7-T scanner (MAGNETOM Terra, Siemens Healthineers, Erlangen, Germany) equipped with a single channel transmitter and a 32-channel receive head coil (NOVA Medical, Wilmington, MA). The movement of the subjects was minimized by tightly padding the space inside the coil with MR-compatible cushions.

Double spin-echo BOLD: signal characterization
MC simulations were performed using an in-house MATLAB (Math-Works, Natick, Massachusetts) script, as previously described ( Han et al., 2015 ), to characterize double spin-echo (dSE) signals. In short, threedimensional binary matrices for vascular structure were generated using randomly oriented cylinders. The susceptibility difference between blood and tissue for fully deoxygenated blood was set to Δ = 0.11 ppm in cgs units ( Jain et al., 2012 ;Spees et al., 2001 ). A susceptibility difference value of (1 -Y ) × 0.11 ppm = 0.011 ppm was used, with restingstate Y = 60% and activation Y = 70%. The finite perturber method (FPM) was used to calculate the magnetic field shift from the BOLD effect ( Pathak et al., 2008 ) resulting from an applied field strength of B 0 = 7 T. In the FPM, the magnetic field shift by an arbitrary structure is calculated as the sum of the magnetic field shifts by uniformly packed small perturbers within the structure. Initially, 2 × 10 5 proton spins outside of the cylinders were positioned randomly and moved in a random trajectory with a diffusion time step of 0.2 ms and a diffusion coefficient (D) of 10 − 5 cm 2 /s. The vessel radius was varied from 1 m to 100 m at a fixed V b of 3%, with V b representing the mean blood volume ( Báez-Yánez et al., 2017 ;Weber et al., 2008 ). Echo times (TEs) for SE were 30 ms, 45 ms, and 60 ms, and the TE for GE was 22 ms. True T 2 contribution to SE-BOLD signal only occurred at the TE. MC simulations were also performed with a TE of 45 ms by varying half of the EPI readout length from 6 ms to 22 ms to investigate the T 2 * effect on the SE-BOLD signal during the EPI readout. T 2 and T 2 * of baseline were set to gray matter values at 7 T of 50 ms ( Uluda ǧ et al., 2009 ) and 33 ms ( Peters et al., 2007 ), respectively, to calculate S activation /S baseline .

dSE EPI: pulse sequence and multi-shot segmented EPI for high spatial resolution
A schematic diagram of the dSE EPI sequence is shown in Fig. 1 A. In general, three RF pulses generate 5 echoes and 3 free-induction decays (FIDs). Spoiler gradients (purple-striped gradients) between 180°p ulses dephase unwanted FID signals arising from imperfect RF refocusing pulses. The polarity of the spoiler gradient pair between the second 180°pulse was reversed to preserve the two primary spin echoes but to exclude unwanted stimulated echo pathways. Rephasing gradient (green-striped gradient) was inserted after the first SE acquisition to ensure that the 0 th moment of phase-encoding gradient was zero before the second primary SE acquisition.
To overcome the spatial encoding limitations of single-shot EPI, especially in dSE-EPI, multi-shot segmented EPI was applied to achieve high spatial resolution fMRI with a TE of the T 2 value of gray matter Fig. 1. Double spin-echo EPI pulse sequence and three-shot approach for high-resolution fMRI. (A) A schematic diagram of double spin-echo (dSE) EPI sequence. dSE-EPI sequence generates two primary spin-echoes (red and blue colored). Spoiler gradients (purple striped) and rephasing gradient (green striped) were added; see descriptions in Section 2.2 . The terms 2 1 and 2 2 represent echo times (TE 1 and TE 2 ) of the first (red) and second (blue) primary SEs, respectively. Nav is a 3-echo navigator for N/2 ghost correction ( Heid, 1997 ). (B) Illustration of multi-shot segmented EPI with 3 shots. In-plane acceleration (R in-plane ) was 9 for each shot, and the effective R in-plane was 3. Illustration of sliding-window multi-shot data reconstruction. TR shot represents the temporal resolution for each shot. TR sw represents the temporal resolution from sliding-window reconstruction, whose temporal resolution was the same as TR shot , but the reconstructed signal was smoothed by three adjacent time frames.
(~50 ms). Segmented EPI can reduce readout duration and acceleration factors. Fig. 1 B shows an illustration of multi-shot segmented EPI with 3 shots. In-plane acceleration (R in-plane ) for each shot is 9, and the effective in-plane acceleration becomes 3 by merging 3 shots. In our implementation, multiple slices were first acquired at each shot's repetition time (TR shot ); thus TR shot between shots in each slice was the same. With this approach, the volume TR is 3 × TR shot and the total number of repetitions (NR) becomes NR original /3, where NR original is the number of repetitions counting all shots. However, multi-shot data reconstruction can be conducted using a "sliding-window " with 3 adjacent time frames, as shown in Fig. 1 B. For example, for the second time frame reconstruction, TR shot (2), TR shot (3), and TR shot (4) are used instead of using TR shot (4), TR shot (5), and TR shot (6). This approach resulted in temporal smoothing and temporal resolution with sliding-window (TR SW ) equal to TR shot , and the total number of time frames became NR original -2 instead of NR original /3. The acquired k-space data were reconstructed offline using an in-house MATLAB (MathWorks, Natick, Massachusetts) script. Nyquist ghost correction and phase offset correction were performed independently for each shot, and conventional Generalized Autocalibrating Partial Parallel Acquisition (GRAPPA) reconstruction was performed after merging three shots.

Overall experimental design
To investigate the properties of dSE-BOLD fMRI, we chose the region of the human primary motor (M1) and sensory (S1) cortices proximal to the central sulcus, which have been previously investigated by fMRI ( Budde et al., 2014 ;Huber et al., 2020Huber et al., , 2017Huber et al., , 2015. Slices were aligned perpendicular to the participants' central sulcus between M1 and S1, as shown in Fig. 2 A. The anterior gray matter bank of the central sulcus is M1, and the posterior side of the central sulcus is S1, as shown in Fig. 2 C. Anatomical images were acquired by the fast low angle shot (FLASH) sequence and were co-registered with the functional data to determine the border between gray matter and cerebrospinal fluid (CSF). A 3.8-min unilateral (left hand) fist-clenching with a touching stimulation paradigm (initial 20 s resting and 8 blocks of alternating 6 s clenching and 20 s resting) was used to induce BOLD signal changes. With a temporal resolution of 2 s, a total of 114 images were obtained (i.e., 10 resting -[3 stimulation -10 resting] × 8 images).
Two separate dSE-BOLD fMRI studies were performed. First, the resting-state fMRI data were acquired at 1.5 mm isotropic resolution to evaluate the sensitivity gain in the dSE-EPI compared to conventional SE-EPI. This feasibility study utilized single-shot dSE-EPI. Second, the fMRI results from multi-shot dSE-BOLD were directly compared to those from multi-shot GE-BOLD at 0.8 mm isotropic resolution to further examine the spatial specificity of dSE-BOLD fMRI at a laminar resolution.

Experiment #1: characterization of dSE-BOLD fMRI
First, the dSE-EPI sequence was evaluated for signal intensity and tSNR on a representative subject by comparing conventional SE-EPI with the TE of SE-EPI matched to the TE 2 of dSE-EPI. The imaging parameters were as follows: 1.5 mm isotropic resolution, in-plane reduction Fig. 2. Experimental design and coregistration. (A) EPI slice orientation was determined by positioning the slices to be perpendicular to the central sulcus between M1 and S1. (B) The stimulation paradigm, initial 20 s resting and 8 blocks of alternating 6 s fist-clenching with touching and 20 s resting, had a total duration of 3.8 min. (C) Registration of the functional images to the anatomical images (0.8 × 0.8 × 1.6 mm 3 ). Images in the left column show anatomical FLASH, multi-shot GE-EPI, and multi-shot dSE-EPI, with the blue outline of the brain region delineated from the FLASH image. Images from the middle column were magnified images of motor and sensory cortices (red boxed region). CSF boundaries and gray matter boundaries of M1 and S1 were drawn with white, green, and yellow colors, respectively, from the FLASH image and overlaid on multi-shot EPI images.
Second, BOLD fMRI-generated images were obtained from singleshot dSE-EPI and conventional single-shot SE-EPI data acquisitions on six subjects. The imaging parameters were the same as the above experiment except for the TE of the conventional SE acquisition, which was 45 ms. A TE of 45 ms was determined by the average of two TEs from dSE-EPI (TE 1 and TE 2 were 30 ms and 60 ms), which is defined as the effective TE. The averaged dSE-EPI images were generated by averaging images from the two echoes at each TR. Resting-state fMRI was performed with a number of repetitions (NR) of 88 (2 s × 88 = 176 s) to determine SNR and tSNR.

Experiment #2: high-resolution multi-shot dSE-BOLD vs. multi-shot GE-BOLD
To demonstrate the improved specificity of dSE-BOLD compared to GE-BOLD acquisitions at 0.8 mm isotropic resolution, we investigated six subjects using the 3.8-min stimulation paradigm as shown in Fig. 2 B. The imaging parameters for multi-shot dSE-EPI were as follows: 0.8 × 0.8 mm 2 in-plane resolution, R in-plane = 9 (for each shot; effective R in-plane = 3 by 3 shots), FOV = 120 × 120 mm 2 , 16 slices (slice thickness = 0.8 mm), flip angle = 90°, partial Fourier = 6/8, ESP = 1.04 ms (1190 Hz/Px), TR = 2000 ms, and TE 1 and TE 2 = 30 ms and 60 ms so that the averaged TE of the dSE-EPI was 45 ms, matched with the T 2 value of gray matter (~50 ms). The total EPI readout length per shot was 10.40 ms. For multi-shot GE-EPI, imaging parameters are the same as in the multi-shot dSE-EPI except for TE = 22 ms, and the flip angle was 50°Two fMRI runs were performed for multi-shot GE fMRI, while multi-shot dSE fMRI experiments were repeated 10 times to ensure sufficient sensitivity. Anatomical images were acquired by a FLASH sequence with the same imaging parameters as the multi-shot dSE-EPI except for 620 Hz/Px, TEs = 3.3, 6.3, and 20 ms, and the flip angle was 50°to determine the border between gray matter and CSF and acquire a B 0 field map for distortion correction of the EPI images. Image reconstruction was performed by merging three-shot k -space data using the sliding-window approach, as explained in Section 2.3 .

Preprocessing and functional analysis
Functional images with GE and dSE contrasts were separately motion-corrected using SPM12 (Functional Imaging Laboratory, University College London, UK) with 6 motion parameters. All repeated runs were concatenated and motion-corrected. The motion-corrected functional images were co-registered onto the anatomical FLASH images using FLIRT from the FSL package ( http://www.fmrib.ox.ac.uk/fsl ). In the anatomical images, CSF, gray matter, and white matter were visu- 2.11 ± 0.24 2.45 ± 0.33 3.29 ± 0.44 2.89 ± 0.34 1.14 ± 0.03 < 0.001 * Statistical analyses for sensitivity-related parameters were performed using ANOVA. Averaged dSE-BOLD fMRI has a statistically higher sensitivity than the other cases.
ally identified, which enabled the generation of a layer mask registered onto the functional images. The distortion of the EPI images was corrected with the B 0 field map, performed with PRELUDE and FUGUE from the FSL package ( http://www.fmrib.ox.ac.uk/fsl ). All other data analyses were performed using custom-written MATLAB scripts (Math-Works, Natick, MA, USA). Statistical analysis was done using FSL FEAT ( Worsley, 2012 ) from the FSL package using standard parameter settings without spatial smoothing. Functional maps were generated by thresholding using a z-value of 1.5. The percentage signal change was calculated on the average 8 blocks of stimulation and rest; the rest signal was averaged over 10 s before the onset of the task, whereas the evoked signal was averaged over 6 s, excluding the initial 4 s right after the stimulus onset to minimize the influence of transient changes.

Experiment #1: characterization of dSE-BOLD fMRI
To calculate statistical indices (i.e., SNR, tSNR, and CNR), gray matter ROIs in M1 and S1 were drawn across 3 slices from conventional SE resting-state fMRI data ( Supplementary Fig. S1). The noise area was defined as regions in the four corners of the image, and its standard deviation ( ) (considered as noise) was determined by calculating the mean standard deviation across time frames (Supplementary Fig. S1). The signal component (S) was determined as the mean intensity within the selected gray matter ROIs ( Supplementary Fig. S1); then, the mean SNR was calculated using a corrected SNR equation assuming Rician noise distribution ( Gudbjartsson and Patz, 1995 ). The voxel-wise tSNR within the selected ROIs was calculated from 83 images, excluding the first 5 images, by dividing the temporal mean of the time series by the temporal standard deviation. Mean tSNR was calculated by averaging voxel-wise tSNR within the ROI. The relative CNR ( Kundu et al., 2017 ;Poser et al., 2006 ;Poser and Norris, 2009 ;Posse et al., 1999 ) was calculated from the selected ROIs. The relative CNR of conventional SE-EPI follows from Eq. (1) : where is the mean of time series standard deviation of image noise of SE, is the averaged signal, and is the echo time of SE. The relative CNR of averaged dSE-EPI follows from Eq. (2) : where 1 and 2 are the average signal at TE 1 and TE 2 , respectively; TE 1 = 30 ms and TE 2 = 60 ms.
The SNR, tSNR, and CNR ratios were calculated by dividing the average value of dSE-EPI by that of conventional SE-EPI. Statistical analysis was assessed using one-way analysis of variance (ANOVA). All values in Table 1 are listed as mean ± standard deviation (SD).

Experiment #2: high-resolution multi-shot dSE-BOLD vs. multi-shot GE-BOLD
Functional images were downsampled along the slice direction to enhance the sensitivity, resulting in 0.8 × 0.8 × 1.6 mm 3 resolution. As the slices were placed orthogonally to the laminar direction and the cortical thickness is approximately constant across slices, downsampling increases CNR without reducing laminar resolution ( Huber et al., 2015 ).
Anatomical FLASH images were used to create white matter and CSF borders for the laminar analysis, as shown in Fig. 2 C. To create masks of cortical depths, we upsampled the images 4 times with cubic interpolation (200 m in-plane resolution), and the gray matter boundaries with white matter and CSF were manually delineated on the anterior and posterior bank of the central sulcus, including the hand knob region of M1 and S1 depicted as green and yellow lines, respectively, in the middle column of Fig. 2 C. For quantitative analyses, layers were drawn in both M1 and S1 areas where activation responding to the stimulation was observed. Since the thickness of human M1 is approximately 4 mm ( Fischl and Dale, 2000 ) and the thickness of S1 is about half of the thickness of M1 ( Huber et al., 2015 ), the cortex was divided into 10 and 20 equidistant depths for S1 (2 mm) and M1 (4 mm) areas, respectively, and we used the software suite LAYNII to determine the cortical depths ( Huber et al., 2017, Huber et al., 2021. After generating layers for each slice, 5 slice-averaged cortical profiles were computed to improve the statistical power and minimize the bias of selecting single slices. To investigate the cortical depth-dependent time course of fMRI activations, three cortical depths, 0-0.75 mm (top layers), 0.75-1.50 mm (middle layers), and 1.50-2.25 mm (bottom layers), were determined in M1 and S1, respectively.

Characterization of dSE-BOLD fMRI
MC simulation was performed using parameters closely matched with Experiment #1 to obtain insight into the signal sources of dSE-BOLD measurements. First, BOLD signal amplitude as a function of vessel size obtained by averaging two echoes (with TE 1 = 30 ms and TE 2 = 60 ms) from a dSE was similar to that of a single echo (TE = 45 ms) ( Supplementary Fig. S2A). This demonstrates that the effective TE of the averaged dSE is similar to the mean of two TEs. The SE contrast peaked at a ~4-m vessel radius and decreased dramatically for larger vessels. Second, the effect of T 2 * contamination to SE-BOLD signals was simulated by varying EPI readout length. As EPI readout length increased, the overall sensitivity increased, including that of large vessels, but still peaked for small vessels (Supplementary Fig. S2B).
The quality of the second spin-echo (TE 2 = 60 ms) image from the dSE-EPI sequence was compared with that of conventional SE-EPI with a TE of 60 ms ( Supplementary Fig. S3) to ensure the proper implementation of the dSE-EPI sequence at 7 T. Signal intensity and tSNR were comparable between images using the two acquisitions, indicating that the dSE sequence acquires robust signal also at the second echo.
The fMRI sensitivity gain in the dSE-EPI sequence was investigated by comparing the average of the dSE-BOLD fMRI images (TE 1 of 30 ms and TE 2 of 60 ms) with the images obtained by the conventional SE-EPI sequence (TE of 45 ms). SNR and tSNR images are illustrated on slices from two subjects in Fig. 3 A and 3 B, respectively. The average of two echoes (TE ~45 ms) from dSE clearly had higher SNR and tSNR than conventional SE (TE = 45 ms). Table 1 shows the SNR, tSNR, and relative CNR across six subjects. All sensitivity-related parameters from the averaged dSE-EPI were higher than those from conventional SE-EPI. Fig. 3. Characterization of double spin-echo EPI. dSE-EPI vs. conventional SE-EPI fMRI. SNR (A) and tSNR (B) from two subjects are shown for the first primary SE of dSE-EPI (first row), second primary SE of dSE-EPI (second row), averaged dSE-EPI (third row), and conventional SE-EPI (fourth row). The averaged dSE-EPI showed higher SNR and tSNR than conventional SE-EPI.
The SNR and tSNR of averaged dSE were higher by 20% and 24% compared to the values using the conventional single-echo SE data with the same TE. The relative CNR improvement by the addition of two-echoes was 14%, which agrees with the previous result at 7 T ( Poser and Norris, 2009 ).

High-resolution multi-shot dSE-BOLD signal vs. multi-shot GE-BOLD signal
The accuracy of the co-registration between anatomical FLASH images and their corresponding EPI images is illustrated in Fig. 2 C. The outline of the brain area determined in the anatomical image was wellmatched with multi-shot GE-EPI and multi-shot SE-EPI images. Good anatomical agreement in the position and shape of motor and sensory areas confirmed the accuracy of distortion correction and co-registration among different images.
The spatial specificity of BOLD signals in the parenchyma was evaluated for multi-shot GE-BOLD and multi-shot dSE-BOLD fMRI from 6 subjects during the fist-clenching with touching task. First, z-score maps of multi-shot GE-BOLD and multi-shot dSE-BOLD signals are compared in representative slices (0.8 × 0.8 × 1.6 mm 3 ) from six subjects in Fig. 4 . Anatomical images obtained from FLASH sequences with a TE of 3.3 ms clearly visualized CSF, gray matter, and white matter. Layers for M1 and S1 were drawn and overlaid onto anatomical images (second row). The anterior and posterior gray matter bank of the central sulcus corresponds to M1 and S1, respectively. For the case of GE-BOLD signal (the third row), the z-score values peaked at or outside the cortical surface near high-intensity CSF areas. However, for the case of dSE-BOLD signal (the fourth row), without significant activation near the CSF areas, M1 and S1 activation areas were clearly distinguished. Similar differences between GE-BOLD and dSE-BOLD signals were consistently observed across all participants. Composite maps were generated by z-score maps from GE-BOLD and dSE-BOLD fMRIs (fifth row) to visualize the sim-ilarity and difference between the two maps. In the composite maps, activated voxels only in GE-BOLD (red voxels) were mostly confined to the CSF region, marked by two white lines. However, common active voxels in both GE-BOLD and dSE-BOLD signal (green voxels) overlapped with the gray matter region, which demonstrated that dSE-BOLD signal showed better specificity in gray matter and suppressed responses in the CSF region.
Cortical layers were drawn in five slices with 1.6 mm thickness in each subject (see Supplementary Fig. S4 for a representative example). The cortical depth profiles of GE-BOLD and dSE-BOLD fMRI were obtained by averaging signals from five slices in each subject and are plotted in Fig. 5 A and Fig. 5 B, respectively. Group-averaged cortical profiles (solid lines) and cortical depth profiles of all six participants (dashed lines) were plotted to examine consistency. The functional response of GE-BOLD fMRI ( Fig. 5 A) was highest at the cortical surface, where the concentration of draining vessels is high and decreased monotonically with increasing cortical depth. On the other hand, for the case of dSE-BOLD fMRI ( Fig. 5 B), the percent signal change was reduced at the cortical surface and peaked in the gray matter for both M1 and S1 (green arrows). Specifically, dSE-BOLD response peaked at around 1.0 mm depth from the surface of the cortex in both M1 and S1. The SE-BOLD peak in M1 is located at the upper cortical lamina (~0.8 mm), which well agrees with the corresponding cerebral blood volume (CBV)-weighted vascular space occupancy (VASO) response ( Huber et al., 2015 ). The SE-BOLD signal peak in S1 is located at the middle cortical layer (~1.0 mm), which also agrees with VASO fMRI ( Yu et al., 2019 ). Note that the lateral side of the hand knob in the M1 area (BA4a) has previously shown double-peak features ( Huber et al., 2017 ). However, we did not observe double-peak features in M1, possibly due to averaging of multiple slices.
Three cortical depths were chosen for top layers (0-0.75 mm), middle layers (0.75-1.50 mm), and bottom layers (1.50-2.25 mm) to further characterize cortical depth-dependent dynamic BOLD responses. The percent change in fMRI time courses with respect to the three Fig. 4. High-resolution GE-BOLD and dSE-BOLD fMRI maps of six individual subjects. For illustration, one slice was selected for each subject. Based on the anatomical images from FLASH (first row), cortical depth ROIs (colored contours), as well as the boundary of CSF area (two white lines in the second row), were drawn. Z-score thresholds of 2.5 and 1.5 were used for GE-(third row) and dSE-fMRI maps (fourth row), respectively. To visualize the spatial overlap between dSE-and GE-fMRI maps, a composite map was generated from the z-score maps of GE-BOLD and dSE-BOLD fMRIs (fifth row). Blue, red, and green voxels represent activation represented by SE only, GE only, and the overlap of activation quantified by SE and GE, respectively.

Fig. 5.
Cortical profiles of GE-and dSE-BOLD fMRI in the primary motor and sensory cortices. Cortical profiles of GE-(A) and dSE-BOLD fMRI (B) in M1 and S1 were obtained from individual subjects. For each subject's data, the cortical depth profiles from five adjacent 1.6-mmthick slices were averaged (see Supplementary  Fig. S4). Averaged cortical profiles were plotted with solid lines and error bars representing standard errors of the mean (SEM). The GE-BOLD signal peaks in the vicinity of the cortical surface and decreases with cortical depth, whereas the dSE-BOLD response has its peak at ~1.0 mm depth within both M1 and S1 gray matter.
cortical depths (top, middle, and bottom) of M1 and S1 are shown in Fig. 6 . For the case of GE-BOLD signal in both M1 and S1 areas ( Fig. 6 A-1 and 6B-1), the top cortical layers had the highest responses, but in dSE-BOLD fMRI, the middle cortical layers had similar or higher responses compared to that of the top layers ( Fig. 6 A-2 and  6B-2).

Discussion
We demonstrated the successful implementation of dSE-EPI sequences at 7 T with 0.8 mm fMRI isotropic resolution using multi-shot approach with sliding-window reconstruction. This sequence increased the sensitivity for SE-BOLD signals when two primary SEs were aver- Fig. 6. Cortical depth-dependent SE-BOLD vs. GE-BOLD fMRI time courses. Time courses of GE-(left) and SE-BOLD fMRI (right) are represented from three cortical depths in M1 (A) and S1 (B) , respectively. Three cortical depths were chosen for 0-0.75 mm (top cortical layer; red), 0.75-1.50 mm (middle cortical layer; green), and 1.50-2.25 mm (bottom cortical layer; blue). Error bars represent standard errors of the mean (SEM).
aged. The peak activity of dSE-BOLD fMRI was located around 1.0 mm depth from the surface of both the S1 and M1 cortices, demonstrating the high spatial specificity of dSE-BOLD fMRI in resolving mesoscopic functional units in humans.

Comparison with previous SE-BOLD fMRI studies
The SE-BOLD signal has previously shown better specificity than GE-BOLD signal in animal studies ( Harel et al., 2006 ;Lee et al., 1999 ;Uluda ğ and Blinder, 2018 ;Zhao et al., 2006Zhao et al., , 2004 and in simulation data ( Uluda ǧ et al., 2009 ). In animal studies, SE-BOLD fMRI peaks within the gray matter, whereas GE-BOLD fMRI peaks at the surface of the cortex. In most human studies, biophysical mechanisms of SE-BOLD fMRI have been examined; however, high-resolution fMRI studies with sub-millimeter resolution are limited due to insufficient sensitivity at high spatial resolution ( Budde et al., 2014 ;Harmer et al., 2012 ;Yacoub et al., 2007Yacoub et al., , 2005. Recently, GE-BOLD and SE-BOLD fMRI data with 1 mm isotropic resolution were acquired in humans at 9.4 T ( Budde et al., 2014 ). Although large venous effects in SE-BOLD signal are significantly reduced compared to the GE-BOLD signal, the improvement of laminar BOLD specificity had not been demonstrated in this study. In contrast, our SE-BOLD fMRI data showed a clear reduction of pial vessel contributions between M1 and S1, resulting in two distinct cortical peaks, each in M1 and S1. The major difference between 9.4 T ( Budde et al., 2014 ) and our 7 T studies is likely to be different readout time and magnetic field strength: 33.8 ms at 9.4 T and 10.4 ms at 7 T. The T 2 * contribution to SE-BOLD measurements is larger at higher magnetic fields and at longer readout times, which induce T 2 * blurring ( Farzaneh et al., 1990 ). The shorter readout time and lower field strength adopted in our studies, relative to the 9.4-T study ( Budde et al., 2014 ), reduce the T 2 * SE-BOLD fMRI contamination from large vessels, consequently enabling the improved separation of M1 and S1 responses.
In our studies with a TR of 2 s for both sequences, averaging dSE-EPI data increased tSNR by about 20% compared to conventional singleecho BOLD with the same echo time. However, this may not hold when the shortest TR is independently utilized for SE-and dSE-BOLD fMRI. Specifically, using the same MR parameters as in Experiment #1 (except TR), the minimum TRs for SE-EPI and dSE-EPI were 1340 ms and 1860 ms considering specific absorption rate (SAR) limitation, respectively. Our preliminary data with shortened TRs suggest that tSNR gain by using dSE-EPI still holds (data not shown).

Approaches for high-resolution SE BOLD contrast
In this study, multi-shot segmented EPI was used to achieve high spatial resolution (0.8 mm isotropic resolution). In general, high spatial resolution is achieved by increasing the number of encodings in k-space. However, this results in increased geometric distortion ( Jezzard and Balaban, 1995 ) and T 2 * blurring ( Farzaneh et al., 1990 ) due to the increase of the readout duration and the minimum achievable echo time. One strategy for reducing the readout duration is zoomed imaging. Zoomed imaging is achieved by restricting the phase-encoded FOV, which can reduce the required number of phase-encoding lines ( Feinberg et al., 1985 ;Heidemann et al., 2012 ;Pfeuffer et al., 2002 ). Our adopted strategy for reducing the readout duration was multi-shot 2D-EPI ( Cheng et al., 2001 ;Goense et al., 2012 ;Goodyear and Menon, 2001 ;Kim et al., 1996 ;Menon and Goodyear, 1999 ;Yacoub et al., 2001a ). Multi-shot acquisition can reduce the required number of encoding lines for each shot, resulting in reduced distortion and spatial blurring ( Butts et al., 1994 ). On the other hand, there are several disadvantages of multi-shot segmented EPI. Multi-shot EPI increases the volume TR depending on the number of shots, which reduces statistical efficiency , and shot-to-shot variations result in intermittent ghosting artifacts that corrupt the fMRI signal as a function of time ( Reeder et al., 1997 ). This can be reduced by using shot-wise navigator echo correction ( Kim et al., 1996 ). In this study, sliding-window reconstruction was used to maintain similarity between effective TR and shot TR, albeit at the cost of temporal smoothing. Additionally, the implementation of fast low-angle excitation echo-planar technique (FLEET) segment ordering (multi-shot loop at each slice) could have helped by reducing shot-to-shot variations due to a short time difference between shots in each slice and alleviate SNR penalty by implementing variable flip angles ( Berman et al., 2021 ).
In SE-EPI, the signal depends on both T 2 and T 2 * according to where t represents the sampling time ranging from (TE -readout time/2) to (TE + readout time/2). Due to the long readout times in EPI, especially at high spatial resolution, there is considerable T 2 * dependence. A readout length of about 1.25 × T 2 * ( Goense and Logothetis, 2006 ) is often used for SE-EPI to avoid the degradation of image resolution due to T 2 * broadening . For multi-shot dSE-EPI (Experiment #2), the EPI readout per shot was 10.40 ms, which corresponds to previously determined optimal values for a readout window of about 0.5 × T 2 * ( Goense and Logothetis, 2006 ). MC simulation also demonstrates that the EPI readout length of ~12 ms reduces T 2 * contamination significantly, as shown in Supplementary Fig. S2B (blue line corresponds to the condition of Experiment #2). For this reason, we observed a clear distinction between M1 and S1 activation areas with successful suppression of large vessel effect near CSF areas by optimally reducing T 2 * contamination, as shown in Figs. 5 and 6 . Further studies may be necessary to examine the dependency of SE-BOLD fMRI on EPI readout length.
In our 7 T dSE-BOLD studies with a TE of 45 ms, the cortical profile showed a peak at the middle of the cortex. Regarding SE-BOLD data with a TE of 30 ms at 7 T, the intravascular contribution was expected to be 20% of the total BOLD responses . When the intravascular contribution is dominant (such as at the experimental conditions of 3 T or short TE), the SE-BOLD peak occurs at the cortical surface, which is shown in 9.4 T SE-BOLD data with a TE of 16 ms ( Jin et al., 2006 ). Our cortical profile of segmented dSE-BOLD data indicated the intravascular contribution and T 2 * contamination to be minimal.

Comparison of fMRI methodologies in varying cortical depths
In animal studies, it has been shown that CBV results in BOLD peaks in deeper cortical layers closer to the neural activation site ( Harel et al., 2006 ;Lu et al., 2004 ;Poplawsky et al., 2015 ;Zhao et al., 2006 ). For human applications, noninvasive CBV-weighted VASO ( Huber et al., 2014 ;Lu et al., 2013 ) has been successfully used for laminar fMRI studies ( Huber et al., , 2018( Huber et al., , 2017( Huber et al., , 2015. For example, the task fMRI (finger tapping) comparison between GE-BOLD signal and VASO response was performed with an in-plane resolution of 0.78 mm and a slice thickness of ~1.5 mm ( Huber et al., 2015 ), which was similar to the parameters of Experiment #2. Since dSE-BOLD fMRI is readily available and can be used as an alternative approach to laminar fMRI, it would be worthwhile to compare it with VASO fMRI, which we plan to do in the near future.
Although GE-BOLD fMRI was averaged 5 times less than the dSE-BOLD runs (Experiment #2), its functional sensitivity was higher than that of the averaged dSE-BOLD and was even higher when an optimal TR was used for GE-BOLD fMRI. However, this high GE-BOLD fMRI sensitivity is partly due to the contributions of non-specific draining vessels, including pial veins (see Fig. 6 ). It is crucial to remove non-specific contributions to improve spatial fidelity of the GE-BOLD response. Two different approaches were adopted, differential analysis and model-basis deconvolution. The differential analysis of cortical profiles induced by multiple paradigms can remove the common responses and enhance layer-specific contrasts ( Olman et al., 2012 ). However, susceptibility effects induced by pial veins on laminar profiles may remain. These effects are dependent on vessel size and orientation relative to B 0 and extend to the cortex, especially in the upper cortical layers. Thus, only if the sus-ceptibility changes of pial vessels are identical for two different stimuli, then the differential analysis will remove the common non-specific contributions.
Recently, model-based GE-BOLD signal deconvolution method has been shown to improve the specificity of the cortical layer profile. With assumptions of the biophysical properties of the layer-dependent vascular architecture, the unwanted venous drainage effect in GE-BOLD can be accurately estimated and mitigated from ROI-specific GE-BOLD layer profiles ( Markuerkiaga et al., 2016 ;Marquardt et al., 2020 ). The modelbased deconvoluted cortical profile is sensitive to the weighting factor of draining from deeper to upper layers, which can be subject specific. However, one advantage of dSE-BOLD signal acquisition compared to GE BOLD signal after spatial deconvolution is the reduced sensitivity to pial vessels, as the deconvolution currently only corrects for intracortical draining vein effects. Thus, dSE-BOLD signal approach can help in the future to determine the validity and limitations of spatial deconvolution approaches for GE-BOLD signal approaches.
SE-BOLD fMRI contrasts without large vessel contributions may be viable for layer studies since they are readily available and insensitive to parameter optimizations. However, although dSE-EPI improves CNR compared to conventional SE-EPI, its functional sensitivity is still low in ultrahigh-resolution fMRI studies relative to GE-EPI, requiring extensive signal averaging. The low functional sensitivity is partly due to the use of a long TR value to additionally accommodate 180°pulses with SAR consideration. Further sensitivity gain may be achieved by further optimizing the pulse sequence and imaging parameters (e.g., shortening TR).
Since the BOLD signal is sensitive to baseline CBV, the maximum SE-BOLD response at the middle of the cortex can be partly due to the higher baseline microvascular volume compared to that of the surrounding layers. The baseline CBV bias may be removed by the normalization of global stimuli (e.g., using breath-hold, hypercapnia).

Limitations of this work and future work
In general, obtaining high spatiotemporal SE-fMRI resolution is difficult, mainly due to its low sensitivity, long TE and TR, and high SAR from high flip angle pulses. Moreover, for a given run duration, the lengthened effective TR and increased SAR in dSE-EPI due to the additional 180°pulse result in narrower slice coverage than conventional SE. As an option to increase the slice coverage and reduce SAR limitation, Power Independent Number of Slices (PINS) pulses ( Boyacio ğlu et al., 2014 ;Eichner et al., 2014 ;Koopmans et al., 2012 ;Norris et al., 2011 ) can be implemented in the dSE-EPI sequence. Also, complex-encoded generalized slice dithered enhanced resolution (cgSlider) ( Han et al., 2020 ) may further increase the slice coverage.
There are several methods to utilize multi echoes: simple echo summation (used in this study), a weighted summation from fitted T 2 * , and CNR-weighted summation ( Kundu et al., 2017 ;Poser et al., 2006 ;Poser and Norris, 2009 ;Posse et al., 1999 ). In CNR-weighted summation, CNR weights are calculated by the following Eq. (3) , where is n th echo's TE, and is n th echo's averaged SNR, calculated in Experiment #1. The mean w 1 and w 2 across six subjects were 0.43 and 0.57, whose difference was not significant. For this reason, we implemented the simple echo summation approach.

Conclusions
SE-BOLD fMRI sensitivity was increased with the newly adopted dSE-EPI sequence at 7 T. Using multi-shot dSE-BOLD fMRI with 0.8-mm isotropic resolution, we demonstrated high spatial specificity by observing the peak response at ~1.0-mm depth from the CSF border both in the primary motor cortex and the primary sensory cortex in humans.
Thus, the proposed dSE-BOLD fMRI technique may play an important role in investigating mesoscopic cortical circuits with high specificity at UHF.

Data availability
The unprocessed laminar fMRI data are available from the corresponding author upon request.

Data and code availability statement
All fMRI data analyzed in our study will be accessed upon the request with a reasonable motivation to the primary authors. Relevant analytical software is released to the code repository of our center (ibscnir.github.io/shhan).