Superconducting receiver arrays for magnetic resonance imaging

Superconducting QUantum-Interference Devices (SQUIDs) make magnetic resonance imaging (MRI) possible in ultra-low microtesla-range magnetic fields. In this work, we investigate the design parameters affecting the signal and noise performance of SQUID-based sensors and multichannel magnetometers for MRI of the brain. Besides sensor intrinsics, various noise sources along with the size, geometry and number of superconducting detector coils are important factors affecting the image quality. We derive figures of merit based on optimal combination of multichannel data, analyze different sensor array designs, and provide tools for understanding the signal detection and the different noise mechanisms. The work forms a guide to making design decisions for both imagingand sensor-oriented readers.

Magnetic resonance imaging (MRI) is a widely used imaging method in clinical applications and research. It is based on measuring the magnetic signal resulting from * koos.zevenhoven@aalto.fi nuclear magnetic resonance (NMR) of 1 1 H nuclei (protons). In NMR, the magnetization rotates around an applied magnetic field B at the proton Larmor frequency f L , which is proportional to B [1]. This behavior of the magnetization is often referred to as precession due to the direct connection to the quantum mechanical precession of nuclear spin angular momentum.
Conventionally, the magnetic precession signal has been detected using induction coils. The voltage induced in a coil by an oscillating magnetic field is proportional to the frequency of the oscillation, leading to vanishing signal amplitudes as f L approaches zero. Today, clinical MRI scanners indeed use a high main static field B 0 ; typically B 0 = 3 T, corresponding to a frequency f 0 = 128 MHz. However, when the signal is detected using magnetic field (or flux) sensors with a frequencyindependent response, this need for high frequencies disappears. Combined with the so-called prepolarization technique for signal enhancement, highly sensitive magnetic field detectors, typically those based on superconducting quantum-interference devices (SQUIDs), provide an NMR signal-to-noise ratio (SNR) that is independent of B 0 [2]. In recent years, there has been growing interest in ultra-low-field (ULF) MRI, usually measured in a field on the order of Earth's magnetic field (B 0 ∼ 10-100 µT).
A number of ULF-MRI-specific imaging techniques have emerged, including rotary-scanning acquisition (RSA) [4], temperature mapping [5], signal-enhancing dynamic nuclear polarization [6,7], imaging of electric current density (CDI) [8][9][10], and making use of significant differences in NMR relaxation mechanisms at ULF compared to tesla-range fields [11][12][13]. Several groups have also investigated possibilities to directly detect changes in the NMR signal due to neural currents in the brain [14][15][16][17] and electrical activation of the heart [18]. A further notable field of research now focuses on combining ULF MRI with magnetoencephalography (MEG). In MEG, an array of typically ∼ 100 sensors [19][20][21] is arranged in a helmet-shaped configuration around the head (see Fig. 1) to measure the weak magnetic fields produced by electrical activity in the brain [22,23]. SQUID sensors tailored for ULF MRI can typically also arXiv:1805.08867v2 [physics.ins-det] 15 Jan 2020 Figure 1. Helmet-type sensor array geometries consisting of (a) triple-sensor modules at 102 positions similar to standard Elekta/Neuromag MEG configurations and (b) an array with larger overlapping pickup coils for increased perfomance. Magnetometers are marked in green and gradiometers in red or blue; see Sec. II B for descriptions of pickup coils. (The sample head shape is from MNE-Python [3].) be used for MEG, and performing MEG and MRI with the same device can significantly improve the precision of localizing brain activity [24][25][26][27][28].
In typical early ULF-MRI setups [2], the signal was detected by a single dc SQUID coupled to a superconducting pickup coil wound in a gradiometric configuration that rejects noise from distant sources. In this case, the maximum size of the imaging field of view (FOV) is roughly given by the diameter of the pickup coil. With large diameters such as 60 mm, field sensitivities better than 1 fT/ √ Hz have been achieved with a reasonable FOV. A large coil size, however, does have its drawbacks, including issues such as high inductance and increased requirements in dynamic range. Therefore, the most straightforward way to increase the available FOV and the SNR is to use an array of sensors. In addition, as is well known in the context of MEG [20,29,30], a multi-channel measurement allows forming so-called software gradiometers and more advanced signal processing techniques to reduce noise that can be optimized separately for different noise environments. In ULF-MRI, this can even be done individually for each voxel (volume element) position within the imaging target, as will be shown later. While single-channel systems are still common, several groups have already been using arrays of sensors.
Also in conventional MRI, so-called parallel MRI is performed using an array of tens of induction coils, allowing full reconstruction of images from a reduced number of data acquisitions [31,32]. There are studies on designing arrays of induction coils for parallel MRI [33] with an emphasis on minimizing artefacts caused by the reduced number of acquisitions. At the kHz frequencies of ULF MRI, the dominant noise mechanisms are significantly different, and one needs to consider, for instance, electromagnetic interference from power lines and electrical equipment, thermal noise from the radiation shield of the cryostat required for operating the superconducting sensors, as well as noise and transients from other parts of the ULF MRI system structure and electronics [34]. Studies on the design of arrays for MEG [20,35,36], which mainly focus on the accuracy of localizing brain activity, are also not applicable to ULF MRI. In terms of single-sensor ULF-MRI signals, there are existing studies of the depth sensitivity [37] and SNR as a function of frequency with different detector types [38].
Previously, in Ref. [39], we presented approaches for quantitative comparison of sensor arrays in terms of the combined performance of the sensors, the results indicating that the optimum sensor for ULF MRI of the brain would be somewhat larger than typical MEG sensors. Extending and refining those studies, we aim to provide a fairly general study of the optimization of ULF-MRI array performance, with special attention to SNR and imaging the human head.
We begin by defining relevant quantities and reviewing basic principles of ULF MRI in Sec. II. Then, we analyze the effects of sensor geometry and size with different noise mechanisms (Sec. III), advancing to sensor arrays (Sec. IV). Finally, we show computed estimations of array SNR as functions of pickup size and number, and provide more detailed comparison of spatial SNR profiles with different array designs (Secs. V and VI).

A. Signal model and single-channel SNR
In contrast to conventional MRI, where the tesla-range main field is static and accounts for both polarizing the sample and for the main readout field, ULF MRI employs switchable fields. Dedicated electronics [34] are able to ramp on and off even the main field B 0 with an ultra-high effective dynamic range. An additional pulsed prepolarizing field B p magnetizes the target before signal acquisition. Typically, a dedicated coil is used to generate B p (B p ∼ 10-100 mT) in some direction to cause the proton bulk magnetization M ( r ) to relax with a longitudinal relaxation time constant T 1 towards its equilibrium value corresponding to B p . After a polarizing time on the order of seconds or less, B p is switched off-adiabatically, in terms of spin dynamics-so that M turns to the direction of the remaining magnetic field, typically B 0 , while keeping most of its magnitude.
Next, say at time t = 0, a short excitation pulse B 1 is applied which flips M away from B 0 , typically by 90 • , bringing M into precession around the magnetic field at positions r throughout the sample. While rotating, M ( r ) decays towards its equilibrium value corresponding to the applied magnetic field in which the magnetization precesses. This field, B L , may sometimes simply be a uniform B 0 , but for spatial encoding and other purposes, different non-uniform magnetic fields ∆ B( r, t) are additionally applied to affect the precession before or during acquisitions. The encoding is taken into account in the subsequent image reconstruction.
The ULF MRI signal can be modeled to a high accuracy given the absence of unstable distortions common at high frequencies and high field strengths. To obtain a model for image formation, we begin by examining M at a single point. If the z axis is set parallel to the total precession field B L , then the xy (transverse) components of M account for the precession. Assuming, for now, a static B L , and omitting the decay for simplicity, the transverse magnetization M xy = M xy (t) can be written as where ω = 2πf L is the precession angular frequency, e ♥ is the unit vector along the ♥ axis (♥ = x, y, z), and φ 0 is the initial phase, which sometimes contains useful information.
In an infinitesimal volume dV at position r in the sample, the magnetic dipole moment of protons in the volume is M ( r ) dV . It is straightforward to show that the rotating components of this magnetic dipole are seen by any magnetic field or flux sensor as a sinusoidal signal dψ s = |β| cos(ωt + φ 0 + φ s )M xy dV . Here |β| = |β( r )| is the peak sensitivity of the sensor to a unit dipole at r that precesses in the xy plane, and φ s = φ s ( r ) is a phase shift depending on the relative positioning of the sensor and the dipole. To obtain the total sensor signal ψ s , dψ s is integrated over all space: where φ( r, t) = t 0 ω( r, t ) dt + φ 0 ( r ) + φ s ( r ) .
Here, we have noted that the magnetic field can vary in both space and time and therefore ω = ω( r, t) = γB( r, t), where γ is the gyromagnetic ratio; γ/2π = 42.58 MHz/T for a proton. For convenience, the signal given by Eq. (2) can be demodulated at the angular Larmor frequency ω 0 = 2πf 0 corresponding to B 0 ; using the quadrature component of the phase sensitive detection as the imaginary part, one obtains a complex-valued signal where * denotes the complex conjugate, m( r ) = M xy ( r )e −iφ0( r ) is the uniform-sensitivity image, ∆ω = ω − ω 0 , and we define as the single-channel complex sensitivity profile. Besides geometry, β generally also depends on the direction of the precession field; β = β BL ( r ).
After acquiring enough data of the form of Eq. (3), the image can be reconstructed-in the simplest case using only one sensor, or using multiple sensors, each having its own sensitivity profile β. As a simplified model for understanding image formation, ideal Fourier encoding turns Eq. (3) into the 3-D Fourier transform of the sensitivity-weighted complex image β * m = (β * m)( r ). In reality, however, the inverse Fourier transform only provides an approximate reconstruction, and more sophisticated techniques should be used instead [40].
Here, we do not assume a specific spatial encoding scheme. Notably, however, the sensitivity profile is indistinguishable from m based on the signal [Eq. (3)]. In other words, the spatial variation of β * affects the acquired data in the same way as a similar variation of the actual image would, regardless of the spatial encoding sequence in ∆ω.
Consider a small voxel of centered at r. The contribution of the voxel to the signal in Eq. (3) is proportional to an effective voxel volume V . Due to measurement noise, the voxel value becomes V β * m + ξ, where ξ is a random complex noise term. If β is known, the intensitycorrected voxel of a real-valued image from a single sensor is given by where s( r ) = V β * ( r ) is the sensitivity of the sensor to m in the given voxel. Assuming that the distribution of ξ = |ξ|e iφ ξ is independent of the phase φ ξ , the standard deviation σ of Re ξe iφs is independent of φ s and proportional to σ s , the standard deviation of the noise in the relevant frequency band of the original sensor signal. The precision of a voxel value can be described by the (amplitude) SNR of the voxel value. The voxel SNR is defined as the correct voxel value m( r ) divided by the standard deviation of the random error and can be written as where the last expression incorporates that m ∝ B p , and that σ is inversely proportional to the square root of the total signal acquisition time, which is proportional to the total MRI scanning time T tot . It should be recognized, however, that σ also depends heavily on factors not visible in Eq. (6), such as the imaging sequence. Ultimately, the ability to distinguish between different types of tissue depends on the contrast-to-noise ratio (CNR), which can be defined as the SNR of the difference between image values corresponding to two tissues. A better CNR can be achieved by improving either the SNR or the contrast, which both strongly depend also on the imaging sequence.

B. SQUIDs, pickup coils and detection
SQUIDs are based on superconductivity, the phenomenon where the electrical resistivity of a material completely vanishes below a critical temperature T c [41]. A commonly used material is niobium (Nb), which has T c = 9.2 K. It is usually cooled by immersion in a liquid helium bath that boils at 4.2 K in atmospheric pressure.
SQUIDs can be divided into two categories, rf and dc SQUIDs, of which the latter is typically used for biomagnetic signals as well as for ULF MRI [19,27]. The dc SQUID is a superconducting loop interrupted by two weak links, or Josephson junctions; see Fig. 2(a). With suitable shunting and biasing to set the electrical operating point, the current or voltage across the SQUID can be configured to exhibit an oscillatory dependence on the magnetic flux going through the loop-analogously to the well known double-slit interference of waves.
A linear response to magnetic flux is obtained by operating the SQUID in a flux-locked loop (FLL), where an electronic control circuit aims to keep the flux constant by applying negative flux feedback via an additional feedback coil.
To avoid harmful resonances and to achieve low noise, the SQUID loop itself is usually made small. The signal is coupled to it using a larger pickup coil connected to the SQUID via an input circuit to achieve high sensitivity. An input circuit may simply consist of a pickup coil and an input coil in series, forming a continuous superconducting path which, by physical nature, conserves the flux through itself, and feeds the SQUID according to the signal received by the pickup coil, as explained in Sec. III A along with more sophisticated input circuits.
Different types of responses to magnetic fields can be achieved by varying the pickup coil geometry. Fig. 2(b-g) schematically depicts some popular types. The simplest case is just a single loop, a magnetometer, which in a homogeneous field responds linearly to the field component perpendicular to the plane of the loop (b). Two loops of the same size and orientation, but wound in opposite directions, can be used to form a gradiometer. The resulting signal is that of one loop subtracted from that of the other. It can be used to approximate a derivative of the field component with respect to the direction in which the loops are displaced (by distance b, called the baseline). Typical examples are the planar gradiometer (c) and the axial gradiometer (d). By using more loops, one can measure higher-order derivatives. Some ULF-MRI implementations [2,42] use second-order axial gradiometers (e). If a source is close to one loop of a longbaseline gradiometer, that 'pickup loop' can be thought of as a magnetometer, while the additional loops suppress noise from MRI coils or distant sources. However, adding loops also increases the inductance L p . Before a more detailed theoretical discussion regarding L p and SQUID noise scaling, we study the detection of the MRI signal by the pickup coils.

C. Sensitivity patterns and signal scaling
The magnetic flux Φ picked up by a coil made of a thin superconductor is given by the integral of the magnetic field B over a surface S bound by the coil path ∂S, Here, the line integral form was obtained by writing B in terms of the vector potential A as B = ∇ × A, and applying Stokes's theorem.
As explained in Sec. II A, the signal in MRI arises from spinning magnetic dipoles. The quasi-static approximation holds well at signal frequencies, providing a vector potential for a dipole m positioned at r as where µ is the permeability of the medium, assumed to be that of vacuum; µ = µ 0 . Substituting this into Eq. (7) and rearranging the resulting scalar triple product leads to where the expression for the sensor field B s is the Biot-Savart formula for the magnetic field at r caused by a hypothetical unit current in the pickup coil, as required by reciprocity.
The sensor field B s is closely related to the complex sensitivity pattern β introduced in Sec. II A. In an applied field B L = B L e z , the magnetization precesses in the xy plane, and β can in fact be written as For arbitrary B = B L e L , we have We choose to define the measured signal as the flux through the pickup coil-a convention that appears throughout this paper. The measurement noise is considered accordingly, as flux noise. This contrasts looking at magnetic-field signals and noise, as is often seen in the literature. Working with magnetic flux signals allows for direct comparison of different pickup coil types. Moreover, the approximation that magnetometer and gradiometer pickups respond to the field and its derivatives, respectively, is not always valid.
The signal often scales as simple power laws R α with the pickup coil size R (or radius, for circular coils). When the distance l from the coil to the signal source is large compared to R, a magnetometer sees a flux Φ ∝ BR 2 , giving an amplitude scaling exponent α = 2. When scaling a gradiometer, however, also the baseline b is proportional to R. This leads to α = 3 for a first-order gradiometer, or α = 2 + k for one of k th order. Conversely, the signal scales with the distance as l −α−1 , as is verified by writing the explicit forms of the field and its derivatives. The additional −1 in the exponent reflects the dipolar nature of the measured field (−2 for quadrupoles etc.). For some cases, the detected flux can be calculated analytically using Eq. (8). First, as a simple example, consider a dipole at the origin, and a circular magnetometer pickup loop of radius R parallel to the xy plane at z = l, centered on the z axis. The integral in Eq. (8) is easily integrated in cylindrical coordinates to give If the dipole precesses in, for instance, the xz plane, the corresponding sensitivity is |β| = B s . Instead, if precession takes place in the xy plane, the sensitivity vanishes; |β| = 0, and no signal is received. In this case, moving the pickup loop away from the z axis would cause a signal to appear. These extreme cases show that even the absolute value of a single-channel sensitivity is strongly dependent on the sensor orientation with respect to the source and the magnetic field, as is also seen in Fig. 3. Another notable property of the sensitivity |β| = B s from Eq. (11) is that if l is fixed, there is a value of R above which the sensitivity starts to decrease, i.e., part of the flux going through the loop comes back at the edges canceling a portion of the signal. By requiring ∂B s /∂R to vanish, one obtains R = l √ 2, the loop radius that gives the maximum signal. Interestingly, however, if instead of the perpendicular (z) distance, l is taken as the closest distance to the pickup-coil winding, then the coil is on a spherical surface of radius R a = l. Now, based on Pythagoras's theorem, R 2 + l 2 in Eq. (11) is replaced with l 2 . In other words, the sensor field is simply B s = e z µR 2 /2l 3 , so the scaling of α = 2 happens to be the same as for distant sources in this simple case.
Importantly, however, the noise mechanisms also depend on R, and moreover, the situation is complicated by the presence of multiple sensors. These matters are discussed in Secs. III-IV.

III. NOISE MECHANISMS AND SCALING
The signal from each measurement channel, corresponding to a pickup coil in the sensor array, contains flux noise that can originate from various sources. Examples of noise sources are the sensor itself, noise in electronics that drives MRI coils, cryostat noise, magnetic noise due to thermal motion of particles in other parts of the measurement device and in the sample, noise from other sensors, as well as environmental noise. This section is devoted to examining the various noise mechanisms and how the noise can be dealt with. Unless stated otherwise, noise is considered a random signal with zero average. We use amplitude scaling exponents α to characterize the dependence of noise on pickup-coil size and type.

A. Flux coupling and SQUID noise
For estimates of SQUID sensor noise as a function of pickup coil size, a model for the sensor is needed. As explained in Sec. II B, the signal is coupled into the SQUID loop via an input circuit. In general, the input circuit may consist of a sequence of one or more allsuperconductor closed circuits connected by intermediate transformers. Via inductance matching and coupling optimization, these circuits are designed to efficiently couple the flux signal into the SQUID loop.
Intermediate transformers can be useful for optimal coupling of a large pickup coil to a SQUID-coupled input coil, as analyzed e.g. in Ref. [43]. To further understand the concept, consider a two-stage input circuit where a pickup coil (L p ) is connected to a transmitting inductor L 1 to form a closed superconducting path; see Fig. 4. Ideally, the distance between the two coils is fairly small in order to avoid signal loss due to parasitic inductances of the connecting traces or wiring. The total inductance of this flux-coupling circuit by itself is L p + L 1 . The primary is coupled to a secondary inductor L 2 with mutual inductance M 12 . As the magnetic flux picked up in L p changes by ∆Φ p , there is a corresponding change ∆J 1 in the supercurrent flowing in the circuit such that the flux through the closed path remains constant. This passes the flux signal onwards to L 2 which forms another fluxtransfer circuit together with the input coil L i , which couples inductively into the SQUID.
Superconductivity has two important effects on the transmission of flux into the next circuit. First, the presence of superconducting material close to a coil tends to reduce the coil inductance because of the Meissner effect: the magnetic flux is expelled and the material acts as a perfect diamagnet. This effect is included in the given inductances L p and L 1 . The other effect emerges when the flux is transmitted into another closed superconducting circuit, such as via M 12 . This is because the transmitting coil is subject to the counteracting flux M 2 12 ∆J 1 /(L 2 +L i ) from the receiving coil of the other circuit. Now current ∆J 1 only generates a flux [L 1 − M 2 12 /(L 2 + L i )]∆J 1 in L 1 . Closing the secondary circuit thus changes the inductance from L 1 to where the last form is obtained by expressing the mutual inductance in terms of the coupling constant k 12 (|k 12 | < 1) as M 12 = k 12 √ L 1 L 2 . Note that we do not include a counteracting flux from the SQUID inductance L S back into L i , i.e., no screening from the biased SQUID loop. However, like other inductances, L i does include the effect of the presence of the nearby superconductors through the Meissner effect.
The change of flux though the dc SQUID loop is now obtained as or, with M iS = k iS √ L i L S and defining χ 1 and χ 2 such that L 1 = χ 1 L p and L 2 = χ 2 L i , we have For a given pickup coil, χ 1 and χ 2 can usually be chosen to maximize the flux seen by the SQUID. While the function in Eq. (15) is monotonous in k 12 , there is a single maximum with respect to parameters χ 1 , χ 2 > 0. Noting the symmetry, we must have χ 1 = χ 2 =: χ, and the factor in Eq. (15) becomes k 12 χ/[χ 2 (1−k 2 12 )+2χ+1], which is maximized at χ = 1/ 1 − k 2 12 . At the optimum, the coupled flux is given by Notably, with a k 12 ≈ 1, the coupling corresponds to a perfectly matched single flux-coupling circuit [41]. Already at k 12 = 0.8, 50% of the theoretical maximum is achieved, while matching without an intermediate transformer may cause practical difficulties or parasitic resonances. When referred to SQUID flux Φ S , the noise in the measured SQUID voltage in the flux-locked loop corresponds to a noise spectral density S ΦS (f ) at frequency f . As the signal transfer from the pickup coil to the SQUID is given by Eqs. (15), the equivalent flux resolution referred to the signal through the pickup coil can be written as Due to resonance effects and thermal flux jumps, L S needs to be kept small [41]. The flexibility of intermediate transformers allows the same model to estimate noise levels with a wide range of pickup coil inductances L p . In general, the inductance of a coil with a given shape scales as the linear dimensions, or radius R, of the coil. If the wire thickness is not scaled accordingly, there will be an extra logarithmic term [44]. Even then, within a range small enough, the dependence is roughly S

1/2
Φp ∝ R α with α = 1/2. The case of a magnetometer loop in a homogeneous field then still has a field resolution S

B. Thermal magnetic noise from conductors
Electric noise due to the thermal motion of charge carriers in a conducting medium is called Johnson-Nyquist noise [45,46]. According to Ampère's law ∇ × B = µ 0 J, the noise currents in the current density J also produce a magnetic field which may interfere with the measurement. In this view, devices should be designed in such a way that the amount of conducting materials in the vicinity of the sensors is small. However, there is a lower limit set by the conducting sample-the head. Estimations of the sample noise [38] have given noise levels below 0.1 fT/ √ Hz, consistent with a recent experimental result of 55 aT/ √ Hz [47]. Other noise sources still exceed those values by more than an order of magnitude.
More restrictingly, it is difficult to avoid metals in most applications.
To keep the SQUID sensors in the superconducting state, the array is kept in a helmet-bottom cryostat filled with liquid helium at 4.2 K. The thermal superinsulation of a cryostat usually involves a vacuum as well as layers of aluminized film to suppress heat transfer by radiation [41]. The magnetic noise from the superinsulation can be reduced by breaking the conducting materials into small isolated patches. Seton et al. [48] used aluminiumcoated polyester textile, which efficiently breaks up current paths in all directions. By using very small patches, one can decrease the field noise at the sensors by orders of magnitude, although with increased He boil-off [49].
To look at the thermal noise from the insulation layers in some more detail, consider first a thin slab with conductivity σ on the xy plane at temperature T . Johnson-Nyquist currents in the conductor produce a magnetic field B(x, y, z, t) outside the film. For an infinite (large) slab, the magnitude of the resulting field noise depends, besides the frequency, only on z, the distance from the slab (assume z > 0). At low frequencies, the spectral densities S Bα (α = x, y, z) corresponding to Cartesian field noise components are then given by [50] S 1/2 where d is the thickness of the slab and k B the Boltzmann constant.
The infinite slab is a good approximation when using a flat-bottom cryostat or when the radius of curvature of the cryostat wall is large compared to individual pickup loops. Consider a magnetometer pickup loop with area A placed parallel to the conducting films in the insulationto measure the z component of the magnetic field, B z . The coupled noise flux is the integral of B z over the loop area. If the loop is small, the noise couples to the pickup circuit as S Bz A. A coil of size R then sees a flux noise proportional to S 1/2 Bz R 2 , that is, α = 2. Instead, if the pickup coil is large, the situation is quite different. The instantaneous magnetic field depends on all coordinates and varies significantly over the large coil area. Consider the noise field at two points in the plane of the coil. The fields at the two points are nearly equal if the points are close to each other. However, if the points are separated by a distance larger than a correlation length λ c (z), the fields are uncorrelated. Therefore, if R λ c , the coupled flux is roughly a sum of A/λ 2 c uncorrelated terms from regions in which the field is correlated. Each term has a standard deviation of order S

1/2
Bz λ 2 c . The spectral density of the cryostat noise is then Most importantly, the flux noise amplitude S 1/2 Φ,c is directly proportional to the coil size R, and we now have α = 1. Still, the noise increases to a higher power of R than the sensor noise, which according to section III A scales as √ R and hence dominates in small pickup coils. For a continuous film, the correlation length λ c can be estimated from data in Ref. [51] to be around several times z. The correlation at distances smaller than λ c is due to two reasons. First, the magnetic field due to a small current element in the conductor is spread in space according to the Biot-Savart law. Second, the noise currents in elements close to each other are themselves correlated. The latter effect is broken down when the film is divided into small patches; only very small current loops can occur, and the noise field starts to resemble that of Gaussian uncorrelated magnetic point dipoles throughout the surface. In this case, Eq. (18) is no longer valid, but the approximate relation of Eq. (19) still holds-now with a smaller λ c .
The magnetometer case is easily extended to first-order planar gradiometers parallel to the superinsulation layers [ Fig. 2(b, f)]. For a very small baseline, b λ c , the field noise is effectively homogeneous and thus cancels out. However, when b λ c , the spectral density of the noise power is twice that of a single loop.

C. MRI electronics, coils and other noise sources
As explained in Sec. II A, MRI makes heavy use of applied magnetic fields. The fields are generated with dedicated current sources, or amplifiers, to feed currents into coils wound in different geometries. As opposed to applying static fields, a major challenge arises from the need for oscillating pulses and the desire to quickly switch on and off all fields, including not only readout gradients but also the main field B 0 , which requires an ultra-high dynamic range to avoid excess noise. Switching of B 0 enables full 3-D field mapping for imaging of small electric currents in volume [34]. Noise in the coil currents can be a major concern in the instrumentation. The contribution from B 0 ideally scales with pickup coil size as R α , α = 2 for a magnetometer, and noise in linear gradients essentially scales as α = 2 in magnetometers as well as fixed-baseline gradiometers. With b ∝ R, first-order gradiometers experience noise from linear gradient coils according to α = 3.
MRI coils themselves also produce Johnson-Nyquist noise. In particular, the polarizing coil is often close to the sensors and made of thick wires as it should be able to produce relatively high fields. This allows thermal electrons to form current loops that generate field noise with complicated spatial characteristics, which is detrimental to image quality and should be eliminated. Another approach is to use litz wire, which is composed of thin wires individually coated with an insulating layer. This prevents significant noise currents perpendicular to the wire and eliminates large current loops. However, efficient uniform cooling of litz wire is problematic, leading to larger coil diameters. Increasing the coil size, however, significantly increases harmful transients in the system as well as the power and cooling requirements [52]. Instead, we have had promising results with thin custom-made superconducting filament wire and DynaCAN (Dynamical Coupling for Additional dimeNsions) in-sequence degaussing waveforms to solve the problem of trapped flux [52,53]; optimized oscillations at the end of a pulse can expel the flux from the superconductor. Such coils contain much less metal, and significantly reduce the size of current loops that can generate magnetic noise.
A significant amount of noise also originates from more distant locations. Power lines and electric devices, for instance, are sources that often can not be removed. Indeed, magnetically shielded rooms (MSRs) effectively attenuate such magnetic interference. However, pulsed magnetic fields inside the shielded room induce eddy currents exceeding 1 kA in conductive MSR walls [54], leading to strong magnetic field transients that not only saturate the SQUID readout, but also seriously interfere with the nuclear spin dynamics in the imaging field of view. Even a serious eddy current problem can again be solved with a DynaCAN approach where optimized current waveforms are applied in additional coil windings to couple to the complexity of the transient [55].
Noise from distant sources typically scales with the pickup coil size with an exponent at least as large as the signal from far-away sources: α = 2 + k for a k th -order gradiometer (see Sec. II B). Although the noise detected by gradiometers scales to a higher power than with magnetometers (k = 0), gradiometers have the advantage that they, in principle, do not respond to a uniform field. For a higher-order gradiometer that is not too large, the environmental noise is nearly uniform in space, and therefore effectively suppressed by the pickup coil geometry. Gradiometers with relatively long baselines can also be seen as magnetometers when the source is close to one of the loops. Still, they function as gradiometers from the perspective of distant noise sources. A similar result applies for so-called software gradiometers, which can, for example, be formed by afterwards taking the difference of the signals of two parallel magnetometers. However, in Sec. IV A, a more sophisticated technique is described for minimizing noise in the combination of multiple channels. At very low system noise levels, other significant noise mechanisms include noise due to dielectric losses. Electrical activity in the brain can also be seen as a source of noise. This noise, however, is strongest at frequencies well below 1 kHz. Using Larmor frequencies in the kHz range may therefore be sufficient for spectral separation of brain noise from MRI.
The amplitude scaling exponents α for signal and noise are summarized in Table I. The notation in later sections refers to the scaling of flux signal and noise in terms of α s and α n , respectively. For a single sensor, the SNR scaling R δ is given by δ = α s − α n . Table I. Amplitude scaling exponents α for the flux noise standard deviation σ ∝ R α as well as the signal, given different pickup-coil geometries and noise mechanisms.

A. Combining data from multiple channels
It is common to work with absolute values of the complex images to eliminate phase shifts. Images from multiple channels can then be combined by summing the squares and taking the square root. This procedure, however, causes asymmetry in the noise distribution and loses information that can be used for improved combination of the data. If the sensor array and the correlations of noise between different sensors are known, the multi-channel data can be combined more effectively.
In the following, we show that, where multiple sensors can form a software gradiometer, an array of N sensors can form an N th -order combination optimized to give the best SNR for each voxel.
To follow the derivation in Ref. [39], consider a voxel centered at r, and N sensors indexed by j = 1, 2, ..., N . Based on Sec. II A, each sensor has a unit magnetization image s j ( r ) = β * j ( r )V , where β j and V are the sensitivity profile and voxel volume, respectively. The absolute value |s j | gives the sensed signal amplitude caused by a unit magnetization in the voxel, precessing perpendicular to B L . The complex phase represents the phase shift in the signal due to the geometry. To study the performance of the array only, we set V to unity.
where † denotes the conjugate transpose. Requiring that the outcome is sensitivity-corrected sets a condition on the coefficient vector a = [a 1 , ..., a N ] . In the absence of noise, a unit source magnetization gives v j = s j ( r ). The final voxel value u should represent the source, which leads to the condition Below, we show how a = [a 1 , ..., a N ] should be chosen in order to maximize the SNR in the final image given the sensor array and noise properties. The single-sensor image values v i can be written in the form v j = w j + ξ j where w j is the 'pure' signal and ξ j is the noise. The noise terms ξ j can be modeled as random variables, which, for unbiased data, have zero expectation: E(ξ j ) = 0. If there is a bias, it can be measured and subtracted from the signals before this step. The expectation of the final value of this voxel is then The noise in the voxel is quantified by the variance of u.
Eqs. (20) and (22) yield u = E(u) + a † ξ, leading to where Σ = E(ξξ † ) identifies as the noise covariance matrix. For simple cases, Σ is the same for all voxels. However, it may vary between voxels if, for instance, the voxels are of different sizes. Now, the task is to minimize the noise a † Σa subject to the constraint in Eq. (21). The Lagrange multiplier method turns the problem into finding the minimum of with respect to a, while still requiring that Eq. (21) holds. From the constraint it follows that a † s is real, so it may be replaced by (a † s + s † a)/2 in Eq. (24). By 'completing the square' in Eq. (24), one obtains whereã satisfies Since Σ, being a covariance matrix, is positive (semi)definite, the minimum of L is found at a =ã. Further, Σ is always invertible, as the contrary would imply that some non-trivial linear combination of the signals would contain zero noise. Multiplying Eq. (26) by s † Σ -1 from the left and using Eq. (21) leads to λ = −2/s † Σ -1 s. When this expression for λ is put back into Eq. (26), the optimal choice for the coefficient vector a =ã is obtained as Similar to Eq. (7) of Ref. [56], Eqs. (23) and (27) reveal the final noise variance σ 2 fin for the given voxel position, In the above derivation, we assumed little about how the individual single-sensor data were acquired. In fact, the only significant requirement was that the sensitivities s i are well defined and accessible. As discussed previously, the signal can be modeled to high accuracy at ULF (see Sec. II A).

B. Figures of merit and scaling for arrays
Given the N th -order combination from Eqs. (20) and (27), the contribution of the sensor array to the voxelwise image SNR is given by Eq. (28). We define the array-sensitivity-to-noise ratio aSNR as When each sensor in the array sees an equal flux noise level σ, the aSNR 1/2 takes the form where X = Σ/σ 2 is the dimensionless noise correlation matrix. We refer to the quantity √ s † X -1 s as the array sensitivity, which for weak correlation is given approximately as ||s|| 2 . Scaling law exponents for the array sensitivity are denoted by α a , and for the aSNR by δ = α a − α n .

C. Correlation of noise between sensors
As already seen in Secs. IV A and IV B, the aSNR is affected by the correlation of random noise between different single-sensor channels. There are two main reasons for such correlations. First, a noise source that is not an intrinsic part of a sensor can directly couple to many sensors. For instance, thermal noise in conductors close to the sensors may result in such correlated noise (see Sec. III B). Second, the pickups of the sensors themselves are coupled to each other through their mutual inductances. This cross-coupling increases noise correlation and may also affect the sensitivity profiles via signal cross-talk.
To see the effect of noise correlation on the image SNR, consider a noise covariance matrix of the form where I is the identity matrix and C contains the correlations between channels (the off-diagonal elements of X). In words, each channel has a noise variance of σ 2 and channels p and q have correlation C pq = E(ξ p ξ * q )/σ 2 . Assume further that absolute values of the correlations C pq are substantially smaller than one.
To first order in C, the inverse of Σ is obtained as Σ −1 ≈ σ −2 (I − C). The SNR in the final image, according to Eq. (28), is then proportional to σ −1 fin , with Clearly, the effect of correlations on the image SNR is governed by the sum in Eq. (32). Assume that the dominant terms in the sum correspond to adjacent sensors p and q. For voxels not too close to the sensors, the sensitivities s p and s q are similar, and therefore s * p s q ≈ |s p | 2 ≈ |s q | 2 . Further, if the noise correlation between the adjacent sensors is positive, one has Re (s * p s q C pq ) > 0. This leads to the conclusion that the noise correlation tends to decrease the image SNR.
While the assumptions made in the above discussion may not always be exactly correct, the result is an indication that the correlation of noise between adjacent sensors is usually harmful-even if it is taken into account in reconstruction. Moreover, the actions taken in order to reduce noise correlation are often such that the noise variances decrease as well. For instance, eliminating a noise source from the vicinity of the sensor array does exactly that.
Correlation can also be reduced by minimizing the inter-sensor cross-talk, for instance by designing a sensor array with low mutual inductances between pickup coils. If the mutual inductances are non-zero, the cross-talk can be dramatically reduced by coupling the feedback of the SQUID flux-locked loop to the pickup circuit instead of more directly into the SQUID loop [41]. This way, the supercurrent in the pickup coil stays close to zero at all times. In theory, the cross-talk of the flux signals can be completely eliminated by this method.
Correlated noise originating from sources far from the subject's head and the sensor array can also be attenuated by signal processing methods prior to image reconstruction. The signal space separation method (SSS) was developed at Elekta Neuromag Oy [30] (now ME-GIN) for use with 'whole-head' MEG sensor arrays. The SSS method can distinguish between signals from inside the sensor helmet and those produced by distant sources. Now, the strong noise correlation is in fact exploited to significantly improve the SNR. Similar methods may be applicable to ULF MRI as well. To help such methods, additional sensors can be placed outside the helmet arrangement to provide an improved noise reference.
For sensor array comparisons, we assume that all measures have been taken to reduce correlated noise before image reconstruction. The details of the remaining noise correlation depend on many, generally unknown aspects. Therefore, we set C = 0 in Eq. (31) for a slightly optimistic estimate, i.e., sensor noises are uncorrelated, each having variance σ 2 .

D. Filling the array
In this section, we use general scaling arguments to provide estimations of how the whole sensor array performs as a function of the pickup coil size. Consider a surface, for instance, of the shape of a helmet, and a voxel at a distance l from the surface. The surface is filled with N pickup coils of radius R to measure the field perpendicular to the surface. We assume the pickup coils are positioned either next to each other or in such a way that their areas overlap by a given fraction (see Fig. 1). The number of sensors that fit the surface is then proportional to R −2 .
Take, at first, a voxel far from the sensors; l R. Now, the signal from the voxel is spread over many sensors. For Σ = σ 2 I, the aSNR is proportional to s 2 /σ. Assume that s j ∝ R αs and σ ∝ R αn , which leads to s 2 ∝ √ N R αs ∝ R αs−1 , and finally, Here we thus have array sensitivity scaling according to α a = α s − 1, as opposed to α a = α s when N is fixed. Recall from Sec. II B that the flux sensitivities scale as R αs with α s = 2 for magnetometers and α s = 3 for first-order planar gradiometers, given that l R. Assuming, for instance, optimally matched input circuits, the intrinsic flux noise of the sensor in both cases has a power law behavior with exponent α n = 1/2 (see Sec. III B), which yields δ = 0.5 and δ = 1.5. This is clearly in favor of using larger pickup coils. Especially for larger R, however, the cryostat noise may become dominant, and one has α n ≈ 1. Now, magnetometer arrays have δ ≈ 0, i.e., the coils size does not affect the SNR. Still, gradiometer arrays perform better with larger R (α a ≈ 1).
In the perhaps unfortunate case that noise sources far from the sensors are dominant, the noise behaves like the signal, that is, α s = α n and δ = −1. Unlike in the other cases, a higher SNR would be reached by decreasing the pickup coil size. However, such noise conditions are not realistic in the low-correlation limit. Instead, one should aim to suppress the external noise by improving the system design or by signal processing.
The breakdown of the assumption of l R needs some attention. If the voxel of interest is close to the sensor array, the image value is formed almost exclusively by the closest pickup-loop. Now, for non-overlapping pickups, the results for single sensors (α a = α s ) are applicable, and the optimum magnetometer size is R ≈ l. But then, if the voxel is far from the array (deep in the head), and R is increased to the order of l, it is more difficult to draw conclusions. We therefore extend this discussion in Secs. V and VI by a computational study.

V. METHODS FOR NUMERICAL STUDY
In order to be able to compare the performance of different sensor configurations, we used 3-D computer mod-els of sensor arrays and calculated their sensitivities to signals from different locations in the sample.
The sensitivities of single pickup coils were calculated using B s from Eq. (8). Evaluating the line integral required the coil path ∂S to be discretized. The number of discretization points could be kept small by analytically integrating Eq. (34) over n straight line segments between consecutive discretization points r j and r j+1 (the end point r n = r 0 ): As shown in Appendix A, this integrates exactly to where a j = r j − r. Besides reducing computational complexity and increasing accuracy, this result allowed exact computation for polygonal coils. For a precession field B L = B L e L , the single-sensor sensitivities were obtained from Eq. (10) and the arraysensitivity and aSNR maps were computed according to Sec. IV B. The normalization of the values computed here is somewhat arbitrary; the real image SNR depends on a host of details that are not known at this point (see Sec. II A). However, the results can be used for studying array sensitivity patterns and-with noise levels scaled according to estimated coil inductances-for comparing different possible array setups.

VI. RESULTS
Numerical calculations were performed for simple spherical sensor arrays (Sec. VI A) as well as for realistic configurations (Sec. VI B), e.g., of the shape of a helmet. The former were used for studying scaling behavior of array sensitivities with sensor size and number, extending the discussion in Sec. IV D. The latter were used for comparing array sensitivity patterns of different potential designs.

A. Effects of size and number
A sensor array model was built by filling the surface of a sphere of radius 10 cm (see Fig. 5) with N magnetometers or N/2 planar units of two orthogonal planar firstorder gradiometers. Combining one of the magnetometers with one of the gradiometer units would thus give a sensing unit similar to those of the Elekta/Neuromag MEG system, though circular (radius R). All sensors were oriented to measure the radial component of the field. A spherical surface of radius 6 cm was chosen to represent the cerebral cortex. The cortex surface was thus at distance 4 cm from the sensor shell. In addition, the center of the sphere was considered to represent deep parts of the brain.
The data in Fig. 6 show the dependence of the array sensitivity on R. Note that the number of sensors is approximately proportional to R −2 . The largest coil size R = 10 cm corresponds to one magnetometer or gradiometer unit on each of the six faces of a cube. The solid lines correspond to the scaling of the sensitivity as R αa , α a = α s − 1. For smaller R, the scaling laws from Sec. IV D hold in all cases, and particularly well for gradiometers and deep sources. The scaling law fails most notably with the magnetometer array at the cortex. Indeed, the sensitivity starts to decrease with R when R is very large, as was shown for a special case in Sec. II B.
The error bars in Fig. 6 correspond to the minimum and maximum value of the sensitivity at the cortex while the data symbols correspond to the average value. Despite the strong orientational dependence of single sensors (see Sec. II B), the array sensitivities are fairly uniform at the cortex. Only at large R do the orientational effects emerge. Figure 7 shows a different dataset on how the array sensitivity changes with how densely the sensors are packed into the array. In this case, a varying number of magnetometer coils or gradiometer units with fixed radius R = 1.44 cm was distributed on the spherical shell. The aSNR of voxels at the center scales as √ N to an excellent accuracy. While the average sensitivity at points on the cortex also obeys √ N scaling remarkably well, the uniformity drops dramatically when N is lowered below roughly 30 sensors. Closer to the sensors, e.g. on the scalp, this effect is even more pronounced.    Other slices, however, displayed similar performance at the cortex. Also changing the direction of the precession field B L had only a minor effect on the SNR in the region of interest. In all cases shown here, B L was parallel to the y axis, which is perpendicular to the visualization plane. Note that this contrasts MRI convention, where the B L direction is considered fixed and always along the z axis.
In most cases, the sensors are arranged on a helmet surface at 102 positions as in the Elekta/Neuromag system. Again, magnetometers and planar double-gradiometer units are considered separately (here, R = 1.25 cm, resembling conventional MEG sensors). The same flux noise level was assumed for magnetometers and planar gradiometers of the same size. In addition, we consider arrays with axial gradiometers as well as radially oriented planar gradiometers, both cases having k = 1, b = 4 cm and R = 1.25 cm. Configurations with 102 overlapping units with R = 2.5 cm are also considered, as well as the existing Los Alamos 7-channel coil geometry [42] and the single large second-order gradiometer at UC Berkeley [2] (see figure caption). For long-baseline gradiometers with k = 1, L p was estimated to be twice that of a single loop, and six times for k = 2.
With planar sensor units of R = 1.25 cm [ Fig. 8(a-b)], the aSNR for 102 magnetometers is three times that of 204 gradiometers at the cerebral cortex. At the center of the head, the difference is almost a whole order of magnitude in favor of the magnetometers. Therefore, the small gradiometers bring little improvement to the image SNR if the magnetometers are in use. However, as shown previously, especially gradiometer performance improves steeply with coil size. Allowing the coils to overlap with R = 2.5 cm [ Fig. 8(g-h)] leads to a vastly improved aSNR, especially with gradiometers, but also with magnetometers.
Gradiometers with long baselines provide somewhat magnetometer-like sensitivity patterns while rejecting external noise. However, their aSNR performance is inferior to magnetetometers because of their larger inductance, yielding higher flux noise when the sensor noise dominates; see Sec. III A. Helmet arrays of magnetometers can provide a similar aSNR in the deepest parts of the brain as the Berkeley gradiometer provides at a small area on the scalp.

VII. CONCLUSIONS AND OUTLOOK
Extending Ref. [39], we analyzed a variety of factors that affect the noise and sensitivity of a SQUID-based sensor array for ULF MRI of the brain. Many of the principles, however, apply to non-SQUID arrays as well.
We also derived numerical means for studying and comparing the SNR performances of any given sensor array designs.
Signal-and noise-scaling arguments and calculations showed that filling a sensor array with a huge number of tiny sensors is usually not advantageous. Larger pickup coil sizes give a better image SNR at the center of the head and, up to some point, also at closer sources such as the cerebral cortex. This is true even if the number of sensors needs to be decreased due to the limited area available for the array. However, the average voxel SNR is proportional to the square root of the number of sensors.
Several possible array designs were compared, including existing arrays designed for MEG and ULF MRI. The results are mostly in favor of magnetometers and large first-order gradiometers. While typically having inferior SNR, gradiometers do have the advantage of rejecting external fields, reducing also transient issues due to pulsed fields [52]. An especially dramatic difference was found when comparing a magnetometer-filled helmet with a single larger gradiometer.
In general, using an array of sensors relaxes the dynamic range requirements for sensor readout. Splitting a large loop into smaller ones further allows interference rejection based on correlation, while also increasing the SNR close to the center of the loop. An array of many sensors also solves the single-sensor problem of 'blind angles'.
Our initial analysis of overlapping magnetometer and gradiometer coils gave promising results. Implementing such arrays, however, poses challenges. Practical considerations include how to fabricate such an array and what materials to use. For instance, wire-wound Type-I superconducting pickup coils have shown some favorable properties [57,58] in pulsed systems, and exploiting the dynamics of superconductor-penetrating flux [52,53,59] has been promising. However, existing techniques are not suitable for helmet configurations with overlapping coils. In addition, careful design work should be conducted to minimize mutual inductances and other coupling issues. Further significant improvements could be achieved by placing the sensors closer to the scalp, but that would require dramatic advancements in cryostat technology, and was not studied here.
Here, we only considered the contribution of the sensor array to the imaging performance. Other things to consider are the polarizing technique as well as the ability of the instrumentation to apply more sophisticated sequences and reconstruction techniques, while preserving low system noise. A class of techniques enabled by multichannel magnetometers is accelerated parallel MRI [32]. However, the so-called geometry factor should be taken into account [60] if large parallel acceleration factors are pursued.