Spectral μCT with an energy resolving and interpolating pixel detector

A main challenge in x-ray μCT with laboratory radiation derives from the broad spectral content, which in contrast to monochromatic synchrotron radiation gives rise to reconstruction artifacts and impedes quantitative reconstruction. Due to the low spectral brightness of these sources, monochromatization is unfavorable and parallel recording of a broad bandpath is practically indispensable. While conventional CT sums up all spectral components into a single detector value, spectral CT discriminates the data in several spectral bins. Here we show that a new generation of charge integrating and interpolating pixel detectors is ideally suited to implement spectral CT with a resolution in the range of 10 μm. We find that the information contained in several photon energy bins largely facilitates automated classification of materials, as demonstrated for of a mouse cochlea. Bones, soft tissues, background and metal implant materials are discriminated automatically. Importantly, this includes taking a better account of phase contrast effects, based on tailoring reconstruction parameters to specific energy bins. © 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
Spectral x-ray computed tomography (CT), also known as color-CT, extends conventional 'grayscale' x-ray imaging representing the x-ray attenuation coefficient at a single photon energy E, by two or more energy bins up to an entire spectral dimension [1][2][3]. Energy-binned x-ray data recorded for each projection angle can be reconstructed independently or jointly using advanced regularized iterative reconstruction methods [4], in order to gather more information about the material composition. The simplest case of dual energy recordings can be implemented either with two sources and detection devices operated in parallel (dual-source), with two voltage settings (dual-kVp) or with energy-resolving detectors, which is the preferred option for multi-E recordings. While dual energy CT and color-CT is rather advanced in medical CT and to some extent also in macroscopic non-destructive testing, it has not yet fully reached the realm of µCT, i.e. high resolution CT with voxel sizes in the range of a few microns or below. This may be due to the fact that many µCT applications are carried out with monochromatic synchrotron radiation, and that imaging an object at several E, i.e. full field x-ray spectro-microscopy is well established in a sequential recording mode [5]. In translating spectro-microscopy and hence 'chemical contrast' to laboratory µCT, and thereby make it more broadly accessible, one of the main challenges is the development of suitable energy-resolving detectors, recording several energy bins in parallel. Due to the smaller pixel size required, this problem is accentuated with respect to macroscopic CT, and due to the notoriously low brilliance of laboratory sources, sequential measurements with monochromatic radiation is highly unfavorable.
The advent of direct conversion hybrid pixel detectors with Charge Integrating (CI) readout has created unique opportunities to implement spectral CT, with photon energy resolution limited only by their electronic noise. Unfortunately, charge sharing, due to the diffusion of the charge cloud produced by x-rays drifting to the collection electrode under the influence of the applied electric field, imposes a physical limit to the minimal useful pixel size to a few tens of microns [6], which is particularly detrimental for µCT applications. However, CI pixel detectors operated in the single photon regime, i.e. with single detected photons separated in time and space, can overcome this limit since the charge collected by neighbouring pixels can be summed up to obtain the photon energy (clustering) and its distribution can be analyzed to extract the point of incidence with a resolution better or equal to the pixel pitch. This has been exploited by the MÖNCH detector [7,8], a CI pixel detector with 25 µm pixel pitch, developed at the Paul-Scherrer institute. With the parameters of the MÖNCH, photons with energies between 10 keV and 20 keV will generally produce charge clouds larger than a single pixel but contained in 2×2 pixels. From these 4 charge-values, not only the photon energy (total charge) and the interaction pixel (maximum charge) can be extracted, but the interaction point of the photon can even be extracted with sub-pixel resolution of easily a factor of 5 in both dimensions by exploiting centroiding methods (interpolation).
In this work, we demonstrate that the MÖNCH detector, or more generally equivalent detector technology, can enhance laboratory µCT applications, in particular of biological specimen relevant for pre-clinical research. By providing simultaneous energy-resolved detection on a single-photon level, material identification and image segmentation is significantly improved and facilitated. We illustrate this with a tomography example at the scale of small animal organs, notably a mouse cochlea [9,10], which we have studied before by conventional grayscale CT. As in [9], in this work we use a liquid-metal jet anode [11] to reach sufficient resolution. The cochlea with its intricate three dimensional (3d) structure is indeed a perfect example for the benefits of virtual 3d histology. µCT can be instrumental in studies of malformations caused by genetic defects, in guiding new treatments, and for the development of cochlear implants [12,13], including novel optogenetic implants [14]. For this purpose, implant materials have to be distinguished from soft tissues (nerve, membranes) as well as from surrounding bone, which is a tedious task when performed manually. We show that spectral CT can not only be used to attribute certain features of the object, from a set of pre-defined materials, but that it can also serve to distinguish regions in the object by classification in a higher dimensional contrast space, without prior knowledge. Based on the experimental measurements we demonstrate the benefit of spectral imaging for automatic segmentation.

Sub-pixel sensitivity of the MÖNCH
For sub-pixel resolution and photon-energy information, the detector has to be operated in single-photon mode such that the detected photons are spatially or temporally well separated. The photon flux must be low enough such that no more than a single photon is absorbed per 3×3-cluster of the sensor per frame. Hence, the number of photons that can be detected per time interval ∆t per pixel must not exceed where f is the detector frame rate. However, due to the stochastic nature of photon detection, pileup of more than one photon per 3×3-cluster and frame is likely to happen if the mean flux is close to the threshold on the right hand side of (1). This should be avoided by using a factor of 10 lower flux than the one given in (1). Note that interpolation further decreases the photon statistics per interpolated pixel since all the available photons have to be distributed amongst the pixels of the interpolated image.
Here we give a brief overview of the sub-pixel photon position finding algorithm which is used to interpolate the images. In this study, we use an improved version of the position finding algorithm described in Ref. [15], which is based on a two-dimensional generalization of the η-algorithm [16]. For a model of the charge diffusion inside the sensor see Ref. [17] and the references therein. The first step is to identify charge clusters caused by photon hits as described in [18], producing a list of 2×2-clusters. Each cluster is further analyzed separately to find the center of the charge cloud with sub-pixel resolution in the following way.
Let p be the pixel pitch and let (x, y) be a Cartesian coordinate system centered in the center of the 2×2-cluster and let (x 0 , y 0 ) denote the center of the charge distribution. Clearly (x 0 , y 0 ) lies within |x 0 | ≤ p/2, |y 0 | ≤ p/2 (black dashed square in Fig. 1), otherwise a different 2×2-cluster would have been identified. Let Q 1,1 , Q 1,2 , Q 2,1 , and Q 2,2 be the charges in the 4 pixels (see Fig. 1) and let Q = i,j Q i,j be the total charge. Define It is easy to see that η x ≥ 0.5 if and only if x 0 ≥ 0 and likewise η y ≥ 0.5 if and only if y 0 ≥ 0. Thus (η x , η y ) can be used without further steps for two-fold interpolation. Higher resolution requires more information about the functional dependence of (η x , η y ) on (x 0 , y 0 ) or, more formally, on the map The properties and shape of Ψ −1 depend on the shape of the charge distribution and on the local charge collection efficiency within the pixels. Depending on the spatial diameter of the charge distribution with respect to the pixel pitch p, the map Ψ −1 will either be surjective, injective, or both. Difficulties arise from the fact that there is no simple way to measure Ψ −1 . Assume that Ψ −1 is invertible and define Ψ := Ψ −1 −1 to be its inverse. Such a map Ψ can also be defined for non-invertible Ψ −1 but we omit this here for brevity. Information about Ψ can be inferred by measuring a flat-field, i.e. shining light uniformly onto the sensor, and computing a 2-dimensional histogram of (η x , η y ) to estimate the probability density f 0 (η x , η y ). Using that (x 0 , y 0 ) is distributed uniformly within [−p/2, p/2] 2 , i.e. its probability density is ρ(x 0 , y 0 ) = 1/p 2 , it follows that where denotes the Jacobian determinant in terms of the two components (Ψ 1 , Ψ 2 ) = Ψ. The key problem is to estimate the map Ψ from a known probability density f 0 . More formally, it comes down to finding a continuously differentiable map Ψ : [0, 1] 2 → [−p/2, p/2] 2 , that solves where we absorbed the constant factor p 2 into f 0 . Such a solution always exists (given f 0 is sufficiently well-behaved) but is generally not unique [19]. In the simplest case Ψ 1 is independent of η y and likewise Ψ 2 is independent of η x . In this case the second term on the right hand side of (5) vanishes and a (unique) solution can be found using the marginal distributions ) dη x and solving the first order ODEs Ψ 1 = f 0X and Ψ 2 = f 0Y . However, the function f 0 does not factorize in general as can also be seen from the measured f 0 shown in Fig. 1(b,c). This non-factorizability can be caused by "dead regions" in the local charge-collection efficiency as well as by some charge-clouds having tails that extend beyond the 2×2-cluster. The simple approach can nevertheless be used to obtain an approximate solution for reasonable results [18]. A more involved approach using a custom iterative optimization procedure was proposed in Ref. [15]. However, their method is computationally very expensive. generates a charge cloud (blue) that is shared among the pixels in a 2×2-cluster size. These 4 charges Q i,j are used to compute coordinates (η x , η y ). A non-linear transformation Ψ is used to reconstruct the spatial coordinates (x 0 , y 0 ) from (η x , η y ). (b,c) Histogram of (η x , η y ) measured by homogeneous illumination with photons of energies 7 keV to 11.2 keV and 22 keV to 26.5 keV, respectively. The lines indicate the corresponding coordinate transformation Ψ (iso-lines in the image) calculated with the algorithm described in appendix A.
In this work, we propose to use a method described in Ref. [19], which greatly simplifies for the problem at hand. It consists of first computing a special solution of (6) and (optionally) refining this solution in an iterative procedure to obtain a unique "optimal" solution. The solution is optimal in the sense that it minimizes the integral and thus penalizes the distance the map Ψ moves each bit of material. Such a map is hence called an optimal transport map. The method is described in appendix A. Each iteration step consists only of a few fast Fourier transforms and point-wise operations. The method usually converges in less than 50 iterations on a 100×100 grid. Note however, that an optimal transport map is curl-free, while the physical solution will not necessarily be curl-free. Nevertheless, since the curl-free solution inherits the symmetries from the probability density f 0 it should be a good approximation to the physical solution for the purpose of interpolation.

Experiment
The experiments were performed at the Institut fär Röntgenphysik, Göttingen, using the JuLiA µCT facility and the local computing infrastructure. A sketch of the laboratory setup is shown in Fig. 2(a). The liquid-metal jet X-ray source (Excillum, Sweden) with a Galinstan (GaInZn alloy) anode was operated at 70 kV acceleration voltage, 18 W electron power, and a source spot of ≈10 µm in diameter. A 12.5 µm-thick iron filter was used to attenuate the lower energies that dominate the spectrum. Sample and detector were placed at z 01 = 134 cm and z 02 = 171 cm downstream from the source, respectively, resulting in a geometric magnification of M = 1.28 and a maximum divergence of 3 mrad. A dry cochlea of a mouse with an optical implant [20] that has been imaged previously with other methods was used as sample. The implant consists of GaN µLEDs (62×62×4.5 µm 3 ) arranged on a 230 µm-wide polyimide-strip and connected with AuTi and Pt tracks. The whole implant is dip-coated in silicone.
The MÖNCH03 detector was operated with 0.5 ms counting time and, including the read-out (dead) time of 0.25 ms, a total frame period of 0.75 ms. In total 721 tomographic projection (angles) with 3.5×10 5 frames each were acquired.
The reconstruction of the 3d-volumes from the detector data was performed in a multi-step process. First, several hitmaps for different pixel-resolutions and photon energy windows were computed for each tomographic projection using in-house C++-code, making significant use of the Eigen library [21]. More information about this procedure is given in appendix A. The result of this first step are several independent tomographic data sets for different energies (colors) and resolutions. One high-resolution data set with an interpolation factor 4 and photon energy range 7 keV to 30 keV, and 4 energy-resolved data sets with a smaller interpolation factor 2 and energy ranges 7 keV to 11.2 keV, 11.2 keV to 16.6 keV, 16.6 keV to 22 keV, and 22 keV to 26.5 keV, were generated. The energy ranges and resolutions were chosen such that the photon statistics for each data set are approximately the same, resulting in 100 to 200 photons per pixel and tomographic angle. Then, these data sets are fed independently into the well-established data processing pipeline (see for example Ref. [22]) using the HoloTomoToolbox [23]. After empty-beam correction, the dead pixels were treated by replacing them with the corresponding values of the 180°-rotated sample. To utilize the strong "edge enhancement" due to phase contrast, the corrected images were then filtered with the BAC-algorithm [24]. The FDK-algorithm provided by the ASTRA-Toolbox [25] was used for tomographic reconstruction.
The high-resolution reconstruction (full photon energy bandwidth) was segmented manually in Avizo. Independently automatic segmentations were computed using the Fuzzy Local Information C-Means (FLICM) algorithm [26]. These segmentations were benchmarked with the manual segmentation, which was considered "ground truth". More details are given in section 5. On the other hand, the MÖNCH03 shows peaks for the three pairs, not resolving the individual lines within the pairs. Due to the rapidly decreasing absorption cross section of the 300 µm-thick silicon sensor for higher energies, the second peak is significantly lower than the first and the third peak is barely visible. The energy resolution of the MÖNCH03 was determined to be approximately 1.3 keV FWHM by convolving the AmpTek spectrum with Gaussians of varying widths and optimizing for the best agreement of the peak shapes with the MÖNCH03 spectrum. Due to 2×2 clustering, this value should ideally be a factor of two larger than the electronic noise of a single pixel (2 × 425 eV = 850 eV FWHM). Additionally, gain mismatches between pixels, Fano noise, and the Compton background due to the high-energy tail of the bremsstrahlung spectrum impair the energy resolution. The plot also shows the different energy bins (colors) which were used for the interpolation -labeled here as red, yellow, green, and blue. Figure 2(c) shows an empty-beam corrected image from the high-resolution data set. The black dots on the left are the result of dead pixels in the MÖNCH03 prototype. Similarly, (d-g) show the same image for the 4 different energy ranges. The absorption rapidly decreases with increasing photon energy, as expected, but most regions of the cochlea are always in a transmission range between ∼80 % and ∼10 %. For the given geometric magnification of image recording, the effective pixel size due to the demagnification of the (physical) pixels is approximately 20 µm; thus the two-fold and 4-fold interpolated images result in effective pixel sizes of 10 µm and 5 µm, respectively. The source diameter of10 µm (again for the given M) results in an effective diameter of the source below 3 µm and should hence have negligible impact on the system resolution. In the inset of Fig. 2(c) a thin hair spanning 1-2 pixels in diameter is visible, giving an upper bound for the resolved feature size. The Fourier-ring-correlation (FRC) [27] of 2 independent images of the same tomographic projection, each having 2/5 of the photon fluence of the full image, puts all spatial frequencies well above the 1/2-bit-threshold [28], indicating in fact a pixel-limited resolution even in the 4-fold interpolated images. It should be noted however, that we can not exclude that the FRC-result is affected by the interpolation procedure itself (resulting in spurious correlations at high spatial frequencies). A closer look (high-resolution images online) to the inset of Fig. 2(c) in fact reveals a faint checkerboard pattern at the base of the sample, that is neither present in the background nor in the 2-times interpolated images. Figure 3(b-f) shows slices through the reconstructed volumes (attenuation coefficients) for different photon energy ranges. The change of contrast and decrease of overall absorption with increasing photon energy is evident. (c-f) 2-fold interpolation, photon energy range as shown in Fig. 2(a). The gray-values encode the effective attenuation coefficient per voxel in the corresponding photon energy range.

Results
Next, we explore the capability to classify different materials based on multi-E channels. To this end, we first performed manual segmentation in the high-resolution data set. We segmented background, bone, membrane (representative of all soft unmineralized tissue), silicone, and implant, distinguishing between conducting tracks/wires and the bulk µLED consisting mainly of GaN [20]. A 3d rendering of the segmented volume is shown in Fig. 3(a).
The manual segmentation was used to extract the attenuation coefficients of the different materials for the used energy ranges. Figure 4(a) shows the measured mean attenuation coefficient for membrane (green diamonds) and bone (light gray triangles) together with the theoretical curves [29] (dashed lines) for Compact Bone (ICRU) and Soft Tissue (IRCP). The vertical error bars mark 1σ of the experimental values. The experimental values follow the expected decay without absorption edges and are quantitatively in good agreement.  shows the the corresponding values for the voxels classified as LED-chip (blue disks) and wire (golden stars), respectively. The attenuation coefficient of the chip shows a slight decrease in the lowest energy bin. This can be partially attributed to the fact that the gallium Kα and Kβ emission lines (which make up the majority of this lowest energy bin) are below the gallium K absorption edge (blue dashed line) at 10.37 keV. Thus the gallium emission is less attenuated in the chip than photon energies above the edge. The blue dashed line corresponds to the attenuation expected for GaN with 1/10 volume density within the voxels. This fractional volume density qualitatively reflects that the 4.5 µm-thick GaN chips are significantly thinner than a voxel. The conducting tracks consist mainly of gold, having 3 L absorption edges between 11.29 keV and 14.35 keV. This could explain that the attenuation coefficient of the voxels containing the wire decreases only slowly with increasing energy. The tracks are however significantly thinner than the voxel edge length and thus only occupy a fraction of the voxel volume. The plotted curve for gold corresponds to 1/100 volume filling (golden dashed line), and may serve as a reference. Figure 5 shows the distribution functions of the attenuation coefficients as violin plots for the different manually segmented components (material classes), namely for (a) bone, soft tissue and implant wire, and for (b) the implant chip. Conventional µCT measures the summed distribution functions without energy separation. The summed distribution functions are shown on the right of the vertical dashed line. While the mean clearly differs between material classes, the substantial width of the distribution and their tails result in significant overlap impeding unambiguous segmentation only based on gray value. This situation is considerably improved if the information of all E-bins is taken into account, providing a higher diversity in the data and hence a better starting point to differentiate material components. Interestingly, even the means of the different components vary non-monotonously with respect to each other with E, clearly showing that the classes have different elemental compositions. The width of the distribution of the attenuation coefficients within each channel is still considerably large. For biological materials, this can be expected due to the native spread in mass density and composition within each material class. This intrinsic distribution of values is highly interesting as a result. However, while 'bone' may be a rather well defined class, we may question the validity of the chosen class, for example for 'membrane', which regroups several distinct soft tissues resulting in an artificial spread. Further, the sampling and statistical errors broaden the measured histograms. The attenuation values of boundary voxels, for example or all other poorly sampled structures represent (in the best case) averages of partial volumes. Also, the segmentation was done by hand and is not perfect such that some voxels may have been falsely classified. We believe the partial volume effect to be dominant for the classes 'chip' and 'wire'. Finally, photon noise limits the achievable accuracy, however, at values much smaller than the measured width of the histograms.

Automatic segmentation
A variation of "Fuzzy c-means clustering" called FLICM [26] was employed to automatically segment the reconstructed volume into classes of different materials. Further information about the algorithm is given in appendix B. The number of clusters c = 4 corresponds to the number of materials in the reconstructed volume that could be reliably distinguished by the algorithm. The algorithm was applied independently to 2 data sets: a 1-channel volume using all photons in the E-range 7 keV to 30 keV, and a 3-channel volume with the E-ranges 7 keV to 11 keV, 11 keV to 20 keV, and 20 keV to 30 keV, respectively. In both cases the total number of photons were identical and the interpolation factor was 2.
Three bins instead of four bins were used, because the relative positions of the distribution functions in the bins 11.2 keV to 16.6 keV and 16.6 keV to 22 keV (see Fig. 5) was found to be quite similar. Furthermore, the number of bins n had to be compromised against the noise level in each energy-channel which of course increases with n. For practical reasons, it should also be noted that computational time for the FLICM algorithms also increases with n.
As pre-processing, each of the n features (multi-channel gray values) were normalized to have zero mean and unit variance. Afterwards the cluster centroids were initialized with a random voxel of each class defined in the manual segmentation. The parameter m (fuzziness) was optimized in the interval [1.5, 3.0]. Figures 6 and 7 show a slice of the automated segmentation in a region of interest around the implant. Comparing the pixels classified as bone (blue) and soft tissue (green) for the 1-channel and the 3-channel segmentation in Fig. 6(c) and (d), respectively, one immediately appreciates the benefits of a vector of gray levels based on E discrimination. In particular, the number of background pixels which are falsely classified as soft tissue is significantly higher without E discrimination. This is also the case for the region of-interest around the implant, see Fig. 7, where the success of the classification is illustrated. Comparing the upper row for the 3-channel dataset with conventional single-channel gray value in the lower row, one again clearly can appreciate the advantage of photon energy discrimination. Figure 7 shows a high amount of false positives for membrane and silicone segmentation. This is likely caused by the high degree of noise in the data and the low overall gray values for membrane and silicone in all E-channels (see 5). However Fig. 7(b,e) shows clear improvements using the 3-channel data set. Correct segmentation of the chip is challenging due to its small dimensions, very high absorption, and the (sub-voxel) layer structure of its internal compounds leading to inconsistent absorption values. Beam hardening (over-proportional absorption of lower energy photons) causes additional artifacts in particular in the full energy reconstruction. Automatic segmentation of chip and wire material is still inaccurate, as shown in Fig. 7(c,f). Using three energy channels improves the segmentation of the chip nevertheless. Note that the classification shown in Fig. 7 should be taken with caution, since the manual segmentation that is used as a reference is not perfect and generated only from the full-energy data set. In particular

Discussion, conclusion and outlook
We first briefly discuss the two main capabilities of the detector exploited here, interpolation and energy-resolution. The 4-fold interpolated images reproduced physical features of the size of single pixels and a FRC-based criterion suggests pixel-limited resolution, demonstrating that 4-fold interpolation without loss of resolution is indeed feasible with the MÖNCH. Here the FRC-based resolution limit should be taken with care as the applicability of the criterion is not entirely obvious for this kind of interpolated images and needs to be studied in more detail.
The interpolated images exhibit a checkerboard pattern in homogeneous areas of medium absorption such as the base of the sample. This can be understood as follows: The energydependence of the attenuation coefficient causes the photon-energy distribution behind a partially absorbing object to be shifted towards the higher energies, which is known as beam hardening. The map Ψ that is used in interpolation however, is computed for a photon-energy distribution without absorption, and hence incorrect in regions of the image that are affected by beam hardening. This incorrect Ψ causes artifacts in the interpolated image which manifest themselves as a periodic "checkerboard" pattern in homogeneous areas. Computing Ψ in finer E-bins and using this information for the interpolation could suppress these artifacts and improve the reconstruction.
With respect to the interpolation algorithm, the question arises whether the physical solution of (6) can be extracted from the solution space without modeling the charge distribution and charge collection efficiency. While we cannot give a definite answer yet, the algorithm proposed here allows to solve (6) both in a more exact and faster manner than the previously proposed methods. Future work will be directed at benchmarking the interpolation results by comparison with data from high resolution detectors. At this point we have no indication that the results suffer from flawed interpolation aside from the mentioned (minor) artifact.
With respect to the energy resolution, we note that the comparison of the integrated MÖNCH spectrum to the spectrum measured with a high resolution CdTe-solid-state spectrometer (AmpTek), gives evidence of the moderate energy resolution of the MÖNCH and the reduced sensitivity at higher E due to the sensor properties. Although we did not achieve the theoretical energy resolution of 0.85 keV, this does not impede the exploitation of three or four energy bins to reliably distinguish (groups of) characteristic emission lines, which were chosen with respect to the given spectrum and the object's optical properties. The present energy resolution already significantly facilitates the identification of the implant material, as well as automatic classification of background, soft tissue (membranes), and mineralized tissue (bone). In future works, a more careful energy calibration of the individual pixels is likely to improve the energy resolution. Note that the energy channels can flexibly be chosen a posteriori and flexibly adapted to the sample under study, as exploited for the FLICM-segmentation.
Interesting extensions of this work would involve detailed numerical simulations to gain a quantitative understanding of the experimental and segmentation parameters, in particular the optimum number n of energy bins. The number of segmentation classes, and the optimum design of the emission spectrum, for example using anode alloys with taylored emission lines could also be studied using well controlled phantoms and by simulation. Further, it would be of interest to better quantify the information gain, for example based on a Bayesian approach. Quantification of the difference in histograms by Kullback-Leibler distance or Wasserstein metric could also help to better understand how and if energy discrimination results in information gain for a particular measurement as a function of energy bins n .
The fact that the MÖNCH can work only under low flux illumination appears to be the most limiting experimental factor. However, the high photon sensitivity together with its high resolution, makes the detector a perfect match for a nano-focus source, where fluence is intrinsically reduced. For this reason we will use it in future work for tomography at our nano-focus source [30], where interpolating and energy resolving detection technology is expected to result in significant progress. A first example for a nearly pure phase object and the corresponding image quality obtained by 5-fold interpolation is shown in Fig. 8.

Appendix A. Interpolation
Here we describe the algorithm to solve (6). For brevity we consider the equivalent rescaled problem of finding a diffeomorphism u : Ω → Ω with Ω = [0, 1] 2 , that solves for some given positive and bounded function f 0 : Ω → R >0 . Problem (8) is related to (6) by a shift and rescaling, Ψ = pu − (p/2, p/2), where Ψ and p were defined in section 2. An initial solution of (8) can be computed as follows. Define a (x) by and b(x, y) = 1 a (x) Then u 0 (x, y) = (a(x), b(x, y)) (12) solves (8) as can be easily checked.
The true physical solution u must have certain symmetries on Ω which it inherits from the symmetries of the pixels and the charge distribution. Since the charge distribution is isotropic on average and the pixels are quadratic, i.e. have a D 4 symmetry, any physical u must satisfy u(P y x) = P y u(x), (13b) where P x and P y denote reflection at x = 1/2 and y = 1/2, respectively, and ⊥ denotes rotation by 90°. As a consequence the function f 0 again has a D 4 symmetry, i.e.
f 0 (P x x) = f 0 (P y x) = f 0 (x ⊥ ) = f 0 (x), which is very well satisfied for the experimental data as can be seen in Fig. 1(b,c).
The solution (12) satisfies (13a) and (13b) as can be easily checked using (14), but is by construction not symmetric under exchange of x and y and thus violates (13c). Also, the order of integration was chosen arbitrarily. It is therefore desirable to impose additional constraints to obtain a unique solution reflecting the physical symmetries. One possibility is to minimize the integral (7) on the solution space of (8). Such an optimal transport map can be computed from an initial solution such as (12) by a gradient-descent algorithm [19,31] with update step where χ l is the divergence-free part of u l , f 0 is the measured probability density, Du l is the Jacobian matrix, and γ l >0 is an appropriate (possibly constant) step size. The update step utilizes the Helmholtz decomposition of the function u into its curl-free and divergence-free parts, respectively: u = ∇w + χ.
In R 2 , the divergence-free part can be expressed as χ = ∇ ⊥ h for some function h : Ω → R. Here ⊥ denotes rotation by 90°, i.e. u ⊥ = (−u 2 , u 1 ) and ∇ ⊥ h = (−∂ y h, ∂ x h). The function h can be obtained by solving the Dirichlet-type boundary problem on Ω = [0, 1] 2 ∆h = −div(u ⊥ ), From (13a) and (13b) it follows that the right hand side of (16), div(u ⊥ ), is an odd function in x and y that vanishes at the boundary. It can thus be expressed in a sine basis with zero offset, with coefficients b k x ,k y ∈ R. Expanding h in a Fourier series and comparing the coefficients yields b k x ,k y (2πk x ) 2 + (2πk y ) 2 sin(2πk x x) sin(2πk y y).
This solution (18) can be easily computed numerically on a finite 2d grid using fast Fourier transforms F , setting the zero frequency component to 0. Using (19) to compute h l from u l , the complete update step of the gradient-descent scheme (15) can be written as

Appendix B. Fuzzy Local Information C-Means
Here we briefly describe the Fuzzy Local Information C-Means (FLICM) algorithm [26]. It is a variation of the fuzzy c-means (FCM) algorithm [32] for image segmentation using local information.
FCM is a clustering algorithm that assigns each item (here: voxel) to a predefined number of clusters c. Thereby the algorithm uses the so called membership matrix u ik that indicates the degree of each voxel i belonging to a cluster k. In relation to (volumetric) image segmentation the matrix u ik is computed using similarity measures between each voxels grayvalues (or multi-channel values) x i and the centroids of each cluster v k .