Characterization and modeling of acousto-optic signal strengths in highly scattering media.

Ultrasound optical tomography (UOT) is an imaging technique based on the acousto-optic effect that can perform optical imaging with ultrasound resolution inside turbid media, and is thus interesting for biomedical applications, e.g. for assessing tissue blood oxygenation. In this paper, we present near background free measurements of UOT signal strengths using slow light filter signal detection. We carefully analyze each part of our experimental setup and match measured signal strengths with calculations based on diffusion theory. This agreement between experiment and theory allows us to assert the deep tissue imaging potential of ∼ 5  cm for UOT of real human tissues predicted by previous theoretical studies [Biomed. Opt. Express8, 4523 (2017)] with greater confidence, and indicate that future theoretical analysis of optimized UOT systems can be expected to be reliable.


Introduction
Optical imaging is a tool used in medicine for diagnosis and monitoring of health status of superficial biological tissue, since light provides specific contrast between different tissue types. Biological tissues, however, strongly scatter light, making noninvasive spatially resolved optical imaging at larger depths, for example of deep-lying organs in humans, extremely challenging. To overcome the problem of poor resolution at larger imaging depths due to strong scattering, an imaging technique that combines light and ultrasound, known as ultrasound optical tomography (UOT), has been investigated since the 1990s [1,2]. In UOT, a small fraction of laser light illuminating tissue becomes frequency-shifted, or "tagged", in the interaction with an ultrasonic pulse focused to a small spatial region inside the tissue. This allows for optical measurements with ultrasound resolution, provided that the information carrying tagged photons can be discriminated from the untagged background photons. Different methods for detecting the tagged photons have been developed, such as Fabry-Perot cavities [3], laser speckle contrast [4], spectral hole-burning filters [5][6][7][8][9], heterodyne holography [10] and photorefractive detection [11,12]. Of these detection methods, spectral hole-burning filters tailored in the absorption profile of rare-earth-ion-doped crystals using optical pumping schemes are a particularly interesting candidate because of the possibility of creating extremely narrowband spectral filters with a large étendue [13]. The method is also insensitive to speckle decorrelation, since the transmission in the frequency selective spectral hole-burning filters does not depend on the phase of the light field, and is therefore insensitive to tissue movement. Spectral hole-burning filters suppress the untagged background light due to the absorption by the ions outside the spectral hole, while almost completely transmitting the tagged UOT signal. Additionally, the light inside the spectral hole can be slowed down by several orders of magnitude compared with the background because of strong dispersion within the spectral hole, allowing for further background suppression using time gating. We will refer to spectral hole-burning filters incorporating the slow light effect as slow light filters. UOT using spectral hole-burning filters based on Tm 3+ :YAG crystals operating at ∼ 800 nm has been demonstrated by the authors of Refs. [5][6][7][8], where e.g. Xu et al. imaged absorbers embedded in a 3.2 cm thick chicken breast. This filter wavelength is highly relevant for medical imaging since it is within the tissue optical window (∼ 650 − 900 nm), where the penetration depth of light is maximal. Slow light filters based on Pr 3+ :Y 2 SiO 5 crystals have been used to detect ultrasound tagged photons through a 9 cm thick highly scattering (µ s = 10 cm −1 ) phantom [9]. High absorption by blood at the 606 nm operating wavelength of Pr 3+ :Y 2 SiO 5 filters, however, greatly limits their usability for UOT of real biological tissues. However, when using a more optimal wavelength, simulations have shown that UOT using slow light filters could potentially perform fast (250 ms) imaging of small differences in absorption inside real biological tissue at depths of ∼5 cm in a reflection mode setup [14,15]. Such a medical imaging technique would offer interesting diagnostic possibilities, e.g. potentially opening up for real-time imaging of oxygenation level at the frontal part of the heart or other deep-lying organs. This could have a large impact on medical diagnosis, e.g. as a triage tool in the emergency ward, especially considering that ischemic heart disease is the leading cause of death worldwide [16].
In this paper we present near background free measurements of acousto-optic signal strengths from highly scattering phantoms, enabled using slow light filters based on Pr 3+ :Y 2 SiO 5 crystals. Each separate part of the experiment, the losses in the setup, the detector quantum efficiency, the slow light filter performance, the ultrasound field, and the optical properties of the phantoms are thoroughly analyzed, giving us good control over the relevant setup parameters. Simulations using diffusion theory are performed and matched with measured signal strengths. The agreement between simulations and experiment enables more reliable assessment of the imaging potential of future UOT systems.

Ultrasound optical tomography using slow light filters
Several rare-earth-ion-doped materials offer unique properties when cooled to liquid helium temperatures, such as high absorption coefficients [17], extremely narrow optical transition linewidths [18], and exceptionally long-lived hyperfine levels in the ground state [19,20], making them interesting for filtering applications. By tuning a laser within an inhomogeneously broadened absorption line in such a material, ions within a narrow frequency interval can be transferred to, and stored in long-lived non-resonant shelving states, e.g. nuclear spin states. This creates a semi-permanent decrease in the absorption of the material at the laser frequency, called a persistent spectral hole, and the process is known as spectral hole-burning. Such frequency selective bleaching of the transition thus offers the possibility of spectrally tailoring a transmission window (or bandpass filter) in the absorption profile of the material using laser pulses. The tailoring of the filter into the crystal absorption will occasionally be referred to as burning the filter in this paper. These filters can have a very steep frequency cutoff and a high peak transmission [ Fig. 1(a)]. Furthermore, there is a rapid increase in refractive index n with frequency f within the filter passband as shown in Fig. 1(a). The group velocity v g of a light pulse is given by: where c is the speed of light in vacuum. For light with a frequency within the filter passband the f ∂n ∂f -term in Eq. (1) can be on the order of 10 4 or larger in Pr 3+ :Y 2 SiO 5 , thus resulting in a greatly reduced speed of light. For light with a frequency outside the passband, the f ∂n ∂f -term in Eq. (1) is close to zero, and the light thus propagates through the material at a speed ∼ c. The group velocity for light with a frequency matching the center frequency of a filter where all the ions within a square region with width Γ have been removed can be approximated as: where α is the absorption coefficient of the material outside the square filter. A more narrowband filter and/or a higher absorption thus gives a lower speed of light inside the material. For UOT, both the absorptive part of the filter, and the slow light delay can be used to select the desired UOT signal and minimize the interfering background light. A simplified drawing of our UOT scheme using slow light filters is shown in Fig. 1 the absorption profile of the ions at frequency f s = f c + f US , corresponding to the laser carrier frequency f c shifted by the ultrasound frequency f US [ Fig. 1(a)]. A short ∼1 µs ultrasound pulse is then sent into the tissue. When the ultrasound pulse reaches the region where the optical properties of the tissue are to be measured a short ∼1 µs laser pulse is sent into the tissue. Since the speed of light is much greater than the speed of sound, the ultrasound pulse is effectively stationary while the light pulse propagates inside the tissue. Since the laser pulse is strongly scattered by the tissue it will fill a large part of the tissue volume. A small fraction of the light will pass through the volume occupied by the ultrasound pulse, which will generate sidebands to the carrier [ Fig. 1(c)]. This sideband light is often referred to as tagged photons. The frequency of the first order positive sideband matches the center frequency of the spectral filter, and it will thus be transmitted through the crystal, while the untagged laser background at frequency f c outside the filter passband is absorbed by the material. The tagged light also arrives at the detector later in time than the untagged background light because of the slow light delay inside the material for light at the filter center frequency [ Fig. 1(d)]. Untagged background light that leaks through the absorptive part of the filter can thus be further suppressed with time gating, since this light is not significantly slowed down by the material. An image with optical contrast and ultrasound resolution can therefore be obtained by measuring the tagged photon intensity as a function of the position of the ultrasound pulse.

Preparation for experiment
To accurately compare measured and simulated UOT signal strengths, it was assured that each part of the experiment was well-controlled. This section briefly describes preparatory experiments that were carried out before the UOT experiments. This includes the preparation of tissue-mimicking phantoms with known absorption coefficient (µ a ) and reduced scattering coefficient (µ s ), ultrasound field characterization, and measuring the slow light filter performance and signal detector sensitivity. A more detailed description can be found in the appendices.

Tissue phantoms
Typical values of µ s and µ a for muscle tissue at wavelengths within the tissue optical window are ∼ 5 cm −1 and ∼ 0.2 cm −1 , respectively [21]. The tissue-mimicking solid phantoms used in the UOT experiments were made from deionized water, highly purified agar, and Intralipid-20%, and had µ s = 6.1 cm −1 and µ a = 0.008 cm −1 at 606 nm. The lower µ a was chosen to allow for measurements through thicker phantom than otherwise possible in our current setup, which has high losses from the phantom output to the detector, see Sec. 4.1. Having thicker phantoms enables more accurate comparisons between theory and experiments. The optical properties of the phantoms were determined using photon time-of-flight (PTOF) spectroscopy [22]. Measurements on the pure phantom constituents (India ink and Intralipid-20%) diluted in water support that the optical properties measured using the PTOF system are reliable. We therefore conclude that we have control over the optical properties of the phantoms. For further details regarding the tissue phantoms, see Appendix A.

Ultrasound field
Our ultrasound source was an EPIQ 7 with the X5-1 matrix transducer (Philips Medical Systems, Bothell, WA, USA) delivering pulses with a 1.6 MHz center frequency at a 1.25 kHz repetition rate. The ultrasound focus was positioned 3.5 cm from the transducer in the UOT experiments. The ultrasound pressure distribution for a focus placed 3.5 cm from the transducer was therefore measured, giving a lateral and axial focal size of 4 × 4 mm 2 and 2 mm, respectively at the −6 dB intensity point (half pressure). We use these measured dimensions when defining the ultrasound pulse volume, but note that different conventions exist for which pressure drop defines the volume. The peak compression and rarefaction pressure at the focus was measured to 4.3 and 2.0 MPa, respectively. The frequency bandwidth of the pulses at half pressure was measured to ∼0.6 MHz. Further details regarding the ultrasound field characterization can be found in Appendix B. The ultrasound bandwidth in combination with the laser probe pulse bandwidth (∼ 0.4 MHz for a 1.0 µs long transform-limited Gaussian pulse) will set the bandwidth of the tagged photons. The slow light filter bandwidth should preferably not be much narrower than the bandwidth of the tagged photons to avoid signal attenuation. We estimate that a filter bandwidth slightly above 1 MHz should not significantly cut the tagged photons.

Slow light filter preparation
The rare-earth-ion-doped filter material used in the measurements was a 12 × 10 × 10 mm 3 (crystal axes b × D 1 × D 2 ) Pr 3+ :Y 2 SiO 5 crystal with a 0.05% doping concentration. Narrowband, high suppression filters can for this material be created for the 3 H 4 -1 D 2 transition at 606 nm [13].
The aim was to tailor a 1 MHz wide transmission window with a sharp frequency cutoff and a high suppression ratio for frequency separations of 1.6 MHz. Furthermore, it is desirable to efficiently burn the whole crystal volume, allowing the entire crystal to function as a spectral filter, thus increasing the filter étendue. Spectral hole-burning measurements were therefore carried out to characterize the filter performance before moving on to UOT experiments. The crystal was burnt and probed using a 10 mm diameter collimated beam. The absorption coefficient of the 3 H 4 -1 D 2 transition is strongly polarization dependent [13,23,24]. A large optical depth can only be obtained for light polarized along the D 2 axis of the crystal. To get a large filter suppression ratio, the burning and probing beams were set up to propagate along the b axis of the crystal, with the polarization aligned along the D 2 axis using a half-wave plate and a polarizer. The crystal was kept at 2.2 K in a liquid bath cryostat. The filter burn sequence was very similar to the one described in Ref. [13], and employs a series of frequency chirped pulses, designed to optimize transfer efficiency within a desired frequency interval while minimizing unwanted excitation. The beginning and end of the pulses, are the first and last halves of a complex hyperbolic secant pulse. In between there is a linear frequency scan with constant intensity, as described in Ref. [25]. Following their notation, the width of the complex hyperbolic secant edge function, T e , was 4.5 µs for our pulses. The linear scan had a duration and chirp rate of T ch = 92 µs and κ = 10 kHz/ µs, respectively. The total pulse length was 140 µs and the pulses were separated in time by 200 µs. By using pulses with a ∼ 10 mW power, resulting in a low Rabi frequency (estimated to be ∼ 1 kHz for the stronger transitions), the effect of instantaneous spectral diffusion [26] was minimized. With a chirp rate of 10 kHz/µs the adiabatic condition is not satisfied for our pulses, and as a result only a small part of the population is transferred with each pulse. However, this is not needed since the pulse is repeated many times. In total 2000 pulses were used. After creating the spectral filter, a frequency scanning pulse was used to read out the filter structure. The filter bandwidth was measured to about 1 MHz. By probing the filter at different wait times after burning, it was determined that filter decay will have very little effect on the UOT measurements, provided that the probing time is <0.5 s. Even shorter probing times of 80 ms was used in the UOT measurements to be on the safe side. Much longer filter lifetimes are, however, possible if a magnetic field is added [27], but was not deemed necessary for the purpose of the experiments presented in this paper. By comparing the transmission of Gaussian probe pulses with a duration of 1.0 µs at full-width-at-half-maximum (FWHM), sent either at the center of the filter, or shifted ∼ 1 nm outside the inhomogeneous absorption profile, the filter transmission, T filter , was estimated to be ∼ 60%. The filter contrast for a collimated beam was measured to be ∼ 45 dB, i.e., similar to Ref. [13]. In the UOT experiments presented in Sec. 4, a lower filter contrast of ∼ 30 dB was, however, measured. The decreased filter contrast may e.g. be due to a lower optical polarization purity or scattered light leaking around the filter crystal.

Detector sensitivity
A photomultiplier tube (Hamamatsu, R943-02) with a 10 × 10 mm 2 photocathode effective area was used for UOT signal detection. The anode radiant sensitivity (A/W) was measured under conditions mimicking the UOT experiments. Short laser pulses (1.0 µs) were sent onto a 50:50 beamsplitter. The reflected power was measured with a calibrated photodetector. The transmitted light was attenuated with carefully calibrated neutral density filters giving ∼ 7000 photons/pulse that were sent onto the photocathode. The beam diameter was 10 mm and had a top-hat intensity profile, thus filling most of the photocathode effective area. Under these conditions, an anode radiant sensitivity of 5.1 · 10 4 A/W for a 2000 V anode to cathode voltage was measured. This is significantly lower than specified in the data sheet. The detector may thus have suffered sensitivity degradation due to overexposure, which is not uncommon for GaAs (Cs) photocathodes [28]. We also speculate that it could be due to a lower sensitivity near the detector edges than specified in the data sheet. Assuming the decreased sensitivity is only due to photocathode degradation, and not a decrease in the gain of the photomultiplier, a detector quantum efficiency (QE det ) of 1.7% is obtained, as opposed to the 14% specified in the data sheet. The experimentally measured anode sensitivity and estimated degraded quantum efficiency was used when analyzing the data from the UOT experiment.

Ultrasound optical tomography experiment
This section describes the UOT experiment. The experimental setup is first outlined. Measured signal strengths are thereafter presented and compared with simulations. Figure 2 shows the UOT setup. The laser system was a dye laser stabilized to a reference cavity using the Pound-Drever-Hall technique, providing sub kHz linewidth light at 606 nm. Acousto-optic modulators were used for pulse shaping. A low reflectivity beamsplitter was used to pick off a small fraction of the light to a reference detector. A motorized flip mirror was used to switch between two beam paths, in Fig. 2 denoted burn and probe. Each experimental cycle started by preparing the filter by sending the laser burn pulses described in Sec. 3.3 through the burn beam path. The burning beam was expanded to 10 mm in diameter and was propagating along the crystal b axis, with the polarization aligned along the D 2 axis using a half-wave plate and a polarizer. The desired number of laser probe pulses was thereafter sent to the phantom through the probe beam path. During the probing window, up to 100 laser and ultrasound probe pulses were delivered at a 1.25 kHz repetition rate. The laser probe pulses had a FWHM of 1.0 µs, a ∼ 25 mW peak power, and a 1.0 mm beam diameter. The probe pulses were sent either at the filter center frequency or shifted 1.6 MHz outside. The ultrasound transducer was mounted above the phantom. The ultrasound focus was set 3.5 cm below the transducer, and each ultrasound pulse was timed to be in the center of the phantom when the laser pulse passed through. The characteristics of the ultrasound pulses are described in Sec. 3.2 and Appendix B. The phantom used for the experiment had a height and width of 7.0 cm, while the thickness (parallel to the probe beam) was varied between 2.5 − 6.8 cm. A liquid light guide for the visible wavelength range manufactured by Rofin Australia Pty Ltd, with an aperture area (A LG ) of 0.79 cm 2 , and a numerical aperture (NA) of 0.59 was used to collect diffuse light leaving the phantom. Lenses were used after the fiber to guide light through the crystal mounted inside the cryostat at a temperature of 2.2 K. A polarizer made sure that the un-polarized light exiting the light guide was polarized along the D 2 axis of the crystal. Photons were detected after the cryostat using a photomultiplier tube (PMT). After probing, the filter was erased using laser pulses.

Experimental setup
The experimental cycle was thereafter repeated, starting with recreating the filter. Frequently recreating the spectral filter ensures that the filter performs very similarly for each probe pulse, as the filter decays over time and will be slowly modified by the probe pulses themselves. It is, however, likely that longer probing times with more probe pulses could have been used before recreating the filter, since no change of the filter performance was observed during the probing in the measurements, but this was not further investigated. The transmission from the output of the light guide to the PMT (T setup ) was measured to be ∼ 0.05%. This was measured with the laser tuned away from the inhomogeneous profile of the ions, meaning that residual intensity suppression for frequencies inside the filter (measured to be 2.2 dB, i.e., a filter transmission T filter = 60%, see Sec. 3.3) is not included. The total setup efficiency, η tot , will here be defined as: where η LG is the collection efficiency of the light guide calculated based on [29]. For our current setup η tot = 1 ppm. This efficiency, we believe, can be significantly increased to η tot = 0.6% in a future setup. Setup improvements are discussed in Sec. 5.1. Table 1 summarizes the experimental setup parameters, both for the setup used in the experiments of this paper and for a more optimized setup.

Results: characterization of signal strengths
Measurements were performed on a phantom that was successively cut thinner in steps of ∼ 0.5 cm. For each phantom thickness, the optical power at the signal detector was recorded with the probe beam shifted by either 0.0 or 1.6 MHz, such that either the laser carrier or tagged photons matched the filter center frequency, respectively. This was done both with and without ultrasound modulation. Typical examples of such measurements for tagged photons for 3.5 and 6.8 cm thick phantoms are shown in Fig. 3(a) and Fig. 3(b), respectively. The zero time corresponds to the arrival time of the probe pulse at the reference detector and the traces are obtained by averaging 1000 shots. It is clearly shown that photons that are tagged via the acousto-optic effect inside the turbid phantoms are detected. The photons arriving at the signal detector close to time zero, are either untagged photons not fully suppressed by the absorptive part of the filter, or photons in any other way finding their way into the signal detector, e.g. by leaking around the crystal. However, the slow light delay is 5.6 µs, corresponding to a speed of light inside the crystal of ∼2000 m/s, giving an almost complete separation in time between the tagged and untagged photons. Figure 3 thus shows the advantage of slow light filters compared with pure absorptive spectral hole-burning filters in terms of improved UOT filter contrast. Consequently, a filter contrast of about 30 dB was measured with slow light filtering. Slow light filters are thus excellent for characterizing the absolute number of acousto-optically tagged photons incident on the detector, since they allow for near background free measurements by integrating traces such as those in Fig. 3, only over the time interval where the tagged photons appear. For results presented in this paper, the time window 3.0 − 15µs was used in the analysis. The trailing oscillating tail of the pulse visible in Fig. 3, is due to the bandwidth of the tagged photons being slightly broader than the filter bandwidth [30,31]. These oscillations were much more pronounced if a higher intensity pulse was sent at the filter center frequency. Figure 4 shows the number of carrier and tagged photons incident on the detector per probe pulse for phantom thicknesses between 2.5 − 6.8 cm. For each measurement the laser probe power was recorded, and used to compensate for laser power fluctuations around the 25 mW peak probe power. By recording signal traces without ultrasound modulation, a small background was removed in the data analysis for tagged photons. The vertical error bars represent 2 standard errors. The horizontal error bars represent ±1.0 mm, and is our best estimate of how accurately we cut the phantom thicknesses. The measurements are compared with simulations using the 1D diffusion approximation with extrapolated boundary conditions for a slab geometry, using the parameters in Table 1. In the simulations, the flux of either carrier or +1st order tagged photons per unit area across the phantom surface at the light guide position is calculated. The carrier flux is modeled as described in Ref. [32]. The tagged flux is modeled with an approach similar to Refs. [14,15]. Namely, the fluence rate (W/cm 2 ) due to a narrow laser beam incident on a slab is first calculated at the position of an ultrasound focus located at a depth of half the slab thickness using diffusion theory. This fluence rate is multiplied by a tagging factor, K, determining the power of an isotropic point source of ultrasound tagged light (into +1st order) at the focus. Diffusion theory is thereafter used to calculate the +1st order ultrasound tagged flux from this point source across the phantom boundary at the position of the light guide. The optical power incident on the light guide is calculated and converted into the number of photons per probe pulse, using the known photon energy and probe pulse length. The transmission of the setup is thereafter accounted for to obtain the number of carrier or tagged photons incident on the detector, i.e., as given by Eq. (3), but the detector quantum efficiency is excluded. For further details regarding the simulations, see Appendix C. As seen in Fig. 4, the number of carrier photons incident on the detector agree well with simulations. Note that this simulation does not contain any fitting parameters. For tagged photons, choosing K = 0.026 cm 2 in the simulation yields good agreement between measurement and simulation (Fig. 4). This value is lower than the K = 0.10 cm 2 used in Refs. [14,15]. In general, K depends on both ultrasound pressure and focus volume, and potentially also on other parameters such as the light and ultrasound frequency. A rough comparison should, however, be feasible due to the similar ultrasound focus volumes of 3 × 3 × 3 mm 3 in Refs. [14,15] and 4 × 4 × 2 mm 3 in the experiment presented in this paper. K can not be significantly increased in our experiment for the chosen size of the ultrasound focus by increasing the ultrasound pressure without exceeding the medical safety limit. The experimental measurements thus indicate that a slightly too high K may have been used in previous simulations. We test using K = 0.026 cm 2 in the codes of Refs. [14,15] and find that it decreases the predicted UOT imaging depth by ∼ 0.4 cm. A better understanding of how K depends on ultrasound pressure and focus volume would, however, be desirable to improve UOT simulations based on diffusion theory, since this would allow for more accurate calculations regarding the trade-off between imaging depth and spatial resolution. Possible experimental studies include characterization of acousto-optic signal strengths as a function of ultrasound pressure, frequency and focus volume.  4. Comparison between measured and simulated carrier and +1st order tagged photon numbers. Carrier measurement corresponds to photons recorded with the slow light filter at the carrier frequency and no ultrasound modulation, while tagged measurement corresponds to photons recorded with the slow light filter at the +1st order sideband frequency with ultrasound modulation. The phantoms have µ s = 6.1 cm −1 and µ a = 0.008 cm −1 . All values are obtained by averaging 1000 probe pulse. The vertical and horizontal error bars represent 2 standard errors and our estimated phantom thickness accuracy, respectively. Note, the data presented is photons incident on the detector. By multiplying with our measured degraded quantum efficiency, the number of detected photons, i.e., photons generating a photoelectron is obtained.

Discussion and outlook
In this section we present an analysis of the large improvement potential of our current setup, and calculate the number of photons that would be incident on the detector for an optimized setup in real biological tissues, using our experimentally validated simulation. A slow light filter wavelength of 800 nm is used, which is possible with a thulium-doped filter material. We acknowledge that UOT performance for an optimized setup has previously been estimated based on experimental measurements [9], and some similarity in the analysis exist. Furthermore, some assumptions from Refs. [14,15] will be used, namely, that laser probe pulses are delivered at a repetition rate of 25 kHz over a 1.0 cm 2 tissue surface area, and that the signal from each tissue region is averaged 200 times, allowing for measurements of 30 tissue voxels in <250 ms. The ultrasound tagging efficiency will be based on our experimental data, i.e., K = 0.026 cm 2 is used. It will be assumed that a liquid light guide identical to the one used in this experiment is used also in the optimized setup.

Analysis of system improvement potential
The transmission from the light guide to the signal detector in our current setup, is as mentioned in Sec. 4.1, very low (∼0.05%). A largely contributing factor to the low transmission is the cryostat, which, with its large size and small sample windows, is not designed for applications requiring a high étendue, and cuts a considerable fraction of the signal light. Using a small cryostat where the crystal is mounted very close to large sample windows should allow a substantial fraction of the light exiting the liquid light guide to be directed through the crystal mounted inside the cryostat using lenses. Furthermore, an anti-reflection coating can be applied to the crystal surfaces and the cryostat windows, minimizing reflections. By using a slow light filter material with high absorption regardless of the light polarization, the need for polarizers is eliminated, resulting in more than a doubling of the signal compared with the current setup. Furthermore, the crystal can be burnt at an angle relative to the probe beam, making the beamsplitter in our current setup no longer needed, resulting in a further increased optical throughput. We estimate that a transmission of ∼ 20% from the output of the light guide to the signal detector should be possible in a future setup. In the experimental setup presented in this paper, a single slow light filter is used at the frequency of the +1st order sideband. The possibility of creating two spectral hole-burning filters for tagged photons shifted towards both higher and lower frequencies has been demonstrated experimentally using a thulium-doped material [8]. It will be assumed that the improved setup uses filters that select both the first order positive and negative sidebands, which doubles the amount of tagged photons incident on the detector. A filter transmission of ∼ 100% will be assumed, although a filter material suitable for UOT capable of such performance is still to be discovered. In our experiment, the laser pulses were delivered to the phantom at a repetition rate of 1.25 kHz, had a temporal duration of 1.0 µs, and a peak power of ∼25 mW. A future UOT system will likely operate at a higher repetition rate for faster imaging rates. If 1.0 µs long laser pulses are delivered at a repetition rate of 25 kHz over a 1.0 cm 2 tissue area, the peak laser power can be increased to 8.0 W and still be within the medical safety limit of 200 mW/cm 2 for average radiation and 20 mJ/cm 2 for short pulses [33]. GaAs detectors can have quantum efficiencies of ∼ 15% at 800 nm, which will be used for the improved setup. The PMT detector is chosen over an avalanche photodiode due to the larger detector area of the PMT. The signal collection efficiency is proportional to the detector area. The experimental parameters of the improved setup are summarized in Table 1.

Signal strengths for an optimized setup
Two different tissue types are considered, namely, muscle and breast. For breast tissue µ a = 0.05 cm −1 and µ s = 11 cm −1 are used [34], while for muscle tissue µ a = 0.2 cm −1 [14] and µ s = 5 cm −1 are used, where µ s is based on averaged values from [21]. As before, the ultrasound focus, and thus the measurement volume, is located halfway through the tissue slab. The simulation results are presented in Fig. 5. Defining the shot-noise limited signal-to-noise ratio (SNR) as SNR = QE det N phot N avg , where N phot is the number of photons per probe pulse incident on the detector, and N avg is the number of probe averages. With e.g. QE det = 15% and N avg = 200, an SNR ≈ 55 can be achieved when 100 photons per probe pulse are incident on the detector. Such signal levels are possible for an ultrasound depth of 4.6 and 6.0 cm of muscle and breast tissue, respectively. Although the relevant signal for UOT is the difference in the number of tagged photons from different regions inside the tissue, the possibility of detecting acousto-optical signals with SNR ≈ 55 from a single region at a tissue depth of ∼ 5 cm inside seems promising to us.

Conclusion
In this paper, the results of a UOT experiment using slow light filters programmed in the absorption profile of Pr 3+ :Y 2 SiO 5 crystals to discriminate between tagged and untagged photons have been presented. We have analyzed the losses in our experimental setup, the optical properties of our tissue-mimicking phantoms, characterized the pulses delivered by our ultrasound scanner, measured the performance of our slow light filters and the sensitivity of our detector, giving us good control of each step of the experiment. This allowed for matching measured acousto-optical signal strengths with simulations based on diffusion theory for the particular ultrasound focus volume used in the experiment. The experimentally verified model for calculating signal strengths in highly turbid media was used to calculate signal strengths under more realistic condition in real human tissues, indicating signal strengths of ∼ 100 photons for ultrasound depths of ∼ 5 cm. Although contrast measurements on tissue phantoms with absorbing inclusions would be required to fully gauge the imaging potential of UOT and validate simulations such as [14,15], our improved understanding of the components of the UOT signal allows us to assert the high potential for deep imaging with greater confidence. In addition, it provides an important framework for future analysis between experiments and theory.

Appendix A: Phantom preparation and spectroscopy
This appendix provides a more detailed description of the preparation and optical characterization of the tissue-mimicking solid phantoms used in the UOT experiment. Photon time-of-flight (PTOF) spectroscopy was used as the main tool to determine µ s and µ a of the solid phantoms similarly as described in detail in [22]. The PTOF system has previously been characterized to provide an uncertainty of the measured optical properties of 2.5% for absorption and 10% for scattering [35]. To recheck the accuracy in our measurements and to minimize the risks for handling errors in the measurements, we conducted collimated beam absorption measurements on solutions of india-ink diluted in water. Such solutions were then added to phantoms made of water and intralipid and the absorption coefficient was again measured, now using PTOF and compared with the collimated beam measurements. We also prepared mixtures with controlled scattering based on Ref. [36] using only intralipid and water, and compared with the reduced scattering coefficient values obtained by PTOF. Furthermore, an experiment was carried out where light was transmitted into and collected after propagating through liquid phantoms with a wide range of optical properties. The transmitted powers were compared with Monte Carlo simulations.

A.1 Preparation of tissue phantoms
Liquid tissue phantoms were prepared for characterization measurements. These phantoms were made as mixed water dilutions of India ink (Royal Talens) for adding absorption and Intralipid-20% (Fresenius Kabi AB) for controlling scattering.
A solid phantom was made for the UOT measurements. The solid phantom recipe was based on the work of Cubeddu et al. [37]. Phantoms with µ s ≈ 6 cm −1 and µ a ≈ 0.01 cm −1 at 606 nm (measured with PTOF) were prepared by mixing 4 g of agar powder (A7921, SIGMA) and 390 ml deionized water in a glass beaker. The solution was heated to ∼95 • C on a hot plate with magnetic stirring and kept at this temperature for about 1 hour. The solution was thereafter cooled to ∼45 • C and 10 ml Intralipid-20% (Fresenius Kabi AB) was added. It was stirred for another 30 minutes at 45 • C, before being transferred to a container and stored in a refrigerator for solidification. As also observed by the authors of Ref. [37], adding agar to a water-intralipid phantom caused µ s to decrease.

A.2 Collimated transmission spectroscopy
For collimated transmission spectroscopy measurements, a multi-mode fiber (Thorlabs M41L02) was used to guide light from a tungsten halogen light source (HL-2000-FHSA) into a temperature controlled cuvette holder (QPOD 2e). The sample was contained in a cuvette sitting in the middle of the holder. The transmitted light was detected by a spectrometer (Ocean Optics QE65000) in a 400 − 1100 nm wavelength range.
Measurements were carried out on homogeneously mixed water dilutions of India ink (Royal Talens). The dependence of µ a on ink concentration was extracted and used as a reference to compare with PTOF values.

A.3 Photon time-of-flight spectroscopy
For characterization of the tissue phantoms photon time-of-flight measurements were conducted as in Ref. [22]. Briefly, short (6 ps) optical pulses were generated by a photonic crystal fiber supercontinuum source at a rate of 80 MHz. An acousto-optical tuneable filter selected a narrow-band (∼3 nm) region of the broad pulse spectrum at around 606 nm. The pulses were sent into the turbid medium through an optical fiber. Diffuse light was collected by a second identical fiber after propagating in the medium, and guided to a single-photon avalanche diode detector. The photon time-of-flight distribution was retrieved using time-correlated single-photon counting (SPC-130, Becker & Hickl, Germany). Attenuation filters were employed to ensure a count rate less than 1 mega-counts per second, to avoid pile-up in measurements. The data was compared with a white Monte Carlo model [38] to extract the optical properties of the turbid medium.

A.4 Monte Carlo validated optical spectroscopy
To further verify that we can mix liquid phantoms with specific optical properties, a simple experiment was carried out. Two fibers were inserted at a fixed distance and depth into a water, India ink and Intralipid-20% liquid phantom. One fiber transmitted 635 nm light into the phantom and the other collected light, guiding it to an optical power meter. Initially, the phantom had an estimated µ a = 0.023 cm −1 and µ s = 2.8 cm −1 based on collimated transmission spectroscopy and Ref. [36], respectively. Ink was thereafter step wise added to increase µ a according to the calibration curves obtained with collimated transmission spectroscopy, while µ s was assumed to remain constant. The output power was recorded for each ink concentration and normalized against the maximal output power. Similar measurements were performed when instead step wise increasing µ s by adding intralipid-20%. In this case, µ s was calculated from the Intralipid-20% concentration based on Ref. [36]. Throughout this measurement series, µ a was assumed to be 0.01 cm −1 , i.e., the approximate value of an intralipid-water mix. The measurements were compared with Monte Carlo simulations using an anisotropy factor g = 0.7.

A.5 Results
The dependence of µ a on ink concentration was characterized and compared with collimated transmission spectroscopy measurements (Sec. A.2). As shown in Fig. 6(a), values of µ a measured using PTOF and collimated transmission spectroscopy agree. Similarly, PTOF measurements on water dilutions of Intralipid-20% (Fresenius Kabi AB) were carried out at 606 nm. The measured µ s were compared with reference values calculated based on Ref. [36] using the known Intralipid-20% concentration. As seen in Fig. 6(b), there is an overall good agreement. The PTOF spectrometer, however, appears to measure slightly higher µ s compared with reference values. The PTOF system is not reliable for µ s <2 cm −1 . This data is therefore not included. Measurements show good agreement with normalized output powers predicted by Monte Carlo simulations [ Fig. 7(a) and Fig. 7(b)]. Fig. 7. Normalized output power of the two-fiber setup when (a) incrementally increasing µ a following our calibration curves obtained from collimated transmission spectroscopy, and (b) when incrementally increasing µ s based on Ref. [36]. The black dots corresponds to measurements with errorbars representing 1 standard error and red lines are Monte Carlo simulations. An anisotropy factor g = 0.7 was used in the simulations.

Appendix B: Ultrasound field characterization
The used ultrasound source was the EPIQ7 machine (Philips Medical Systems, Bothell, WA, USA) with the X5-1 matrix transducer. The machine was used in the pulse wave Doppler setting at 0 dB attenuation, 2 mm sample volume, scale set to minimum and focused normal to transducer surface at a 3.5 cm depth.
The ultrasound field was characterized by mounting the ultrasound transducer and a needle hydrophone (Precision Acoustics, 0.2 mm aperture) in a water tank. The hydrophone was attached to a computer controlled translation stage that could be moved transverse to the ultrasound propagation (x and y directions) and longitudinally along the ultrasound propagation (z direction). The transducer was mounted at a fixed position. The inner walls of the water tank were partly covered with a sound absorbing material to minimize ultrasound reflections. The transverse pressure distribution of the ultrasonic focus was measured at z = 3.5 cm from the transducer by scanning the position of the hydrophone along x and y with a 0.5 mm step size, giving a transverse focus size of 4 × 4 mm 2 at half pressure, see Fig. 8(a). The peak compression P + , and rarefaction pressure P − at the ultrasound focus was measured to 4.3 and 2.0 MPa, respectively, with center frequency f US =1.6 MHz. This yields a mechanical index MI = P − / f US of 1.6 which is below the medical safety limit of 1.9 [39].
The longitudinal shape of the ultrasound pulse is shown in Fig. 8(b) & 8(c), implying an axial resolution of 2 mm which aligns with information from the ultrasound machine monitor. In Fig. 8(e) an FFT of the measured waveform gives the FWHM of the ultrasound center frequency as 0.6 MHz. Because our tissue phantoms mostly consist of deionized water, the ultrasound properties measured in the water tank setup are assumed to be similar to the ultrasound properties inside the phantoms.

Appendix C: Modeling signal strengths using the diffusion equation
To evaluate the data, a model similar to the one in Ref. [14] was used. A fluence rate Φ at the position r and time t in a slab is modeled analogous to Ref. [32] as a sum of point sources and drains Φ i : P(t) here is the optical power injected into the phantom, D = 1/3(µ a + µ s ) is the diffusion coefficient, µ eff = µ a /D is the effective attenuation coefficient, r i is the position of the point source i and N is the number of used point sources. N = 6 for the constructed model to achieve sufficient accuracy [32]. σ i is +1 or −1 depending on if Φ i is a source (+1) or drain (−1) (see Fig. 9). r i is determined by mirroring an original point source Φ 0 in a cascading fashion over the 2 extrapolated boundaries where each mirror operation inverses the sign of σ. The extrapolated boundaries are positioned a distance 2AD from the real boundaries where A is determined by refractive index mismatch across the boundary. For a phantom with refractive index 1.33 interfacing with air, A = 2.35 [40]. The original carrier point source is located a distance 1/µ s into the medium, the distance at which the beam has lost its pencil shape. The original sideband point source is located at the ultrasound focus. An overview of the experiment and the point source locations and their σ i :s (denoted by plus or minus signs) for the constructed fluence rate model can be seen in Fig. 9.
Light guide

US focus
Extrapolated boundaries

Diffusion model
Phantom tissue slab Experiment depiction Transducer Fig. 9. Depiction of experiment with corresponding diffusion model where a phantom is illuminated with carrier light (illustrated by green filled lines) and ultrasound tagged light (illustrated by blue hollow lines). Both light fields are collected in a light guide for transport to the slow light filter and detection. The carrier sources are depicted with filled green plus signs and its drains with filled green minus signs. Similarly, the tagged sources are illustrated with blue hollow plus and minus signs.
Depending on which field that is simulated, P(t) in Eq. (C1) is either equal to the injected power of the carrier, P c (t), or the generated sideband power P s (t). P c (t) is modeled as a Gaussian pulse with width τ and height P peak . P s (t) is dependent on the ultrasound properties and the fluence rate of the carrier, Φ c (r, t), at the position of the ultrasound focus r US . A simple way of quantitatively modeling P s (t) is P s (t) = KΦ c (r US , t), where K is denoted the tagging factor, an empirical value of the carrier to the +1st order sideband conversion efficiency which depends on the size and pressure amplitude of the ultrasound focus and has the S.I. unit m 2 . The optical power flux per unit area J is given by Fick's law as J(r, t) = −D∇Φ(r, t) .

(C3)
The flux out of the phantom, J out (t), through the boundary with normal vectorn is given as J out (t) = J(r, t) ·n .
The optical power incident on the detector is then the outbound flux collected by the light guide with aperture area A LG and collection efficiency η LG and attenuated by the setup transmission T setup and the filter transmission T filter . The collection efficiency is estimated as η LG = (NA/n) 2 where n is the refractive index of the scattering medium (n = 1.33 for used phantoms) [29]. T setup and T filter are discussed in Sec 4.1. The total power incident on the detector, P det (t), is then given as where η tot and QE det are the same as in Eq. (3). N side is the number of sidebands filtered by slow light filters. For the simulations in Sec. 4, N side = 1, and for the simulations of the improved setup of Sec. 5, N side = 2 is used. To estimate the amount of photons hitting the detector we simulate the peak outbound flux J peak out for both the carrier and sideband from the peak input power P peak . For a photon energy E γ , the outbound pulse then contains 1.06τJ peak out /E γ photons per m 2 . Note that this model is only valid for a slowly varying P(t) compared to the average time between scattering events and should not be used for very short pulses. This criterion can quantitatively be expressed as P(t) varying with less than 1% over 1 ps for our phantoms. This is met by a Gaussian pulse with τ> 1 µs.

Funding
The