A method for the geometric calibration of ultrasound transducer arrays with arbitrary geometries

Geometric calibration of ultrasound transducer arrays is critical to optimizing the performance of photoacoustic computed tomography (PACT) systems. We present a geometric calibration method that is applicable to a wide range of PACT systems. We obtain the speed of sound and point source locations using surrogate methods, which results in a linear problem in the transducer coordinates. We characterize the estimation error, which informs our choice of the point source arrangement. We demonstrate our method in a three-dimensional PACT system and show that our method improves the contrast-to-noise ratio, the size, and the spread of point source reconstructions by 80±19%, 19±3%, and 7±1%, respectively. We reconstruct the images of a healthy human breast before and after calibration and find that the calibrated image reveals vasculatures that were previously invisible. Our work introduces a method for geometric calibration in PACT and paves the way for improving PACT image quality.


Introduction
Photoacoustic computed tomography (PACT) [1][2][3] is an emerging hybrid medical imaging modality that combines the molecular specificity of optical imaging and the low tissue scattering property of ultrasound to provide deep tissue imaging with optical absorption contrast. A typical PACT system consists of a laser for light delivery to the target, an ultrasound transducer array for acoustic detection, and a data acquisition system for recording and digitizing the photoacoustic (PA) signals. The ultrasound transducer arrays used in PACT come in various geometries and have different operating frequencies and bandwidths [4][5][6][7][8]. Knowledge of the exact locations of the transducers in these arrays is crucial to reconstructing the high-contrast images that PACT is known to produce. However, due to manufacturing errors, the positions of the transducers in the manufactured array do not exactly match those in the design, which degrades the reconstructed image quality. Correcting these errors is essential for maximizing the potential of PACT systems.
The problem of position estimation has been studied extensively in fields such as global positioning systems (GPS) [9,10], wireless sensor networks [11][12][13], and microphone arrays [14][15][16]. It is typically formulated as estimating the position of an object given the times-of-arrival (ToAs) of waves (e.g., electromagnetic waves, or acoustic waves) from a few sources to the object. Generally, the positions of the sources and the wave propagation speed are assumed to be known. In the context of the geometric calibration of ultrasound transducer arrays, the major distinction is that the speed of sound in the medium is unknown. Ultrasound transducer position estimation with an unknown wave propagation speed has been studied in the context of ultrasound computed tomography [17,18] and underwater ultrasound imaging [19,20]. However, they also consider the element receive and transmit delays to be unknown, which leads to more involved solution strategies. In contrast, in PACT, the element receive delays can be assumed to be known since they can be found separately by diffusing laser light onto the array (which generates a strong PA signal at the instant of the light emission).
In the PACT literature, the problem of geometric calibration of transducer arrays has not been studied widely. In [21], the authors proposed a non-linear least-squares algorithm to solve for the transducer positions from ToA data collected by scanning a point source using a robotic gantry. However, they did not address the speed of sound estimation. In [8], the authors used an iterative method based on point source responses to simultaneously estimate the transducer positions, the point source positions, and the speed of sound in the medium. However, simultaneously approximating all three quantities leads to a scale ambiguity between the speed of sound and the coordinate system. Additionally, due to the non-convexity of the problem formulation, if the initial guesses of the unknowns are inaccurate, the algorithm can converge to a local minimum. Further, their method only calibrates the transducer coordinates in the radial direction and is therefore not applicable to arrays with arbitrary geometries. More recently, in [22], the authors proposed a global optimization algorithm to find the optimal location for each transducer in their 28-element transducer array by maximizing the sharpness of the reconstructed image. This method also suffers from convergence to local minima, and it scales poorly with the number of transducers. Moreover, it could lead to an unphysical situation, where different imaging targets result in different values of learned transducer coordinates. Finally, in [23], the authors circumvent the problem of geometric calibration by using deep learning-based frameworks that reduce the image artifacts resulting from the errors in the transducer positions. However, such methods suffer from a lack of interpretability [24] and result in the loss of linearity of the image reconstruction process.
In this work, we present a geometric calibration method that overcomes all the limitations stated above. We start with the point sourcebased formulation in [8] and reduce it to a linear system of equations in the transducer coordinates by using alternate methods to obtain the other unknown quantities in the formulation. In doing so, we overcome both the scale ambiguity between the unknowns as well as the non-convexity of the problem. Owing to the linearity of the resulting formulation, we can also derive error estimates for the estimated transducer locations. These are useful for determining the number and locations of the point sources needed to calibrate a transducer array within a given error tolerance.
The paper is structured as follows. In Section 2, we elucidate the importance of geometric calibration through numerical simulations and introduce our solution strategy. In Section 3, we apply our method to an experimental PACT system and show an improvement in the reconstructed image quality due to our method. In Section 4, we end with a discussion of our results.

Motivation and theory
Before presenting our method, we demonstrate the need for a sound geometric calibration method for PACT systems through numerical simulations.

Motivation for geometric calibration
Using the k-wave MATLAB package [25], we simulate a 512-element circular ultrasound transducer array with isotropic point transducers and a radius of 10 cm. Each element of the array has a Gaussian frequency response with a center frequency of 2 MHz and an ∼ 80 % one-way 6 dB bandwidth. Next, we perturb the x and y coordinates of each transducer with a uniformly distributed random variable in the range [ − 0.5λ 0 ,0.5λ 0 ], where λ 0 is the wavelength corresponding to the center frequency. This imitates the real-world situation where the actual transducer locations in an array do not exactly match the designed locations due to manufacturing errors. A schematic of this simulation setup is shown in Fig. 1(a).
We simulate the propagation of ultrasound waves due to an initial pressure distribution defined by a vessel-like numerical phantom (shown in Fig. 1(a)) and record the propagated waves at the perturbed transducer coordinates. Then, we reconstruct the images of the phantom using the designed coordinates and the perturbed coordinates, shown in Figs. 1(b) and (c), respectively. We can interpret the image obtained with the perturbed coordinates as the one obtained after geometric calibration (i.e., the calibrated image), and the image obtained with the designed coordinates as the uncalibrated one. Even for a maximum perturbation of 0.5λ 0 (or 375 μm), there is a significant degradation in the quality of the uncalibrated image ( Fig. 1(b)) compared to that of the calibrated one ( Fig. 1(c)) in terms of the sharpness of the reconstruction and the background artifacts.
To quantify this degradation, we compute the contrast-to-noise ratios (CNRs) of the two images in Figs. 1(b) and (c) to be 17 and 36, respectively, thus indicating a CNR reduction of as much as 50%. We also extract two line profiles from the images at locations "A" and "B"

Proposed method
Our method for geometric calibration is based on acquiring point source responses at various locations within the field of view (FOV) of the array. The ToA of the PA signal originating from a point source at x ′ = [x ′ , y ′ , z ′ ] and recorded by a transducer at x = [x, y, z] can be written as, where t denotes the ToA of the signal, c is the speed of sound in the medium (water in this case), and ||⋅|| denotes the Euclidian norm. While the objective of geometric calibration is to estimate the transducer locations, due to the problem formulation in Eq. (1), we end up with three unknownsthe transducer location, x, the point source location, x ′ , and the speed of sound, cthat are related in a non-convex fashion. In addition, this problem is ill-posed because scaling both the coordinate system and the speed of sound by a constant factor will result in the same ToAs.
To overcome these issues, in our approach, we obtain the point source locations and the speed of sound in water through surrogate methods. To obtain c, we leverage the fact that the variation of the speed of sound in water with temperature has been studied extensively in the literature [26][27][28][29]. We measure the water temperature accurately and infer the speed of sound from it. Next, instead of solving for the point source locations, we use a high-precision (3-axis) translation stage to move the point source to different locations within the FOV of the array. Thus, we have a coordinate system defined by the translation stage with the origin at the initial position of the stage. Having obtained the speed of sound and the point source locations, we reformulate the problem, so it becomes linear in the transducer coordinates. To do this, consider the ToA relations (Eq. (1)) for two point sources at x , and a transducer at x = [x, y, z], square them and take their difference, as shown below.
where d i = ct i , i = 1, 2 are the distances between the transducer and the two point-sources, respectively, and r (i,1) )/2, respectively, where (i, 1) and (i, 2) represent the indices corresponding to the i th pair of point sources out of the N c pairs. Finally, we estimate the transducer location using the This process is repeated for each transducer independently. A graphical illustration of the proposed method is shown in Fig. 2. Posing the problem as described above allows us to characterize the error in the estimated transducer positions in a straightforward manner, as shown in Appendix A. By doing so, we can systematically choose the number and locations of point source measurements needed to calibrate the transducers within a pre-determined error tolerance.

Methods
We experimentally demonstrate our method using the 3-dimensional (3D) PACT system described in [8]. The system consists of a hemispherical array housing with four arc-shaped 256-element ultrasound transducer arrays uniformly distributed along the azimuthal direction (see Fig. 3). Each transducer element has a center frequency of 2.25 MHz and an ∼ 98% one-way 6 dB bandwidth. The array is rotated by 90 • to achieve an ∼ 2π steradian solid angle coverage. The signals from each transducer are amplified and digitized by a one-to-one mapped pre-amplification and data acquisition system, and the digitized data are streamed to the computer via USB 3.0. Finally, we reconstruct the images from the raw PA signals using the universal back-projection (UBP) algorithm [30]. For the demonstration in this paper, we only consider one of the arcs. We operate the system in two configurations. In configuration #1, meant for point source imaging, we couple 532 nm light from a laser (IS8-2-L, Edgewave) to an optical fiber (FG050LGA, Thorlabs; core diameter: 50 μm) terminated with a light-absorbing material (carbon nanopowder), which acts as a point source for PACT. In configuration #2, meant for human breast imaging, we use a laser (LPY7875, Litron; pulse repetition frequency: 20 Hz, maximum pulse energy: ∼ 2.5 J) to deliver 1064 nm light to the tissue through an engineered diffuser (EDC 40, RPC Photonics Inc.) installed at the intersection of the four arcs to expand the beam. We ensure that the optical fluence on the tissue surface is within the American National Standards Institute (ANSI) safety limit at 1064 nm [31]. The two configurations are illustrated in Fig. 3.
For calibrating the array, we operated the system in configuration #1 and acquired 108 point-source responses using a high-precision 3-axis translation stage (PLS-85, Micos; bidirectional repeatability: 0.2 μm) in a 6 × 6 × 3 arrangement with a pitch of 0.254 mm. We used deionized water (resistivity: 2 MΩ⋅cm) in the experiment to ensure that we can accurately estimate the speed of sound. During the experiment, we measured the temperature of the water using a thermocouple (HH303, Omegaette) and inferred the speed of sound from it to be 1482.9 m/s. Note that the point source we used is not perfectly isotropic. However, this anisotropy does not affect our method as long as the signal-to-noise ratio (SNR) of the acquired data permits accurate ToA estimation.
There are several ways to estimate the ToAs of the point source signals. For instance, we can compute the noise statistics of a signal and estimate the ToA as the first instant when the signal exceeds a predefined amplitude threshold above the noise. However, the true firstarrival signal might be buried in noise, which leads to erroneous ToA estimates, especially in low SNR situations. Alternatively, we can experimentally acquire a reference PA signal with a known ToA (for e.g., by accurately measuring the distance between the source and the transducer), and use it to estimate the ToAs of the point source signals relative to the reference signal [32]. However, acquiring such a signal with a known ToA is challenging. Instead, we combine these two approaches of using noise statistics and leveraging the structure of an experimentally acquired signal. First, we find the maxima of the acquired signals, which is usually well above the noise. There is a delay between the maximum of the signal and the first-arrival due to the finite bandwidth of the transducers. To find this delay, we align the maxima of all the acquired signals and compute the average of the signals (this boosts the SNR). Then, we estimate the ToA of the averaged signal as the first instant when it exceeds a predefined amplitude threshold (three times the standard error of the noise in this case) and compute the delay between the ToA and the time when the maximum of the averaged signal occurs. Finally, we estimate the ToA of each individual signal by subtracting this delay from the time corresponding to the maximum of the signal.
We estimated the ToAs using this approach (see Supplementary Fig.   Fig. 2. Graphical illustration of our geometric calibration method.

Fig. 3.
A schematic of the 3D PACT system. The system consists of four arc-shaped 256-element ultrasound transducer arrays in a hemispherical array housing, which is filled with water for acoustic coupling. The array is rotated by 90 • to achieve a solid angle coverage of ∼ 2π steradian. The system is operated in two configurations. Configuration #1 for point source imaging: light from a 532 nm laser is coupled to an optical fiber which is terminated on the other end with an optically absorptive material, which acts as a point PA source. Configuration #2 for human breast imaging: light from a 1064 nm laser is delivered to the tissue through a diffuser (placed at the intersection of the arcs) to expand the beam.
1) and applied our calibration method to estimate the locations of the transducers. The designed and calibrated locations of the transducers and their relative shifts are plotted in Supplementary Fig. 2.

Comparison of the images reconstructed before and after calibration
Having obtained the calibrated coordinates, we proceed to evaluate the improvement in the reconstruction quality due to the geometric calibration using two data sets. The first one consists of point source responses recorded at five different locations within the FOV of the array. Note that these data are not part of the 108 point-sources used for the calibration. The second data set is obtained by imaging the breast of a healthy adult subject lying down in a prone position within a single breath-hold of 10 seconds (to minimize motion artifacts), and with the system being operated in configuration #2.

Point source reconstruction results
The reconstructed images of one of the five point-sources using the uncalibrated (designed) and calibrated coordinates are shown in Figs. 4 (a) and (b), respectively. The images are maximum amplitude projections (MAPs) of the reconstructed volume along the x, y, and z directions. From the images, we see a clear improvement in the calibrated image ( Fig. 4(b)) compared to the uncalibrated image ( Fig. 4(a)) in terms of the improved sharpness of the reconstruction and the suppressed artifacts in the background. We identify three locations in the MAPs (marked as points A, B, and C in Figs. 4(a) and (b)) where the difference between the two images is prominent and extract line profiles of the volumes at each of these locations in the direction perpendicular to their corresponding MAPs. The profiles at points A, B, and C are plotted in Figs. 4(c), (d), and (e), respectively, and they also show that the background in the calibrated image is lower than the uncalibrated one. Finally, to better appreciate the differences between the two images, we provide a video that toggles between the two images consecutively (Supplementary Video 1) and a video that shows the reconstructed 3D volumes of the uncalibrated and calibrated point sources (Supplementary Video 2).
To quantify the improvement in the reconstructed point source images, we compute their CNRs. Additionally, we compute two other metrics that characterize the resolution of the system. These metrics are based on the power-RMS width (RMS stands for root-mean-square) defined in Appendix A.2 of [33]. The power-RMS width of a complex-valued function, f(t), is the standard deviation of the probability density function given by |f(t)| 2 . We extend this definition to 3D by computing the covariance matrix, Σ V 2 ∈ R 3×3 , of the 3D distribution defined by the square of the reconstructed volume, V. From the covariance matrix, we define the following quantities, and where Det(⋅) and Tr(⋅) denote the determinant and the trace of a matrix, respectively. The size and the spread of the reconstructed point source are a measure of its volume, and its spread in the radial direction, respectively. The advantage of using these covariance matrix-based measures over conventional metrics (such as the full width at half maximum) is that they are axis-invariant and do not place assumptions on the polarity or the shape of the reconstruction (such as Gaussianity). The three metrics are computed for the reconstructions of all five pointsources and their mean and standard errors are reported in Table 1. We also report the relative improvement for each metric, defined as,

Relative improvement in metric
= |Metric of calibrated image − Metric of uncalibrated image| Metric of uncalibrated image .
From Table 1, we see that there is an (80 ± 19)% improvement in the CNR, a (19 ± 3)% improvement in the size, and a (7 ± 1)% improvement in the spread of the point source reconstruction. We also reconstruct the image of a simulated point source (see Supplementary Fig. 3). The size and spread of the simulated point source are 0.1 mm 3 and 0.81 mm, respectively, and they are very close to the size and spread of the calibrated point source reconstruction.

In-vivo reconstruction results
Next  Supplementary Fig. 4. Further, to quantify the improvement, we compute the CNRs at five points within the region of interest (the green dashed box) and compute their mean and standard error. These quantities are presented in Table 2 and they show that the CNR in this region improves by (25 ± 7)% due to the geometric calibration.

Discussion
In this paper, we have presented a method for the geometric calibration of the ultrasound transducer arrays used in PACT. The method is versatile in that it can be used for any ultrasound array, provided the point source measurements are made within the FOV of the array. The method also overcomes the ill-posedness and non-convexity of the original formulation in Eq. (1) by using surrogate methods to estimate the speed of sound and the point source locations, leading to a linear system of equations in the transducer coordinates. We applied our method to a 3D PACT system and showed that using the estimated transducer locations obtained from our method resulted in a significant improvement in the reconstructed point sources and the in vivo human breast image in terms of the CNR and the resolution. Our method would be particularly useful in situations where precisions in the transducer positions are difficult to control, such as when arrays are constructed using individual ultrasound transducers [7,22].
A notable advantage of our formulation is that it is linear in the transducer coordinates. In addition to simplifying the optimization, the linearity also allows for a straightforward characterization of the error in the estimated transducer coordinates, as shown in Appendix A. Characterizing the error is particularly important for practical considerations such as choosing the number and positions of point sources needed to calibrate an array within a given error tolerance. For instance, for the demonstration in Section 3, let the error tolerance be λ 0 /5, where λ 0 ≈ 0.67 mm is the wavelength corresponding to the center frequency of the array. For the point source arrangement that was used, as shown in Appendix A, we estimate the errors along the three coordinate axes defined by the three-axis translation stage as 0.03 mm, 0.03 mm, and 0.07 mm, respectively, which is well within our error tolerance. If our error tolerance is even lower, we can either increase the pitch between the point sources or increase the number of point source measurements to satisfy the requirement. The ability to systematically choose the point source arrangement based on an error tolerance distinguishes our method from the existing approaches in the literature [8,21,22].
In our method, we estimate the speed of sound in water by measuring the temperature of the water. To ensure that this estimate is accurate, a few points must be considered. Firstly, the speed of sound in water does not just depend on the temperature of the water but also its purity [34]. In our experiment, we used deionized water with a resistivity 2 of 2 MΩ⋅ cm to ensure that our speed of sound estimate is accurate. Secondly, since we assume that the speed is homogeneous, we must make sure that the water temperature is uniform and constant throughout the experiment. One way to do this is to start the experiment only after the water temperature has reached a steady state as monitored at several locations and at regular intervals in time.
Our method also requires an accurate estimate of the ToAs of the point source signals. While estimating the ToAs, it is crucial to account for any delays in the data acquisition pipeline such as the element receive delays. In PACT systems, these delays can be found by diffusing the laser light onto the array which generates a strong PA signal (termed the transducer surface signal) at the instant of laser emission. The ToA estimation approach from Section 3.1 can be used to estimate the firstarrival time of the transducer surface signal. This synchronizes the laser emission with the data acquisition system. It is important to note that the ToA estimation approaches described in Section 3.1 are valid only when the point source response does not change significantly within the measurement region. If it does, then the spatial impulse response of the transducers [37] has to be incorporated into these approaches for accurate ToA estimation.

Table 1
A quantitative comparison of the uncalibrated and calibrated point source reconstructions using three metrics: CNR, size, and spread. The reported quantities are the mean ± standard errors of the respective metrics for the reconstructed volumes of five different point sources. The size and spread of the point source reconstruction are defined in Eqs. (3) and (4) We concede that despite accounting for several factors in the estimation of the speed of sound and the ToAs, there may still be some error in these estimates. For instance, changes in the temperature that are smaller than the measurement resolution of 0.1 • C could result in some error in the estimated speed of sound. Similarly, since we define our ToA based on an amplitude threshold, we ignore the part of the point source response that occurs prior to this instant, which introduces some error in our ToA estimates. We account for such errors in our error analysis in Appendix A, where we assume that our speed of sound error is 0.3 m/s (based on the resolution of the temperature measurement) and the ToA estimation error is approximately 0.45 μs (based on the center frequency of the array). While our method is readily applicable to any ultrasound transducer array, a practical concern arises when working with 2-dimensional (2D) PACT systems (for example, a ring array, or a linear array). In this case, it is crucial to ensure that the point sources and the transducer array lie in the same plane. Otherwise, the ToAs of the point source responses acquired by the PACT system do not accurately reflect the true distances between the point sources, leading to erroneous results. To overcome this, if we perform 3D geometric calibration for a 2D array, then it is necessary to account for the changes in the spatial impulse response of the transducer and the decrease in the signal-to-noise ratio while estimating the ToAs for out-of-plane point sources.
In conclusion, we presented a method for the geometric calibration of the ultrasound transducer arrays used in PACT systems, demonstrated the method in a 3D PACT system, and discussed several practical considerations in implementing the method. We hope that our work will standardize the practice of geometric calibration in PACT and lead to improved image quality in PACT systems. A demo code for our method has been posted on GitHub. 3

Imaging protocols
All human imaging experiments were performed with the relevant guidelines and regulations approved by the Institutional Review Board of the California Institute of Technology (Caltech). The human experiments were performed in a dedicated imaging room. Written informed consent was obtained from all the participants according to the study protocols.

Data and code availability
The data that support the findings of this study are provided within the paper and its Supplementary materials. A demo code for the calibration method has been posted online at https://github.com/ karteekdhara98/PACT-geometric-calibration. The reconstruction algorithm and data processing methods can be found in the paper. The reconstruction code is not publicly available because it is proprietary and is used in licensed technologies.

Declaration of Competing interest
L.V.W. has a financial interest in Microphotoacoustics, Inc., Cal-PACT, LLC, and Union Photoacoustic Technologies, Ltd., which, however, did not support this work. The other authors declare no competing interests.

Data Availability
The reconstruction algorithm and the data that support the findings of this study are provided within the paper and its Supplementary materials. A demo code for our method has been posted online.

Appendix A. Error analysis
As described in Section 2.2, in our method, the transducer locations are estimated as x = ( To compute the error in x, we treat the elements of b as random variables whose mean is equal to their true value and standard deviation is equal to the experimental error in their measurement. Assuming that the high-precision translation stage has an accuracy much smaller than the wavelength of the transducers being calibrated, we can treat the matrix A as a deterministic quantity with negligible error. Let the covariance matrices of b and x be Σ b ∈ R Nc×Nc and Σx ∈ R 3×3 , respectively. Then, we have the relation, where A † = ( A T A ) − 1 A T is the pseudoinverse of A, and Σ b has the following form: Here, σ c and σ t are the uncertainties in the measurements of the speed of sound and the ToAs, respectively, c is the speed of sound, and t 0 is the ToA of a point source response at the transducer. Since the speed of sound is derived from temperature measurements, σ c depends on the error in the temperature measurement, T. Assuming that the temperature measurement is unbiased and has an uncertainty of σ T , we have σ c = ⃒ ⃒dc dT ⃒ ⃒ σ T . The uncertainty in the ToA, σ t , can be estimated as 1/f 0 , where f 0 is the center frequency of the array, because the bandwidth is usually given as a fraction of the center frequency. Finally, note that although the ToA from every point source is different, for simplicity, we only consider a representative ToA, t 0 , for calculating σ 0 . Having calculated Σ b using Eqs. (6) and (7), we can use Eq. (5) to compute Σx. The square roots of the diagonal elements of Σx give us the estimation error in the transducer positions along the three coordinate axes.
To illustrate how these quantities are computed in a practical situation, we estimate the error for the demonstration in Section 3. Note that this calculation is typically performed before the experiment. Firstly, the bidirectional repeatability of our translation stage (PLS-85, Micos) is 0.2 μm, which is much less than the wavelength corresponding to the center frequency of the array (670 μm). Therefore, matrix A can be treated as a deterministic quantity. Assuming that the water temperature is 20 • C, we infer the speed of sound to be, c = 1482.3 m/s, and dc dT ≈ 3 m/s. Since the readings from our thermocouple (HH303, Omegaette) are sufficiently precise, we determine that the maximum error in our temperature measurement is equal to the resolution of the thermocouple, i.e., σ T = 0.1 • C. Thus, σ c ≈ 0.3 m/s. The center frequency of the array is 2.25 MHz. Therefore, σ t = 1 2.25×10 6 ≈ 4.5 × 10 − 7 s. t 0 is estimated as the time taken for the acoustic wave to travel a distance of 13 cm (from the center of the array to the transducer), i.e., t 0 = 0.13 1482.3 ≈ 8.8 × 10 − 5 s. Substituting these values in Eq. (7), we get σ 0 ≈ 5 × 10 − 3 . Next, we construct Σ b for the point source arrangement in Section 3 using Eq. (6). Then, we substitute it into Eq. (5) to obtain the errors in the estimated transducer locations along the three coordinate axes as 0.03 mm, 0.03 mm, and 0.07 mm, respectively. We study the dependence of the estimation error on the number of point sources and the point source locations in Supplementary Fig. 5.