Regional motion correction for in vivo photoacoustic imaging in humans using interleaved ultrasound images

: In translation from preclinical to clinical studies using photoacoustic imaging, motion artifacts represent a major issue. In this study the feasibility of an in-house algorithm, referred to as intensity phase tracking (IPT), for regional motion correction of in vivo human photoacoustic (PA) images was demonstrated. The algorithm converts intensity to phase-information and performs 2D phase-tracking on interleaved ultrasound images. The radial artery in eight healthy volunteers was imaged using an ultra-high frequency photoacoustic system. PA images were motion corrected and evaluated based on PA image similarities. Both controlled measurements using a computerized stepping motor and free-hand measurements were evaluated. The results of the controlled measurements show that the tracking corresponded to 97 ± 6% of the actual movement. Overall, the mean square error between PA images decreased by 52 ± 15% and by 43 ± 19% when correcting for controlled- and free-hand induced motions, respectively. The results show that the proposed algorithm could be used for motion correction in photoacoustic imaging in humans.


Introduction
In photoacoustic (PA) imaging, pulsed laser light is emitted into the tissue where different chromophores absorb the photons. This causes thermoelastic expansions that generate broadband ultrasound pulses which are detected by an ultrasound transducer at the tissue surface. Thus, PA imaging combines optical contrast with ultrasound imaging and can provide spectral information that depend on molecular composition. Unlike pure optical techniques, PA imaging can offer spectral information at relatively high spatial resolution down to a depth of several centimeters.
PA imaging of blood vessel has drawn particular interest and encouraging results have been obtained in animal studies, showing detailed images of small blood vessels with high resolution [1]. The first attempt to image the human vasculature has been performed [2,3], particularly in inflammatory disease conditions in Raynaud's disease [4], and systemic sclerosis [5]. Recently, we reported the adaptation of the PA technique for use in temporal arteries in humans [6], and promising results have been shown by us and other for the diagnosis of skin cancer, including melanoma [7], basal cell carcinoma [8], and squamous cell carcinoma [9], in patients.
One issue in the translation from preclinical to clinical studies is the introduction of motion artifacts due to i) movements caused by the patient, ii) internal movements of e.g. arteries, and iii) movements of the manually operated ultrasound transducer. During acquisition of multiple frames, for averaging or for spectral imaging (transmission of several different wavelengths), motion artifacts arise from the fact that the same tissue region generates PA signals at different locations in each frame. As a result, distortions are introduced in the reconstructed PA spectral images.
Studies have been made to improve image quality in photoacoustic tomographic imaging through global motion correction [10][11][12][13]. However, the tissue motion could differ in different parts of the images which would leave purely global translational and/or rotational adjustments inadequate. In such cases a regional motion tracking technique would be preferable. Kircher et al. used an optical flow algorithm to motion correct blood-oxygenation measurements in a carotid artery [14]. Minjae et al. used a block-matching approach to correct regional motions in PA images of a graphite lead-injected chicken breast [15]. Mozaffarzadeh et al. used a feature-based image registration method to motion correction 2D PA profile images of the tooth and its periodontium [16]. Jeng et al. recently proposed a handheld PA system, where multiple individually controlled fibers are placed in a ring around the ultrasound transducer, that can achieve high frame rate (50 Hz) imaging. They used a two-pass speckle tracking approach to correct movements between PA images of a mouse leg muscle [17].
A complicating factor for motion correction in spectral PA imaging is the fact that succeeding frames are not comparable (due to difference in absorption between optical wavelengths) and optimal matching might be difficult. By instead matching interleaved ultrasound images this issue could be resolved, however the effect of such correction of PA images using regional motion tracking in vivo are very few [14,17]. More studies on this subject are needed. Specifically, the improvements caused by motion correction of in vivo PA images are ultimately the most interesting measure. However, in vivo images are not straight forward to evaluate in a general manner since no ground truth is known (pixel-wise) in the images. Further, in an in vivo setting without exogenous contrast agents and with a large number of vessels, there are multiple ultrasound sources (PA absorption in hemoglobin) that can create interference patterns (speckle) with nonnegligible amplitudes [18,19]. Hence, motion correction of in vivo images might not show the same improvement as of those on phantoms with point-wise inclusions/absorbers. Ultimately, an evaluation of motion correction in an in vivo setting is thus necessary.
In continuation from our previous in vivo studies on human tissue, there is a need for motion correction to optimize the evaluation of PA acquisitions containing motion artifacts. We have previously evaluated a regional motion tracking method on a tissue mimicking phantom that enables fast 2D tracking in entire images using ultrasound images [20]. The main purpose of this study was to demonstrate the feasibility of using this algorithm for motion correction of in vivo human PA images. For this purpose, the algorithm has been improved to handle the large frame to frame movements present in low frame rate PA imaging. Motion artifacts included both internal (artery) movements and transducer motions caused by a computerized stepping motor as well as by free-hand movements. Besides an evaluation of the motion tracking itself, efforts were made to evaluate the overall improvement caused by the motion correction in the PA images.

Data acquisition and imaging protocol
Eight healthy volunteers (ages 38-56), without any underlying medical conditions, underwent PA imaging of their forearms. The scanner used was a Vevo LAZR-X (FUJIFILM VisualSonics Inc., Toronto, CA) equipped with a linear array transducer (MX400, FUJIFILM VisualSonics Inc., Toronto, CA). The transducer had a center frequency of 30 MHz (22-40 MHz bandwidth) providing an axial resolution of 50 µm and a lateral resolution of 110 µm. A fiberoptic bundle connected to a 20 Hz tunable laser (generating nanosecond laser pulses) was coupled to the transducer. The laser light was focused through two planar light beams located on each side of the ultrasound array. Four laser pulses were used in the construction of each PA image, resulting in a PA frame rate of 5 Hz. Ultrasound B-Mode images were captured interleaved with the PA images, also at a frame rate of 5 Hz. Ultrasound gel was placed on the probe and complemented with an Aquaflex ultrasound gel pad (12 × 12 × 24 mm) in order to fill the gap between the transducer and the outer edge of the plastic holder that connected the optic cables. The gel pad was kept in place with plastic foil during the examination. Details on the setup of the equipment have been described previously [6].
In total 16 acquisitions were made on each volunteer, including eight computerized measurements and eight free-hand measurements. In the computerized measurements the transducer was placed in a stepping motor (VisualSonics Inc., Toronto, CA) that enabled controlled 1D movements of the transducer. Normally the transducer linear array is placed perpendicular to the 1D movement direction to enable 3D scanning, but in this study the transducer was placed with the array in parallel with the 1D movement direction in order to create a controlled in-plane motion within the images. An adjustable arm (Mounting Accessory, GCX Corporation, Petaluma, USA) was used to place the transducer (and stepping motor) in the right position on the subject. Figure 1 show the acquisition setup (except for the plastic foil, see Ref. [6]) using the free-hand and stepping motor configurations. The protocol, summarized in Table 1, was approved by the Ethics Committee at Lund University, Lund, Sweden. All participating subjects were given verbal and written information about the study and informed written consent was obtained from all subjects according to the Helsinki Declaration.
Based on the interest in evaluating the result in an in vivo setting, the rationale for choosing single wavelength acquisitions instead of spectral acquisitions (different wavelengths in each frame) was threefold: I. There is no ground truth. Since no ground truth exist in the PA response of the in vivo images, acquired spectral signatures cannot in themselves add value in the accuracy of the motion tracking.
II. Different wavelengths cannot be compared. Even if there is no ground truth, the effect of the motion tracking on the PA images can still be evaluated based on the assumption that the PA response should be the same for the same tissue segment. Motion between PA frames of equal wavelength can be compared while those of different wavelength cannot. III. The system has low frame rate. Although care was taken to minimize motion of the forearm itself, tens of micrometer (3D-) motions between frames, and more over time, was unavoidable. Using a low frame rate acquisition, the tissue decorrelation in an in vivo setting could cause blurring with difficulties in interpreting the results. Hence, to achieve as high frame rate as possible for the evaluation, single-wavelength acquisitions were chosen.
The motivation for using two different wavelengths in the protocol was to extend the evaluation of the motion tracking to also include another type of PA tissue response. The choice of 700 nm and 900 nm was a combination of creating different PA responses with regards to oxygenated and deoxygenated hemoglobin and utilizing the available wavelength range of the laser. Further, the choice of two acquisitions in each step was a consideration between having more independent images (they were not acquired at the exact same location) and not increasing the total acquisition time too much.
Photoacoustic and ultrasound data (both beamformed prior to exportation) was imported to MATLAB (The Mathworks Inc., Natick, MA, USA) where motion correction and analysis were performed.

Motion tracking
An intensity phase tracking (IPT) algorithm was used for motion tracking on the interleaved ultrasound B-Mode images [20]. To allow a wider usage, both in PA and in ultrasound applications, IPT was originally design for use on B-Mode data, which is far more available compared to RF-data. In this study, the IPT algorithm was modified to allow the larger between-frame motions created by the stepping motor or caused by the low frame rate during free-hand imaging. Motion tracking was done in all acquisitions with a kernel size of 1 × 1 mm. In short, IPT is based on the local phase variation of the amplitude information (amplitude converted to phase information). By combining phase differences between frames with the estimated local phase gradients the motions can be estimated. The modification to allow tracking of larger between-frame movements was based on the same measurement principle and implemented as an initial compensation for motions at a global level, before applying the regional tracking. The operations can be performed very efficiently, and the computational time is considerably lower compared to e.g. block matching approaches. A principal overview of the IPT algorithm can be viewed in Supplement 1, including calculations of the larger global motion as well as the local motions.
Once the motion tracking on the interleaved ultrasound images was done, the corresponding PA images could be motion corrected. This was done by keeping track of the motion of each individual pixel in the first frame throughout the cineloop. This meant that a pixel (y,x) in frame k in the motion corrected cineloop would be chosen as the pixel corresponding to (dY y,x , dX y,x ) in frame k in the original PA cineloop, where dY y,x and dX y,x are the cumulative measured vertical and horizontal movements from frame 1 to k. Note that the motion corrected cineloop was only valid and evaluated for the areas that stayed within the imaged window during the entire acquisition.

Evaluation
The performance of the motion tracking was evaluated by comparing the tracking results with the ground truth created by the stepping motor, which was 178 µm between each frame.
The improvement of the PA images caused by the motion correction was not straight forward to evaluate for several reasons: I. No ground truth. The exact combination of tissue chromophores, and thus the PA response, in different parts of the image was unknown. Also, the true spatial extension of expected responses, e.g. oxygenized hemoglobin around the radial artery, was unknown. Thus, the obtained PA signal cannot be compared to a ground truth. Nor can the image, objectively and with confidence, be divided into target and background.
II. Signal amplitude variation. Part of the obtained PA signal amplitude varies randomly in the image due to interference patterns and beamforming effects, similar to ultrasound speckle. The pattern is stable for stationary images but varies with transducer movements. This significantly affects the overall amplitude variation in areas with low PA signal strength, with impacts on the following measures: a) Contrast to noise ratio. To evaluate a contrast to noise ratio, the interference pattern must be treated as background/noise and need some spatial averaging. This average is not expected to change in relation to a target signal once averaging motion corrected frames. Further, as stated in reason (I) above, it is not obvious how to divide the image into target and background. The purpose of the motion tracking is not to enhance the contrast to noise ratio -but to align spatially misaligned PA images. A more suitable measure would be one that compares similarities between images.
b) Image similarity. Based on reasons (I) and (II) above, it would seem best to compare the corrected PA images to a reference image in order to evaluate how well the motion tracking has aligned the images. The first frame of the acquired PA image would seem like a good choice as reference. However, due to the interference pattern none of the other frames will match the first (or any other) frame exactly. On the contrary, areas with low PA strength (which could be a majority of the image) could have a large percentual difference in pixel-wise PA amplitude even if the motion correction is very accurate.
In an attempt to overcome the difficulties mentioned above and still quantify the improved stability in the in vivo PA images, we chose to compare each frame to an optimal reference frame. Due to the averaging of PA speckle (while preserving underlying amplitudes), the average of all motion corrected frames, was assumed to correspond to the most noise-and speckle-free PA response. Therefore, this was considered as the optimal reference. In this choice of reference frame, it was assumed that the motion tracking and corresponding PA image correction was working. Here, the purpose was not to evaluate the motion tracking explicitly but rather to quantify the improvement in PA image stability -knowing that even if the PA images were optimally motion corrected, the amplitude variation in the frames would still differ due to different interference patterns.
The stability in PA response (deviations from the assumed reference frame) was determined using the mean squared error (MSE) defined as where X n is the n:th out of all N pixels andX n is n:th pixel in the reference frame. Instead of presenting MSE values for motion corrected vs. original PA images, the ratio between them was used since this represented the improvement for the specific image. The suppression of noise (including interference pattern) by moving the transducer, causing an extension of the aperture, can be further evaluated through the relation between how much the transducer has been moved and the quality of in vivo PA images. A universal quality index (QI) [21], was used to compare individual frames to the reference, where the reference was an increasing number of averaged motion corrected images (increasing aperture). A ratio in QI between the motion corrected and original images was considered more relevant than the individual QI.
To avoid both very large and very low PA amplitudes as well as signals from the plastic foil, the selected region for evaluation in all subjects started 1 mm below the skin surface and extended to a depth of 6 mm. Areas outside this region were ignored.
Videos were created to allow visual interpretation of the tracking.

Results
The mean estimated motion in the interleaved ultrasound B-Mode images, acquired using the computerized stepping motor, was 172 ± 11 µm. This corresponded to 97 ± 6% of the actual movement caused by the stepping motor. The mean estimated (absolute) motion between frames in the free-hand acquired ultrasound images was 249 ± 107 µm. Figure 2 shows examples of the effect on PA images acquired using both the stepping motor and free-hand configurations with and without motion correction. Note that the PA responses from the artery only appear at the edge of it because all available photons are quickly absorbed by the large amount of hemoglobin. The last two rows of Fig. 2 show an example of the differences between the 700 nm and the 900 nm acquisition on the same tissue section with and without motion correction, using the stepping motor.
The overall mean square error in the PA images decreased by 52 ± 15% when correcting the motion caused by the stepping motor and by 43 ± 19% when correcting for the free-hand induced motions, compared to no motion correction, see Table 2. There was a difference in improvement between acquisitions from different wavelength. For the 700 nm acquisitions the MSE decreased by 61 ± 16% (stepping motor) and by 54 ± 17% (free-hand) and for the 900 nm acquisitions it decreased by 44 ± 10% (stepping motor) and by 33 ± 15% (free-hand). A video that demonstrates an example of the motion tracking and cumulative averaging of PA frames, with and without motion correction, can be viewed in Visualization 1. Figure 3 shows the last frame of Visualization 1, with the B-Mode image to the left, averaged PA frames without motion correction in the center and averaged PA frames with motion correction to the right. Note Fig. 2. Each row shows a B-Mode image (1 st column) with different overlaid photoacoustic images (2 nd -4 th column). In the 2 nd column the first acquired PA frame is overlaid. In the 3 rd column all PA frames are averaged and overlaid. The 4 th column shows the average of all motion corrected PA frames overlaid. The 1 st row shows an example when the stepping motor was used (wavelength=900 nm). The 2 nd row shows an example when the free-hand configuration was used (wavelength=900 nm). The 3 rd and 4 th rows show examples of the same artery but with different wavelengths (700 nm and 900 nm respectively). that, even though tracking was done in all pixels, for clarity only the red stars in the B-Mode image were used to visualize it. The universal QI increased by 244 ± 86% (from range 0.19-0.66 to range 0.61-0.84) when correcting the motion caused by the stepping motor and by 179 ± 36% (from range 0.32-0.63 to range 0.59-0.77) when correcting for the free-hand induced motions. Figure 4 shows the ratio between the QI for the motion corrected images and the QI for the original images versus an increasing number of frames averaged in the reference (used to calculate each QI), for the stepping motor and free-hand configurations. In both cases the quality improvement of the corrected PA images, compared to the original images, increased with the number of frames (increasing effective aperture).

Discussion
Motion artifacts caused by internal and/or external movements during PA acquisitions could result in incorrect spectral signatures or otherwise misleading information when combining acquisitions over several frames. Hence, motion correction is often necessary to enable analysis of such measurements. The purpose of this study was a) to show the feasibility of using an in-house regional motion tracking algorithm on in vivo PA images and b) to try to evaluate the improvement of the in vivo PA images caused by the motion correction. Hence the results of this study aimed to give credibility in both tracking performance and that the resulting motion corrected PA images were more useful.
The results show that the measured overall movements using motion tracking corresponded well to those induced by the motorized stepping motor. The result of the motion tracking was therefore considered satisfactory for the controlled motions. For the free-hand motions there was no ground truth. Visual assessment of the tracking performance showed that the tracking seemed satisfactory if the movement occurred within the image plane (few out-of-plane movements). In some cases, the in-plane motion was very large between subsequent frames and in these cases the motion tracking failed. This is probably the reason for the lesser improvement in these cases. However, such large movements will likely not occur under normal conditions (in this study the operator was instructed to induce motion on purpose). Overall, the results from the free-hand acquisitions showed that tracking can be used to correct motion artifact caused by regular use of a hand-held transducer.
Due to the lack of a ground truth, the motion correction of arterial wall movements was not quantified explicitly. However, the effect of this correction is still included in the overall MSE and QI measures. Further, in Fig. 2 and in Visualization 1, the averaged motion corrected PA signals are kept stationary compared to the first frame, thus giving a qualitatively satisfactory tracking result.
The tracking was performed on interleaved ultrasound B-Mode images to represent a general situation. Because in settings using consecutive frames with different wavelengths (spectral imaging), tracking in the PA images directly is not suitable due to the difference in tissue PA absorption. By transferring the measured movements to the corresponding PA images and rearranging those at a pixel level, a motion corrected PA sequence could be achieved. Resulting motion corrected PA images and videos show that the stronger PA signals originating from specific tissue segments are indeed kept in the same image location throughout the cineloop.
Using the average of all frames as a reference, the MSE decreased significantly when using motion correction. This improvement was not surprising, in fact the MSE was expected to decrease even more. However, in each frame there is a presence of speckle resulting from interference of pulses from multiple absorbers. The interference pattern varies between frames due to changes in tissue orientation compared to the laser-and ultrasound transducer aperture. As a result, there is a considerable variation in pixel-wise PA signal amplitude between frames, especially in areas where the interference pattern is the dominate source of signal strength. This, together with often large areas of relatively low PA amplitude, could explain why the MSE did not decrease even more when using motion tracking compared to not using it.
Motion in PA acquisitions might be used as an advantage considering that different interference patterns could be averaged during motion correction to enhance the image quality. To address this, each motion corrected image was evaluated against an increasing number of averaged frames as reference, using a quality index. The result show that an increase in transducer displacement indeed increase the image quality, see Fig. 4 and the lesser amount of speckle in the 4 th column compared to the 2 nd in Fig. 2. Not surprisingly, the rate of improvement decreases when adding more interference patterns no longer significantly change the averaged reference. More studies are needed to address if improvements in the quality would justify longer acquisition times, especially for spectral imaging using many different wavelengths.
Once comparing the differences between wavelengths, see last two rows in Fig. 2, the motion corrected PA sequences have more physical meaning. The location of the stronger PA signals in the images are kept at the same location and the difference in amplitude between different laser wavelengths gives meaningful interpretation as, for example, oxygenated hemoglobin in the artery. The difference in improvement of MSE for the 700 nm and 900 nm wavelength acquisitions might be explained by the fact that the 900 nm wavelength gave a background with, on average, lower PA amplitude and thus more significant influence of the variations in the interference pattern.
Overall, the motion corrected in vivo PA images seems considerably more meaningful than the original motion affected PA images when combining multiple frames. The use of multiple frames (spectral imaging) is often the preferred option since it gives more information on the tissue. In this study we chose single wavelength acquisitions in order to minimize the uncertainties with no ground truth during in vivo imaging. Still, we believe it is important to evaluate an in vivo setting, even if a ground truth is missing. Especially since studies on motion correction and its effect on in vivo PA images are currently very few.
The results of this study show that the proposed method can be used to motion correct in vivo PA images. This is an important result since integration of such motion correction is required in the translation towards in vivo PA imaging.