Improving needle visibility in LED-based photoacoustic imaging using deep learning with semi-synthetic datasets

Photoacoustic imaging has shown great potential for guiding minimally invasive procedures by accurate identification of critical tissue targets and invasive medical devices (such as metallic needles). The use of light emitting diodes (LEDs) as the excitation light sources accelerates its clinical translation owing to its high affordability and portability. However, needle visibility in LED-based photoacoustic imaging is compromised primarily due to its low optical fluence. In this work, we propose a deep learning framework based on U-Net to improve the visibility of clinical metallic needles with a LED-based photoacoustic and ultrasound imaging system. To address the complexity of capturing ground truth for real data and the poor realism of purely simulated data, this framework included the generation of semi-synthetic training datasets combining both simulated data to represent features from the needles and in vivo measurements for tissue background. Evaluation of the trained neural network was performed with needle insertions into blood-vessel-mimicking phantoms, pork joint tissue ex vivo and measurements on human volunteers. This deep learning-based framework substantially improved the needle visibility in photoacoustic imaging in vivo compared to conventional reconstruction by suppressing background noise and image artefacts, achieving 5.8 and 4.5 times improvements in terms of signal-to-noise ratio and the modified Hausdorff distance, respectively. Thus, the proposed framework could be helpful for reducing complications during percutaneous needle insertions by accurate identification of clinical needles in photoacoustic imaging.


Introduction
Ultrasound (US) imaging is widely used for guiding minimally invasive percutaneous procedures such as peripheral nerve blocks [1], tumour biopsy [2] and fetal blood sampling [3]. During these procedures, a metallic needle is inserted percutaneously into the body towards the target under real-time US guidance. Accurate and efficient identification of the target and the needle are of paramount importance to ensure the efficacy and safety of the procedure. Despite a number of prominent advantages associated with US imaging such as its realtime imaging capability, high affordability and accessibility, it suffers from intrinsically low soft tissue contrast that sometimes results in insufficient visibility of critical tissue structures such as nerves and small blood vessels. Moreover, visibility of clinical needles with US imaging is strongly dependent on the insertion angle and depth of the needle. clinical datasets with fine annotations are usually required for clinical applications but difficult to obtain. Photoacoustic (PA) imaging has been of growing interest in the past two decades for its various potential preclinical and clinical applications, owing to its unique ability to resolve spectroscopic signatures of tissues at high spatial resolution and depths [21][22][23]. In recent years, several research groups have proposed the combination of US and PA imaging for guiding minimally invasive procedures by offering complementary information to each other, with US imaging providing tissue structural information and PA imaging identifying critical tissue structures and invasive surgical devices such as metallic needles [24][25][26][27]. Recently, laser diodes (LDs) and light emitting diodes (LEDs) have shown promising results as an alternative to solidstated lasers that are commonly used as PA excitation sources due to their favourable portability and affordability, which is of advantage to clinical translation [28][29][30].
DL has been demonstrated as a powerful tool for signal and image processing, leading to remarkable successes in medical imaging [31][32][33][34]. DL-based approaches have been proposed for PA imaging enhancement, especially photoacoustic tomography [35], where they process raw channel data for image reconstruction [36][37][38] and enhancement [39,40] as well as reconstructed images for image segmentation or classification [41,42].
DL has been recently used by several research groups for improving the imaging quality of LED-based PA/US imaging systems. Anas et al. [43] exploited the use of a combination of a convolutional neural network (CNN) and a recurrent neural network (RNN) to enhance the quality of PA images by leveraging both the spatial features and temporal information in repeated PA image acquisitions. Kuniyil Ajith Singh et al. [29] proposed a U-Net model to improve the SNR by training a neural network using PA images acquired by an improved PA imaging system with a higher laser energy and broadband ultrasound transducers. The pre-trained model was proven effective with LEDbased PA images acquired from phantoms. Hariri et al. [44] proposed a multi-level wavelet-CNN to enhance noisy PA images associated with low fluence LED illumination by learning from PA images acquired with high fluence illumination sources. Enhancements were achieved on unseen in vivo data with improved image contrast. Most recently, Kalloor Joseph et al. [45] developed a generative adversarial network (GAN)based framework for PA image reconstruction to mitigate the impact of the limited aperture and bandwidth of the ultrasound transducer. The proposed model was trained on simulated images from artificial blood vessels and tested on in vivo measurements of the human forearm. The proposed approach was able to remove artefacts caused by the limited bandwidth and detection view.
Although prominent attention has been given to improving the visualisation of tissue structures, notably, not much effort has been made to improve the visualisation of invasive medical devices in PA imaging. In this work, we proposed a DL-based framework to enhance the visibility of clinical needles with PA imaging for guiding minimally invasive procedures. As clinical needles have relatively simple geometries whilst background biological tissues such as blood vessels are complex, as opposed to using purely synthetic data [46][47][48][49], a hybrid method was proposed for generating semi-synthetic datasets [50]. The DL model was trained and validated using such semi-synthetic datasets and blind to the test data obtained from tissue-mimicking phantoms, ex vivo tissue and human fingers in vivo. The applicability of the proposed model on diverse in vivo image data was further assessed on PA video sequences and compared with the standard Hough Transform. To the best of our knowledge, this is the first work that exploits DL for improving needle visualisation with PA imaging as well as utilises semi-synthetic datasets for DL in PA imaging.

System description
A commercial LED-based PA/US imaging system (AcousticX, CY-BERDYNE INC, Tsukuba, Japan) was used for acquiring experimental data. Detailed description of it can be found elsewhere [51]. Briefly, PA excitation is provided by two LED arrays that sandwich a linear array US probe at a fixed angle. Each array consists of four rows of LEDs with 36 elements of 1 mm × 1 mm on each row. The LED arrays can be driven at different pulse repetition frequencies from 1 kHz to 4 kHz, and the maximum pulse energy from each array is 200 μJ. The LED pulse duration is controllable between 30 ns to 150 ns. In this study, a pulse width of 70 ns at 850 nm was selected for optimal energy efficiency [52,53]. The illumination area formed by the LED arrays was approximately a rectangle (50 mm × 7 mm), resulting in an optical fluence of 0.11 mJ/cm 2 at the maximum pulse energy of 400 μJ. The US probe has 128 elements over a linear array distance of 40.3 mm, with each element having a pitch of 0.315 mm, a central frequency of 7 MHz, and a −6 dB fractional bandwidth of 80.9%.
Radio-frequency (RF) data for PA and US imaging were collected simultaneously from 128 channels on the probe with sampling rates of 40 MHz and 20 MHz, respectively. Interleaved PA and US imaging can be performed in real-time with image reconstruction performed on a graphics processing unit (GPU). Meanwhile, a maximum of 1536 PA frames and 1536 US frames corresponding to a total duration of 20 s could be saved in memory at one time, available for offline reconstruction.

Semi-synthetic dataset generation
The process of semi-synthetic dataset generation comprised three main steps as shown in Fig. 1: (1) acquisition of in vivo data to account for background PA signals originated from biological tissue; (2) generation of synthetic sensor data from needles; (3) image reconstruction with raw channel data combining synthetic and measurement data.
In vivo data for background vasculature were collected by imaging the fingers of 13 healthy human volunteers using AcousticX. Experiments on human volunteers were approved by the King's College London Research Ethics Committee (study reference: HR-18/19-8881). For each measurement, a total of 1536 PA frames and 1536 US frames were saved to the hard drive of the system's workstation for offline reconstruction. The RF data of one PA or US frame was a 2D matrix with dimensions of 1024 × 128. The first 150 (out of 1024) time steps were zeroed to remove strong LED-induced noise that spanned across the upper 5 mm depth in PA and US images. Averaging over 128 frames was implemented for suppressing random noise in the background. Reconstructed PA and US images corresponded to a field-of-view of 40.3 mm (X) × 39.4 mm (Z) according to the geometry of the linear transducer and the total number of time steps (1024) at a 40 MHz sampling rate.
Simulations of the sensor data originated from the needle were performed using the k-Wave toolbox [54]. Initial pressure distribution maps were created by simulating the optical fluence distributions on the needle shaft using Monte Carlo simulations [55]. A 40.0 mm (X) × 40.0 mm (Z) region with a grid size of 0.1 mm was constructed to represent the background tissue. A uniform refractive index, optical scattering coefficient and anisotropy of 1.4, 10 mm −1 and 0.9, respectively, were assigned to this region [56]. Three optical absorption coefficients of 1 mm −1 , 1.5 mm −1 , 2 mm −1 accounted for the variations in standard tissue. A homogeneous photo beam with a finite size of 38.4 mm was applied to the surface of the simulation area. Each simulation was run for around 10 min with approximate 100,000 photon packets.
A linear array of 128 US transducer elements (with a pitch of 0.315 mm over a total length of 40.3 mm) were assigned to a forward model in k-Wave to receive the generated PA signals from the initial pressure distributions maps. The US transducer was assigned a central frequency of 7 MHz and a fractional −6 dB bandwidth of 80.9% according to the specifications of AcousticX. RF data collected by the transducer elements were successively down-sampled to 40 MHz to match the sampling rate of the measured data. Considering the variations of needle insertions, simulations were conducted to account for clinically-relevant needle insertion depths and angles, spanning from 5 mm to 25 mm with an increment of 5 mm, and from 20 degrees to 65 degrees with a step of 5 degrees, respectively.
To form a semi-synthetic image, the simulated RF data were normalised to maximum amplitude of ex vivo needle signals collected by AcousticX. Subsequently, a pair of 2D data matrices (1024 × 128) consisted of RF data from a simulation on a needle and a measurement on a human finger were added to form a single 2D data matrix and then fed to a Fourier domain algorithm for image reconstruction [57]. The reconstructed images based on the semi-synthetic data were then interpolated to 578 × 565 pixels with a uniform pixel size of 70 μm × 70 μm. To facilitate network implementations, the images were cropped to 512 × 512 pixels by removing the corresponding rows from top to bottom and the same number of columns from left to centre and right to centre respectively.
Finally, a total number of 2000 semi-synthetic images with substantial variations on both the needle and background were used for model training with the corresponding initial pressure distributions as the ground truths.

Acquisition of phantom, ex vivo and in vivo data for evaluation
Evaluation of the trained neural network was performed on PA images acquired with in-plane needle (20G, BD, USA) insertions into blood-vessel-mimicking phantoms, pork joint tissue ex vivo and human fingers in vivo (needle outside of tissue; see Supplementary Materials; Video S1). It is noted that the fingers with needle insertions were used to obtain representative real in vivo data, but there is no corresponding clinical scenario.
The blood-vessel-mimicking phantoms were created by affixing several carbon fibre bundles in a plastic box filled with 1% Intralipid dilution (Intralipid 20% emulsion, Scientific Laboratory Supplies, UK) that had an estimated optical reduced scattering coefficient of 0.96 mm −1 at 850 nm [58]. The fibre bundles were randomly positioned to mimic different orientations of blood vessels. The acquired PA images were prepared following the pipeline used for processing the semi-synthetic data.

Network implementation
The network architecture implemented in this work was derived from the U-Net architecture proposed in Ref. [35]. In general, this model followed the original U-Net architecture [59] but had fewer scales and a reduced number of filters at each scale to accommodate a small input size. Experiments about the model capacity (see Supplementary Materials; Section 5) manifested that the model shown in Fig. 2 was able to learn the regularities of the training data and generalise well to unseen data. Besides, this model was built smaller and lighter which could contribute to size and latency reduction that are beneficial for real-time applications.
In Fig. 2, following an encoder path, each scale consisted of two convolutional layers followed by a 2 × 2 max pooling layer. For the decoder path, similarly, each scale contained two convolutional layers but followed by a transposed convolutional layer with an up-sampling factor of 2. The model was trained using the input pairs with a smaller resolution of 128 × 128 pixels that adapted well to the receptive field of the model by resizing from the initial size of 512 × 512 pixels via bicubic interpolation and evaluated on the real images with a size of 256 × 256 (see 4. Discussion & Conclusions regarding the choice of the image resolution).
Our network was implemented in Python using PyTorch v1.2.0. The semi-synthetic dataset was randomly split into training, test, and validation sets with a ratio of 8:1:1. Training was performed for 5000 iterations with a batch size of 4 that minimised the mean square error (MSE) loss in the validation set using the ADAM optimiser [60] (initial learning rate: 0.001) and NVIDIA Tesla V100 GPUs. The CosineAnneal-ingLR learning rate scheme [61] was employed to steadily decrease the learning rate during the training.

Post-processing
A post-processing algorithm based on maximum contour selection [62] was employed for further improving the outcomes of the trained neural network and fitting the needle trajectory. It was assumed that for all the experiments only in-plane placements with one single needle was performed. The isolated outliers in the outputs of the U-Net could be discriminated based on the region size differences from that of the enhanced needle. Thus, the post-processing algorithm detected all contours in the outputs of the proposed model and saved the maximum contour as the one from the needle by counting the number of pixels on each contour boundary.

Comparison method
As a further evaluation step considering processing multiple PA frames from video sequences during dynamic needle insertions, the performance of the trained neural network on needle identification was compared with the standard Hough Transform (SHT), which is a classical baseline method for line detection [63]. The SHT is designed M. Shi et al. to identify straight lines in images. It employs the parametric representation of a straight line, which is also called the Hesse normal form and can be expressed as: where is the shortest distance from the origin to the line. measures the angle between the -axis and the perpendicular projection from the origin point to the line. Therefore, a straight line can be associated with a pair of parameters ( , ), corresponding to a sinusoidal curve in Hough space. A few points on the same straight line will produce a set of sinusoidal curves that cross the same point ( , ) which exactly represents that line. In this study, the SHT was implemented by a twodimensional matrix whose columns and rows were used to save the and values, respectively. For each point in the image, was calculated for each , leading to increments of that bin in the matrix. Finally, the potential straight lines in the image were extracted by selecting the local maxima from the accumulator matrix.

Evaluation protocol and performance metrics
The needles in the acquired PA images from three different media (phantoms, pork joint tissue ex vivo, and human fingers) were manually labelled as line segments by an experienced observer. The line needle segment was generated by connecting two points in the needle: the needle tip and the farthest point to the tip on the needle shaft that was visualised in a PA image. The needle tip had a good contrast on US images when it was surrounded by water in the liquid-based tissuemimicking phantoms, but had poor visibility for ex vivo and in vivo measurements. Therefore, solid glass spheres (0-63 μm, Boud Minerals Limited, UK) were injected through the needle after being diluted with water to enhance the contrast of the tip, thus improving the accuracy and precision of manual labelling. For each medium, 20 representative measurements with different backgrounds and needle locations were used for metrics calculation.
To access the accuracy of needle extraction using the proposed DL-based approach, a metric called the modified Hausdorff distance (MHD) was employed [64]. The MHD was adapted from the generalised Hausdorff distances proposed for object matching with improved discriminatory power and greater robustness to outliers. Considering two point sets  = { 1 , … , } and  = { 1 , … , }. The distance between a point from  and a set of points from  was defined as ( , ) = ∈ ‖ − ‖. The directed distance measure (, ) was defined as: Then, the directed distance measures (, ) and (, ) were combined in the following way, resulting in the definition of the MHD as: The MHD was defined in the unit of pixels which could be converted to real distance considering the pixel size of 70 μm per pixel. Signal-tonoise ratio (SNR) was also used to assess the performance of the needle enhancement and was defined as SNR = S / , where S is the mean amplitude over the needle region and is the standard deviation of the background. The mean amplitude of the needle region was calculated by taking the average of the pixel values over the line segment. The background was defined as one of the largest rectangular regions that excluded the needle pixels.

Blood-vessel-mimicking phantoms
The results of imaging on blood-vessel-mimicking phantoms are shown in Fig. 3. Compared to conventional reconstructions, noticeable improvements in terms of removing background noise and artefacts can be observed in the outputs of the U-Net. The proposed model successfully identified the needle insertion without being perturbed by the background vessels that shared similar line-shape features. The false positives in the U-Net enhanced images were further suppressed by the post-processing. Besides, the proposed model demonstrated robustness to noise and strong artefacts (e.g., Fig. 3(d)). The composite images indicated that the proposed model was able to detect the needle insertion with good correspondence to the conventional reconstructions and the US images.
The performance of the proposed model was quantified with SNR and MHD (Table 1). Compared to the conventional reconstruction, the proposed model achieved a significant improvement in SNR by a factor of 8.3 ( < .0001). The MHD had large values as the noise level and artefacts increased. The proposed U-Net led to an initial 2.4 times decrease in MHD (from 63.2 ± 15.9 to 26.4 ± 23.3) and was further optimised by the successive post-processing to achieve the smallest MHD of 1.4 ± 1.3.

Pork joint tissue ex vivo
The proposed model was also applied to images acquired from needle insertions into ex vivo tissue.

Table 1
Quantitative evaluation of the trained neural network using blood-vessel-mimicking phantoms. These performance metrics are expressed as mean ± standard deviations from 20 measurements acquired from different phantoms and needle positions.

Metrics
Conventional reconstruction  Table 2 Quantitative evaluation of the trained neural network using ex vivo needle images. These performance metrics are expressed as mean ± standard deviations from 20 measurements acquired from different spatial locations of the ex vivo tissue and needle positions. with different insertion depths and angles, and significantly suppressed image artefacts and background noise. It also achieved a 4.8 times improvement in SNR compared to the conventional reconstruction ( < .00001)( Table 2). The MHD substantially decreased from 28.7 ± 16.3 to 4.5 ± 7.0 after post-processing the output of the U-Net ( < .00001).

Metrics
It is worth noting that the performance of the proposed model was slightly degraded with visualising the region near the needle tip, which could be attributed to the large depths at around 2.5 cm (Fig. 4(b) and (d)). However, within smaller imaging depths, the proposed model was still effective for increasing the imaging speed by reducing the number of averages required to visualise the needle with a high SNR (see Supplementary Materials; Figure S4). Fig. 5 shows the results of PA imaging with a 20G needle inserted between two fingers of a human volunteer immersed in a water tank (needle outside of the fingers). A main digital artery which has a twolayered feature is apparent on PA images (marked by hollow triangle wide arrows). A 22 s video consisting of 128 frames was saved during the needle insertions (see Supplementary Materials; Video S2). Fig. 5 shows four frames acquired at different time points with the conventional reconstructions and the overlays of PA and US images after the U-Net enhancement and SHT. The SHT was able to detect the needle, but resulted in excessive false positives. This is because the performance of the SHT is sensitive to the specifications of hyperparameters, such as the detectable length of line segments and the searching resolution of and . Fine-tuning of the hyperparameters based on the observations of the needle on a frame by frame basis is not trivial. In comparison, the proposed model manifested a good ability of robustness and generalisation. The in vivo results demonstrated it is insensitive to constantly changing lengths and angles of the needle, and images with excessive noise and artefacts.

In vivo imaging
Quantitative results are summarised in Table 3. An average of 5.8 times improvement in SNR was observed in the U-Net enhancement versus conventional reconstruction at different time points during the insertion ( < .00001). For MHD, the proposed model outperformed the conventional reconstruction and the SHT with a 4.5-and 2.9-fold reduction, respectively ( < .00001 for the conventional reconstruction; < .0001 for the SHT). The post-processing algorithm effectively suppressed the outliers in the output of the U-Net, leading to the MHD as small as 0.6.
To further evaluate the model performance, needle detection rate was measured with three in vivo PA sequences (Table 4; Supplementary Materials; Video S3). For each sequence containing 128 frames, the number of frames with the identifiable needle was counted by the   observer as the reference for calculation. It is worth noting that the proposed model enhanced almost all the frames that contained the needle with a true positive rate up to 100%, 90.6%, and 97.0%, respectively.

Impact of the needle diameter
To further assess the generalisability of the proposed model, inplane insertions of needles with different diameters (16G, 18G, 20G, BD, USA; 25G, 30G, Meso-relle, Italy) were imaged with AcousticX during in plane insertions into pork joint tissue ex vivo. The results (see Supplementary Materials; Figure S5) were consistent with the previous results of the 20G needle. For the needles with small diameters, the acquired PA images readily suffered from lower SNRs. However, as expected, the proposed model yielded robust enhancement on the needles at different contrast levels. For SNR and MHD, the proposed model outperformed the conventional reconstruction with an average of 11.2 and 6.5 times improvement, respectively (see Supplementary  Materials; Table S1).

Discussion
Previous works on DL in PA imaging mainly focused on improving the visualisation of the vasculature by denoising and artefacts removal. However, the performances of DL networks are highly dependent on the training dataset. Networks that are specifically trained to enhance the visualisation of vasculature usually have poor performance on visualising needles due to their different image features. In this work, we are the first to apply DL to specifically improve the needle visualisation with PA imaging for minimally invasive guidance. Considering the relatively simple geometries of clinical needles compared to vasculature, a prominent contribution of this work is that we developed a semisynthetic approach to address the challenges associated with obtaining ground truth for in vivo data as well as the poor realism of purely simulated data.
According to our experimental results (not shown for the sake of brevity), the simulated optical fluence distribution had a minimal effect on the performance of the proposed model when evaluated on unseen real needle images even with deep insertion angles. This is because the DL-based method was able to enhance the visualisation of the needle by learning its relatively simple spatial features that remained largely consistent. In addition, we found that the inference performance of the trained model on the real images did not benefit from a higher resolution input data than that is currently used (128 × 128 pixels; See Supplementary Materials; Figure S7). The low resolution images performed sufficiently well considering the lightweight model and simple features of the input data. Compared to a high resolution input, a low resolution input is also advantageous in terms of the computational costs; the inferring time for one image was around 90 ms using one GPU (NVIDIA GeForce RTX 2070 Max-Q). Further reduction on the inferring time could be realised by using more powerful GPUs for real-time applications.
During minimally invasive procedures, accurate and clear visualisation of the needle is essential for successful outcomes. Needle visibility has been greatly improved by PA imaging as compared to US imaging, but the image quality in terms of SNR with the LED light source is still sub-optimum due to the low pulse energy. Frame averaging is effective for reducing background noise, but at the cost of the imaging speed and introduces movement artefacts. Further, blood vessels in the background with similar line-shape structures to the needle are readily regarded as visual disturbances for clinicians to identify the needle trajectory. Finally, line artefacts above or beneath the needle shaft are often non-negligible that can lead to misinterpretation of the needle position. Therefore, in this work, PA images of needle insertions into different types of blood-vessel-mimicking phantoms, ex vivo tissue, and in vivo human fingers were acquired to evaluate the proposed model.
Qualitative results demonstrated that our proposed model was able to achieve substantial enhancement on the needle visualisation regarding noise suppression, artefacts removal, and needle detection. The enhancement was further quantified by the SNR and MHD. Performance of the proposed model was compared to the conventional reconstruction and the standard Hough transform on images acquired from blood-vessel-mimicking phantoms, ex vivo pork joint tissue, and human fingers as shown in Supplementary Materials ( Figure S1; Figure  S2; Figure S3). For SNR, our proposed model achieved 8.3, 4.8, and 5.8 times enhancement for phantom, ex vivo, and in vivo data respectively. The MHD as a measure of similarity of two objects was employed for its great robustness and discriminatory power. It was observed that the MHD had the smallest values with our proposed model compared to the conventional reconstruction and the SHT (1.4, 4.5, and 0.6 for phantom, ex vivo, and in vivo data respectively). Additionally, it is evidenced that the post-processing method based on maximum contour selection was effective to remove the false positives of the U-Net enhancement while preserving the needle pixels.
We also compared the proposed model with a conventional line detection algorithm, SHT, with in vivo video sequences. The SHT performed quite well on some cases with carefully chosen critical hyperparameters, but its performance was readily affected by imperfection errors from the former edge detection step and sensitive to some decision criteria such as empirical values of and that are directly related to the detection efficiency. Fine-tuning of these hyperparameters is impractical for real-time applications where the effective length and angle of the needle placements could constantly vary in each frame. In contrast, our proposed model can efficiently improve the needle visualisation on a variety of PA images from in vivo measurements in near real-time.
Nonetheless, the DL-based enhancement was sensitive to the SNRs of the images. More importantly, the visibility of the needle, especially its tip was still limited to a depth of around 1 cm with in vivo measurements. In the future, deep neural networks could be applied for real-time denoising [44] as an alternative to frame-to-frame averaging to improve the imaging depth. For needle tip visualisation, a fibreoptic US transmitter could be integrated within the needle cannula so that the needle tip can be unambiguously visualised in PA imaging with high SNRs [10]. Light-absorbing coatings based on elastomeric nanocomposites could also be applied to the needle shaft for enhancing its visualisation for guiding minimally invasive procedures [65].

Conclusions
In this work, we provided a DL-based framework for enhancing needle visualisation with PA imaging. The DL-model was built using only semi-synthetic data generated by combining simulated data and in vivo measurements. Evaluation was performed on unseen real data acquired by inserting needles into blood-vessels-mimicking phantoms, ex vivo tissue and human fingers (needle outside tissue). Compared to the conventional reconstruction, the proposed framework substantially improved the needle visualisation with PA imaging. It also outperformed the standard Hough Transform on PA in vivo videos with improved robustness and generalisability. Therefore, our framework could be useful for guiding minimally invasive procedures that involve percutaneous needle insertions by accurate identification of clinical needles.

Declaration of competing interest
One or more of the authors of this paper have disclosed potential or pertinent conflicts of interest, which may include receipt of payment, either direct or indirect, institutional support, or association with an entity in the biomedical field which may be perceived to have potential conflict of interest with this work. Dr. Sim West is a consultant anaesthetist at UCLH. He graduated from Sheffield in 2000, and completed his training in anaesthesia in North London, spending 2012 as the Smiths Medical Innovation Fellow. He was appointed to UCLH in 2013 and is lead for regional anaesthesia and the orthopaedic hub. His research interests include improving visualisation of needles, catheters and nerves.
Dr. Adrien Desjardins is a Professor in the Department of Medical Physics and Biomedical Engineering at the University College London, where he leads the Interventional Devices Group. His research interests are centred on the development of new imaging and sensing modalities to guide minimally invasive medical procedures. He has a particular interest in the application of photoacoustic imaging and optical ultrasound to guide interventional devices for diagnosis and therapy.