Performance evaluation of mesoscopic photoacoustic imaging

Photoacoustic mesoscopy visualises vascular architecture at high-resolution up to ~3 mm depth. Despite promise in preclinical and clinical imaging studies, with applications in oncology and dermatology, the accuracy and precision of photoacoustic mesoscopy is not well established. Here, we evaluate a commercial photoacoustic mesoscopy system for imaging vascular structures. Typical artefact types are first highlighted and limitations due to non-isotropic illumination and detection are evaluated with respect to rotation, angularity, and depth of the target. Then, using tailored phantoms and mouse models, we investigate system precision, showing coefficients of variation (COV) between repeated scans [short term (1 h): COV= 1.2%; long term (25 days): COV= 9.6%], from target repositioning (without: COV=1.2%, with: COV=4.1%), or from varying in vivo user experience (experienced: COV=15.9%, unexperienced: COV=20.2%). Our findings show robustness of the technique, but also underscore general challenges of limited-view photoacoustic systems in accurately imaging vessel-like structures, thereby guiding users when interpreting biologically-relevant information.


Introduction
Extending the range of macroscopic and microscopic photoacoustic imaging (PAI) approaches, raster-scanning photoacoustic mesoscopy has been recently introduced [1] for visualization of tissue structures up to 2-3 mm depth [2,3]. The innovative nature of mesoscopy lies in combining wide-bandwidth, high-frequency acoustic detectors with fast nanosecond-pulsed laser excitation, enabling three-dimensional imaging of optically absorbing molecules such as haemoglobin and melanin [4,5] at the mesoscopic scale (in-plane spatial resolution: ~20 µm) [2,3]. In the preclinical setting, photoacoustic mesoscopy has already been demonstrated for: longitudinal in vivo studies in several mouse models of cancer [6,7]; whole body imaging of zebrafish using a 360 • multi-orientation approach [8]; and gastrointestinal imaging [9]. In clinical studies, it has shown particular potential in skin imaging, revealing individual skin layers, benign nevi [10,11], hyperthermia effects [12], or pathophysiological biomarkers of inflammatory skin diseases [2,13]. The technique belongs to the group of acoustic resolution PA systems [14] which generate images with acoustic lateral resolution (tens to hundreds of microns) at relatively high penetration depth using wide-beam optical illumination and tight acoustic focusing. This contrasts optical resolution PAI systems which achieve optical lateral resolution (micron to submicron scale) at superficial depths using tight optical focusing.
Volumetric visualization of vasculature and skin layers is one of the key strengths and main applications of photoacoustic mesoscopy, however, the accuracy and precision for delineating targets of interest, such as vessel structures, has yet to be comprehensively evaluated. A well-known limitation of practical PAI systems lies in the limited angular coverage of their illumination and detection arrays, enabling only visualization of targets that are quasi-perpendicular to the direction of the transducer array [15]. Moreover, the non-isotropic illumination from the sides of the ultrasound detector array results in inefficient light delivery to the region of interest [16], thereby leading to limited penetration depth, particularly in specimens that exhibit high absorption. Finally, the limited view can cause in-plane or out-of-plane artefacts, reducing the clarity of the images [17][18][19][20]. In addition to these technical limitations impacting accuracy, precision-related variations in image data arise from temporal, positioning, or operator-dependent factors. Detailed precision studies have been performed for other commercially available PAI systems, including a preclinical tomography system [21] and a handheld clinical system [22,23] which have established relevant bounds on the reliability and biological relevance of the extracted photoacoustic data. Such studies are crucial to ensure data reproducibility and accurate data interpretation as the application of mesoscopic PAI in preclinical and clinical research expands.
Here, we conduct a detailed technical validation of a commercial photoacoustic mesoscopy instrument to assess both accuracy and precision of the system and use our findings to identify common artefacts and suggest approaches to maximize image quality. Imaging limitations of photoacoustic mesoscopy are evaluated by characterising common artefact types and systematically analysing signal variations resulting from the restricted angular coverage of the illumination and detection array. Precision is then assessed both using tailored phantoms and in vivo using mouse models to account for system variation arising over time, target repositioning and user experience. We conclude by outlining recommendations for data acquisition. By presenting this validation study for photoacoustic mesoscopy systems, we hope to guide users with similar PAI setups, assisting them in system validation, data handling and data interpretation.

Mesoscopic photoacoustic image acquisition
The photoacoustic mesoscopy system (RSOM Explorer P50, iThera Medical GmbH, München, Germany; Fig. 1A,B) has been described in detail elsewhere [24]. Briefly, laser light is generated by a 532-nm laser (pulses: 1 ns; ≤1 mJ/pulse) and delivered through a customized 2-arm fibre bundle (spot size: 3.5 ×5 mm). Photoacoustic signals are detected by a spherically focused LiNbO 3 detector (centre frequency: 50 MHz; bandwidth: 10-90 MHz; focal diameter: 3 mm; focal distance: 3 mm; f number: 1). The recorded data is amplified by a low noise amplifier of 63 dB gain. The scanning head is attached to two motorized stages and coupled to the sample surface by an interchangeable water-filled (2 mL) interface. For coupling of the object to the lower side of the interface, commercial ultrasound-gel (Aquasonic Clear, Parker Laboratories, Fairfield, NJ, USA) was used. The ultrasound gel was centrifuged to remove air bubbles and warmed before application. The interface was positioned on the object by moving the stage in x, y and z directions. Images were acquired over a field of view of ≤ 12 × 12 mm (step size, 20 µm). The acquisition of one image took approximately 7 min For phantom imaging, the phantom was placed underneath the transducer and aligned to the transducer direction in the position of interest. About 10 mL of ultrasound gel was used to couple the transducer surface to the phantom medium. Care was taken for the transducer interface not to touch the phantom surface.
For in vivo imaging, temporal variability with and without replacement was assessed over a time frame of 35 mins (n = 5 images) with an n = 5 per sample group. 'With replacement' is defined as full removal of the mouse from the heat pad including cleaning off the coupling ultrasound gel. For evaluation of the impact of operator experience, an operator with > 50 imaging runs experience in preclinical photoacoustic mesoscopy imaging was regarded as experienced, and an operator with < 20 imaging runs experience in preclinical photoacoustic mesoscopy imaging was regarded as inexperienced. An imaging run is defined as one full acquisition of one image, including independent positioning of the target within the system.

Base material
For systematic evaluation of artefacts using geometric phantoms, agar was chosen as the bulk phantom material to provide structural support for imaging targets due to its facile and fast preparation method. A 1.5% tissue-mimicking [25] agar mixture was prepared according to the protocol by Joseph et al. [21]. Briefly, intralipid (2.08 v/v%; Merck, 68890-65-3) was used to mimic tissue-like scattering conditions and Nigrosin (0.62 v/v% of Nigrosin stock solution [0.5 mg/mL Nigrosin in deionised water, Merck, 8005-03-6]) was added to mimic tissue-like absorption.
For evaluation of long-term system precision, copolymer-in-oil base material was chosen [26]. The material was prepared according to Hacker et al. [27]. Briefly, the phantom was composed of 30 w/v% high molecular weight SEBS (Sigma Aldrich 200557-250 G) and 8 w/v% LDPE (Alfa Aesar 43949.30) in mineral oil (Sigma Aldrich-330779-1 L), with 0.03 w/v% TiO 2 (Sigma Aldrich 232033-100 g) added to provide optical scattering and 0.0007 w/v% Nigrosin (Sigma Aldrich 211680-100 G) added to provide optical absorption. The acoustic properties were characterised using a through transmission substitution system (available at NPL, London, UK) [27], yielding a speed of sound of 1485.29 ± 0.19 m⋅s − 1 and an acoustic attenuation of 20.72 ± 0.08 dB⋅cm − 1 at 10 MHz.
The optical properties were determined using a custom-built doubleintegrating sphere (DIS) system [27], yielding a reduced scattering coefficient of 0.4 mm − 1 and an optical absorption coefficient of 0.01 mm − 1 at 532 nm. DIS system setup and measurement procedure are described elsewhere [27].

Phantom mould and inclusions
For phantom studies, a versatile, modular phantom mould was created by 3D-printing (Fig. 2). Phantom moulds were designed in Autodesk Fusion 360 (San Rafael, CA, USA) and printed using an Anet A6 Printer and polylactic acid (PLA PRO 1.75 mm PLA 3D Printer Filament 832-0232 (Yellow) and 832-0223 (White), RS Components, Corby, UK) as a base material. The phantom mould consists of two  Fig. 2A). The inner module acts as the actual testing module and allows the measurement of specific testing parameters. The outer module functions as a frame for the inner module and prevents leaking of the bulk material and enables a firm positioning of the phantom. Indentations in the corners of the outer module allow insertion of structural support blocks to control the distance between detector and phantom surface. Customized handles can be inserted on diagonal sides of the inner module to facilitate the insertion and removal of the inner module from the outer module. For this work, two inner modules have been developed. The first inner module, the string module (Fig. 2B), allows for the examination of basic technical parameters such as the sensitivity of the system to angularity or penetration depth through insertion of targets at different depths and angles. The second module, the tubing module (Fig. 2C), permits the analysis of liquid contrast agents by including tubing (Fine Bore Polythene Tubing 0.58 mm inner diameter, 0.96 mm outer diameter, Portex) at different depths and directions.
For evaluation of imaging artefacts, a dilution series of red ink (Cranfield Colours, Wales, UK) was used in the tubing module. For geometric sensitivity studies, the string module was deployed with redcoloured synthetic fibres (Smilco, Houston, TX, USA). These were chosen as imaging targets for the string module due to their similar size [28,29] to murine vessels and high absorption at 532 nm.

Animal handling
Procedures on small animals were performed under the authority of project (PE12C2B96) and personal (IA70F0365, I544913B4) licenses issued by the Home Office, UK. Studies were approved by the Cancer Research UK Cambridge Institute local animal welfare and ethical review bodies under compliance form numbers: CFSB2112, CFSB1567 and CFSB1745. All mice were housed in Tecniplast Green Line individually ventilated cages with APB6 bedding on a 12-h on/off light/dark cycle (7AM to 7PM) with 5R58 diet (PicoLab).
For in vivo imaging, mice were anaesthetised using 3% isoflurane delivered in 50% oxygen and 50% medical air. A mix of 50% oxygen and 50% medical air was used to provide the mice with a more realistic mix of gasses. The mix is also essential to keep the mice in a steady state if gasses are alternated between oxygen and air in oxygen-enhanced (OE) PAI [31]. If needed, the hair was removed around the area to be imaged by shaving and application of commercial hair removal cream. Mice were placed on a heat-pad maintained at 37 • C inside the system chamber. Respiratory rate was maintained between 70 and 80 bpm using isoflurane (~1.5-2% concentration) throughout image acquisition.

Image and statistical analysis
Imaging data was reconstructed using a beam-forming algorithm, which models the sensitivity field of the focused detector and generates 3-dimensional images [32][33][34]. The reconstructed images were analysed using MATLAB (v2020, MathWorks, Natick, MA, USA) and Fiji (v2.1.0) [35]. Statistical and regression analysis was performed using Prism (v9, Graphpad Software, San Diego, CA, USA). All data are shown as mean ± standard deviation (SD) unless otherwise stated. Coefficients of variation (COV) were calculated as the ratio of the SD to the mean, expressed  as percentage. For multiple comparisons, one-way ANOVA followed by Tukey's test was conducted. Correlation analysis between dorsal and lateral imaging positions was conducted using Spearman's correlation coefficient due to non-normal data distribution.

Phantoms
For phantom image analysis, a fixed-sized rectangular region of interest spanning the length of the string was placed around each string within the image. For each string, the line profiles perpendicular to the string were extracted within the region of interest. The minimum of each line profile was subtracted to account for the varying background signal intensity. A gaussian curve was fitted to the signal. The full width at half maximum (FWHM) and mean intensity value (taken as the half maximum of the signal intensity peak) were extracted from the fit for each line profile. Subsequently, the mean and SD of all FWHMs and signal intensities of all line profiles were calculated to achieve final values for each string.

In vivo
For the in vivo repeatability studies, the blood volume was chosen as a comparison metric, as it is a commonly used variable of interest in preclinical studies, and sometimes used as a precursor for further downstream analysis of morphological features [36]. An existing pipeline [36] was used to quantify the results. Briefly, the image data was filtered [37] in the Fourier domain in the XY plane to remove reflection lines using an respective algorithm available in the viewRSOM software (v2.3.5.2 iThera Medical GmbH, Germany). Reflection lines appear as periodically repeated straight horizontal signals that are caused by back-reflection of the laser on the ultrasound transducer surface, creating a piezoelectric signal that is reflected back and forth within the acoustic lens [37]. Details on the algorithm can be found elsewhere [37]. Afterwards, the data was reconstructed using a back projection algorithm with motion correction (voxel size of 20 × 20 × 4 µm 3 (X,Y,Z)) provided in viewRSOM software. Reconstructed images were subjected to a high-pass filter [21] to remove echo noise arising from intrinsic tissue structures, followed by a Wiener filter to remove stochastic noise. Afterwards, a slice-wise rolling ball background correction [38] was performed to achieve a homogenous background intensity. Segmentation was performed using a random forest classifier (ilastik v1.3.3 [39]). For the classifier, n = 20 in vivo images were used for training and n = 14 in vivo images were used for testing. Then, all segmented images were passed through a 3D median filter to smooth and remove impulse noises. The blood volume was quantified as the number of segmented voxels multiplied by the voxel size of 20 × 20 × 4 µm 3 (X,Y,Z).

Phantom studies illustrate illumination, shadow and reflection artefacts in photoacoustic mesoscopy
To initiate our characterisation of photoacoustic mesoscopy performance, we first sought to investigate dominant artefacts present in the images. Artefacts emerging from absorbing vascular structures can degrade image contrast and confound interpretation. While clutter artefacts [40][41][42] can be observed in the image background, affecting the overall imaging signal-to-noise ratio, three main classes of artefact can be seen to directly affect vessel analysis (Fig. 3).
First, illumination artefacts arise due to the limited top illumination of the field of view, leading to optical excitation and acoustic wave generation only in the upper part of absorbing structures that face the illumination and transducer array (Fig. 3 A). This results in inaccurate diameter estimations in the XZ plane compared to the XY plane [36]. For example, in the string phantom module, where the true diameter of each string is d act = 126 µm, the measurements from the reconstructed images agree only in the XY plane (d xy =122 ± 7.8 µm) and substantially underestimate the value in the XZ plane (d xz =27 ± 3.2 µm).
Second, shadow artefacts arise from obscuring objects, causing a signal loss in underlying objects due to strong optical or acoustic attenuation of the overlaying structure (Fig. 3B). Third, reflection artefacts occur due to presence of acoustic reflectors/scatterers or strong acoustic reverberations of the absorbing object itself [18,43,44], leading to a signal echo near the object of interest (Fig. 3 C). They can occur in-plane or out-of-plane depending on the position of the absorber/reflector [45]. Awareness of these artefacts is of particular importance as illumination artefacts can lead to inaccuracies in quantification of vessel size, but more importantly, shadow and reflection artefacts can be mistaken for real structures in the image plane.

The sensitivity of photoacoustic mesoscopy is orientation-dependent
We sought to systematically analyse how the geometric positioning of a vessel-like target in three-dimensional (3D) space affects the acquired signal. First, the impact of different target (string) depths on signal intensity and measured spatial dimensions was evaluated (Fig. 4 A,B). As expected, a significant signal loss occurred with increasing string depth (Fig. 4 C). However, depth did not impact the quantified FWHM in XY direction (p = 0.912; Fig. 4D) aligning with previous observations [46]. Furthermore, the impact of vertical and horizontal target rotation on the acquired signal was tested. For testing of the horizontal rotation, a phantom with strings in a star-shaped pattern (Fig. 4E,F) was imaged and the signal from each angled string quantified. To minimize inaccuracies in the target depth arising from the experimental preparation of stacking the strings, the phantom was rotated by 90 • between image acquisitions. The angle of the string in relation to the direction of illumination was found to significantly impact the quantified mean PAI signal (Fig. 4 G), with the string in line with the two illumination fibres having the highest signal intensity. Encouragingly, there was no significant difference in the calculated FWHM between 90 • and 0 • or in the strings between the rotation steps (Fig. 4H). For the vertical angles, a clear signal decay was found with increasing angle to the phantom surface (Fig. 4 I-K). At 24 • , the string could no longer be detected in the tissue-mimicking phantoms (Fig. 4 K). The size of the FWHM quantified from the MIPs remained stable for angles up to 20 • , but beyond that it increased in value and variability due to the lower signal (Fig. 4 L). These results highlight depth-and angle-related limitations of the system resulting from the limited angular coverage of the illumination and transducer array, affecting signal detection and quantification.

Photoacoustic mesoscopy shows high precision for repeated measurement in phantoms
The signal stability was assessed through photoacoustic mesoscopy phantom measurements over time (Table 2, Fig. 5). A mean COV of 1.2 ± 0.7% (Fig. 5A) was determined for repeated measurements in a single imaging session without repositioning of the phantom. With repositioning (removal from the system and immediate replacement), a slightly higher COV of 4.1 ± 2.4% was found (Fig. 5B) for short-term studies (without switching off the system in between each runs), which increased to 9.6% for long-term studies (25 days, with switching off the system between each run). No slopes were significantly non-zero (without repositioning: p = 0.1551; with repositioning [short term]: p = 0.2858, with repositioning [long term]: p = 0.3955). The COVs were similar across different depths (Fig. 5A,B), suggesting an excellent technical longitudinal repeatability for the system.

Photoacoustic mesoscopy shows greater variation during in vivo application
The temporal stability of photoacoustic mesoscopy measurements was further evaluated in vivo, where additional sources of variation can arise (Fig. 6 A). Contributions from motion were minimised by appropriate positioning of the animals within the mouse bed and by application of a motion correction algorithm during image reconstruction. Optimal image acquisition in photoacoustic mesoscopy depends on a variety of factors, such as (1) adequate coupling of the of the tissue of interest to the transducer interface, (2) intactness of the coupling foil, and (3) purity of the coupling medium. Importantly, compression of the transducer interface on the target tissue must be avoided as it can cause signal drop out by impeding blood flow (Fig. 6B,C).
The impact of the relative mouse positioning on the quantified blood volume was assessed for tumour imaging (Fig. 7 C). In our studies, the tumour position on the flank (a common location for preclinical oncology) allows for several different methods of animal positioning to be used, which will lead to different sub-volumes of the tumour being captured by the imaging system. A significant correlation (r = 0.76, p < 0.0001) in the quantified blood volume between dorsal and lateral positioning of the mouse was found (Fig. 7D). For the remainder of the studies, the leg position was used. Here, the influence of operator experience was also evaluated. With repositioning of the mouse (full removal of the mouse from the heat pad including cleaning off the ultrasound gel), mean COVs of 15.9 ± 6.3% and 20.2 ± 9.9% for the blood volume were found for an experienced (preclinical RSOM imaging experience > 50 imaging runs) and inexperienced (no significant preclinical RSOM imaging experience) user, respectively (Fig. 6D, Table 2). Taken together, our in vivo results suggest that photoacoustic mesoscopy provides a robust quantification of vascular data once users are experienced with the system.

Discussion
Photoacoustic mesoscopy has shown great potential to become a widely used tool in biomedical research and clinical healthcare, but technical validation is required to ensure reliable data interpretation, reproducibility of data acquisition, and comparability between different studies. Using custom phantoms and in vivo analyses, we systematically analyse the impact of different variation sources on precision and accuracy of photoacoustic mesoscopy. We conducted our studies representatively on a commercially available PA mesoscopy system, but the results may be extended to other acoustic resolution PAI systems with similar device configurations [14].
First, various artefact types that commonly appear in photoacoustic mesoscopy studies were exemplified and quantified. Accurate representation of vascular lumens is fundamentally limited in the photoacoustic mesoscopy geometry: since vessels are only illuminated from the top, images do not depict the full vessel volume (illumination artefact); signal loss also occurs for high angles of vessels relative to the detector, at increasing depth, or due to the presence of overlying structures (shadow artefact). The limited detection bandwidth of 10-90 MHz of the transducer translates to accurate representation of structures that are sized 12-120 µm [47]. Furthermore, reflection artefacts can occur that may be mistaken for real vessel structures. Several elaborate post-processing algorithms [36] have been suggested in the literature to minimize the impact of these artefact types. For reflection artefacts, signal processing methods include techniques based on singular value decomposition [48], short-lag spatial coherence [49][50][51], wavelet analysis [52] or convolutional neural networks [44]. Specialized techniques to reduce the occurrence of reflection artefacts have also been proposed employing e.g., multi-wavelength excitation [18], tissue deformation compensated averaging (DCA) [17], localized vibration tagging (LOVIT) [20,41], or ultrasound (either focused [43] or plane waves [40]) to mimic wavefields from photoacoustic sources for identification of reflections. Shadow and illumination artefact have also been targeted with post-processing algorithms, particular deep learning methods [53][54][55][56][57], or with other specialized techniques, e.g., based on spatial/temporal modulation of the optical absorption of the sample [58,59]. General applicability of the proposed methods may be limited due to differences system configurations or limiting assumptions of the Table 2 Coefficients of variation (COV) for temporal stability of the photoacoustic mesoscopy system. n refers to number of imaged mice/phantoms. PDX = Patient-derived xenograft. 0.04 ± 0.05 5 algorithms [52], but may still offer a way to improve image quality in particular contexts. Practical advices to maximise the information from an imaging object include rotating the object relatively to the probe [60] or vice versa [61]. Nonetheless, as the full range of photoacoustic mesoscopy artefacts cannot yet be automatically detected or corrected, understanding the resulting limitations on the acquired data is of utmost importance to prevent misinterpretation. Second, limitations of the system were analysed that arise from three geometric factors significantly affect the measured signal intensity: (1) depth, and (2) horizontal and (3) vertical rotations of targets. In the commercial system tested, light is delivered from two optical fibres spaced 180 • apart around the US transducer, leading to different light fluence at the sides orthogonal to this plane. Whilst these effects are minimal, and did not affect the quantification of target size, they significantly impact any associated measurements of signal amplitude. Illumination from four optical fibres placed at 90 • around the US transducer, such as in other setups [62], could help to mitigate this problem, but would increase system cost. The limited aperture of the US transducer [8] also enabled signal quantification only up to an angle of 24 • in our phantom, meaning structures angled more steeply could be missed. The relative impact of these factors also depends on the optical and acoustic properties of the surrounding medium, with higher signal loss in more attenuating media. As all these factors can significantly decrease accuracy of a measured structure, they should be considered when positioning a target within the system, or quantifying and evaluating acquired data.
After assessing accuracy-related factors, the precision of the system was evaluated. The system was found to be characterized by a high signal stability in both short-term and long-term phantom studies. Primary sources of short-term fluctuations are hardware instabilities [21]; arising from light source energy fluctuations (caused, e.g., by lack of temperature stabilization of laser cavity and tuning crystal; aging of the pump source; dirty or damaged optics; or occurrence of timing jitter) or ambient system noise (caused, e.g., by electronic interference; overheating of the control electronics). Higher signal fluctuations with target repositioning are primarily caused by variations in the (manually controlled) distance between target and illumination/detection array (Z position; the XY position was kept constant between the imaging runs). Variations in coupling media/foil may also impact signal quantification. Ensuring constant transducer-target distance, as well as clear, bubble-free coupling agents, intact coupling foil and, if working with phantoms, homogeneous base media, will minimize the impact of these variation factors. The found temporal stability is in line with other studies showing less than 10% variation in PAI systems [21,22]. The values reported here are higher than those found previously for a commercial tomography system by the same vendor [21] (COV-MSOT =2.8% vs COV RSOM =9.6%), which is likely due to the higher resolution of the mesoscopic system, leaving it more sensitive towards small changes in positioning, and the manual control of target-detector distance, as outlined above.
Finally, sources of variation in vivo were investigated considering time, tissue type, positioning and operator-dependent sources. As expected, a higher variability was found in vivo compared to in phantoms due to additional variations arising from motion and changes in vessel perfusion. When comparing healthy and tumour tissue, a higher variability was found in tumour tissue, which can be explained by the chaotic, tortuous and leaky nature of tumour vasculature [63]. Interestingly, a significant decrease in blood volume was seen in healthy vasculature over time. Similar observations have been made before in murine spleen and kidney at the macroscopic scale in a tomographic PAI system [21], owing to perfusion changes under anaesthesia. Here, the ear was chosen as the healthy tissue of interest, as it is easily accessible by the system and often used as a site of interest in PAI studies [64][65][66]. However, as a peripheral organ the ear is not directly heated by the heat pad during image acquisition, which may have led to temperature-related vasoconstriction and thereby reduction of measured blood volume over time. Future work should explore the impact of temperature and isoflurane-related sources in more detail. A high correlation was found between the quantified blood volumes measured at two different data acquisition positions, demonstrating robustness in assessment of specimen-specific blood parameters. Variability between different operators was assessed, with only slightly higher values for an unexperienced user, confirming observations from other studies on photoacoustic precision [22]. Key sources of variability arising from different user experience were identified as: variation in positioning of the mouse between the different data acquisition sessions; variations in application of the US gel on the imaging target (e.g., inclusion of air bubbles); variation in compression of the transducer interface on the imaging target; and variations in aligning the target of interest in the central field of view.
Taken together, these observations demonstrate good precision of photoacoustic mesoscopy measurements in preclinical oncology models, across time and with different user experience.
There are several general considerations for photoacoustic mesoscopy that lead to high image quality, including keeping the coupling media and foil clean, intact and free of bubbles. Sample and coupling media temperature should be matched and care should be taken to maintain a constant transducer-target distance across repeated imaging sessions (e.g., by aligning the target of interest to the 'focus-line' of the transducer in the pre-scan). As target intensities are impacted by vertical and horizontal rotation, and penetration depth, the same positioning of the imaging object with respect to the illumination and detection plane should be used. For optimal quantification, the target of interest should be as close to the scan head and as horizontal relative to the scan head as possible and it should be noted that accurate dimensions can only be measured in x and y. For in vivo studies, in addition to the usual considerations of hair removal, maintaining breathing rate and animal body temperature, and minimizing motion, the transducer interface should be adjusted to avoid over-or under-compression. Care should also be taken that the target of interest is centrally aligned in the field of view.  Several limitations remain which should be addressed in future work. Inter-scanner variability has not been assessed as part of this study due to the logistical challenges of such a study. Similarly, the impact of motion correction, image reconstruction algorithms and data postprocessing has not been investigated. First steps towards analysis of these computational variation sources in mesoscopic imaging have been taken elsewhere [36], but future work should explore these technical factors of variability in more depth for further user guidance.

Conclusion
In summary, we have conducted a detailed technical validation of a PA mesoscopy system by: outlining common artefact types; highlighting imaging limitations arising from the limited field of view; delineating precision; and analysing the impact of various sources of variation. By offering guidance on how to optimize data acquisition, we hope to assist future preclinical and clinical studies using similar PAI setups, thereby maximizing study accuracy and information retrieval.

Declaration of Competing Interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests Sarah Bohndiek reports a relationship with EPFL Center for Biomedical Imaging that includes: speaking and lecture fees. Sarah Bohndiek reports a relationship with PreXion Inc that includes: funding grants. Sarah Bohndiek reports a relationship with iThera Medical GmbH that includes: non-financial support.

Data availability
A DOI to the dataset will be provided in preparation for publication of the paper. Fig. 7. : Impact of time, positioning, tissue and operator-dependent variation sources on the signal stability of the mesoscopic system in vivo. A) Blood volume is shown over time in healthy (black, n = 6 ears, cell line model) and tumour tissue (green, n = 9 tumours, PDX) without replacement of the mouse. Blood volume has been normalised by dividing each value with the first data point in the time series. B) Dorsal (left) and lateral (right) positioning of the mouse are indicated (figure created with Biorender). C) Correlation of calculated blood volume between dorsal and lateral positioning of the mouse is shown (n = 161, PDX model, Spearman r = 0.7615, R 2 =0.7687). D) Blood volume is shown in tumour tissue with replacement of the mouse by two different operators (each n = 5, inexperienced operator=black, experienced operator=green, PDX and cell line models). Blood volume has been normalised by dividing each value with the first data point in the time series. All data is shown as mean ± SD. Lina Hacker received her bachelor's degree in Molecular Biomedicine at the University of Bonn (Germany) and her master's degree in Biomedical Engineering at RWTH Aachen (Germany). Currently, she is pursuing a PhD in Medical Sciences at the University of Cambridge focusing on the technical and biological validation of photoacoustic imaging biomarkers. Emma Brown completed her PhD in Medical Sciences at the University of Cambridge in 2022, using photoacoustic imaging to study tumour vascular evolution. She is currently undertaking postdoctoral research at Washington University in St Louis.

Thierry Lefebvre completed his MSc in Medical Radiation
Physics at McGill University in 2020 and is currently pursuing a PhD in Medical Sciences at the University of Cambridge. His research focuses on using photoacoustic imaging to study radiotherapy response and resistance.
Paul Sweeney completed his PhD in Applied Mathematics at University College London in 2017. After postdoctoral research at UCL he joined the VISIONLab in Cambridge as a Wellcome Trust Junior Interdisciplinary Fellow. His research uses machine learning methods for improved image segmentation to parameterise biophysical models that can predict how fluid and mass is transported through cancerous tissue.

Sarah
Bohndiek completed her PhD in Radiation Physics at University College London in 2008 and then worked in both the UK (at Cambridge) and the USA (at Stanford) as a postdoctoral fellow in molecular imaging. Since 2013, Sarah has been a Group Leader at the University of Cambridge, where she is jointly appointed in the Department of Physics and the Cancer Research UK Cambridge Institute. She was appointed as Full Professor of Biomedical Physics in 2020. Sarah was recently awarded the CRUK Future Leaders in Cancer Research Prize and SPIE Early Career Achievement Award in recognition of her innovation in biomedical optics.