Assessing the Quality of Perineal Auscultation for A Noninvasive Diagnosis of Urinary Bladder Outlet Obstruction

In men the prostate generally increases in size with age, most often as a result of Benign Prostatic Enlargement (BPE). The latter may lead to Bladder Outlet Obstruction (BOO). Men with BOO generally have Lower Urinary Tract Symptoms (LUTS), including a weak urinary stream, frequent voiding (also nocturnal) and residual urine in the bladder after voiding. The current standard method for diagnosing BOO is a pressure-flow study (PFS) which is part of a (video) urodynamic examination (VUDE). The pressure flow study consists of a urinary flow measurement while simultaneously recording the detrusor pressure using a catheter inserted in the bladder and a second catheter in the rectum. This method is timeconsuming, costly, uncomfortable to the patient and potentially harmful (it may lead to serious side-effects).This poses a threshold for preoperative testing. To lower this threshold, easy-to-use noninvasive urodynamic testing methods have been developed such as the Doppler flow metry method [1], the condom-catheter method [2], the penile cuff method [3], bladder wall thickness measurement [4] and most recently perineal auscultation [5-6]. All of these methods have some drawbacks. The latter technique records the sound generated by the urinary flow through the urethra with


Introduction
In men the prostate generally increases in size with age, most often as a result of Benign Prostatic Enlargement (BPE). The latter may lead to Bladder Outlet Obstruction (BOO). Men with BOO generally have Lower Urinary Tract Symptoms (LUTS), including a weak urinary stream, frequent voiding (also nocturnal) and residual urine in the bladder after voiding. The current standard method for diagnosing BOO is a pressure-flow study (PFS) which is part of a (video) urodynamic examination (VUDE). The pressure flow study consists of a urinary flow measurement while simultaneously recording the detrusor pressure using a catheter inserted in the bladder and a second catheter in the rectum. This method is timeconsuming, costly, uncomfortable to the patient and potentially harmful (it may lead to serious side-effects).This poses a threshold for preoperative testing. To lower this threshold, easy-to-use noninvasive urodynamic testing methods have been developed such as the Doppler flow metry method [1], the condom-catheter method [2], the penile cuff method [3], bladder wall thickness measurement [4] and most recently perineal auscultation [5][6]. All of these methods have some drawbacks. The latter technique records the sound generated by the urinary flow through the urethra with a perineal contact microphone. In a model of the urethra it was shown that the sound, recorded downstream of an obstruction is related to the degree of obstruction [5]. The variability and repeatability of perineal auscultation has been studied in a male volunteer population [6]. In the latter study the authors found that the results were significantly different between volunteers.
The quality of the sound recording is of major importance for the clinical applicability of perineal auscultation. Disturbances of the recording by e.g. movement of the microphone or noise from the surroundings can affect the results and thereby the diagnosis. To establish whether a measurement may result in a correct diagnosis objective quality measures are required. We studied two measures for assessing the quality of perineal auscultation in a clinical patient population, and compared these quality measures to the visual quality assessment of measured traces by three independent experienced observers.

Data acquisition
Prior to the VUDE, a free flow measurement was performed using a rotating disc flow meter (Urodyn 1000, Dantec, Denmark). Simultaneously, perineal auscultation was performed using a microphone at the perineum, which was held in place by a modified jockstrap ( Figure 1). To minimize noise, patients were asked to be quiet and minimize movement during voiding. The perineal auscultatory signal was amplified using especially developed battery powered low noise amplifier. The flow rate and auscultatory signal were both sampled at a frequency of 10 kHz and stored on a PC for offline analysis. More details about the setup have been described by Idzenga [6]. Analysis of the signals was done using a custom program written in Matlab® (The Mathworks, Inc., Natick, MA, USA). The flow rate signal was low pass filtered at 20 Hz. The auscultatory signal was band pass filtered between 25 and 4000 Hz.
Following the free flow measurement with simultaneous perineal auscultation, a VUDE was performed according to the International Continence Society standards [7]. The VUDE included 2 filling cystometries with subsequent PFS's.

Quality assessment
The quality of the perineal auscultation was assessed using two different approaches. The first was an independent visual scoring of the quality by three experienced observers and the second approach was signal analysis by two objective measures. The two approaches were performed independent from each other. In the first approach the flow and sound signal were visually inspected by the three observers, blinded for each other's scoring. The quality of the recordings was scored using a three-point scale: 'good'=1, 'medium'=0 and 'poor'=-1. Examples of sound signals of good, medium and poor quality are shown in (Figure 2). Good quality signals had low noise levels and sound amplitude that increased and decreased synchronously with the flow rate signal ( Figure  2). Medium quality signals had increased noise levels but still had a perceivable amplitude variation of the auscultatory signal synchronously with the flow rate signal (Figure 2 middle). All other measurements were scored as poor. In the second approach, the quality of each auscultatory recording was quantified using two objective measures. The first measure was based on the normalized correlation coefficient (NCC) between the auscultatory signal and the flow rate signal. From each recording a time interval (flow rate higher than half the maximum flow rate) during voiding was manually selected. The correlation between the envelope of the selected auscultatory signal and the associated flow rate was calculated using normalized cross correlation, where 1 indicated maximum correlation and 0 no correlation. The envelope of the auscultatory signal was obtained by band pass filtering the signal between 25 to 500 Hz, followed by a Hilbert transformation, rectification of the transformed signal and subsequent low pass filtering at 20 Hz. The envelope was then shifted0.1 seconds with respect to the flow rate signal and the NCC was calculated again. The correlation was calculated for all time delays between ± 2 seconds. The maximum correlation coefficient was selected and used to assess the quality of the measurement. The second measure was based on the signal-to-noise ratio (SNR) of the auscultatory signal. This SNR was calculated by manually selecting two intervals within the earlier selected interval for calculating NCC: a voiding interval and a non-voiding interval (Figures 3a-3b). Selection of the voiding interval was computer assisted by automatically indicating the part of the flow rate signal higher than half the maximum flow rate. The non-voiding interval was selected based on absence of flow and a low sound level. The SNR was calculated based on the method that was previously described in detail by Idzenga [6]. In the voiding and non-voiding interval we calculated the power spectra of the sound signal in a window of 0.8s duration using Fast Fourier Transform. The window was then moved 0.2s and the spectrum was calculated again. This procedure was repeated until the end of the selected time interval was reached. The frequency spectrum in the frequency range 100 -500 Hz was averaged over the time windows, resulting in an average voiding and non-voiding spectrum. The SNR was calculated by dividing the areas under the curves of these two averaged spectra (Figure 3c).

Statistical analysis
The agreement between the three observers was expressed using the Weighted Kappa statistic [8],where 1 is perfect agreement, 0 is what would be expected by chance, and negative values indicate agreement less than pure chance. When the observers disagreed we used the median of the three observations as combined quality score. The objective quality scores were compared to the objective quality measures using the Kruskal Wallis test. We used Receiver Operator Characteristic (ROC) analysis to determine the performance of NCC and SNR for quantification of the measurement quality. The median (min-max) age of the 74 patients included in the study was 68 (27 -82) yrs. The recordings failed in 5 patients due to technical errors. In 9recordings there was no non-voiding interval, making it impossible to calculate the SNR. The median (25% -75%) flow rate of the remaining 60 recordings was 11.9 (8.3 -18.6) ml/s and the median (25% -75%) voided volume was 80 (50 -152) ml. The scoring of the three observers and the interobserver agreement of the visual quality score are presented in  (Table 1). All three Weighted Kappa values, 0.73, 0.66 and 0.76, were significantly different from 0 (p < .05). The medians of the quality scores as assessed by the three observers were distributed as follows: of the 60 recordings 21 were scored as 'poor' quality, 23 as 'medium' quality and 16 as a 'good' quality recording.  The NCC and SNR for the three visual quality scores are depicted in (Figure 4). The NCC for the good quality recording was significantly different from that for the medium and poor quality recordings (Kruskal-Wallis, p < .05). The 25 th and 75 th percentile of the NCC were 0.45 and 0.80 for the good quality, 0.10 and 0.45 for the medium quality and 0.07 and 0.45 for the poor quality recordings. The SNR was not significantly different (Kruskal-Wallis, p = 0.13) between the quality scores. The ROC curves for the two quality measures are depicted in (Figure 5). For the ROC analysis the poor and medium recordings were grouped, since the NCC values were significantly higher for the good quality recordings. The areas under the curve were 0.84 and 0.66 (both significantly different from the reference line, p < .05) for NCC and SNR, respectively.

Discussion
A good quality recording is essential for clinical use of perineal auscultation to noninvasively diagnose BOO in patients with LUTS. In this paper we tested two objective measures for quantifying the quality of perineal auscultation recordings. To test the validity of these quality measures we compared both to quality scores visually assessed by three experienced observers. The NCC best reflected the assessment by the three experienced observers. According to the commonly cited scale for the Weighted Kappa statistic [8,9] the experienced observers were in substantial agreement on scoring the quality of the recordings. This supports the use of their scoring as a 'gold standard' for validating the two quality measures. These quality scores were evenly distributed over the patient population, i.e. each quality score was assigned to an approximately equal number of patients (Table 1), which makes this population suitable as a test population.
When comparing the two quality measures to the quality scores, the NCC was significantly higher for the good quality recordings than for the medium and poor quality recordings. This was not the case for the SNR, which makes the latter measure not suitable for distinguishing good quality from medium/poor quality recordings. Also the area under the ROC curve was higher for NCC than for SNR. This suggests that NCC represents the quality scoring by the experienced observer's best. Computing the NCC for an auscultatory recording however is not completely objective. NCC contains a semi-objective element in the sense that for each perineal auscultatory recording a certain time interval was selected manually. This time interval defines the onset and ending of voiding and was used to remove distortions when the patient moved before or after the voiding. Based on the 25 th and 75 th percentile of the NCC for good and medium quality recordings we propose a suitable cutoff NCC-value to identify good quality auscultatory recordings of 0.45.
In conclusion, we developed quality criteria to assess the quality of electronic perineal auscultation in patients with LUTS. These quality measures can be used to automatically select auscultatory signals that are suitable for noninvasively diagnosing BOO. Based on our results it would seem possible that using our present improvised setup maximally 16/60 patients could have been diagnosed noninvasively using perineal auscultation. Improvements in the setup seem necessary to increase this rate.