Evaluation of the Accuracy of Computer Automated Analysis of Esophageal 24-hour Impedance pH Studies

Background: Esophageal pH monitoring in conjunction with multichannel intraluminal impedance (MII-pH) is now considered the most accurate method for detection and characterization of gastro-esophageal reflux (GER), with higher sensitivity and specificity in detecting reflux than esophageal pH monitoring alone. Aims: One possibly limiting factor for using MII-pH testing is the time required to analyze the results. Automatic interpretation softwares have been produced to reduce this, in this study, we assessed the reliability of two 24 hour MII-pH analysis softwares compared to the interpretation provided by an expert. Methods: We performed a retrospective review of 200 MII-pH studies done on patients with reflux symptoms between September 2009 and September 2014. The studies were split into two groups of 100 patients: one group’s testing was performed using MMS equipment and software, and the other group used Sandhill Scientific equipment and software. All tracings were additionally analyzed by an expert and the interpretations were compared. Results: Our data indicated a strong correlation between the expert’s analysis and both automatic softwares in all positions, Demeester score, reflux episodes and symptoms index (p<0.0001). For studies interpreted as either normal or abnormal, there was concordance between the expert analysis and the software 95% of the time for the MMS software, and 93% of the instances for the Sandhill software. Conclusions: The MII-pH data analysis software provide reliable diagnostic utility and are time-efficient at the present time, but it is advisable to seek interpretation from an experienced interpreting physician, prior to signing off the report in order to avert any possible troubles such as probe malfunctioning.


Background
Esophageal pH monitoring in conjunction with Multichannel Intraluminal Impedance (MII-pH) is now considered the most accurate method for detection and characterization of gastroesophageal reflux (GER).
Compared to esophageal pH monitoring, MII-pH significantly increases the sensitivity and the specificity in detecting reflux episodes [1]. Additionally, it identifies patients with symptoms related to nonacid reflux, which is not detected by standard conventional pH monitoring [2,3].
The Porto consensus concluded that MII-pH monitoring is the only recording method that can achieve high sensitivity for detection of all types of reflux episodes [4]. The MII-pH catheter contains six impedance segments placed at different distances above the lower esophageal sphincter (LES). These allow the detection of reflux. The catheter also has a pH electrode that is used to identify the acidity of the refluxate [5,6].
Measurements of MII-pH monitoring have been shown in a prospective study to detect GERD with higher levels of specificity and positive predictive values than wireless pH monitoring [7]. The FDA has approved use of MII-pH to monitor reflux by detecting retrograde intraluminal bolus movement.
As a result, patients with normal endoscopic findings on acidsuppression therapy with persistent GERD symptoms have an indication to undergo MII-pH monitoring to quantify reflux episodes, classify the type of reflux (i.e. acidic vs. nonacidic), and assess the relationship between persistent symptoms and MII-detected reflux [8].
In this study, we assessed the reliability of two different types of 24 hour MII-pH analysis softwares compared to the interpretation provided by an expert.
This information is important, since there is a concern in the community about complexity of interpretation of MII-pH tracings, the amount of time a gastroenterologist may have to spend learning to read these tracings and analyzing them, as well as the reliability of the automatic analysis provided by the different software made available by the manufacturers.

Material and Methods
We performed a retrospective review of 200 consecutive MII-pH studies done on adult patients with typical or atypical reflux symptoms referred to our laboratory for MII-pH monitoring. These studies were ordered by physicians practicing in our institution or outside referring physicians.
Patients with dysphagia and history of gastric surgery were excluded because of the potential for esophageal dysmotility. Meal times were excluded from analysis. The study was approved by the Institutional Review Board of United Health Services Hospitals. The studies were done between September 2009 and September 2014.
The studies were split into two groups of 100 patients each: One group had testing performed using MMS equipment, and the other group using Sandhill scientific equipment. We performed the analysis using the corresponding software for each device: MMS version V 8.19h and Bioview analysis (Sandhill scientific) version 5.5.4.1, respectively.
All patients were asked to fast for at least 4-6 hours before the procedure while still taking their usual medications, including acidsuppression therapy. The catheter design placed the pH electrode 5 cm above and 10 cm below the lower esophageal sphincter (LES) with impedance measuring segments at 3 cm, 5 cm, 7 cm, 9 cm, 15 cm, and 17 cm above the LES.
All patients were provided with a diary to mark the time and content of meals, time and type of symptoms, time and type of medications, and recumbent and upright positions during the study period.
Symptoms and patient position were also recorded by pressing assigned buttons on the MII-pH monitor. The following day, the catheter was removed and data were downloaded for analysis. A reflux episode was defined by cephalad bolus movement as seen on MII. It was regarded as acid reflux if pH dropped below 4 and non-acid reflux if pH remained at 4 or above.
The total number of reflux episodes in patients on acid-suppression therapy had a threshold for abnormal reflux at 48 reflux episodes in 24 hours, an average of approximately two reflux episodes per hour. All tracings were interpreted by the same expert with an experience of having read more than 2,000 MII-pH studies.
For the purpose of this study, a trainee with no prior experience in interpreting MII-pH tracings collected the data from the expert analysis and then reset the tracings to their original status prior to modification by the expert, and applied automated analysis using the newer versions of the software. The trainee subsequently also collected the reflux data generated by the automated analysis. All data was inputted in an excel sheet.
The basic concepts of esophageal impedance are similar to pH monitoring; whereby the esophageal data are documented via a probe positioned transnasally with the help of a recorder. Upon completion of data acquisition, the raw data are then downloaded into specific software, the MMS version V 8.19h and the Bioview analysis (Sandhill scientific) version 5.5.4.1, that prepares a tracing and is capable of automatic analysis.
The MII-pH analysis automatic system assesses several metrics, including the duration, percentage, and number of reflux episodes, as well as the Bolus' Clearance Time (BCT) and Acid's chemical Clearance Time (ACT).
As a result of the MII-pH device recording simultaneously in at least 6 different esophageal sites, it provides the ability to identify and characterize the bolus-events of GER, their length of time, approximate extension and their association with any symptom [9].
The symptoms associated are calculated using three different indices: symptom sensitivity index (SSI), symptom index (SI), and symptom association probability (SAP) [10].
Prism software was used for statistical analysis and level of significance was set as p<0.05.

Results
A total of 200 studies were reviewed: 65% females, with a mean age of 48.6 years; 35% males, with a mean age of 46.3 years. Total 100 studies were done using MMS equipment and software (MMS version V 8.19h) and 100 studies were done using Sandhill scientific equipment and software (Bio view analysis version 5.5.4.1).
As per Figure 1, our data indicated a very strong correlation between the expert's analysis and both automatic softwares as shown for the upright position, supine position, Demeester score, acid and non-acid reflux episodes as well as symptoms index.  Figure 1 gives results of correlation between the expert's and automatic software interpretation thus giving us the Pearson r for each data pair. There was a p<0.0001 in these data points, an indicator of a very strong correlation between expert's and automatic software's analysis. Looking at the possibilities of overall interpretation resulting in either abnormal or normal study shows that the expert interpreter and automatic software agreed 95% of the instances for the MMS software, and 93% of the instances for the Sandhill software.

Discussion
Acid reflux (AR) is defined as a reflux event with a drop in pH to less than 4.0, a classic description used in esophageal pH-monitoring. Non-acid reflux is defined as esophageal pH higher than 4.0, and is additionally separated into weakly acidic reflux (WAR) for a pH between 4.0 and 7.0, and weakly alkaline reflux (AlkR) for an esophageal pH above 7.0. Data from previous MII-pH studies has demonstrated that non-acid reflux accounts for at least half of reflux episodes, and bears a strong correlation with symptoms. The capabilities of MII-pH testing have been recognized, with many studies comparing the results with pH monitoring exclusively, especially for evaluation of the temporal connection between GER and symptoms [11][12][13][14].
Some previously considered drawbacks of MII-pH testing have been both the time required for an expert to analyze and interpret individual tests, and variation among expert's analysis. Because intra or interobserver variability remain relatively high, even among experienced experts, a validated and polished automated analysis is needed for this clinical procedure. This ensures both reliability and reproducibility and significantly decreases the time needed for analysis [15].
Previous research has indicated that automatic MII-pH interpretation presents problems in recognizing GER at meals, hence meals are frequently not considered in study designs. In order to thoroughly examine the difficulties in recognizing GER at meals, more work needs to be performed closely inspecting the data that is generated before, during and after meal times. Thus, we still need to make sure to specifically exclude meals prior to automated analysis. Symptom association plotting could provide an effective tool in analyzing this association around meal periods, especially by studying the number of symptoms, the different types of GER associated with symptoms, symptoms that occur in the absence of GER, and GER events that occur in the absence of symptoms [10]. A thorough analysis of this potential data may elucidate these relationships more clearly.
There have been a few noteworthy factors that may influence impedance data such as the type gathered in this study. Baseline impedance has been shown to be more reduced in patients having esophagitis, as compared to patients experiencing non-erosive reflux disease. Additionally, proton pump inhibitor treatment outcomes have been shown to correlate with increased baseline impedance; however, this baseline impedance is also dependent on the patient's age, as well as the number of impedance events [16,17].
The data of our study indicate that the automatic MII-pH analysis programs can provide a quick and valid method of interpreting results, with consistency and high reproducibility. Our data indicates that both the MMS and Sandhill equipment and software provide statistically similar interpretations. Furthermore, our data shows that both of the software's data interpretation bear strong correlations compared to an expert's interpretation.
Having a valid, consistent, reproducible and swift interpretation method for MII-pH analysis enables more frequent and broader applications of this technique. The data from this study supports the clinical strength of MII-pH software analysis, and increases the potential clinical significance of this tool. Using this software interpretation, MII-pH analysis can be more confidently employed to provide important information in assessing GER, especially in the postprandial period and in patients with atypical or persistent symptoms [18]. It is prudent, of course, to have an expert interpreter quickly analyze the software's interpretation. This process should be similar to how an ECG machine's results are quickly analyzed, and, if necessary, edited by a cardiologist. The promising results of this study indicate that MII-pH analysis, with the use of these valid and quick software programs may be a time and cost efficient clinical tool. However, it is still very important that the physician responsible for the interpretation of the pH tracings is fully trained. This is key as there are frequent issues that still need a human input, such as identifying dysfunction in the catheter which sometimes requires exclusion of sections of the tracing from analysis. This is particularly true during the overnight period where we sometimes see an inappropriate drop in pH to below 4 without associated reflux. This is frequently due to drying of the pH electrode. If not excluded, it could erroneously elevate recumbent acid exposure time. Also, in cases of re-reflux, the automated analysis frequently identifies multiple consecutive reflux episodes as only one episode, thus artificially decreasing the total number of reflux episodes. Another possible pitfall is in patients with achalasia in whom an MII-pH study is ordered, as sometimes their symptoms mimic reflux. In these cases, the baseline impedance is very low due to retained fluid within the esophagus. Swallows in these patients can induce waves in the retained fluid that can mimic reflux on the impedance tracings. The interpreting physician needs to be experienced enough to identify this particular presentation and to recommend a manometry study, if not previously performed.
Ultimately, we feel that the current generations of automated MII-pH analysis software are advanced enough to provide guidance and help significantly shorten the length of time needs to analyze and interpret these tracings. They also should help provide consistency in interpretation. However, we discourage the total reliance on the software, as this would significantly increase the risk of erroneous results.
Limitations of our study include Single center sample, and the inability to study all available impedance pH softwares as well as the inability to get multiple expert's readings for the same impedance pH study to compare it with different types of softwares.

Conclusion
Our study indicates a strong correlation between the interpretations provided by the automatic software analysis and an expert analysis for 24 hour MII-pH monitoring, and are significantly less time consuming. The two software, MMS and Bioview, are very reliable at the present time, but it is advisable to seek interpretation from an experienced interpreting physician, prior to signing off the report in order to avert any possible troubles such as probe malfunctioning. MII-pH reports should include the methods, results, and type of the chosen analysis, an interpreted clinical history, and recommendation for further investigations or treatment. Interpretation and analysis of these studies should only be done by those with satisfactory training and expertise.

Ethical Approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. For this type of study (Retrospective), formal consent is not required.