The Networked Analysis of Static Visual Field Evaluation: A Comparison between Medisoft and Humphrey Software Systems

Aim: This study compares and contrasts the integrity of original data once imported to Medisoft from the Humphrey perimeter. We aim to analyse the correlation between reliability indices, mean deviation and pattern standard deviation. Methods: Visual field results from Humphrey automated perimetry using the 24-2 SITA-FAST strategy were imported to Medisoft. A direct comparison was made from original hard copy printouts from the Humphrey perimeter to hard copy printouts of the Medisoft version of the results. All data was compared using Bland-Altman plots and unpaired t tests. Results: A total of 50 patients (23 male, 27 female) aged 51 to 91 (mean ± SD: 73.


INTRODUCTION
Electronic health record (EHR) systems have played an increasingly important role in the provision of healthcare in the UK over recent times, and they are now considered an essential technology within healthcare settings globally [1,2]. Electronic data allows for large amounts of patient information to be stored securely on a central database, with the facility to share the information among a team of care providers who are often spread across multiple locations [3,4]. Storing and sharing test results electronically can improve efficiency, speed up clinical communication, reduce the number of errors and assist clinicians with diagnosis and treatment [2,5].
Medisoft (Medisoft Ltd, Leeds, UK) is the UK's most popular EHR system within ophthalmology units, currently in use at over fifty NHS hospitals nationwide [6].In practice the software aggregates data from various diagnostic equipment and imaging systems providing a central portal through which healthcare professionals can access EHRs. Medisoft allows for the pooling of patient information from all ophthalmic sub-specialties via a networking system linked to clinics, theatre and even patients notes, once they have been scanned and uploaded to the system. This feature of the system requires it to interface with equipment and software produced by multiple different manufacturers of ophthalmic systems and devices ranging from biometry, optical coherence tomography and visual field instruments such as perimeters.
The Humphrey perimeter (Carl Zeiss Meditec, Dublin, CA) is commonly used in ophthalmology outpatient units to assess visual fields. Once assessments are complete the results of testing may be instantly received in either hardcopy format as a printout or viewed from the monitor linked to the individual perimeter [7]. In many clinics, these perimeters operate as isolated pieces of equipment with internal storage of data and direct printout of results. The patient data may also be extracted and pooled by independent software such as Medisoft, allowing it to be shared and accessible to other healthcare professionals at the same site through networked computers. This provision is not only advantageous to clinical staff but also to patients, as it encourages a holistic health care approach [8].
During the process of importing raw data for networking through software interfaces there is the potential for it to be manipulated and reformatted due to different methods of information analysis. However the magnitude of any potential changes to the display of visual field data is currently unknown. As a consequence of this uncertainty, clinicians currently use software interfaces such as Medisoft without accurate knowledge of how visual field results are reproduced [4,9]. If results were to differ using alternative software it is likely that this would primarily relate to the different normative databases used by each software option. It is not possible to access information about these normative databases as these are normally patent protected. However it is possible to speculate that differences would relate to the populations recruited for the normative database, the age range tested, the number of subjects within each age band and their individual test results. A literature search did not reveal any work verifying the accuracy of visual field results using data imported from the Humphrey perimeter through to the Medisoft EHR software, thus highlighting an important deficiency in current clinical practice.
We aim to explore this deficiency further and determine if the data imported though Medisoft is directly interchangeable with the original raw data produced on the Humphrey perimeter. Specifically we will compare and contrast the integrity of original data once imported to Medisoft from the Humphrey perimeter and consider the equivalence between both data types as assessed by the following outcomes: The level of correlation between reliability indices such as number of fixation losses, false positives and false negatives for each software system; The degree of correlation between mean deviation (MD) and pattern standard deviation (PSD) values for each software system; Relationship to the severity of visual loss; Comparison between groups analysing the level of difference between the decibel plot intensity values for each software system; and Comparison between groups analysing the difference in statistical significance of probability plot data for total deviation (TD) and pattern deviation (PD) from each software system.

Population
This study was conducted as a feasibility study. We recruited 50 patients for the purposes of this study. Patients were recruited from a previously compiled database of patients known to have attended glaucoma out-patient clinics and had their visual field assessed at Aintree University Hospital NHS Foundation Trust. All potential recruitment candidates in the database were vetted according to inclusion and exclusion criteria, and then subsequently graded as stage 0 through to stage 4 using the visual field results documented in their EHRs. Recruitment for each of the five grades was an on-going process until the recruitment target had reached its target for any one grade: at this point the recruitment would cease for that particular grade. However recruitment would continue for grades which still remained below the recruitment target.

Visual Field Results
All data was collected over the space of a month, with all visual field results taken from assessments undertaken within 3 years from the date of recruitment. Only one eye was used per patient: in cases where both eyes fulfilled the inclusion criteria the eye with the most reduced visual field was selected to increase efficiency during the recruitment phase. The recruitment target was further sub-divided according to the level of visual impairment and glaucoma severity; 10 normal/ocular hypertension visual field results (stage 0), 10 mildly impaired results (stage 1), 10 moderately impaired results (stage 2), 10 markedly impaired results (stage 3) and 10 blinding visual fields (stage 4) as determined by Mills' GSS staging system [10]. A total of 50 results were extracted across all five stages of visual field loss and subsequently imported for review of results on the Medisoft software system. Patients fulfilling criteria for stage 5 were not included in the study as this category meant no perception of light, and therefore visual field assessment would be of limited/no value in this group of patients.

Inclusion and Exclusion Criteria
Inclusion criteria were: (1) adult patients aged 18years or older with a diagnosis of primary open angle glaucoma or ocular hypertension from the EHRs, (2) diagnosis made by a specialist glaucoma clinician, (3) Humphrey visual field present and accessible on the patients electronic records, (4) visual field results networked to Medisoft ophthalmology for the same eye and test date, with readily accessible visual fields on the electronic records, (5) visual field results for central programmes.
Exclusion criteria included (1) failing to fulfil the required severity staging system criterion, (2) poor reliability as determined by a large amount of fixation losses, (3) a false positive rate of >10%, (4) a false negative rate of >10%, (5) a full set of decibel, total deviation and pattern deviation plots with corresponding probability plots present on both Humphrey and Med soft software field printouts, (6) visual field results other than central programmes.

Glaucoma Severity Staging Systems
Chronic open-angle glaucoma patients are required to have serial visual field assessments as part of their ongoing care to monitor disease progression [11]. For this reason we have selected this group as a suitable sub-population of patients to recruit for the purposes of our study. A variety of different severity classification systems currently exist [10,[12][13][14][15]. We decided to use the Glaucoma Severity Staging (GSS) system by Mills et al (See Table 1) [10] as it utilises two of our main outcome measures: mean MD and PSD, and it offers an efficient method for quick classification of patients into groups based on field loss.

Visual Field Assessment Measures
Visual fields were assessed on Humphrey Visual Field Analyser (HFA; model 740, Carl Zeiss Meditec, Dublin, CA) by experienced perimetrists. As the majority of patients in the outpatients clinic were attending for follow-up, they had experience in performing the test on a number of occasions. However some of the recruited patients may have been tested for first time. Original Humphrey results were printed from the Humphrey perimeter in the "single field analysis" format and secondly from the Medisoft software system in the standard output format.
Various parameters were taken from the visual field printouts to enable interpretation and analysis; these included global indices such as MD and PSD. Demographic details and reliability measures such as fixation losses, false positive and false negative rates were also recorded. All analysis and comparisons were performed by the same researcher.
The MD is a unit measure of the average deviation of overall visual sensitivity, measured in decibels (dB), from that of an age-matched normal [16]. This value is low in normal health, and increases with abnormal visual field results as the severity of gross field loss increases [17]. The PSD is determined by the variation from the normal hill of vision and is also measured in dB. The TD is the deviation from normal values and is assessed using its probability plot, which shows the associated probability value for each point and its variation from the norm. The PD highlights localised defects in the visual field and is also interpreted by its probability plot. Finally, the main decibel plot provides a summary of the raw data produced during clinical testing and is usually interpreted diagrammatically through an accompanying grey-scale image. Individual single analysis (visual fields measured at a single visit using Humphrey perimetry) was conducted to directly compare results of Humphrey software to Medisoft, to detect variation in MD, PSD, total deviation and pattern standard deviation probability plots. The Humphrey and Medisoft MD values are reported as a negative value and therefore the negative MD and PSD results were multiplied by -1 to yield a positive value for statistical analysis [18].
Bland-Altman plots were used to assess the extent of agreement between perimeters for the primary outcome measures of mean deviation and pattern standard deviation. Our clinically defined limits of agreement were ± 2 decibels and this is based on reported test-retest and intra-perimeter comparisons [7,19,20]. We would not consider that the results could be used interchangeably if the calculated/data derived limits of agreement are beyond this range [21]. Independent t-tests were used to determine the significance of parametric relationships between comparative fields of data.

Ethical Approval
The study did not require NHS research ethical approval as it was classed as a service evaluation project according to the guidance from the National Research Ethics Service. The study was designed and conducted solely to define or judge current care. It measured the current service without reference to a standard and involved an intervention (perimetry) in current use with analysis of existing data only. There was no allocation to the intervention (perimetry) as the clinical team and patient consented to the test as part of standard care before the service evaluation took place.

RESULTS AND DISCUSSION
A total of 50 patients (23 male, 27 female) aged 51 to 91 (mean ± SD: 73.8 ± 11.36) had their visual fields assessed on a Humphrey perimeter (HFA; model 740, Carl Zeiss Meditec, Dublin, CA). Automated static visual field assessment was performed using a 24-2 SITA-FAST strategy. During the initial recruitment phase 24 patients with suitable Humphrey visual field assessment results were identified, however there was no corresponding Medisoft test present on the EHRs and therefore they had to be excluded from the study. A further 2 patients from the stage 4 patient group were excluded as they fulfilled all inclusion criteria, except for full data presence for the TD and PD decibel plot fields. Table 2 outlines the results of a direct comparison between the data of both software types. Patient ID and demographic details were found to be identical with no changes in the representation of this information. Fixation losses were also the same, both being expressed as fractions. Conversely, false negative rates and false positive rates were non-identical as Humphrey data was represented in percentage format and the Medisoft data was recorded as a fraction. These parameters were all converted to percentage format for the purposes of analysis. All MD values and PSD values were recorded in the same format, however they were numerically different. Finally, all data plots were present and in the same format but with numerical differences for TD and PD decibel plots. The raw data for the overall gross decibel plots was exactly the same across all groups.

Fixation Losses
All fixation losses for both software types were converted from fractions to percentages to allow statistical analysis, and directly compared against one another for any differences. All patients showed consistent like-for-like fixation losses for both software types with no difference detected across all results (P = >0.50 unpaired t test). Only two anomalous results were detected in Stage 3 patients, where fixation loss results were inconsistent between the two software groups.

False Negative Rates
The false negative rates for all Medisoft patient data were converted to percentages before analysis. Stage 0 patient data was not included as all rates were identical at 0%. On comparison of differences in mean false negative rates between the two software outputs for stage 1 patients, a significant difference was detected ( Fig. 2A). The mean difference between Humphrey and Medisoft software data was 2.40 ± 1.87 (P < 0.0005 unpaired t test).
The actual false negative rates in this group were below 5% for all Humphrey patient data, and recorded at 0% by Medisoft software. The percentage rates in Stage 2 were also all below 5%, and again recorded at 0% by Medisoft software. However this was not significant.

False Positive Rates
Analysis for false positive rates was not possible due to lack of quantitative data for this field on the Medisoft system.
Y=Yes-present, N=No-absent, I = Identical field output, NI = Non-identical field output. *Two patients excluded.

Comparison of Mean MD Values
No statistical significant difference for mean deviation values across all groups was observed between the software types (P = >0.50 unpaired t test). A general trend showing the mean MD value to be slightly overestimated by Medisoft in stage 0 patients was observed. The data was slightly underestimated in comparison to Humphrey software in all subsequent groups with raw data collection values for this field showing the same trend (Fig. 3).
Bland-Altman comparison of all Humphrey and Med soft MD values show that the Humphrey mean value is the greatest of the two at 0.79 (Fig. 4). The confidence intervals ranged from -1.01 to 2.59 which exceeded the threshold value for significance of ± 2 decibels. 47 of the 50 points fell within the confidence intervals, with three points exceeding the 95% level. A general trend shows that as the mean MD value increases, the severity of visual field loss increases in groups 0 and 1 only. The trend is inconsistent thereafter.

Comparison of Mean PSD Values
No statistically significant difference for PSD values across all groups was observed between the software types for this parameter (P = >0.05 unpaired t test). However, a general trend showed mean PSD value to be slightly over-estimated by Medisoft in stage 0 patients and underestimated when all subsequent groups were compared to the Humphrey software (Fig. 5).
Bland-Altman comparison of Humphrey and Medisoft PSD values show that the Humphrey mean value was the greatest of the two, at 0.65 (Fig. 6). The confidence intervals ranged from -0.52 to 1.81 which is within the threshold value for significance of ± 2 decibels. PSD values for all 50 patients were within the confidence intervals of 95%. A general trend of the mean PSD value increasing as the difference increases is visible on the graph; however the PSD value is not related to severity of loss according to the GSS staging system.

Total deviation Probability Plot
The number of statistically significant probability plots for total deviation were counted for each patient on both data sets, and then compared. No difference was observed in the mean number of significant plots for all results (p >0.05 unpaired t test). The mean number of significant plots was over-estimated by Medisoft for patients in stages 0-2 ( Fig. 7 A, B, C), and more consistent with the Humphrey mean at the extremes of visual loss for patients grouped in stages 3 and 4 ( Fig. 7 D, E).

Pattern Deviation Probability Plot
The number of statistically significant probability plots for total deviation were counted for each patient on both data sets, and then compared. No difference was observed in the mean number of significant plots, across all results (p >0.05 unpaired t test). The mean number of significant plots was underestimated by Medisoft for patients in stages 0 and 1 (Figure 8 A,  B), and more consistent with increasing field loss ( Fig. 8 C, D, E).    The Humphrey perimeter produces a summary of the testing data and provides a clinical overview of the visual field result. Large amounts of work has been undertaken in the field to determine the gold-standard protocol for visual field assessment including intra-perimeter comparisons [18,22,23], clinical value in performing static and kinetic field assessments [24][25][26], and different techniques [27,28]. There is however a scarcity of comparable works in the literature investigating the role and accuracy of visual field assessment data between different software systems.

Fig. 6. Bland-Altman plot for mean PSD values from both software types
On review of our data analysis a range of differences and similarities were found when comparing the two software systems. The fixation loss reliability result was reported consistently the same in all but two patients. The cause of this anomaly may be due to an error in detection of the original result as both the Humphrey data plots suggested that the actual value was 0 losses. Further work would be required to confirm this. The false positive rate was reported for all Humphrey data as a percentage: for this same field Medisoft did not report any numerical value other than N/A. This output suggests that there is a threshold level for reporting any significant level with Medisoft. We believe that this is at least over the 10% rate as all recruited patients had a Humphrey value of less than 10%. The false negative rate was analysed and demonstrated a statistically significant difference for patients grouped at Stage 1 severity. The reason for this was that all false negative values below 5% were interpreted as 0% once imported to Medisoft. The threshold value for field significance is responsible for this finding.
The values on the gross decibel plot were noted to be identical on both sets of data. Given this data field corresponded exactly between both software systems, we can assume that the raw data imported to Medisoft is accurate. However the changes in MD and PSD values as well as the differences in TD and PD probability plots imply that they are the result of Medisoft's analysis of the data. The change in these values post-analysis also implies that Medisoft uses a different standard population to calculate secondary dependents such as MD and PSD values. In support of this, the Bland-Altman analysis for both these fields revealed higher mean bias values for Humphrey patient groups, with similar trends across stages of glaucoma severity. Furthermore three patients had differences in MD between both results that exceeded our clinical cut-off of 2dB difference. A key determinant of diagnostic accuracy is the reference standard used for the test itself, which ideally should be comparable to the gold-standard diagnostic test at the time [29]. The exact error of the estimate between two systems is not yet quantifiable [30], thusfurther work is required to address this.
This study was conducted as a feasibility study to evaluate and analysethe potential of discrepancies between Humphrey and Medisoft software. As such only 50 patients were recruited to the study with small numbers of ten results across the categories of visual field severity. Thus our findings must be interpreted with caution. Furthermore, we only considered one test strategy; the 24-2 central threshold programme. We recognise these as limitations of this study but are using our data to inform the planning of a larger, prospective study with a representative sample size across all categories of visual field severity, inclusive of multi-centre recruitment and considering various perimetry test strategies. We believe such a study is important to identify whether differences in representation of perimetry results become clinically and statistically significant in a large study cohort as this is particularly relevant to interpretation of perimetry results when monitoring patients and making treatment decisions.

CONCLUSION
The findings of our study support the null hypothesis of our initial aims, suggesting that, overall, there is no clinically significant difference between the Humphrey and Medisoft software systems. However this feasibility study contained a small sample size and highlights the need for a future, larger powered study. Ultimately if one system of reviewing results is selected over the other so that comparisons of results are always made from the same operating system, then the minor differences observed would not impact on patient care or management. The value in using a system such as Medisoft is that it offers the ability to assimilate data from a diverse range of ophthalmology instruments into a single integrated database, improving patient care and clinical outcomes.