A Virtual Microscope for Academic Medical Education: The Pate Project

Background: Whole-slide imaging (WSI) has become more prominent and continues to gain in importance in student teaching. Applications with different scope have been developed. Many of these applications have either technical or design shortcomings. Objective: To design a survey to determine student expectations of WSI applications for teaching histological and pathological diagnosis. To develop a new WSI application based on the findings of the survey. Methods: A total of 216 students were questioned about their experiences and expectations of WSI applications, as well as favorable and undesired features. The survey included 14 multiple choice and two essay questions. Based on the survey, we developed a new WSI application called Pate utilizing open source technologies. Results: The survey sample included 216 students—62.0% (134) women and 36.1% (78) men. Out of 216 students, 4 (1.9%) did not disclose their gender. The best-known preexisting WSI applications included Mainzer Histo Maps (199/216, 92.1%), Histoweb Tübingen (16/216, 7.4%), and Histonet Ulm (8/216, 3.7%). Desired features for the students were latitude in the slides (190/216, 88.0%), histological (191/216, 88.4%) and pathological (186/216, 86.1%) annotations, points of interest (181/216, 83.8%), background information (146/216, 67.6%), and auxiliary informational texts (113/216, 52.3%). By contrast, a discussion forum was far less important (9/216, 4.2%) for the students. Conclusions: The survey revealed that the students appreciate a rich feature set, including WSI functionality, points of interest, auxiliary informational texts, and annotations. The development of Pate was significantly influenced by the findings of the survey. Although Pate currently has some issues with the Zoomify file format, it could be shown that Web technologies are capable of providing a high-performance WSI experience, as well as a rich feature set. (Interact J Med Res 2015;4(2):e11) doi: 10.2196/ijmr.3495


Introduction Background
Whole-slide imaging (WSI), also known as virtual microscopy, has become more and more important in e-learning during the past decade. It is being used for general educational purposes, graduate education, pathology training, tutoring, and virtual workshops [1]. In many settings, WSI has already replaced the conventional microscope [1]. Recognizing the potential benefits of such applications for medical education, we conducted a survey to determine students' expectations of WSI applications for teaching purposes. Based on these findings, a WSI application for pathological specimens-the Pate application [2]-has been developed. The primary goal of developing Pate was to provide a tool that enables students to improve their skills in identifying histomorphological and pathological features on virtualized histological slides. Furthermore, Pate offers the possibility to explain the pathogenesis and pathophysiology of diseases on the basis of the morphological correlation.
As a basic characteristic of WSI, all these applications support latitude in the slide with a varying degree of usability. Some of the applications, such as ScanScope Images, Histologiekurs, and Mainzer Histo Maps, support annotations. Furthermore, Histologiekurs provides background information about the donor, informational texts about the specimen as well as support for points of interest. However, none of the listed applications, except NYUVM, support small-screen devices or a touch interface. Several of these WSI applications use Adobe Flash to implement the client. This requires the user to install a browser plug-in before using the application. Moreover, many recent devices, such as any Apple mobile gear, do not support those plug-ins. WSI applications using proprietary plug-ins include vMic, Mainzer Histo Maps, ScanScope Images, and VSlides. In contrast, there are also WSI applications that use Hypertext Markup Language 5 (HTML5) technologies. These include the applications NYU Virtual Microscope and Virtuelle Pathologie Magdeburg. By using HTML5 technologies, these applications avoid the disadvantages of browser plug-ins.
Most applications-Flash, Silverlight, and HTML5 based alike-miss important features, such as points of interest (POI), advanced annotations, informational texts, or a map scale. They provide basic image presentation capabilities, but fail to support important features provided by the WSI technology.

Demand Assessment
Due to the variety of options offered by new WSI technologies [10][11][12], it was a major prerequisite for a new WSI tool to investigate which features would benefit students, without compromising the usability of Pate. Therefore, the users' opinions were crucial in identifying and eliminating undesirable features. To achieve this goal, a survey was conducted targeted to functional needs and usability from students' perspectives as potential users of a new WSI tool. Therefore, their expectations of Pate, their experiences with already existing WSI tools, positive and negative features, as well as feature suggestions were analyzed by a questionnaire.

Slide Acquisition
As stated by Glatz-Krieger et al [8], the quality of virtual slides is defined by four crucial parameters, namely, the quality of the histological section, the completeness of the histological section, the quality of the scanned image, and the usability of the virtual slides. From these parameters, the quality and completeness of the section should be guaranteed during the physical slide acquisition. These parameters are essential for high-quality slide imaging.

Image Quality
The image quality is highly influenced by optical focusing during slide scanning. For this, two main methods are currently available. The first method utilizes stacking of multiple planes with different focus settings-z-stacking [13]-which emulates a physical microscope more closely [14]. This method also leads to more memory consumption. However, the slide acquisition process is less complicated, since only the middle optical plane needs to be positioned near the mean focal plane of the glass slide.
The second method uses a single virtual focal plane that resembles the best focus throughout the whole glass slide. Because this procedure results in smaller memory consumption, it was chosen for digitizing the histopathological slides for Pate. In order to ensure optimal results, we manually inspected the suggested automatically generated focal points of the software and corrected them where necessary.

Usability
Features for optimal usability are smoothly scrolling images, short access times, orientation, and several options for magnification. Furthermore, a good user interface design is of major importance. To achieve this, we put special emphasis on mobile devices next to the classical desktop. Therefore, we wanted to support small screens as well as a touch interface.

Resources
The Pate suite offers a family of differently scaled versions for each of the high-resolution images [15]. Thus, a user can conveniently choose the scale of interest while having fast response times due to the small bandwidth used during data transfer. This multiresolution representation can be obtained by a cascade of downsampling operations on a dyadic grid similar to the discrete Haar-wavelet decomposition of rasterized images [16]. At first sight, providing a family of images for different resolutions appears inefficient in terms of memory usage. As we will show, the storage size of the multiresolution representation of images used with tiles is bounded by merely 133% of the original image size.
As a result, the overall size, M, can be written in terms of a finite geometric series, as displayed in Figure 1. The expression for M is overestimated by the (infinite) geometric series for N→ ∞ which converges for all q satisfying 0≤| q |<1 [17]. For the choice, q=1/4, the overall memory consumption relative to the original image size is obtained as M/S<4/3. Thus, all resolutions can be stored with less than one-third of additional memory.

Student Survey for the Expectations of a Virtual Microscope
The student survey was designed in cooperation with the Center for Quality Assurance and Development of the Johannes Gutenberg-University Mainz, Germany. It featured 16 items with question types including multiple choice, free field, and single choice. A total of 216 students in the third year-fifth and sixth semesters-of medical education participated in the survey. Of these, 62.0% (134/216) were female, 36.1% (78/216) were male, and 1.9% (4/216) did not disclose their gender. At the time of the survey, all students were participating in either the course General Pathology or Special Pathology. Prior to this, in their second year of medical education, all students had already completed a histological course. The voluntary survey was conducted during the first lecture of the course and included all the students of the year.
The survey aimed to investigate students' expectations of a virtual microscope with a view to the functional needs, such as useful features (eg, points of interest, annotations) and usability questions (eg, user-friendly handling). Furthermore, their experiences with already existing WSI applications were registered. Besides establishing baseline data, including gender, age, and time-based Internet usage, items were included that covered preexisting WSI experiences by asking for prior WSI usage. In addition, students were required to specify which applications were already known. Furthermore, we asked for expectations by enumerating items, which could be chosen if considered important. Finally, a free-field item for the students' own suggestions completed the survey. The entire set of items is given in Table 1-see Multimedia Appendix 1 for a detailed description of the variables.

Development of the Client
The client application was implemented using JavaScript, HTML5, and Cascading Style Sheets (CSS). As JavaScript implementations differ by browser engine [18], the frameworks JQuery 1.7.1 (jQuery Foundation), MochiKit 1.4.2 (Mochi Media), and Modernizr 2.5.3 were used to abstract JavaScript code from the browser implementation. OpenLayers 2.12 was employed to display image slide data.

Development of the Backend
To speed up application development, the backend was developed using the Web Server Gateway Interface abstraction layer. Python 2.7 (Python Software Foundation) was used as the programming language in combination with the TurboGears 2.2 framework. The database is hosted by a MySQL Server 5.1.66 (Oracle Corporation, Redwood Shores, CA).

Slide Acquisition
We decided to use a single, virtual plane of focus throughout the physical glass slide. This method proved to be more difficult in maintaining a good focus throughout the slide than initially assumed. The autofocus functionality only provided a rough starting point resulting in the need to adjust the focal points in nearly every glass slide to achieve optimal results.
The slides were converted into image files with the slide scanner NanoZoomer 2.0-HT (C9600-13) (Hamamatsu Photonics Deutschland GmbH, Herrsching am Ammersee, Germany) producing high-quality scans. Unfortunately, the scanner software did not produce an open image format that we could use. This resulted in the necessity of converting the image files into a format we could utilize. However, in the meantime a set of tools were published to handle the Hamamatsu image format (NDPI) [19,20]. We decided to implement the Zoomify file format.

Image File Storage
Due to the small file size of approximately 20 KB per tile, the slide image data themselves were put into the database since it would be more expensive to open a file handler for each tile than to query the database [19]. However, this yielded a backup problem. A common way to solve this problem is to dump the database at regular intervals and keep the dumps in a safe place. Because of the file size of the image data, the size of these dumps became huge. With the chosen image quality of the slides and slide dimensions, the file sizes in Pate amounted to approximately 1.5 GB to 2 GB per slide. It would have been inefficient to keep multiple backups of that size which contained mostly redundant data. As the dumps are of a text file nature, a revision control system-GIT 1.8.3.2-was installed to enable dumping of the database regularly, to store the dumps in an efficient manner by accounting only for the differences between revisions, and to permit recovery of backups from any point in time.

Web Design
We commissioned Grüne Kommunikationsdesign, Bodenheim, Germany, for the task of Web design. The goals included design of a streamlined user experience, high usability on both mobile devices and desktops, as well as an appealing graphical interface.

Overview
The Web application, Pate, was developed according to the expectations of medical students. The analysis of the questionnaire revealed that 97.1% (198/204) of the students considered WSI as an important learning tool in the training of histopathological skills. Furthermore, the students assessed WSI as desirable for exam preparation.

Requested Features
A total of 83.8% (181/216) of the students accorded a high priority to points of interest as a prime feature of Pate. Annotations-histological (191/216, 88.4%) and pathological (186/216, 86.1%)-as well as auxiliary informational texts (113/216, 52.3%) were also evaluated positively. In contrast, the deployment of a discussion forum seemed to have little importance for the students, since only 9 out of 216 students (4.2%) recommended this feature. Figure 2 shows graphical results of the importance of WSI application features by students. Furthermore, in the free-field part of the survey, a quiz mode was suggested by the students.

Previously Known Whole-Slide Imaging Applications
One goal of the survey was to evaluate students' preexisting experiences with other virtual microscopes, including non-WSI systems. For this purpose, a multiple choice question was included in the survey containing an enumeration of the most known applications. As expected, since all students had already completed a histological course, which propagated this specific system, most students (199/216, 92.1%) already knew the Mainzer Histo Maps application [4], an online image collection of histological slides of different human organs. The second-most known, the Histoweb Tübingen [20], was less popular (16/216, 7.4%), followed by Histonet Ulm [21] (8/216, 3.7%). All other explicitly listed systems were known by less than 2% of the students. Of the students, 6.9% (15/216) were familiar with an application which was not listed. NYU Virtual Microscope was not part of the questionnaire since this application was not yet published at the time of our students' survey. Figure 3 shows the graphical results of students' familiarity with WSI applications. Figure 3. Previously known WSI applications in relation to students' degree of familiarity to these applications.

Statistical Correlation and Descriptive Analysis of the Dataset
In order to reveal any correlations in the dataset of the questionnaire, we created a heat map for the Spearman rank correlation coefficient, ρ, or r as in Figure 4, containing every variable of the questionnaire paired with each other. The resulting figure shows all correlation coefficients and P values, where applicable. However, there are some variables where a correlation is not applicable, as there is no deviation in the dataset. This applies to the variables of two known WSI applications, as well as the usage of a mobile phone as a device to access Pate. Furthermore, most of the correlations are statistically insignificant (P>.05). However, some clusters of moderate correlation (ρ<.6) remain. For example, there appears to be an association between wanted WSI features, as well as an association between some known WSI applications.

The New Whole-Slide Imaging Tool, Pate
The results of this survey provided the basis for the development of a novel, user-friendly application built using modern Web technologies, such as HTML5, CSS, and JavaScript. These technologies provide a unified user experience across all major platforms, such as PCs, tablets, and mobile phones. For optimal use, a modern Web browser is recommended.
Pate contains 118 high-quality histopathological specimens from the major human pathological conditions, enriched by several specimens regarding cell-tissue interactions. The slides showing full-thickness cartilage defect and punch-biopsy skin wound demonstrate the potential benefits of WSI applications also in biomaterial research. These slides are enriched using nondestructive annotations, as well as points of interest. Each slide in Pate can be shared by distributing the URL. This allows easy sharing of large images as soon as the slides are digitized. Utilizing modern Internet technologies, such as HTML5, CSS, and JavaScript, enables the user to view the image material with a modern Web browser-no proprietary plug-in is required. Pate supports devices with differing screen sizes, such as PCs and mobile phones, by utilizing responsive Web design methods (see Figure 5).

Performance of Pate
It is hard to determine the actual performance of the image-serving capabilities of Pate. It depends on various factors, such as the performance of the Internet service provider (ISP) of the server, the bandwidth of the ISP offered to the user, and specific routing conditions, among others [22]. Therefore, we chose a test setup which is better controlled by utilizing another computer in the same local area network as the server for testing. This allows us to reduce the influence of poor network performance. We created a list of 50 randomly sampled URLs, addressing 256x256 pixel image tiles, served by the Pate image server. All caching had been disabled. We set up Siege 2.70 (Joe Dog Software) [23], a load-testing and benchmarking tool, in order to simulate high concurrency transactions. One transaction is a complete HTTP request of one randomly sampled URL, the download of the image data, and the closing of the connection. The degree of concurrency determines how many transactions are performed in parallel. When one transaction is finished, the next one is started immediately. For each concurrency level, one worker is instantiated by Siege. Every worker performs 1000 sequentially executed transactions. For example, a concurrency level of 4 results in 4x1000 transactions, resulting in 4000 transactions. This test was performed for concurrency levels ranging from 1 to 20. The server is powered by a dual-core Intel Xeon E5504 @ 2.00 GHz CPU, 4.6 GB of RAM, and multiple hard drives in a Redundant Array of Independent Disks (RAID) 5 system with a data retrieval capacity of up to 109 MB/s. The results are shown in Table 2 and illustrated in Figure 6.   Table 4 depicting the median response time in msec, including the standard deviation, as well as the transactions per second, pixels per second, and total and individual pixels per worker per second.

Principal Findings
The goal of this project was to create a new WSI application for histopathological education according to students' demands and expectations. Therefore, a feature set was extracted from existing WSI applications, including histological and pathological annotations, points of interest, background information, latitude in slide, teaching texts, and a discussion forum. Then, a questionnaire was designed to evaluate the actual needs of medical students, as well as their expectations and their experience with WSI applications. The results of the survey built up the base for a feature set for Pate. From a technical point of view, a further goal of the development was to utilize established Web technologies, such as HTML5 and JavaScript, in order to support as many platforms and devices as possible without the requirement to install any kind of software in advance. In addition, we put an emphasis on supporting mobile platforms with small screens and touch interfaces. This resulted in a unique set of requirements for the development of Pate that was not covered by any other application. During the development, student feedback ensured that the desired features were integrated as was intended by the results of the survey.
With the survey, we identified a high demand regarding a broad feature set for the application, including annotations, POIs, and auxiliary informational texts. We were surprised to learn that, by contrast, the students placed little emphasis on the installation of a discussion forum to permit direct contact with the staff. From the viewpoint of the organizer this facilitates the care of Pate, since a moderation of such a forum is time and personnel consuming. Nevertheless, from the viewpoint of the teacher it would be of interest to establish what problems the students are having in learning histopathological skills. Such a forum could open interesting perspectives to give live information and to react to the needs of the medical students. However, an open forum carries the risk of containing uncontrolled information, as well as incorrect data, which are not useful for an e-learning application. Therefore, the development of a discussion forum was abandoned. A statistical analysis of the survey dataset revealed no further insight besides a correlation of desired WSI features, as well as an association between some known WSI applications.
During deployment of Pate, a scripting language was used, which allowed swift response to the students' demands and offered the implementation of a rapid application development process. By using a professional Web design, a user-friendly and intuitive Web frontend was created. In this context, we focused on supporting mobile devices, as well as conventional desktop computers.
Current limitations of Pate arise mainly as a result of the Zoomify file format used to store digital slides. This format utilizes tiles to limit the data which must be transferred to the client. This leads to approximately 200,000 files for a regular slide. The costs of retrieving a file handle for each file can be reduced by storing the image files in a database. However, storing the image data in the database complicates the backup. Commonly, time-stamped database dumps are used. This would lead to a volume of data which would be hard to handle. In Pate, this issue was solved by using a version control system. However, creating a backup would be much less difficult without the image files in the database.
One of the design goals of Pate was to support mobile devices. These devices commonly have a slow Internet connection. Because image quality was the first priority when creating the Zoomify tiles, these have a mean size of approximately 20 KB. This can lead to prolonged loading times when using a slow Internet connection.
Finally, the new Web application, Pate, is modeled closely along the expectations of the students by providing a complete set of features, such as points of interest, annotations, and informational texts, which are actually not available as a holistic feature set by other WSI applications.

Follow-Up Survey
As Pate's development has now sufficiently progressed, we will be conducting a follow-up survey, in order to determine if the students' needs have changed and if the implemented feature set has been implemented in a satisfactory way. Furthermore, we will establish a constant feedback loop to be able to respond to new challenges promptly.

Quiz Mode
The survey revealed that many students appreciated a quiz mode. Since Pate was not designed to offer tests of any kind, this feature would have to be implemented from scratch. Pate includes slide image data, meta-information, and regions of interest within each slide. Thus, views of POI regions could be generated, which the students would then be asked to diagnose.

Image Server
The image-serving capabilities of Pate are competitive. The median image retrieval time was measured as 18.8 ms (SD 4.3) while achieving 3.4 MP per second. The peak performance was reached with 15.4 MP per second. This puts Pate in the same performance category as other state-of-the-art image servers [24]. Most of the limitations of Pate were inherited by the Zoomify file format storing prerendered image tiles in a database. A more convenient way to handle large image data would be to archive a single file which can be stored outside the database, while allowing the user to quickly retrieve any section at any magnification. The Tagged Image File Format (TIFF) features storage of pyramidal image data. Therefore, one future goal is to adapt or to develop an image server that is able to read pyramidal TIFF data server-side and deliver requested image tiles via the Zoomify file format to the client. This method avoids the necessity to redevelop the client and brings the benefit of a file format that supports storing slide image data in one large file. Furthermore, image data could be compressed, according to the current bandwidth, to the client, thus providing a smoother experience for users with small bandwidth, especially users on mobile networks.