Automated bacterial identification by angle resolved dark-field imaging

: We propose and demonstrate a dark-field imaging technique capable of automated identification of individual bacteria. An 87-channel multispectral system capable of angular and spectral resolution was used to measure the scattering spectrum of various bacteria in culture smears. Spectra were compared between various species and between various preparations of the same species. A 15-channel system was then used to prove the viability of bacterial identification with a relatively simple microscope system. A simple classifier was able to identify four of six bacterial species with greater than 90% accuracy in bacteria-by-bacteria testing. algorithm for bacteria identification based on scattering spectrum.


Introduction
Accurate identification of bacteria is vital for many applications such as infectious disease diagnosis, food contamination monitoring, and microbiological research. Common methods of bacterial identification include culture, PCR, and microscopy. Culture and PCR are methods that require a laboratory setting and significant resources to perform. In addition, bacteria culturing remains a time-consuming process, taking days -or in the case of slowgrowing bacteria weeks -to achieve sufficient quantities of bacteria for reliable identification. While microscopy is a well-established method in microbiology, it relies on skilled microscopists to identify key morphological traits and stains that generally lack genus-level specificity. This skill requirement has restricted the adoption of microscopy in resourcelimited environments such as field clinics for malaria and tuberculosis diagnosis [1]. Outside of maintained laboratories there is often a lack of necessary equipment, consumables, and experience to effectively detect and identify bacterial contaminants or pathogens using culture, PCR, or microscopy [2]. Accurate, onsite bacterial identification becomes particularly important for the diagnosis of diseases that require medical intervention such as drug administration, as patient follow-up is often difficult in resource-limited settings.
Scattering measurements have been used to identify bacteria and other cells for decades [3][4][5][6]. First proposed in the 1960's, flow cytometer systems capable of fine angular resolution scattering measurements have been shown to produce effective identification for certain bacterial species. In addition, the theory behind extracting particle information from angleresolved scattering measurements is well-explored [7][8][9]. Angle-resolved flow cytometry systems are inherently complex and expensive, requiring complex optics to achieve sufficient angular resolution. More recently, systems with fewer channels have been explored in conjunction with more sophisticated machine learning algorithms [10].
Dark-field (DF) microscopy has seen relatively little use in clinical microscopy applications [11][12][13]. DF has been used to image bacterial flagella [14] and is an established procedure for the diagnosis of syphilis infections [13]. DF imaging is an inherently scatteringbased method that can be used to determine optical characteristics that are generally unobservable in bright-field. While DF microscopy is relatively simple to implement, the numerical aperture necessary to resolve individual bacteria makes the fine angular resolution necessary for scattering-based identification cumbersome.
Recent methods have combined DF microscopy with hyperspectral cameras for nanoparticle imaging in biological applications [15,16] and hyperspectral dark-field microscopes are commercially available. The present work explores the capabilities of a darkfield microscopy system capable of both angular and spectral resolution, and its potential for identifying bacterial species without exogenous contrast agents. This method will be referred to as multispectral angular-resolved dark-field imaging (MARDI). We will present the scattering spectra (resolved in wavelength and angle) for various bacteria and identify which intra-species variations can affect the scattering spectra. We also present a simple computer algorithm for bacteria identification based on scattering spectrum.

Microscope system
To create a MARDI microscope, a monochromator was inserted between the lamp and the illuminator optics of a transmission-mode upright microscope. The monochomator had a FWHM of 10 nm and was varied from 420 nm to 700 nm in 10 nm increments. The Abbe condenser (dry, NA = 0.9) was fitted with a carousel containing three DF aperture rings to produce different angles of illumination (shown in Fig. 1(a)). Light was collected with a dry 40x long working distance objective (NA = 0.45). Images were captured with an Andor Luca-R EMCCD camera. The electron-multiplier gain of the EMCCD was kept at a constant value for all experiments, which was within the linear operation range of the EMCCD. This optical configuration is shown in Fig. 1(b). 87 grey-scale images (29 wavelengths times 3 DF rings) were normalized by imaging 2 µm silica spheres (n = 1.45) suspended in cytoseal (Richard Allan, Kalamazoo, MI, n = 1.48) and normalizing to expected values calculated using Mie theory. 2 µm silica spheres were chosen to simulate the size and shape of bacteria and were easily resolved by the microscope. A 15-channel MARDI system was also used. This system was identical to the previously described system with the exception of the monochomator being omitted and the EMCCD camera being replaced by a Leica EC3 RGB camera. 15 channels were comprised of three colors (RGB) at five illumination angles. Normalization for this system was performed by white balancing on a white target.

Bacterial samples
Haemophilus influenzae type b (Hib) 10211 was cultured on chocolate agar and in tryptic soy broth (TSB; Fluka, St. Louis, MO) in the presence of 10% CO 2 for 24 h. Pseudomonas aeruginosa 15442 and Staphylococcus aureus 6558 were cultured on tryptic soy agar (TSA; Fluka) and TSB for 12-24 hours. Mycobacterium smegmatis mc2155 was cultured in Middlebrook 7H9 broth (Fluka) supplemented with AODC (Beckon Dickenson, Sparks MD) and Middlebrook 7H10 agar (BD, Franklin Lakes, NJ) with AODC growth supplement and grown for 24-48 hours. Mycobacterium bovis BCG 19274 was grown in 7H9 broth and 7H11 agar supplemented with AODC and Sodium Pyruvate substituted for glycerol for 3-4 weeks. All incubations were performed at 37°C. All cultured bacteria strains were purchased from ATCC, (Manassas, VA). Escherichia coli, additional S. aureus, and Streptococcus pneumoniae culture smears were purchased as prepared slides (Ward's Natural Science, Rochester, NY). H. influenzae, S. pneumoniae, P. aeruginosa, and S. aureus are species commonly found in sputum and leading causes of bacterial pneumonia [17][18][19][20]. M. smegmatis and M. bovis BCG are closely related to Mycobacterium tuberculosis (M.tb).
To test the effect of culture growth stage, samples were prepared from a single culture of M. smegmatis at culture ages of 6, 12, 24, and 48 hours. Using plate counts and optical absorption spectroscopy the 6 and 12-hour-old samples were found to be in log phase, the 24hour-old sample in stationary phase, and the 48-hour-old sample in death phase. An additional step of centrifugation and rinsing was used to concentrate these samples; denoted by C6, C12, C24, and C48, respectively. The stationary phase culture smears (C24) were used to compare sample preparation methods to the standard preparation of M. smegmatis (with samples denoted by N). As a third sample preparation for M. smegmatis, cultures were prepared using the same culture medium and sample preparation process as M. bovis BCG (this involved a modified culture medium and the use of dispersants to break up bacterial agglomerations before smearing). These samples are denoted by SP.
To test the effect of cell wall and membrane on scattering the outer membrane of Gramnegative P. aeruginosa was stripped using a five-minute exposure to 70% ethanol solution after smearing. Scattering spectra were measured on unstained samples, which were then Gram stained to confirm successful membrane stripping. Figure 2 shows a diagram of MARDI post processing. Significant objects in the original image stacks were identified using background subtraction and a dynamic threshold. Identified objects were then pre-filtered for area A and compactness c = P 2 /A, where P is the perimeter. Pre-filtering was designed to screen objects too small or large to be bacteria. Because the same size filter was used for all samples, clusters of smaller bacteria regularly passed the pre-filter and were included in subsequent testing. Once thumbnails of significant objects were extracted from the original images, backgrounds were subtracted and the intensity was summed over all pixels belonging to the object. This resulted in a single 87dimensional vector (29 wavelengths, three angles) to characterize each identified object, which was then normalized for overall brightness and background subtracted. This vector is referred to as the object's scattering spectrum x  . In addition to the scattering spectrum, area, eccentricity, and compactness were recorded for each object. While algorithms exist for shape identification in microscope images, including M.tb identification [21,22], these were omitted from our process to isolate the discriminatory capabilities of the scattering spectrum.

Image processing and classifier algorithm
To test the usefulness of the bacterial scattering spectra for species identification a Bayes-Gaussian classifier [ where d j is the likelihood score for the j th class and P(w j ) is the probability of the j th class occurring. The classification decision is made by designating the test vector to the class with the highest likelihood score. All classifier results reported in this work were produced using leave one out cross validation. In all tests an uninformative prior was chosen, so P(w j ) was set as equal for all classes. Sensitivity and positive predictive value (PPV) were used as performance metrics for the classifier. Sensitivity is defined as the likelihood of a bacterium of known class being accurately identified by the classifier. PPV is defined as the likelihood that a bacterium classified as a specific class was in fact a bacterium of that class. Principle component analysis (PCA) -implemented using singular value decompositionwas used to visualize cluster separation. In all PCA tests an equal number of test vectors were used for each test class.    Fig. 2. Instead of pixel-by-pixel PCA, significant objects were identified and their pixels averaged before extracting the scattering spectrum. This method was found to eliminate the effects of chromatic aberrations, thus producing consistent scattering spectra over full fields of view. The colors in Fig. 3(c) and 3(f) were then assigned based on the PCA scores of each object's scattering spectrum.

Intra-species
M. smegmatis and S. aureus were used to study the variation of scattering spectra within populations of a single species. Figure 4 shows PCA score values for several sets of samples. In all cases each test group is composed of 100 bacteria chosen randomly from multiple slides. Figure 4(a) shows the first principal component scores for M. smegmatis samples of different culture ages. Note the poor separation of clusters. In Fig. 4(b) the first principal components are plotted for S. aureus from two lab culture slides and three purchased slides. Figure 4(c) shows the comparative effect of differing sample preparations.
A Bayes-Gaussian classifier was used to quantify the potential classification accuracy for distinguishing culture age and sample preparation methods. Results are shown in Tables 1 and 2, respectively. In these and all following tables correct classifications are marked in gray on the diagonal and common misclassifications are marked in different shades of red according to their likelihood. Sensitivity and PPV are shown for each sample set. All classifier results shown in this paper were obtained using between N = 250 and N = 4576 individual bacteria from two to four different slides.  The way the sample was cultured and prepared can be seen to affect the scattering spectrum. Of particular interest is that M. smegmatis prepared using the M. bovis culture method (M. smegmatis SP) is much more often misclassified as M. bovis than it is misclassified as M. smegmatis cultured in the recommended way (M. smegmatis N). Note that the difference in culture method between M. smegmatis N and SP entailed only substituting glycerol for sodium pyruvate in the culture medium and the addition of surfactants. In contrast, the age of the culture when the sample was prepared was found to be relatively unimportant, although the 6-hour old preparation (C6) was the easiest to distinguish.
The effect of cell membrane on the scattering spectrum was tested by stripping the outer membrane of Gram-negative P. aeruginosa. Outer membranes were successfully stripped as shown in Fig. 5(b) and 5(c). The score plot for stripped and non-stripped P. aeruginosa is shown in Fig. 5(a). Note the poor cluster separation, implying that the presence or lack of outer membrane has limited effect on bacterial scattering. Despite the poor cluster separation in Fig. 5, the Bayes-Gaussian classifier distinguished the two classes with 90% accuracy.
The age of the smear on the slide was found to not affect the scattering spectrum of the bacteria; no changes in spectrum were observed even after several months at room temperature.  Scattering spectra of the eight tested bacteria species. The three dark-field aperture setting are denoted by colors corresponding to Fig. 1(b) (ring 1 = blue, ring 2 = green, ring 3 = red). Error bars show the standard error of the mean for sample sizes ranging from N = 346 for S. pneumoniae to N = 2172 for P. aeruginosa.

Interspecies
The scattering spectra of various bacterial species are shown in Fig. 6. As expected for particles of bacteria's size and dielectric properties, the scattering spectra lack sharp features.
The spectra do have distinguishing features, however, when averaged over thousands of bacteria. Figure 6 shows that specific scattering characteristics exist for bacterial species. These scattering characteristics enable individual bacteria to be identified. Some principal components are shown in Fig. 7. Figure 7(a) shows H. influenzae and S. aureus. H. influenzae and S. aureus represent a pair of species that are easily distinguished in classifier tests. Figure  7(b) shows two preparations of M. smegmatis (N and SP) with M. bovis BCG. Note that as expected from results in Table 2 M. smegmatis SP clusters more closely with M. bovis than with M. smegmatis N. Table 3 contains the Bayes-Gaussian classifier results for different bacterial species.  Certain bacteria species, including S. aureas and H. infleunzae, were distinguished on a bacterium-by-bacterium basis with greater than 90% accuracy, even from a pool of seven possible species. Other species were much less discernible, such as S. pneumoniae, which was difficult to tell apart from the pool in general. The most common misclassification was between M. smegmatis and M. bovis, which were the most closely related species in the test.

15-Channel system
The results in Tables 1 through 3 were found utilizing an 87-dimensional observation of individual bacteria. Such a system is likely too slow and complex for use in realistic diagnostic situations. To show the potential for a simpler diagnostic device a 15-channel system utilizing an RGB camera for spectral resolution was tested.
In general, the 15-channel system gave more consistent and accurate results than the 87channel system. This is attributed primarily to a simpler acquisition process leading to a reduction in systematic noise. For the 15-channel system PCA results show cluster separation superior to the 87-channel system for most species (Fig. 8(a)). Table 4 also shows higher sensitivities than Table 3 and misclassifications concentrated in only two interspecies mistakes (E. coli -P. aeruginosa and M. smegmatis -S. aureas). Interestingly, S. pneumonia, which had the lowest sensitivity with the 87-channel system, had the highest sensitivity with the 15-channel system.   To test the robustness of the spectral features used in classification, sensitivity was monitored as Gaussian noise was added to the spectra used for training and testing the Bayes-Gaussian classifier. This had the additional effect of showing whether systematic noise was being used by the classifier to distinguish species. Gaussian noise was added multiplicatively as the most likely sources of systematic noise in the system are from focus errors, white balance, and exposure; all of which introduce multiplicative noise to the scattering spectrum. Results are shown in Fig. 8(b). With the exception of M. smegmatis, all species maintain their sensitivity until at least 10% added noise. Sensitivity roll off above 10% noise is due to increased misclassifications of M. smegmatis to S. aureas and E. coli to P. aeruginosa.

Discussion
Different bacterial species were found to have unique scattering spectra. The origin of such unique spectral features can be inferred by referring to previous works. Differences in scattering profile can be due to a bacterium's size, shape, bulk refractive index, any present pigment, and internal variations of refractive index. The coarse angular resolution of the MARDI system strongly implies that gathered light is mainly from the main lobe of the bacterium's scattering profile. This is a good indication that unique spectral features are likely due to the size, shape, and bulk refractive index of the bacteria [7,8,10]. This is confirmed in part by the result that stripping the cell membrane of P. aeruginosa had only minor effects on scattering spectrum (Fig. 4). The cell wall structure has often been cited as the most likely source of internal index of refraction variation [3,9]. There was no clear correlation between bacterial shape or cell wall structure and scattering spectra. It should be noted, however, that referenced works measured bacteria in aqueous suspension.
The data in Tables 3 and 4 and Fig. 8(b) show the results of computer classification algorithms that utilize only the scattering spectra of the test bacteria. Were a MARDI system to be used for bacteria identification applications, shape, size, and clustering pattern observations could be easily incorporated into the computer classification algorithm. Using shape and size observations could help add robustness to the algorithm for identifying bacteria of similar scattering spectra, such as M. smegmatis and S. aureas, which are significantly different in shape (see Fig. 3).

Conclusion and future work
The results of the previous sections must be considered in context of potential applications for MARDI identification. While we have shown that scattering spectra can be used to identify bacterial species, scattering spectra were also shown to vary based on the environmental conditions surrounding the bacteria (in this case the culture medium) and how samples were prepared. Therefore any diagnostic test would require a rigid process of sample preparation. In one foreseeable application MARDI identification could be used in conjunction with cell culture to classify bacteria. This would ensure sufficient control over the bacteria's growth medium and sample preparation.
While MARDI combined with cell culture could potentially cut culture time and complexity, the greater potential of MARDI is perhaps best realized in using direct environmental or clinical samples. The development of a simplified MARDI system could enable the use of samples such as ground water, sputum, blood, or feces for fast on-site identification of pathogenic species, which is not currently possible for PCR, flow cytometry, or culture. This raises additional challenges for sample preparation to ensure that the resulting smear is relatively free of environmental debris and necessitates classification algorithms that account for environmental contaminants on the same size order as bacteria. Such methods are currently under development for sputum samples for use in diagnosis of TB and pneumonia.