Rapid differentiation of Rapid differentiation of Moraxella bovoculi Moraxella bovoculi genotypes 1 and 2 genotypes 1 and 2 using MALDI-TOF mass spectrometry profiles using MALDI-TOF mass spectrometry profiles

Moraxella bovoculi is the most frequently isolated bacteria from the eyes of cattle with Infectious Bovine Keratoconjunctivitis (IBK), also known as bovine pinkeye. Two distinct genotypes of M. bovoculi , genotype 1 and genotype 2, were characterized after whole genome sequencing showed a large degree of single nucleotide polymorphism (SNP) diversity within the species. To date, both genotypes have been isolated from the eyes of cattle without clinical signs of IBK while only genotype 1 strains have been isolated from the eyes of cattle with clinical signs of IBK. We used 38 known genotype 1 strains and 26 known genotype 2 strains to assess the ability of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) to accurately genotype M. bovoculi strains using mass spectrum biomarkers. Mass spectrum data was analyzed with ClinProTools 3.0 software and six models were developed that classify strain genotypes with accuracies ranging from 90.6% - 100%. Finally, using four of the most genotype-specific peaks that also exhibited high peak intensities from the six automated models, we developed a customized model (UNL assisted model) that had recognition capability, validation, and classification accuracies of 100% for genotype classification. Our results indicate that MALDI-TOF MS biomarkers can be used to accurately discriminate genotypes of M. bovoculi without the need for additional methods.


Introduction
Infectious bovine keratoconjunctivitis (IBK) represents the most common ocular disease of cattle in the United States and worldwide (Brown et al., 1998;Killinger et al., 1977). This disease, often referred to simply as "pinkeye" by cattle producers and veterinarians has been shown to have a significant economic impact on the beef industry due to increased labor and treatment costs as well as decreased weight gain in affected calves (Killinger et al., 1977;Thrift and Overfield, 1974). Moraxella bovis is capable of producing IBK experimentally and is regarded as the causative agent of IBK (Beard and Moore, 1994;Rogers et al., 1987). However, other bacteria and viruses are frequently isolated from the eyes of cattle with IBK lesions. These microbes may have a potential role in the pathogenesis of IBK in the absence of M. bovis and include Moraxella bovoculi, Mycoplasma bovoculi and infectious bovine rhinotracheitis virus (Angelos, 2010;Angelos et al., 2007b;George et al., 1988;Rosenbusch and Ostle, 1986).
A recent large-scale retrospective study found that among diagnostic laboratory submissions from cases of cattle with IBK throughout the United States, M. bovoculi was the only bacteria isolated in 64% of cases while M. bovis was the only bacteria isolated in only 22% of cases (Loy and Brodersen, 2014). This is similar to a recent study showing detection of M. bovoculi using PCR based methods in more than 75% of samples submitted to a veterinary diagnostic lab (Zheng et al., 2019). M. bovoculi possess repeats-in-toxin (RTX) type toxins that are highly similar to M. bovis and cause lysis to bovine cells in vitro, and also possess pilin genes that may facilitate colonization of ocular conjunctiva (Angelos et al., 2007a;Cerny et al., 2006;Dickey et al., 2018). To date, attempts to experimentally reproduce IBK using M. bovoculi have been unsuccessful (Gould et al., 2013). In addition, vaccination of cattle using M. bovoculi antigens or autogenous bacterins to prevent IBK have shown mixed results in terms of efficacy (Angelos, 2010;O'Connor et al., 2019). Recently, the USDA granted conditional approval of a M. bovoculi bacterin for IBK prevention (USDA CVM code: 2A77.00, Addison Biological Laboratory, Inc. # 355) highlighting the potential importance of the contribution of these organisms to IBK.
One confounding observation to understanding the contributions of these organisms to IBK is a large amount of genomic diversity among circulating strains of M. bovoculi. Recently, an analysis of 246 M. bovoculi genomes highlighted > 127,000 SNPs shown to represent two https://doi.org/10.1016/j.mimet.2020.105942 Received 10 March 2020; Received in revised form 6 May 2020; Accepted 6 May 2020 distinct genotypes and evidence for intraspecies recombination with Moraxella bovis RTX genes (Dickey et al., 2016;Dickey et al., 2018). Strains of both genotype 1 and genotype 2 were isolated from the eyes of cattle without clinical signs of IBK. Only genotype 1 strains were isolated from the eyes of cattle exhibiting clinical signs of IBK although the strains were isolated prior to the discovery of genotype 2 and the isolation methods used would have preferentially selected for genotype 1 strains. More than 23,000 SNPs were shown to delineate the two genotype strains. As genotype 1 strains of M. bovoculi are primarily associated with bovine IBK lesions to date, rapid differentiation of these two genotypes would be clinically relevant and useful to diagnostic laboratories.
Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is a commonly used technique for the rapid and accurate identification of bacteria in diagnostic and research settings (Clark et al., 2013;Sandalakis et al., 2017;Seng et al., 2009). MALDI-TOF MS has been successfully used to distinguish M. bovis and M. bovoculi isolates at the species level (Robbins et al., 2018). MALDI-TOF MS has increasingly been used to accurately classify isolates within a species into specific subtypes and/or genotypes using bioinformatics analysis to develop "models" that distinguish strains using biomarker peaks in mass spectrum profiles (Khot and Fisher, 2013;Loy and Clawson, 2017;Mani et al., 2017;Perez-Sancho et al., 2018).
Since genotype 1 M. bovoculi strains are more frequently associated with IBK lesions, being able to rapidly discriminate genotypes for outbreak investigations, downstream susceptibility testing, and autogenous vaccine formulation would be useful for veterinary diagnostic laboratories. Currently there are no methods besides genomic sequencing to determine the genotype of M. bovoculi. Therefore, we evaluated the ability of MALDI-TOF MS models to accurately classify M. bovoculi strains according to their genotype. Such a model would be portable and could be freely shared with other diagnostic laboratories using MALDI-TOF MS platforms with ClinProTools software. Additionally, defining genotype-specific peaks within a model would enable diagnostic laboratories with MALDI-TOF MS capability, but without access to ClinProTools software, the ability to visually inspect spectra data and manually assign genotypes.

Strains and bacterial culture conditions
A collection of M. bovoculi isolates was used for this study that had been previously identified to the species level using PCR and MALDI-TOF MS techniques (Loy and Brodersen, 2014;Robbins et al., 2018). In addition, the isolates had previously undergone whole genome sequencing and been classified as either genotype 1 strain or genotype 2 strain, as well as having the presence or absence of an encoded recombinant repeats-in-toxin (RTX) element some of which the gene structure shows recombination with M. bovis RTX . The samples chosen for this study (n = 64) included 38 genotype 1 strains from 11 different states and 26 genotype 2 strains from the state of Nebraska (Table 1). From frozen stocks, bacteria were plated on tryptic soy agar with 5% sheep blood (Remel, Lenexa, KS) and incubated at 37°C in 5% CO 2 for 24 h before being passed onto a blood agar plate and incubated an additional 24-48 h prior to MALDI-TOF MS analysis.

MALDI-TOF MS spectra
Spectra from MALDI-TOF MS analysis of strains used in this study was obtained using both the extraction method and smear method as previously described and recommended by the manufacturer (Khot et al., 2012). To briefly summarize, 2-3 bacterial colonies from a 24-48 h incubated plate were dissolved in 300 μl HPLC water before adding 900 μl of absolute ethanol. The solution was then centrifuged (2 min at 16,000 ×g) and the supernatant decanted prior to allowing the sample to air dry. Next, the pellet was dissolved in 25 μl of 70% formic acid and 25 μl of acetonitrile and centrifuged (2 min at 16,000 ×g). Finally, 1 μl of the sample was placed on the target plate and allowed to air dry before 1 μl of α-cyano-4-hydroxycinnamic acid matrix solution (Bruker Daltonics, Billerica, MA) was placed onto the target plate well. After the wells were dry and matrix crystallization had occurred, the plates were analyzed by MALDI-TOF MS using a MALDI Biotyper system (Bruker Daltonik) in positive linear mode with a mass range of 2-20 kDa m/z with a laser frequency of 60 Hz and calibration using Bacterial Test Standard (Bruker Daltonik). Additional machine settings included ion source 1 voltage 20.00 kV, ion source 2 voltage 18.10 kV, lens voltage 6.05 kV and a pulsed ion extraction time of 170 ns. Spectra from 8 technical replicates were used to formulate a main spectrum profile (MSP) for each strain used in this study using Biotyper software (Bruker Daltonik). The smear method of spectra acquisition involved transferring a single colony onto the target plate using a wooden applicator and allowing the smeared sample to air dry before applying 1 μl of α-cyano-4hydroxycinnamic acid matrix solution to the well. After the wells were dry, the samples were analyzed as described above for the extraction method.

Model generation, validation, and classification
Bacterial strains from each genotype were randomly assigned to separate groups for model generation, external validation, and classification (Table 1). The model generation group spectra was input with genotypes known to the software to initially develop the models. External validation involved inputting spectra from different isolates with genotypes known to the software as an initial test of the model capability. Classification involved inputting spectra with genotypes unknown to the software. The accuracy of the models for the classification group spectra was manually calculated. Finally, a smear classification group for each genotype was used to test the models using the smear method of sample preparation and the resulting novel spectra. Due to a limited number of genotype 2 strains available, the genotype 2 smear classification group included strains previously used within the model generation groups (Table 1).
ClinProTools 3.0 software (Bruker Daltonik) was used to analyze spectra for the presence of potential genotype specific peaks. To generate the spectra analysis models, three classification algorithms were used including genetic algorithm (GA), quick classifier (QC), and support vector machine (SVM). Next, flexAnalysis software (Bruker Daltonik) was used to visually inspect the spectra from each strain to determine the presence or absence of the discriminatory peaks from all GA, QC, and SVM models. Peaks with substantial discriminatory power and peak intensity were incorporated into the UNL assisted model. The UNL assisted model was developed using the GA algorithm and manually determining the number of peaks used and the exact m/z of peaks within the model using the "Force Peak into Model" function within ClinProTools.

ClinProTools automated models
The six automated models developed in this study are summarized in Table 2. The GA algorithm does not compare all possible peak combinations but instead compares a number of random combinations and the final result is the combination that most accurately separates the generation spectra datasets. The random nature of the GA algorithm is evident by the differences in peaks used among the GA models. Therefore, three GA models (GA1, GA2, GA3) were developed to increase the likelihood of a GA model determining peaks with substantial discriminatory power that may be suitable to be included in an assisted model. As shown in Table 2, all six automated models were able to accurately distinguish between the two M. bovoculi genotypes.

Peak visualization using flexAnalysis software
Spectra from 56 strains representing the combined library of generation (G), external validation (Ext. V), and classify (C) groups were manually examined using flexAnalysis software to look for the presence or absence of peaks used in each of the six automated models (Table 3). In addition, the "mass list" function within flexAnalysis was used for each spectra to compare the relative intensities of the respective peaks. Peaks m/z 3492, 8580 and 9057 had 100% sensitivity and specificity between the two genotypes (Table 3).

UNL assisted model
Four mass peaks m/z 6650, 8580, 9057, and 9971 were chosen to be included in the UNL assisted model. These peaks were chosen due to their high degree of discriminatory power as well as their high relative peak intensity. While peak m/z 3492 had 100% sensitivity and specificity, it was omitted from the assisted model because the average peak intensity was much lower (data not shown). Pseudo-gel representations and spectra views of the four peaks incorporated into the UNL assisted model are shown in Fig. 1. The UNL assisted model was 100% accurate in recognition capability, cross validation, external validation, classification and smear method classification (Table 2). Two dimensional peak distributions using different combinations of the four peaks used in the UNL assisted model show well-differentiated genotype specific strain clusters ( Supplementary Fig. 1).

Discussion
Using ClinProTools software we developed six distinct automated models that were able to accurately classify M. bovoculi strains according to known genotypes with accuracies ranging from 90.6% -100%. Using the peaks derived from these six models, the custom UNL assisted model was developed and was 100% accurate across all strains using both the extraction method and smear method of bacterial preparation. Additionally, analyzing the spectra using the 2D peak distributions and gel views within ClinProTools showed the peaks used in the UNL assisted model display large discriminatory power between genotypes.
While the extraction method has been shown to yield higher MALDI-TOF MS identification scores when compared to the smear method (Khot et al., 2012), it is also substantially more time consuming. Therefore, the speed of the smear method makes it an attractive diagnostic tool provided the resulting MALDI-TOF MS spectra allows for accurate model classification. As described above, using the smear method did not affect the accuracy of the UNL assisted model as both the extraction and smear method resulted in 100% genotype classification.
The potential importance of manually visualizing and inspecting spectra when generating MALDI-TOF MS classification models was highlighted in our case by the peak m/z 8580. This peak was only incorporated into one automated model (GA3) yet had sensitivity and specificity values of 100% as well as a large relative peak intensity. This is due to the fact that as mentioned above, the GA algorithm uses a fixed number of random peaks combinations and chooses the best of those random combinations for the final model. That is, the algorithm does not test all possible peak combinations. This saves on computing time as computing all possible peak combinations for an entire spectra would take exponentially more computing time yet this method still often results in an accurate model. This was the case in our study as all three automated models using the GA algorithm (GA1, GA2, GA3) had 100% accuracy regardless of spectra group or bacterial preparation method. Despite the first three automated GA models having 100% accuracy, it is likely that over time the UNL assisted model will prove more accurate as it incorporates only the "best of the best" peaks from the automated models. A much larger collection of bacterial strains with known M. Hille, et al. Journal of Microbiological Methods 173 (2020) 105942 genotypes would be required to test this hypothesis. The potential effect of environmental conditions on MALDI-TOF MS results has been previously described (Karger et al., 2019). During the smear method model evaluation portion of this study, the relative air humidity appeared to initially effect the spectra and corresponding model results. Bruker Daltonik recommends an operating humidity range of 15-80%. The first attempt to analyze spectra obtained by the smear method was performed the first week of June in Nebraska during a period of high humidity. The spectra ID scores for these MALDI runs were abnormally low and the GA, QC, and SVM models all had accuracies of approximately 50% for spectra genotype classification (data not shown). It was subsequently determined that unrelated samples run on the same machine that day were also resulting in uncharacteristically low MALDI ID scores. After a recalibration and laser adjustment to the instrument, the MALDI spectra obtained using the smear method resulted in library based matching ID scores (Bruker biotyper) that were substantially higher (Supplementary Table 1) and the spectra used for the model classification results that were reported in this paper. The original smear method spectra data and classification results for the models from the initial attempt were not shown here as the data was determined to be unrepresentative of a properly calibrated and tuned mass spectrometry machine due to the humidity changes and subsequent inadequate laser operation. Our experience in this case highlights the potential effect environmental conditions can have and the importance of instrument adjustment on MALDI-TOF MS results and quality control.
The UNL assisted model developed in this study provides an accurate method of M. bovoculi genotyping that is both time and economically efficient as it can be performed without the need for nucleic acid based methods and uses data that is collected during bacterial identification. In addition, the characterization of peaks with high genotype discriminatory power allows for the manual inspection of spectra to determine genotype status without the need for additional software. The .XML files containing the ClinProTools models as well as the spectra collected and described in this study are freely available upon request by contacting the corresponding author. The UNL assisted model has potential to serve as a valuable tool for diagnosticians, cattle producers, veterinarians, and vaccine manufacturers in an effort to ensure that M. bovoculi strains included in autogenous pinkeye vaccines are genotype 1. Preferential inclusion of genotype 1 strains of M. bovoculi in an autogenous vaccine formulation may be preferred as genotype 1 M. bovoculi appear to be more highly associated with IBK than genotype 2 strains.
The models developed in this study allow rapid genotyping but do not differentiate RTX status, at least not within genotype 1 populations, given the absence of RTX in genotype 2 populations. The presence of RTX operons is regarded as a putative virulence factor among Moraxella spp. (Angelos et al., 2007a). The relative importance of RTX toxins in IBK pathogenesis is not clear as even genotype 1 RTX-strains have been isolated from cases of IBK, albeit at a lesser rate than genotype 1 RTX+ strains . Development of a modeling system to classify isolates as either RTX+ or RTX-would require a larger pool of known genotype 1 RTX-strains than is currently available. Such a model would be advantageous in the future to assess the relative importance of RTX within the pathogenesis of IBK and also as potential components of autogenous IBK vaccine formulations.

Declaration of competing interest
None.

Acknowledgments
We would like to thank faculty and staff of the Nebraska Veterinary Diagnostic Center. This work was supported by the Nebraska Experiment Station with funding from the Animal Health and Disease Table 2 Spectra analysis results for each model developed in this study. Models GA1, GA2, GA3, QC1, QC2, and SVM were automatically generated using ClinProTools software. UNL assisted model was generated by manually forcing the chosen peaks from flexAnalysis visual inspection into a custom model using the GA algorithm and the "Force Peak into Model" function. Number in parentheses represents the relative weight assigned to each peak within the respective model. GA: genetic algorithm, QC: quick classifier, SVM: support vector machine.  Table 3 The presence of discriminating peaks within all genotype 1 (n = 30) and genotype 2 (n = 26) isolates used in the model generation, validation, and classification steps of this study.  Hille, et al. Journal of Microbiological Methods 173 (2020) 105942 Research (section 1433) capacity funding program (accession 1017646) through the USDA National Institute of Food and Agriculture. The use of product and company names is necessary to accurately report the methods and results; however, the United States Department of Agriculture (USDA) neither guarantees nor warrants the standard of the products, and the use of names by the USDA implies no approval of the product to the exclusion of others that may also be suitable. The USDA is an equal opportunity provider and employer.  (Fig. 1a). Spectra view of discriminating peaks m/z 6550 (1b), 8580 (1c), 9057 (1d), and 9971 (1e). Red line: average genotype 1 spectra. Green line: average genotype 2 spectra. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)