Genotyping Mycoplasma hyorhinis by multi-locus sequence typing and multiple-locus variable-number tandem-repeat analysis

Mycoplasma hyorhinis is a swine pathogen bacterium, which causes significant economic losses. The infection spreads through direct contact between the animals. Powerful genotyping methods like PCR based multi-locus sequence typing (MLST) and multiple-locus variable-number tandem-repeat analysis (MLVA) are necessary to monitor the infections and to conduct epidemiological investigations; hence supporting the control of the disease. The aims of the present study were to examine M. hyorhinis isolates originating mainly from Hungary with MLST and MLVA developed in the study, and to compare the results of the two typing methods. To characterize 39 M. hyorhinis isolates and the type strain (NCTC 10130), six house-keeping genes were selected for MLST and six tandem-repeat regions were chosen for MLVA. We were able to differentiate 31 sequence types and 37 genotypes within the 40 analyzed isolates by the MLST and the MLVA, respectively. With the combination of the two newly developed assays all examined isolates were distinguished with the exception of the ones originating from the same animal. The developed MLST assay provided a robust and high resolution phylogenetic tree, while the MLVA system is suitable for the differentiation of closely related isolates from the same farm, hence the assay is appropriate for epidemiologic studies.


Introduction
Mycoplasma hyorhinis, first described by Switzer in 1955(Switzer, 1955, is a swine pathogen bacterium distributed worldwide with an estimated occurrence around 50-70% in the herds (Hansen et al., 2010). As M. hyorhinis is a facultative pathogen bacterium, the manifestation of clinical signs are triggered by stress or presence of co-infections. Systemic infections occur mainly in 3-10 weeks old piglets, with typical symptoms like arthritis, polyserositis and less often otitis and conjunctivitis (Rovira et al., 2010). The pathogen is transmitted by direct contact, the piglets first make contact with the bacteria through gilts and sows, and then M. hyorhinis spreads between pigs when they are grouped together postweaning. In the past years, M. hyorhinis was detected more often in piglets showing these signs and became one of the main concerns of swine veterinarians dealing with post-weaning infections (Clavijo et al., 2017). MLST is suitable to detect mid-term evolutionary distances and to determine relationships between the strains based on point mutations in house-keeping genes (Maiden, 2006). In phylogenetic analyses, bootstrap method is used to assess confidence. Based on the work of Hillis and Bull (Hillis and Bull, 1993) internal branches (nodes) with bootstrap proportions J o u r n a l P r e -p r o o f above 70% represent true clades thus we consider these nodes well supported. The robustness of the phylogenetic tree is determined based on the bootstrap values of the nodes. An MLST system was published before for genotyping M. hyorhinis isolates (Tocqueville et al., 2014) and it was optimized by Trüeb et al. (2016). Currently, the MLST profiles of 133 M. hyorhinis strains are available at the PubMLST database (https://pubmlst.org/mhyorhinis/), which is established on the published MLST system (Tocqueville et al., 2014;Trüeb et al., 2016). The typing of these strains resulted in discriminating 104 sequence types (STs), but according to the bootstrap values of the resulted phylogenetic tree (Maximum Likelihood algorithm, Tamura-3 model with MegaX software; Kumar et al., 2018) the robustness of the tree is low. In the last year a modified MLST-s (surface) system was also published, which uses two house-keeping genes (pdhB, ung) and two surface-encoding protein genes (p95, mtlD) to increase the discriminatory power of the test (Clavijo et al., 2019).
MLVA discriminates strains based on the repeat number of tandem-repeat (TR) units in the genome. These regions usually harbor several mutations and have higher mutation rates than the house-keeping genes, therefore can be used for the analysis of short-term evolutionary changes and applicable in epidemiologic studies (Nadon et al., 2013). An MLVA system for M. hyorhinis was published before, where two variable-number tandem-repeat regions in hypothetical proteins (Mhr_0152, Mhr_0298) were analyzed (Dos Santos et al., 2015).
The aims of the present study were to characterize mainly Hungarian M. hyorhinis isolates with already published and novel MLST and MLVA systems and compare the methods by examining a number of recent clinical isolates.

Samples
J o u r n a l P r e -p r o o f with the consent of the owners or in slaughterhouses. The isolates were cultured as described before (Bekő et al., 2019a). DNA extraction was performed using the QIAamp DNA Mini Kit (Qiagen Inc., Valencia, CA, USA) according to the manufacturers' instructions for Gramnegative bacteria. All isolates were identified by polymerase chain reaction (PCR) targeting the p37 gene of M. hyorhinis (Caron et al., 2000;Assunção et al., 2005). The presence of other Mycoplasma species was excluded by a universal Mycoplasma PCR system targeting the 16S/23S rRNA intergenic spacer region in Mollicutes (Lauerman et al., 1995) (Kearse et al., 2012).

MLST assay
House-keeping genes suitable for MLST were selected based on the whole genome sequences of the clinical isolates and other publicly available M. hyorhinis whole genome sequences (M. hyorhinis type strain (NCTC 10130), HUB-1, GDL-1, SK76, DBS1050, MCLD and MDBK-IPV). At first, 17 genes were selected based on MLST alleles described in Mycoplasma sp. before (Manso-Silván et al., 2012;Dijkman et al., 2016;Ghanem and El-Gazzar, 2016;Bekő et al., 2019b) including the ones from the published M. hyorhinis MLST (Tocqueville et al., 2014;Trüeb et al., 2016). Genes were selected based on genome position (even distribution) and variability.  (Tocqueville et al., 2014;Trüeb et al., 2016) also, and MLST profiles were compared with sequences from the PubMLST database (https://pubmlst.org/mhyorhinis/). Gene fragments from dnaA, rpoB, gyrB, gltX, adk and gmk were amplified as described previously by Trüeb et al. (2016) and the PCR products were sequenced on an ABI Prism 3100 automated DNA sequencer (Applied Biosystems). The phylogenetic analysis was performed with Maximum Likelihood method, Tamura-3 model in the MegaX software (Kumar et al., 2018).  Stained gels were photographically documented (Kodak Inc., Rochester, NY, USA) and band sizes were estimated with the Kodak MI SE software package (Kodak). The estimated band sizes were converted to number of repeat units based on the formula in Table 3. The clustering analysis was performed with Neighbor-Joining method based on pairwise distances in the MegaX software (Kumar et al., 2018).

Sensitivity and specificity of the assays
In order to test the sensitivity of the developed assays, tenfold dilutions of the type strain (NCTC 10130) were used in the range of 10 8 -10 0 copy number/µl. Template copy number was calculated with the help of an online tool (Staroscik, 2004)

Comparison of the methods
The discriminatory power of the MLST systems were compared based on the Simpson's index of diversity (ID) and the variable nucleotides based on parsimony-informative sites. The Simpson's ID was determined using the online tool ComparingPartitions (http://www.comparingpartitions.info/). The variable nucleotides based on parsimonyinformative sites were determined for the genes of the MLST systems individually and as concatenated sequences in MegaX software (Kumar et al., 2018). Phylogenetic trees based on the MLST profiles of the 44 isolates used in this study and determined by the three MLST based J o u r n a l P r e -p r o o f systems were also compared. The diversity of the MLVA system described here was determined by calculating the Simpson's ID by the online tool ComparingPartitions.
In order to define the proportion of congruence between the MLST described here, the previously published MLST (Tocqueville et al., 2014;Trüeb et al., 2016), the MLST-s system (Clavijo et al., 2019) and the MLVA described here, the adjusted Rand coefficients were calculated by the online tool ComparingPartitions.

Results of the M. hyorhinis MLST assay
After preliminary sequence analysis of 17 chromosomal genes of M. hyorhinis six genes were selected for development of an MLST assay based on the diversity among the clinical isolates, the type strain (NCTC 10130) and other available strains (HUB-1, GDL-1, SK76, PCR assays was between 10 2 and 10 5 copies/reaction (Table 1)

J o u r n a l P r e -p r o o f
The isolates examined in this study showed high diversity with the previously published MLST scheme (Tocqueville et al., 2014;Trüeb et al., 2016)

Results of the M. hyorhinis MLVA assay
A total of 75 TRs were found in the M. hyorhinis type strain's whole genome sequence.
These were narrowed down to 23 amplified TRs based on length (>12 nucleotides) and lacking insertions or deletions in the repeat region. After preliminary amplifications, where some markers turned out to be monomorphic or presented multiple bands, six alleles (Mhr205, Mhr396, Mhr438, Mhr441, Mhr442, and Mhr444) were selected.  Figure 1). With the MLVA method, no correlation was found between sample source or integration and GT.

Comparison of the methods
The MLST described here has higher resolution (37 STs) than the previously described and ST115 on genes dnaA and rpoB. Nevertheless, the STs among the isolates from Bácsalmás determined by the here described system corresponded better with the available metadata (separating the isolates according to their year of isolation). On the phylogenetic tree based on the MLST-s system (Clavijo et al., 2019) the isolates from Bácsalmás were grouped together (ST49) except for the two latest isolates ST14 (MycSu141) and ST41 (MycSu152). According to the examined house-keeping genes no differences were found between the isolates from this farm, but four nucleotides differed on surface-encoding protein genes mtlD and p95 between ST49 and ST14, while ST14 and ST41 differed in one SNP on gene p95.
Similarly to the here described MLST the three isolates from Hajdúszoboszló were grouped into two STs (ST40 and ST88) based on one SNP on gene dnaA with the previously published MLST system (Tocqueville et al., 2014;Trüeb et al., 2016). However, there were isolates from other farms that were grouped with the isolates from Hajdúszoboszló, MycSu55 for ST40 and The two isolates from Jánoshalma showed similar results with all MLST systems, the two isolate represented ST124 and ST127 and differed in six SNPs on genes gyrB, adk, dnaA and rpoB with the previously published MLST (Tocqueville et al., 2014;Trüeb et al., 2016). While with the MLST-s (Clavijo et al., 2019) system the main differences were found on gene mtlD (30 SNPs), one nucleotide difference was found on gene p95 and one SNP was detected on a house-keeping gene phdB.
With the previously published MLST system (Tocqueville et al., 2014;Trüeb et al., 2016) two isolates from different years and different farms MycSu63, MycSu106 were grouped J o u r n a l P r e -p r o o f together in ST13, while they showed distinct profiles with the MLST-s system (Clavijo et al., 2019) and with the here described MLST. Two isolates from the same year but different farms (MycSu56, MycSu92) were also turned out to be similar and formed ST108 with the previously published MLST system (Tocqueville et al., 2014;Trüeb et al., 2016), but represented different STs with the MLST-s (Clavijo et al., 2019) system and differed with the here described system also.
Other isolates were grouped together with the MLST-s system but differed based on the other two MLST assays. MycSu64 and MycSu65 shared ST40 (MLST-s), while these two isolates were located on the same branch but differed in five nucleotides on genes rpoB and gltX with the here described MLST system, and were located on different branches of the phylogenetic tree based on the previously published MLST system (Tocqueville et al., 2014;Trüeb et al., 2016) differing in six nucleotides total on genes gyrB, gltX, gmk and rpoB.
MycSu56, MycSu57 and MycSu58 were grouped together in ST54 (MLST-s) and represented different STs (ST26, ST25, ST27 respectively), but located on the same branch with the MLST described here. MycSu56 differs from MycSu57 in two nucleotides on genes rpoB and gltX, and differs from MycSu58 in 2 SNPs on gltX. These isolates also represented three different STs with the previously published MLST system (Tocqueville et al., 2014;Trüeb et al., 2016), MycSu56 differs from MycSu57 in one nucleotide on gene gmk, and differs from MycSu58 in four SNPs on genes gyrB, dnaA and gltX. Two other isolates MycSu59 and MycSu60 were grouped together with the MLST-s system (ST67; Clavijo et al., 2019) but located on different branches and differs in 14 SNPs on genes lepA, rpoB, gltX, uvrA and valS with the here described MLST system. Similarly, four SNPs were detected between the two isolates on genes gyrB, adk and gltX with the previously published MLST system (Tocqueville et al., 2014;Trüeb et al., 2016).

J o u r n a l P r e -p r o o f
Interestingly, the isolates MycSu29 and MycSu101 from Bácsalmás, which showed identical MLVA GT and belonged to the same MLST STs based on the previously developed MLST (Tocqueville et al., 2014;Trüeb et al., 2016) and MLST-s (Clavijo et al., 2019) systems were differentiated by the developed MLST system, in accordance with the isolates epidemiologic background.
The adjusted Rand coefficients defining the proportion of congruence between the MLST systems were similar. Compared to the MLVA system, all MLST systems showed low congruence (Supplementary table 2).

Discussion
M. hyorhinis can cause significant economic losses in the swine industry due to reduced weight gain and increased treatment costs (Scheiber and Thacker, 2012). Therefore, better understanding of the bacteria with efficient genotyping tools is key to control and monitor the infections and to carry out epidemiological investigations. Here we described new, high resolution MLST and MLVA systems (Simpson's ID 0.985 and 0.995) to genotype M. hyorhinis isolates and to reveal the phylogenetic relations and epidemiological connections.
Limitations of the study were that all isolates originated from routine diagnostic samplings or from slaughterhouses from mainly Hungary and a short period of time (2014-2019); thus complex epidemiologic analysis could not be carried out. However, with the help of the developed assays high diversity was detected (Supplementary Figure 1), similarly to the previously published M. hyorhinis genotyping studies (Tocqueville et al., 2014;Trüeb et al., 2016;Clavijo et al., 2019) and to a previous study with Central-European M. hyopneumoniae isolates (Felde et al., 2018). Although general conclusions can not be determined due to the J o u r n a l P r e -p r o o f sampling frame of the study, several correlations were found among the analyzed clinical isolates and their genetic characteristics.
With the MLST system we were able to differentiate isolates from different farms, but not within farms which corresponds with the resolution of the assay (Maiden, 2006). From Bácsalmás (the herd with the highest isolate number) the isolates are separated by the year of isolation (ST2: isolates from 2014-2015, ST3: isolates from 2016-2019), so the difference can be due to an evolutional event, which is supported by the sole SNP in the rpoC gene fragment.
The two latest isolates from herd A (MycSu141, MycSu152) differ greatly from the other isolates from Bácsalmás and from each other, which indicates that new strains were introduced later to the herd. Introduction of new strains was also detected in case of Hajdúszoboszló (ST22 and ST30) and Jánoshalma (ST8 and ST13), which was confirmed by the previously published MLST and MLST-s systems also (Tocqueville et al., 2014;Trüeb et al., 2016;Clavijo et al., 2019). hyopneumoniae (Vranckx et al., 2011). The isolate from Beremend shared ST2 with two isolates from Bácsalmás (MycSu23, MycSu29), which can be explained by that the two farms, from where the isolates originated, belong to the same integration. However, we found no further evidence for correlation between ST and integration.
The novel and the previously published (Tocqueville et al., 2014;Trüeb et al., 2016;Clavijo et al., 2019) MLST schemes were compared based on the results of the phylogenetic analysis of the same 44 strains (Supplementary table 1). For the new MLST scheme the six most diverse gene fragments were selected from the originally analyzed 17 house-keeping genes, which J o u r n a l P r e -p r o o f helped to increase the number of parsimony-informative sites from 0.65 parsim% (Tocqueville et al., 2014;Trüeb et al., 2016) to 1.12 parsim% (MLST scheme described here). However, the MLST-s system (Clavijo et al., 2019) has a higher number of parsimony-informative sites (parsim% 8.12), due to the use of highly variable surface protein coding genes. Nevertheless, the MLST assay described here showed higher resolution (Simpson's ID 0.985) than the previously published MLST system (Simpson's ID 0.975; Tocqueville et al., 2014;Trüeb et al., 2016) and even than the MLST-s system (Simpson's ID 0.962;Clavijo et al., 2019). A possible explanation for the observed higher resolution of the here described MLST despite the lower number of parsimony-informative sites can be the stochastic nature of mutations. The surface protein coding genes used in the MLST-s system have higher mutation rates than the housekeeping genes used by the MLST; thus, there is higher chance that mutations in the surface protein coding genes result in homoplastic alleles (Estoup et al., 2002;Urwin and Maiden, 2003).
During the comparison of the novel and previously published MLST schemes and the MLST-s system the phylogenetic trees were compiled based on typing data of the same 44 isolates with the best DNA model according to MegaX software (Kumar et al., 2018). When the same 44 isolates were analyzed the dendrogram based on the MLST system described here turned out to be more robust than of the previously published system (Tocqueville et al., 2014;Trüeb et al., 2016). The strains showed high variability based on limited number of SNPs [SNPs/concatenated sequence: 32/2304 for the previously published system (Tocqueville et al., 2014;Trüeb et al., 2016) and 73/3560 for the here described system], with both MLST systems, which might limit the robustness of the constructed phylogenetic tree. The phylogenetic tree of the MLST-s system (Clavijo et al., 2019) showed the highest robustness of the three compared MLST based methods. However, the robustness and the resolution of this system depended on the elevated number of SNPs on the surface-encoding protein gene mltD (SNPs/concatenated J o u r n a l P r e -p r o o f sequence: 134/1441; SNPs/mtlD gene fragment: 110/537). House-keeping genes are under stabilizing selection for conservation of metabolic functions, therefore they can indicate genetic relationships more reliably than genes under positive selection, like surface protein coding genes (Urwin and Maiden, 2003); therefore the previously published and the novel MLST assays probably describe better the phylogenetic relations among the isolates compared to the MLST-s system, despite their lower robustness. The developed MLST system is well reproducible and robust, appropriate to establish midterm genetic relationships of the examined M. hyorhinis isolates. Moreover, the achieved high resolution of the developed MLST system together with detailed metadata of isolates from the same, clinically affected herd may enable the identification of the source of infection during epidemiologic investigations.
MLVA is applied to analyze tandem repeat regions, which have high mutation rates hence could detect short-term evolutional events in bacterial strains. The developed MLVA was able to differentiate isolates within farms, and within the same MLST STs. The detected high diversity of the strains is in accordance with the high genome variability of mycoplasmas (Razin, 1985). The GT/number of isolates proportion of the here described MLVA system was 92.5% (37 GT/40 isolate), higher than the MLVA system by Dos Santos et al. (2015), which was 9.69% (16 GT/165 isolate). By using more TR regions in the MLVA assay higher robustness and lower chance of homoplastic profiles can be achieved (Vergnaud and Pourcel, 2006). The established MLVA is suitable for the rapid and cost-effective fine-scale typing of the isolates, and applicable in clinical outbreaks to locate the source of M. hyorhinis infections.
The isolates from the same animal (MycSu32i, MycSu32s, MycSu32t) were grouped into the same MLST ST and MLVA GT, which presumes that the same M. hyorhinis strain caused the clinical signs in every affected tissue in this case. Two isolates (MycSu29 and MycSu101) from Bácsalmás showed the same MLVA GT, but were collected from different epidemics and J o u r n a l P r e -p r o o f represented different STs only with the developed MLST analysis. In this case, we suppose that the stochastic nature of mutations is responsible for the detection of the same MLVA GT in two unrelated isolates (Estoup et al., 2002). These two closely related isolates were differentiated based on only one SNP detected only by the developed MLST assay, which highlights the system's suitability for the fine-scale typing of the highly variable M. hyorhinis, enabling the better understanding of the evolution of this pathogen.
The described MLST and MLVA assays represent convenient, well reproducible and high resolution molecular tools and revealed high variability of the examined 39 mainly Hungarian M. hyorhinis isolates. While the MLVA assay was suitable for the differentiation of isolates within the same MLST ST and proved to be applicable for epidemiologic studies alone, the combined MLST and MLVA method is recommended to explore both mid-and short term relations of the isolates in phylogenetic studies.  The M. hyorhinis type strain is highlighted grey, the M. hyorhinis strains with available whole genomes are framed. Abbreviations: ST-sequence type, t-tonsilla, s-serosa, i-synovial fluid. Tables   Table 1 Primer sequences, genome positions and sensitivity of the multi-locus sequence typing assays Table 1 J o u r n a l P r e -p r o o f Mhr442  442,537 -442,591   TGG TTC ATC TAC AAG CGG  AGG   TGA ATC TGA TCC TGT TCC  AGT CTG   Mhr444  444,088 -444,479   ATA TTG GGT TTT TTG TTT  GTA ATT TCT T   ATC AGG AAC ATC TAC AAG  CGG AG   Table 3 Characteristics of the tandem repeat regions of the developed multiple-locus variable-number tandem-repeat analysis a The PCR product size is the size of the VNTR flanking region plus n*repeat size, with n being the number of repeats.