Comparative Genomic Analyses of Lactococcus garvieae Isolated from Bovine Mastitis in China

ABSTRACT Lactococcus garvieae is an emerging zoonotic pathogen, but there are few reports regarding bovine mastitis. The prevalence of L. garvieae represents an increasing disease threat and global public health risk. Thirty-nine L. garvieae isolates were obtained from 2,899 bovine clinical mastitis milk samples in 6 provinces of China from 2017 to 2021. Five clonal complexes were determined from 32 multilocus sequence types (MLSTs) of L. garvieae: sequence type 46 (ST46) was the predominant sequence type, and 13 novel MLSTs were identified. All isolates were resistant to chloramphenicol and clindamycin, but susceptible to penicillin, ampicillin, amoxicillin-clavulanic acid, imipenem, ceftiofur, enrofloxacin, and marbofloxacin. Based on genomic analyses, L. garvieae had 6,310 genes, including 1,015 core, 3,641 accessory, and 1,654 unique genes. All isolates had virulence genes coding for collagenase, fibronectin-binding protein, glyceraldehyde-3-phosphate dehydrogenase, superoxide dismutase, and NADH oxidase. Most isolates had lsaD and mdtA antimicrobial resistance (AMR) genes. Based on COG (Clusters of Orthologous Genes database) results, the functions of defense, transcription and replication, and recombination and repair were enhanced in unique genes, whereas functions of translation, ribosomal structure, and biogenesis were enhanced in core genes. The KEGG functional categories enriched in unique genes included human disease and membrane transport, whereas COG functional categories enriched in core genes included energy metabolism, nucleotide metabolism, and translation. No gene was significantly associated with host specificity. In addition, analysis of core genome single nucleotide polymorphisms (SNPs) implied potential host adaptation of some isolates in several sequence types. In conclusion, this study characterized L. garvieae isolated from mastitis and detected potential adaptations of L. garvieae to various hosts. IMPORTANCE This study provides important genomic insights into a bovine mastitis pathogen, Lactococcus garvieae. Comprehensive genomic analyses of L. garvieae from dairy farms have not been reported. This study is a detailed and comprehensive report of novel features of isolates of L. garvieae, an important but poorly characterized bacterium, recovered in the past 5 years in 6 Chinese provinces. We documented diverse genetic features, including predominant sequence type ST46 and 13 novel MLSTs. Lactococcus garvieae had 6,310 genes, including 1,015 core, 3,641 accessory, and 1,654 unique genes. All isolates had virulence genes coding for collagenase, fibronectin-binding protein, glyceraldehyde-3-phosphate dehydrogenase, superoxide dismutase, and NADH oxidase and resistance to chloramphenicol and clindamycin. Most isolates had lsaD and mdtA antimicrobial resistance genes. However, no gene was significantly associated with host specificity. This is the first report that characterized L. garvieae isolates from bovine mastitis and revealed potential host adaptations of L. garvieae to various hosts.

Lactococcus garvieae is a zoonotic pathogen reported to cause infections in fish (4) and humans (5)(6)(7)(8). It is also considered a minor pathogen for bovine mastitis, with transmission attributed to environmental reservoirs (2). There are limited reports regarding bovine mastitis caused by L. garvieae (9-15)-primarily descriptive studies of phenotypes or genotypes. However, detailed whole-genome characterization of L. garvieae associated with bovine mastitis is lacking.
Predominant strain types have been described for various mastitis pathogens (16)(17)(18)(19)(20). Elucidating population structure and diversity of mastitis pathogens informs evidence-based mastitis control programs that target those prevalent strain types.
Bacterial pathogenicity is primarily determined by virulence genes: some facilitate adhesion and invasion, whereas antimicrobial resistance (AMR), particularly multidrug resistance, is an important threat to public health (21). For L. garvieae, several virulence genes and antimicrobial resistance genes were identified using traditional methods (e.g., PCR) targeted at specific virulence genes (22)(23)(24). However, a comprehensive profiling of its virulence and antimicrobial resistance genes is lacking.
Host adaptation of bovine mastitis-associated pathogens has been reported for Staph. aureus (25) and Strep. agalactiae (26). Infections caused by L. garvieae in humans, fish, and cattle have been reported, but potential host adaptation of L. garvieae has not been studied. Therefore, with regard to L. garvieae, our objectives were to (i) resolve the population structure, (ii) identify virulence genes and antimicrobial resistance genes, and (iii) determine genes associated with host specificity.

RESULTS
A total of 39 L. garvieae isolates were derived from 2,899 clinical mastitis composite milk samples collected from 13 large dairy farms in 6 provinces in North China from April 2017 to September 2021. Detailed information of these isolates is provided in Table 1. In addition, 47 high-quality L. garvieae assemblies from NCBI were included in our survey (see Table S1 in the supplemental material).
Six enzyme-related virulence genes were detected in almost all isolates, although eno and srtA were exclusively absent from 2 isolates (MGBC116427 and UBA11300) and 4 isolates (A1, EP01, DCC43, and FDAARGOS_893), respectively. Regarding 16 genes encoding the capsule gene cluster, the number of genes present in isolates varied from 0 to 16. Both Lg2 and JJJN1 had all 16 genes, whereas 47 isolates had none.
Co-occurrence of virulence genes is visualized in Fig. 4, where each box includes a phi coefficient value. A phi value of 1.0 indicates a perfect positive association between the 2 variables, whereas values of .0.7 indicate a fair positive association. Most associations among virulence genes were very weak. However, there were strongly positive associations between 8 genes, including adhPsaA, srtA and 6 iron uptake genes (fecD, feoA, feoB, fepB, fepC, and fepD). In addition, toxin-related genes hly-I and -III were also strongly positively associated with each other and those 8 genes. Furthermore, fecC was strongly positively associated with hly-I, hly-III, and the 8 genes. LPxTG-4 was strongly positively associated with the adhesin gene, whereas fecB had strong associations with the 8 genes. However, there were no strong negative associations among virulence genes.
Pangenome analyses. The pangenome of 86 L. garvieae isolates tested in this study had 6,310 genes. The core genome (shared by 100% of isolates) consisted of 1,015 genes. The accessory genome (genes in .2 isolates but not in all isolates) consisted of 3,641 genes, and the unique genome was composed of 1,654 genes. According to BPGA calculation, the pangenome was open but approached convergence (Fig. 4). Functional annotation of genes revealed a distribution of functional categories among 3 pangenome sets according to the COG (Clusters of Orthologous Genes) and KEGG databases (Fig. 5). The proportions of 4 COG categories were consistent in core, accessory, and unique genes (Fig. 5A). The functions of defense mechanisms, transcription and replication, and recombination and repair were enhanced in unique genes, whereas the functions of translation, ribosomal structure, and biogenesis were enhanced in core genes in KEGG functional pathways (Fig. 5B). The COG functional categories enriched in the unique genome included  human disease and membrane transport (Fig. 5C). In contrast, COG categories enriched in the core genome included energy metabolism, nucleotide metabolism, and translation (Fig. 5D). Phylogenetic analyses. A phylogenetic tree was constructed based on core genes of 86 L. garvieae genomes (see Fig. S1 in the supplemental material). The longer the branch, the more distant the evolutionary relationship. The tree had 3 clades that contained 4, 38, and 44 isolates, respectively. All current isolates, except Hebei-B-22, were in the same clades. EP01 was phylogenetically distant from all others. Isolates from various hosts had signs of host adaptation.
The phylogenetic tree based on 16S rRNA (Fig. S1) was similar to that using core genes, with only minor differences. For example, LG9 was assigned to clade A in the core genebased phylogenetic tree but was in clade B in the 16S rRNA-based phylogenetic tree.
Both trees corresponded well with STs and CCs predicted by GrapeTree analysis. For the core gene-based phylogenetic tree (Fig. S2), all isolates that belonged to the same CC were grouped in the same cluster, except 2 isolates (LG9 and IBB3403) in the 16S rRNA-based phylogenetic tree.
Pangenome-wide association analyses. No significant association was detected between genes and either country or host by pangenome-wide association analysis.
Core genome SNP analyses. The core genome single nucleotide polymorphism (SNP)-based phylogenetic tree with metadata annotation is displayed in Fig. 6. Numbers of core genome SNPs among 86 isolates are provided in Table S3. Several isolates from various hosts were phylogenetically closely related in core SNPs. For example, the numbers of SNPs between isolates within the same MLST but a different host were 0 for DM12426 (human) and CT2 (fish), 5 for 1001287H_170206_H11 (human) and UBA5784 (metal), 11 for Lg-ilsanpaik-gs201105 (human) and Hebei-B-22 (cow), and 105 for 21881 (human) and M14 (cow), implying potential host adaptation in L. garvieae.

DISCUSSION
Although L. garvieae was first isolated as a causative agent of bovine mastitis (27), most reports have focused on epidemiology of isolates from fish or humans. In addition, many studies used sequencing to investigate genotypic characteristics of L. garvieae isolates (8,(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40). Therefore, we collected 39 L. garvieae isolates from bovine mastitis in China and conducted comparative genome sequence analyses. This is apparently the first report of the prevalence of L. garvieae in Chinese dairy herds. The prevalence of L. garvieae from clinical mastitis samples was 1.35% during 2017 to 2021, which increased from 0% in 2017 to 4.10% in 2020 in China. That L. garvieae has been misclassified as Streptococcus spp. (41) has resulted in underreporting. Similarly, the true incidence of human infective endocarditis is difficult to assess due to misidentification with other Gram-positive cocci (42).

Comprehensive Genomic Analyses of Lactococcus garvieae
Microbiology Spectrum phylogenetically close to one other. However, they were distant from isolates of bovine mastitis in other countries, which may indicate geographic effects on phylogeny. Meanwhile, new ST profiles were comprised of new alleles in gene loci (e.g., als, gyrB, and galP), perhaps due to a different evolution rate for those loci (38). Understanding phylogenetic relationships between strains is important for characterizing pathogen transmission. In this study, 3 phylogenetic trees were constructed using core genes, core genome SNPs, and 16S rRNA, respectively. The core gene-based tree and 16S rRNA-based tree produced similar clades, but the 16S rRNA-based tree failed to resolve relationships toward tree tips. Furthermore, a core gene tree was in line with MLST and CC. This was not surprising, as a 16S rRNA tree is based on only 1 gene, representing only a very small portion of the genome. Therefore, many studies have recommended core genes to infer phylogenies (45,46).
Although the pathogenicity of L. garvieae is poorly understood, some factors associated with pathogenicity have been determined, including the presence of a capsule, hemolytic activity via secreted proteins (47), and production of siderophores (48). In fish, capsulated L. garvieae Lg2 was more virulent than the noncapsulated isolate ATCC 49156 (49). The capsule gene cluster, located in a genomic island, was identified in Lg2 but absent in ATCC 49156: that could be crucial for virulence of L. garvieae in fish (40). However, the capsule gene cluster was not detected in all clinical fish isolates from Japan, Spain, Italy, France, Turkey (22), the United States (39), or India (50) nor in any human isolates (31,51). In this study, only 2 of 86 isolates had the complete capsule gene cluster, which confirmed that it was not essential for virulence.
Proteases are among the important virulence genes causing rapid and extensive destruction of host tissue. For example, enolase (50) can cleave an extracellular proteinaceous matrix and therefore break a host's structural barrier during colonization. In addition, hemolysin genes may act with secreted proteases to promote host tissue destruction. Genes encoding biosynthesis of iron uptake may be involved in iron acquisition during host colonization (48). LPxTG protein (Leu-Pro-any-Thr-Gly), an important virulence factor in L. garvieae, binds to the peptidoglycan of cell wall by transpeptidase enzymes called sortases (40). In a previous study, an L. garvieae strain isolated from rainbow trout colonized nonphagocytic cells with the help of LPxTG proteins (52). LPxTG proteins and sortases have important roles binding pathogenic bacteria to their host. In this study, genes coding for adhesin, proteases, hemolysin, iron uptake, and LPxTG protein were detected in most or all isolates. Strong positive associations within LPxTG-4, adhPsaA, srtA, and iron uptake genes suggested they may act together to promote host tissue destruction and colonization. Furthermore, the gene pgm, identified in all isolates, produces protein with an important role in antibody production (53). MIC results were consistent with reports that L. garvieae isolates from dairy farms were susceptible to penicillin, ampicillin, amoxicillin-clavulanic acid, imipenem, ceftiofur, enrofloxacin, vancomycin, and marbofloxacin (10, 54). However, compared to a previous report (10), there were variable degrees of increasing resistance rates for 8 antibiotics, including clindamycin (94% to 100%), chloramphenicol (6.4% to 100%), amikacin (2.1% to 90%), cefpodoxime (0% to 83%), cephalothin (0% to 10%), cefazolin (40% to 45%), gentamicin (0% to 38%), and erythromycin (0% to 5%). High resistance of clindamycin has been described as intrinsic for L. garvieae and proposed as a selection criterion to distinguish between L. garvieae and Lactococcus lactis (55). The lsaD gene, identified in most isolates (83/86), could be responsible for intrinsic resistance. lsaD is a novel lsa-type family gene detected in lincomycin-resistant strains isolated from fish (56). The lsa-type genes are responsible for crossresistance to lincosamides, streptogramins, or pleuromutilins [here referred to as the LSA (P)-resistant phenotype], by coding ATP-binding cassette F proteins in Gram-positive pathogens, including staphylococci (57), streptococci (58), enterococci (59), and lactococci (56). Increasing resistance against cephalosporins may be related to increasing use of these antibiotics on Chinese dairies (60).
The spread of antibiotic resistance genes in bacterial populations is aided by various mechanisms of horizontal gene transfer, with plasmid-mediated transfer being the main mechanism for transmission of resistance genes (63). Horizontal gene transfer between bacteria is largely mediated by specialized mobile genetic elements, including plasmids, bacteriophages, transposon, insert sequences (ISs), integrons, etc., and has been reported in L. garvieae. Both tetS and tetM were associated with conjugative transposon-associated genes in isolates from healthy fish intestines (64). Most of the ISs in L. garvieae had substantial homology to Lactococcus lactis elements, implying movement of ISs between these 2 species that are phylogenetically closely related (65,66). That these 9 AMR genes were reported only in humans does not support the assertion that AMR genes are transferred to humans from fish or dairy products. Regardless, L. garvieae could be a reservoir for antibiotic resistance genes for other bacteria.
In this study, no genes were associated with host specificity, consistent with the phylogenetic analysis and the core genome SNP analysis that host adaptation occurs in L. garvieae isolates. Previous research (5) summarized human L. garvieae infections associated with consumption of raw fish, seafood, or unpasteurized milk. The core genome SNP analysis underlies potential host adaptation of L. garvieae. Meanwhile, adhesins, hemolysin, fibronectin-binding proteins, penicillin acylase and WxL domain-containing proteins are considered to actively promote bacterial colonization (67); most had high similarity across hosts in those coding sequences. Regardless, the underlying mechanisms remain unclear. Consequently, further studies are needed to determine host adaptation mechanisms of L. garvieae.
Conclusions. This was apparently the first comparative genomic analysis of L. garvieae isolates from bovine mastitis in China. The incidence of L. garvieae mastitis was 1.35% in China. Most isolates (38/39) were novel sequence types, 3 antimicrobial resistance genes (mdtA, lsaD, and tetS) were identified, and there was evidence of host adaptation.

MATERIALS AND METHODS
Statement of ethics. This study was conducted in accordance with ethical guidelines and standard biosecurity and institutional safety procedures of China Agricultural University (CAU) (Beijing, China). Ethical approval was not needed, as no animal study was involved.

Comprehensive Genomic Analyses of Lactococcus garvieae
Microbiology Spectrum Sample collection and bacterial identification. Milk samples from cases of clinical mastitis were collected aseptically from dairy cows on Chinese dairy farms and sent to the Mastitis Diagnostic Laboratory at the College of Veterinary Medicine, CAU, Beijing, China. Pathogens were identified by bacteriological culture, colony morphology, and 16S rRNA sequencing, according to National Mastitis Council guidelines (68). In brief, 50 mL milk was spread on tryptone soy agar with 5% defibrinated sheep blood. The plate was incubated aerobically at 37°C for 24 h. Bacterial colony morphology was recorded; samples with $3 morphologically distinct colonies were considered contaminated and excluded from subsequent analyses.
Antimicrobial susceptibility testing. For all 39 L. garvieae isolates, MIC of 15 antimicrobials (Chinese National Institutes for Food and Drug Control, Beijing, China), commonly used to treat clinical mastitis in China, were determined by the broth microdilution method, according to the Clinical and Laboratory Standards Institute (CLSI) guidelines VET01-A4 (CLSI, 2013), with reported breakpoints (10). Staphylococcus aureus ATCC 29213 was used as the quality control strain.
Genome assembly and annotation. Genomic DNA of putative isolates was extracted using a bacterial DNA extraction kit (TransGen Biotech, Beijing, China) according to the manufacturer's instruction. Extracted DNA was quantified with a NanoDrop One spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) prior to 16S rRNA gene sequencing (Beijing Sunbiotech, Inc., Beijing, China). Wholegenome DNA was paired-end sequenced (2 Â 150 bp) using Illumina NovaSeq 6000 (Illumina, San Diego, CA, USA) at Shanghai Personal Biotechnology Co., Ltd. (Shanghai, China). For raw reads, quality control was done with FastQC version 0.11.9 (https://github.com/s-andrews/FastQC). Low-quality bases were trimmed using fastp version 0.20.1 (https://github.com/OpenGene/fastp) with default settings. Quality trimmed reads were assembled into scaffolds using SPades version 3.13.1 (https://github.com/ ablab/spades) with auto coverage cutoff and shovill version 1.1.0 (https://github.com/tseemann/shovill) with default settings. Thereafter, 2 assembled scaffolds for each isolate were obtained, and a draft genome of each isolate was selected using Quast version 5.0.2 (https://github.com/ablab/quast) with N 50 , and L 50 from the above-mentioned 2 assembled scaffolds. Assembly completeness was assessed using Busco version 5.2.2 (https://github.com/WenchaoLin/BUSCO-Mod) with reference to lineage lactobacilla-les_odb10. Only genomes with completion $ 95% were considered a "high-quality draft genome" and were included in further analyses (69). In addition, whole-genome sequence assembly fasta files of 51 L. garvieae (accessed on 24 March 2022) were downloaded from NCBI. To ensure high-quality genomes, all genomes were analyzed by BusCom version 5.2.2 (lineage lactobacillales_odb10) and 3 assemblies were excluded from subsequent analyses. There were 3 ATCC 49156 assemblies; we chose the one with the highest assembly level. Therefore, a total of 39 isolates from composite (or quarter) milk samples and 47 assemblies from NCBI were obtained in the subsequent genome annotation and pangenome analysis. Annotation of the genome was performed using Prokka version 1.14.6 (https://github.com/tseemann/ prokka) with default settings.
MLST analyses. Multilocus sequence typing using whole-genome sequences was performed to determine sequence types (STs) of the 86 isolates. An L. garvieae MLST database was constructed based on reported data sets (51) as there was no publicly available MLST scheme for L. garvieae. Similarly, the database was integrated into the ABRicate local database, and the 86 L. garvieae genomes were aligned against the data set by ABRicate. Sequence types were assigned to new allele patterns and added to the existing L. garvieae MLST scheme (51). The clonal complex (CC) was defined as a group of STs in which every ST shared at least 5 of 7 identical allele profiles with at least 1 other ST in the group. The minimum spanning tree (MST) was constructed by the goeBURST algorithm and visualized with the PhyloViz web server (https://online.phyloviz.net/index) to infer phylogenetic relationships among STs.
Pangenome analyses. The pangenome of 86 L. garvieae isolates was computed using BPGA version 1.3 (https://iicb.res.in/bpga/) with USEARCH algorithm to cluster orthologous gene families using faa files of local isolates produced by Prokka and retrieved from NCBI. For BPGA, a core gene was defined as a gene present in all the genomes, an accessory gene was present in .1 genome but not all genomes, and a unique gene was only present in a single genome. Functional annotations of core, accessory, and unique genes were obtained after comparing sequences to COG and KEGG databases incorporated in BPGA version 1.3.
Core-genome SNP analyses. In addition to pangenome-wide association analyses, core genome SNP analyses were also performed. Core genome alignment and SNPs were detected for all 86 genome sequences using parsnp version 1.7.2 (https://github.com/marbl/parsnp). Meanwhile, exact numbers of SNPs among genomes from various hosts in the same MLST group and closely related in the core gene-based phylogenetic tree were determined with snp-dists version 0.8.2 (http://sanger-pathogens.github.io/snp-sites/) using the sequence alignment file from parsnp. The phylogenetic tree based on core genome SNPs was annotated with iTOL.
Associations between co-occurrences of virulence genes. Co-occurrence of virulence genes was determined with a phi coefficient using the Phi function in psych package version 2.2.5 (https://cran.r -project.org/web/packages/psych/) with R version 4.1.3 (https://www.r-project.org/); a P value of ,0.05 was considered significant in a 2-tailed test. Pairwise phi coefficients between the presences of virulence genes were visualized using ggplot2 version 3.3.6 (https://cran.r-project.org/web/packages/ggplot2/).
Data availability. All whole-genome sequence data used in this study are available without restriction from NCBI under BioProject no. PRJNA848370.