Complete Genome Sequence of the Ice-Nucleation-Active Pseudomonas syringae Strain MUP17, Isolated from the Frost-Damaged Barley Cultivar Hordeum vulgare cv. La Trobe

ABSTRACT Pseudomonas syringae MUP17 was isolated from Western Australian frost-damaged barley. The MUP17 complete genome contained a 5,850,185-bp single circular chromosome with a GC content of 59.12%. IMG/M genome annotation identified 5,012 protein-coding genes, 1 of which encoded an ice-nucleation protein containing 19 occurrences of a highly repetitive PF00818 domain.

T he Pseudomonas syringae species complex (1, 2) is divided into 13 distinct phylogroups, with many strains containing ice-nucleation proteins that contribute to frost damage in crops (3,4). As part of an ongoing study, we have been identifying bacteria that dominate frost-damaged crops across Australia. Here, we report the complete genome sequence of a P. syringae isolate designated MUP17 (= WAC15077), which was isolated (5) from mid-Aprilsown barley (Hordeum vulgare cv. La Trobe) that had been frost damaged at the reproductive stage and was collected from a frost trial site in Wickepin, Western Australia, Australia, in 2017. Further details of the trial can be found in previous research (6).
A pure culture of this strain was grown in lysogeny broth (LB) and cryopreserved in 15% glycerol at 280°C. High-molecular-weight genomic DNA was isolated from a logarithmic-phase culture as described previously (7), and the same DNA preparation was sequenced using both Illumina and Oxford Nanopore Technologies (ONT) platforms. ONT libraries were prepared according to the ONT 1D ligation library preparation protocol (SQK-LSK109) and sequenced with a FLO-MIN-106D flow cell (R9.4.1) on a MinION platform. Guppy v3.2.6 was used for base calling with a read-pass-filter quality score cutoff value of 7. A total of 192,014 ONT reads were generated, producing a total of 1,747,664,262 bp with ;300Â coverage and a read N 50 value of 9,102 bp.
An Illumina library was prepared with the NuGEN Celero DNA-sequencing library preparation kit, following the manufacturer's protocol, and the library was sequenced on an Illumina NextSeq platform using a midoutput 2 Â 150-bp kit. This workflow generated a total of 1,148,718 paired reads, producing a total of 168,091,574 bp with ;30Â coverage of the genome.
ONT long reads were assembled using the Flye v2.9 assembler (8) with default parameters, with 10 iterations. Short reads were mapped to the long-read assembly using mini-map2 (9) to correct ONT-generated sequencing errors. The final genome contained a single circular chromosome (5,850,185 bp), with a GC content of 59.12%. The resulting chromosome was manually orientated using Geneious Prime (10) to position the dnaA gene at the first position in the sequence. A quantitative assessment of the genome assembly was performed using BUSCO v5.3.2 (11), which provided a completeness score of 99.7%. A comparison of average nucleotide identity using BLAST (ANIb) (12-14) values for Pseudomonas genomes revealed that MUP17 is a P. syringae (98% identity of the genome to the P. syringae type strain DSM 10604 (GenBank accession number NZ_JALK00000000.1)) that belongs in phylogroup 2 (Fig. 1).
Gene calling and annotation of the generated chromosomal sequence were performed using the IMG/M annotation pipeline (IMGAP v5.1.8) (15) and the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) (16). A genome annotation comparison is summarized in Table 1.
Of particular interest was the identification of a gene (locus tag Ga0555974_01_ 1713522_1717262) encoding an ice-nucleating protein that contained 19 repeats of the ice-nucleation domain PF00818 (17).
Data availability. The whole-genome sequence for P. syringae MUP17 has been deposited in the publicly facing Genomes OnLine Database (GOLD) v8 (18) (project identification  (12) and imported into DendroUPGMA, and the tree was constructed using a similarity matrix within the algorithm (13). The DendroUPGMA-generated tree was exported into Interactive Tree Of Life (iTOL) (14) for visualization. The superscript T indicates type strains, and the superscript PT indicates pathotype and type strains. Pseudomonas aeruginosa DSM 50071 T was used as an outgroup. number Gp0618314), GenBank database (accession number CP110807), and IMG/M database (genome identification number 2966791590). The raw sequencing reads have been deposited in GenBank under the accession numbers SRR22226936 and SRR22252532 for long reads and short reads, respectively.