Genome Sequence of an Arthroconidial Yeast, Saprochaete fungicola CBS 625.85

Saprochaete fungicola is an arthroconidial yeast classified in the Magnusiomyces/Saprochaete clade of the subphylum Saccharomycotina. Here, we report the genome sequence of holotype strain CBS 625.85, assembled to five putative chromosomes.

S aprochaete fungicola is an anamorphic yeast reproducing by fragmentation of hyphae into asexual spores dubbed arthroconidia (1). Arthoconidia facilitate dissemination, and in some pathogenic fungi, their formation contributes to virulence (2). To provide a resource to study genetic control of arthroconidiogenesis, we determined the genome sequence of the strain CBS 625.85, originally isolated from ascocarps of Nectria cinnabarina, a plant pathogenic fungus causing coral spots (1). Genomic DNA sequencing was performed using a combination of HiSeq 2000 (Illumina) and MinION (Oxford Nanopore Technologies) platforms. DNA was isolated using a standard protocol (3), and total cellular RNA was prepared from a culture grown in yeast extract-peptone-galactose (YPGal) medium (1% [wt/vol] yeast extract, 2% [wt/vol] peptone, and 2% [wt/vol] galactose) at 28°C using hot phenol extraction (4) and an RNeasy minikit (Qiagen).
Assembly with SPAdes v. 3.12.0 (5) resulted in 367 contigs (N 50 value, 0.4 Mbp), and long-read assembly with miniasm v. 0.3-r179 (6) and minimap2 v. 2.13-r852 (7) polished by racon v. 1.3.1 (8) yielded 10 contigs (N 50 value, 5.1 Mbp). Two contigs contained mitochondrial DNA (mtDNA), of which one complete copy (circular 33-kbp contig) was retained for the final assembly. Two contigs not supported by Illumina reads were discarded as contamination. One ribosomal DNA (rDNA) contig was discarded, and eight copies of the rDNA cluster were retained elsewhere. To further correct the long-read assembly, the rDNA cluster was polished separately by two iterations of pilon v. 1.21 (9) with BWA-MEM v. 0.7.17-r1188 (10), and four contig ends were extended using SPAdes contigs. Identified by a significant decrease in long-read coverage, five local misassemblies were corrected, also using SPAdes assembly. All changes were based on reliable long-range overlaps and supported by multiple long reads. The final assembly contains five nuclear contigs of lengths 5.4, 5.1, 4.4, 3.2, and 2.1 Mbp, with an overall GϩC content of 40.6%. These likely correspond to full-length chromosomes since they terminate on both ends by putative telomeric repeats ([AACAG] 0 -1 A 2-6 G 0 -1 A 0 -1 G 4 -7 ) with the predominating motif A 2 GAG 6 .
RNA-Seq reads, processed by Trimmomatic v. 0.36 (11), were assembled into transcripts by Trinity v. 2.8.4 (12) and aligned to the genome by blat v. 36 ϫ 2 (13). Augustus v. 3.2.3 (14) trained on the related Magnusiomyces capitatus genome (15) with RNA-Seq evidence was used for initial gene predictions. Of these genes, 4,785 best supported by RNA-Seq transcripts (99% identity on 99% of length as identified by blat) were used to retrain Augustus parameters for S. fungicola and, together with the RNA-Seq evidence, to predict the final set of 6,138 protein-coding genes. The high-contiguity genome sequence of S. fungicola will be instrumental in comparative and functional studies focused on biology and evolution of arthroconidial yeasts.
Data availability. The genome assembly has been deposited in ENA under the accession number CAACAH010000000, and the Illumina, MinION, and RNA-Seq reads were deposited in SRA under the accession numbers ERR3046939, ERR3046967, and ERR3046965, respectively. The genome annotations are available through a genome browser at http://genome.compbio.fmph.uniba.sk/ and are also archived through Zenodo (16).

ACKNOWLEDGMENTS
This project was supported by grants from the Slovak Research and Development Agency (APVV-14-0253 to J.N. and 15-0022 to L.T.) and VEGA (1/0052/16 to L.T., 1/0684/16 to B.B., and 1/0458/18 to T.V.). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.