Chromosome-level genome sequence data and analysis of the white koji fungus, Aspergillus luchuensis mut. kawachii IFO 4308

Aspergillus luchuensis mut. kawachii is used primarily in the production of shochu, a traditional Japanese distilled alcoholic beverage. Here, we report the chromosome-level genome sequence of A. luchuensis mut. kawachii IFO 4308 (NBRC 4308) and a comparison of the sequence with that of A. luchuensis RIB2601. The genome of strain IFO 4308 was assembled into nine contigs consisting of eight chromosomes and one mitochondrial DNA segment. The nearly complete genome of strain IFO 4308 comprises 37,287,730 bp with a GC content of 48.85% and 12,664 predicted coding sequences and 267 tRNAs. Comparison of the IFO 4308 and RIB2601 genomes revealed a highly conserved structure; however, the IFO 4308 genome is larger than that of RIB2601, which is primarily attributed to chromosome 5. The genome sequence of IFO 4308 was deposited in DDBJ/ENA/GenBank under accession numbers AP024425–AP024433.


a b s t r a c t
Aspergillus luchuensis mut. kawachii is used primarily in the production of shochu, a traditional Japanese distilled alcoholic beverage. Here, we report the chromosome-level genome sequence of A. luchuensis mut. kawachii IFO 4308 (NBRC 4308) and a comparison of the sequence with that of A. luchuensis RIB2601. The genome of strain IFO 4308 was assembled into nine contigs consisting of eight chromosomes and one mitochondrial DNA segment. The nearly complete genome of strain IFO 4308 comprises 37,287,730 bp with a GC content of 48.85% and 12,664 predicted coding sequences and 267 tRNAs. Comparison of the IFO 4308 and RIB2601 genomes revealed a highly conserved structure; however, the IFO 4308 genome is larger than that of RIB2601, which is pri-marily attributed to chromosome 5. The genome sequence of IFO 4308 was deposited in DDBJ/ENA/GenBank under accession numbers AP024425-AP024433.
© 2022 The Author(s

Value of the Data
• The white koji fungus, Aspergillus luchuensis mut. kawachii , is used in the production of the traditional Japanese distilled spirit shochu. • The chromosome-level genome sequence of the white koji fungus can assist shochu brewers and researchers studying koji fungi. • These data are useful for comparative genomics studies of koji fungi, providing further insights into the genetic background of the white koji fungus that make it superior for use in shochu production.

Data Description
The white koji fungus, Aspergillus luchuensis mut. kawachii , is primarily used to produce shochu, a traditional distilled alcoholic beverage indigenous to Japan [1][2][3] . The white koji fungus plays an important role in supplying amylolytic enzymes that decompose starch in shochu ingredients, such as rice, barley, buckwheat, and sweet potato. The fungus also secretes large amounts of citric acid that prevent the growth of contaminating microbes during the fermentation process. We previously reported the genome sequence of A. luchuensis mut. kawachii IFO 4308 (NBRC 4308) [4] . In addition, genome sequences of four other white koji fungi have recently been reported [5] . However, as these sequences were incomplete draft genome assemblies, we conducted a chromosome-level genome analysis of strain IFO 4308.
The nearly complete genome of strain IFO 4308 comprises 37,287,730 bp with a GC content of 48.85% and 12,664 predicted coding sequences and 267 tRNAs. Quality assessment identified 97.7% complete and single-copy, 0.2% complete and duplicate-copy, 0.9% fragmented-copy, and 1.2% missing Benchmarking Universal Single-Copy Orthologs (BUSCOs) [6] . We confirmed that most of the missing BUSCOs were actually present in the genome of IFO 4308. The discrepancy was attributed to technical limitations in gene prediction [6] . Details regarding the chromosomes present in strain IFO 4308 are summarized in Table 1 .
Aspergillus luchuensis mut. kawachii is an albino mutant of a particular A. luchuensis black koji fungus; however, the parent strain of IFO 4308 remains unknown [1-3 , 7] . Determination of the nearly complete genome sequence of IFO 4308 enabled us to compare its genomic structure with that of A. luchuensis RIB2601, the nearly complete genome of which was sequenced previously [8] . The genome of strain RIB2601 is 35,508,746 bp in size [8] , which is smaller than that of strain IFO 4308. Genome comparison indicated a high degree of conservation in the genome structures of strains IFO 4308 and RIB2601, with the larger genome of IFO 4308 primarily attributed to chromosome 5 ( Fig. 1 ). Differences in the genomes could have resulted from transposable elements, such as retrotransposons, because putative reverse transcriptaseencoding genes and long interspersed nuclear elements (LINEs) have been identified in the region specific to IFO 4308 (indicated by triangles and lines in Fig. 1 ).

Sequencing and assembly
Strain IFO 4308 was grown in yeast extract-peptone-dextrose medium (2% [wt/vol] glucose, 1% [wt/vol] yeast extract, and 2% [wt/vol] peptone). After cultivation at 30 °C with shaking at 163 rpm for 24 h, mycelia were harvested by filtration. The cell pellet was freeze-dried and ground into powder using a mortar and pestle. DNA was extracted from the mycelial powder using DNAs-ici!-F DNA extraction reagent (Rizo, Inc., Tsukuba, Japan). DNA of strain IFO 4308 was sequenced using a hybrid assembly approach with Oxford Nanopore Technologies (ONT) MinION and Illumina NovaSeq 60 0 0. ONT long reads were used for de novo assembly, whereas the Illumina short reads were used for error correction. The genomic library for ONT sequencing was prepared using a Ligation Sequencing Kit (SQK-LSK109) and sequenced via MinION using a flow cell (R9.4.1). Adapter sequences were trimmed using Porechop v0.2.4, and chimeric reads were removed using Yacrd v0.6.1, yielding 1,664,0 0 0 ONT reads (mean length, 7,354 bp). The genomic library for Illumina sequencing was prepared using a NEBNext Ultra II DNA Library Prep Kit (E7645) and sequenced via the NovaSeq 60 0 0 using a paired-end sequencing strategy. The Illumina reads were filtered using Fastp v.0.20.1 with default parameters, yielding 42,205,278 reads (mean length, 150 bp). The ONT and Illumina reads provided 328 × and 169 × sequence coverages, respectively. De novo assembly of the ONT reads was performed using Canu v.2.0 [9] , and the initial assembly and trimmed and corrected ONT reads were reassembled using Flye v2.8-b1674 [10] . Next, several contigs were bridged by contigs generated using MaSuRCA v3.4.2 [11] . The superior metrics were selected based on telomere-to-telomere chromosome assembly. Assemblies were polished using medaka v1.0.3 [12] and pilon v1.23 [13] for ONT reads and pilon v1.23 [13] for Illumina reads. The resulting assembly consisted of nine contigs corresponds to eight chromosomes and one mitochondrial DNA segment. Chromosomes 2, 3, 5, 6, 7, and 8 were generated using only Canu and Flye, whereas chromosomes 1 and 4 were generated via an assembly in which two contigs were bridged using a MaSuRCA contig.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.