Whole-genome sequence data of cellulase-producing fungi Trichoderma asperellum PK1J2, isolated from palm empty fruit bunch in Riau, Indonesia

Trichoderma asperellum PK1J2 is a promising cellulase-producing fungus isolated from a palm empty fruit bunch in Riau, Indonesia. Presented here is the genome assembly of T. asperellum PK1J2. The whole genome of the fungi was sequenced using Illumina NovaSeq PE150. The genome assembly was performed using SOAPdenovo, SPAdes, and Abyss software, and the assembly results of the three types of software were integrated with CISA software. T. asperellum PK1J2 has 6,835 protein-coding genes with a length of 9,233,597 bp. The final genome assembly was approximately 36 Mbp with a GC content of 48.45%. This whole genome shotgun project has been deposited at DDBJ/ENA/GenBank under accession JAGJIK000000000.


Specifications
Biological science Specific subject area Genomics, Microbiology Type of data Genome sequencing in FASTA format

Value of the Data
• The genome data of Trichoderma asperellum PK1J2 isolated from Indonesia provide insight into the genetic diversity of T. asperellum and essential genetic information to reveal important details of effector proteins, metabolites and enzymes production. • The data can be useful for researchers working on fungal microbiology, biotechnology, genomics, and genetic engineering. • This genome information can be used for genome mining to discover the genes involved in metabolites and enzymes biosynthesis pathways. • Stakeholders, including industry, can use T. asperellum PK1J2 as a biocontrol agent, biofertiliser, and producer of metabolites and enzymes, especially cellulase, through this genetic information.

Data Description
T. asperellum is a mycoparasitic species widely used for its ability to inhibit the growth of plant pathogens [1] . T. asperellum has been shown to produce hydrolytic enzymes such as cellulase and xylanase [2 , 3] . T. asperellum has also been reported to hydrolyse wheat bran, wheat straw, paper, sawdust, corncob, duckweed, and agave by secreting cellulases [2][3][4][5][6][7] . Strain PK1J2 has been proven to be capable of producing high cellulase. The cellulase from this fungi can hydrolyse cassava stem and sago waste into fermentable sugar [8 , 9] . In a previous study, strain PK1J2 produced highest cellulase activity among the examined fungi isolated from Indonesia and was further selected to characterize its genome.   Fig. 1 shows a phylogenetic tree of strain PK1J2 comparing its internal transcribed spacer (ITS) region with the other fungi. As can be seen in the figure, the ITS gene of strain PK1J2 showed the highest similarity with Trichoderma asperellum species.
T. asperellum PK1J2 had 6835 protein-coding genes with 9,233,597 bp in length, as seen in Table 1  Functional gene annotation predicted about 4759 genes using GO, 6398 genes using KEGG, 1946 genes using KOG, 4759 genes using Pfam, 2783 genes using SWISS-PROT, and 6544 genes using NR database. Gene coding for protein possibly involved in secondary metabolite production revealed the presence of T1PKS cluster, NRPS cluster, NRPS-like cluster, T1PKS-NRPS hybrid cluster, and terpene cluster. A carbohydrate-active enzyme analysis showed that T. asperellum PK1J2 was dominated by GH18, GH3, GH16, GH2, and GH5.

Fungal Strain and DNA Extraction
Strain Trichoderma asperellum PK1J2 was obtained from the Laboratory of Biotechnology, Faculty of Agricultural Technology, Universitas Gadjah Mada. Strain PK1J2 was isolated from a rotten palm empty fruit bunch, Pekanbaru, Riau, Indonesia. The strain was grown on PDA agar at 30 °C for a period of seven days. ZymoBIOIMICS TM DNA Mini Kit (Zymo Research, California) was used for extracting genomic DNA. The harvested DNA was detected by agarose gel electrophoresis and quantified by Qubit ® 2.0 Fluorometer.

Species Identification
The DNA fragment was amplified using universal primer set ITS1 (forward primer) 5 -TCCGTAGGTGAACCTGCGG-3 and ITS4 (reverse primer) 5 -TCCTCCGCTTATTGATATGC-3 . The PCR product was sequenced using Bi-directional Sequencing. The sequence was analyzed by BLAST and then compared to the NCBI database. The phylogenetic tree was constructed using the Neighbor-Joining method (Unrooted Tree) by NCBI BLAST.

Genome Sequencing and Assembly
Sequencing libraries were generated using NEBNext ® Ultra TM DNA Library Prep Kit for Illumina (NEB, USA) following manufacturer's recommendations. The whole genome sequencing of the fungi was performed using an Illumina NovaSeq PE150 at the Beijing Novogene Bioinformatics Technology Co., Ltd. The genome assembly was done using SOAPdenovo, SPAdes, and Abyss software. The assembly results from all three software were integrated with CISA software. The assembly result with the least scaffolds was selected.

Ethics Statements
Not applicable.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.