Complete genome sequence dataset of enthomopathogenic Aspergillus flavus isolated from a natural infection of the cattle-tick Rhipicephalus microplus

As the most important bovine ectoparasite, the southern cattle tick Rhipicephalus microplus transmits lethal cattle diseases such as babesiosis and anaplasmosis, costing the global livestock industry billions of dollars annually. To control cattle ticks, preventive treatment of cattle with pesticides is a common practice; however, after decades of chemical treatment, pesticide resistance has arisen in cattle ticks, rendering most formulations ineffective over time. Facing the perspective of running out of effective chemical treatments against R. microplus, research on biocontrol alternatives is necessary. Acaro-pathogenic microorganisms isolated from different developmental stages of R. microplus offer potential as biocontrol agents. Aspergillus flavus strain INIFAP-2021, isolated from naturally infected cattle ticks, produced high levels of mobility and mortality in the tick population during experimental infections. The whole genome of the fungi was sequenced using the DNBSEQ platform by BGI. The genome was assembled using SOAPaligner, and A. flavus NRRL3357 was used as the reference genome; the complete genome contained eight pairs of chromosomes and 36.9 Mb with a GC content of 48.03%, exhibiting 11482 protein-coding genes. The final genome assembly was deposited at GenBank as a bio project under accession number PRJNA758689, and supplementary material is accessible through Mendeley DOI: 10.17632/mt8yxch6mz.1.


a b s t r a c t
As the most important bovine ectoparasite, the southern cattle tick Rhipicephalus microplus transmits lethal cattle diseases such as babesiosis and anaplasmosis, costing the global livestock industry billions of dollars annually. To control cattle ticks, preventive treatment of cattle with pesticides is a common practice; however, after decades of chemical treatment, pesticide resistance has arisen in cattle ticks, rendering most formulations ineffective over time. Facing the perspective of running out of effective chemical treatments against R. microplus , research on biocontrol alternatives is necessary. Acaro-pathogenic microorganisms isolated from different developmental stages of R. microplus offer potential as biocontrol agents. Aspergillus flavus strain INIFAP-2021, isolated from naturally infected cattle ticks, produced high levels of mobility and mortality in the tick population during experimental infections. The whole genome of the fungi was se-quenced using the DNBSEQ platform by BGI. The genome was assembled using SOAPaligner, and A. flavus NRRL3357 was used as the reference genome; the complete genome contained eight pairs of chromosomes and 36. 9  VBS amplicon is also available in GenBank with accession number ON716447 nuSSU amplicon is also available in GenBank with accession number ON716448 CMD amplicon is also available in GenBank with accession number ON716449

Value of the Data
• The genome data of Aspergillus flavus INIFAP-2021 isolated from R. microplus infection provides insight into the genetic diversity of A. flavus and essential genetic information to reveal important details of its general metabolism. • These data can be used as initial information for researchers working on fungal microbiology, Aspergillus -like organism genetics and biocontrol biotechnology. • The genome assembly could be used to identify genes involved in A. flavus infection of R.
microplus, particularly those that could act as virulence factors. • The genome assembly could be used for comparative genome studies to identify differences between A. flavus strains capable of infecting different organisms in different environments. • The genome dataset helps to identify the proteins in metabolic routes involved in the production of mycotoxins that may be the target for manipulation and/or suppression of toxicity of A. flavus destined for biocontrol in cattle. • Our database may be useful for taxonomical classification within the Aspergillus genus

Objective
The identification of A. flavus genes involved in the cattle tick infection may be useful for biocontrol purposes and assessment of mycotoxin-producing potential.

Data Description
A. flavus is a well-known human and plant pathogen [1][2][3][4][5] that has been shown to have an effect on the mortality of R. microplus . Additionally, some A. flavus strains have been shown to produce aflatoxins, which are considered to be carcinogenic compounds, and the corresponding aflatoxin synthesizing genes have been identified [ 2 , 6 , 7 ]. A. flavus has also been reported as a potential biocontrol agent for cattle ticks [8] . Strain A. flavus INIFAP 2021 was collected from ill females of R. microplus and is now being examined to further characterize its genome. In this work, we first searched the phylogenetic relationship of INIFAP 2021 strain within the Aspergillus genus with nuSSU single phylogeny ( Fig. 1 ) and concatenated phylogeny with calmodulin, versicolorin b sintetase and internal transcribed spacer region markers ( Fig. 2 ).
The sequencing resulted in 8,764,210 reads equivalent to 1,314 raw Mb and 1,273 clean Mb, giving ∼34.4x read depth. The assembly has a total length of 36,944,197 bp distributed in 8 chromosomes with an N50 of 4,658,663. The assembly was uploaded to GenBank with accession number ASM1988044v1. L morphotype genome ranges between 37-37.5 Mpb and S morphotype genomes range between 38.1 and 38.3 Mbp [9] , placing this genome in the L morphotype range.
The genome annotation resulted in 11,482 predicted coding sequences (CDS), which were assigned by alignments using databases as follows: 5543 (48.28%) with GO mapping, 2886 (25.14%) with BLAST, 553 (4.82%) with Go-Slim and 2499 (21.76%) with no hits. These predicted genes had a minimum length of 190, a maximum length of 23,271 and an average length of 1604 bp. In addition, another gene prediction was made with EggNOG, resulting in 3791 GO annotated sequences with 8.91 GOs per sequence indicating there are multiple GO numbers for the annotated sequences and supporting the annotation data, this annotated sequences had an average length of 533 bp. Additionally, EggNOG distributed CDS in 4 different COG categories, 1348 (12.02%) for information storage and processing, 1548 (13.8%) for cellular processes and signaling, 3546 (31.61%) for metabolism and 2802 (24.98%) for poorly characterized. Table 1

Fungal strain and DNA extraction
Strain Aspergillus flavus INIFAP-2021 was obtained from Arthropodology Laboratory, CENID-SAI, Mexico. This strain was isolated from an R. microplus female with a natural fungal infection acquired at the bovine tick-infestation stables and named isolate INIFAP-2021. The isolate spores were cultured in Sabouraud-agar medium for three days at 28 °C, and the biomass was collected and dried on filter paper, frozen to -70 °C and ground to powder in a mortar while maintaining the freezing conditions according to a previously described procedure [8] .
The powder was suspended in lysis buffer (20 mM Tris HCl, 5 mM EDTA, 0.4 M NaCl and 1% SDS, pH 8) and incubated for one hour at 60 °C. The DNA from the lysate was extracted with the phenol-chloroform-isoamyl alcohol (25:24:1) technique described previously [10] , and the sample was extracted with an equivalent volume of 96% ethanol. The sample was then centrifuged at 130 0 0 rpm for one min, the supernatant was discarded, and the pellet was washed with 70% ethanol and centrifuged once again at 130 0 0 rpm for one min. The liquid was discarded and the pellet was dried and resuspended in 50 μL of TE buffer (10 mM Tris HCl and 1 mM EDTA, pH 8). The DNA concentration was inferred using Nabi by Microdigital.

Species identification
The DNA fragment was amplified using a universal primer set for nuSSU : nu-SSU-0817 (5 -TTAGCATGGAATATRRAATAGGA-3 ) nu-SSU-1536 (5 -ATTGCAATGCYCTATCCCCA-3 ) with an amplicon length of ∼ 720 bp according to previously described methods [ 8 , 11 ]. A primer set for calmodulin ( CMD ): CMD5 (5 -CCGAGTACAAGGARGCCTTC-3 ) CMD6 (5 -CCGATRGAGGTCATRACGTGG-3 ) [12] with amplicon length ∼ 550 bp and a primer set for versicolorin B sintetase ( VBS ): AFVBSF (5 -AGGCGCATACGATATGTGS-3 ) AFVBSR (5 -AACAGACCCTTACGCTGCT-3 ) specifically designed for this project with an amplicon length ∼ 717 bp were also used for PCR amplification. The PCR master mix contained 1x PCR buffer, 100 pg of fungal DNA, 20 pmol of corresponding reverse and forward primers, with dNTPs at 2 mM, MgCl 2 5 mM and 2 units of Taq polymerase in a total volume of 20 μL. The cycling conditions were 94 °C for 5 min followed by 30 cycles of 94 °C for 30 s, 55 °C 30 s and 72 °C for 30 s and a final extension of 72 °C for 5 min. Amplicons were verified by 1% agarose electrophoresis in 1x TBE buffer according to a described procedure (Sambrook & Russell, 2001). The amplicons were purified using the Wizard® SV Gel and PCR Clean-Up System (Promega) according to the product instructions and submitted for Sanger commercial sequencing at the "Unidad de Sintesis y Secuenciacion de DNA del Instituto de Biotenologia" UNAM using the same oligonucleotides used for amplification. The obtained sequences of nuSSU were compared using the Basic Local Alignment Search Tool (BLAST) [13] at the NCBI database (BLAST: Basic Local Alignment Search Tool) to obtain similar sequences. Multiple sequence alignment was carried out with Multiple Sequence Comparison by Log-Expectation (MUSCLE) at the EMBL-EBI resources. The nuSSU phy-logeny analysis was conducted with MEGA X using the maximum likelihood method with the general time reversible (GTR) model [14] model with a 10 0 0-repeat bootstrap test ( Fig. 1 ).
Phylogenetic analysis of the concatenated calmodulin ( CMD ), versicolorin B synthetase ( VBS ) and internal transcribed spacer ( ITS ) concatenated genes was performed using the maximum likelihood method with a general time reversible (GTR) model [14] with a 10 0 0-repeat bootstrap test ( Fig. 2 ). The ITS sequence used in the phylogeny was obtained directly from the genome sequence using BLASTn and the ITS sequence from A. parasiticus (AY373859.1) was used as a query.

Genome sequencing and assembly
> 3 mg of DNA sample was sent to BGI Genomics for further sequencing and assembly. The sequence genome was obtained using the DNBSEQ platform by BGI. The genome was assembled using SOAPaligner with the A. flavus NRRL3357 genome (GCA_014117465.1) as a reference genome, and the genome sequence was then uploaded to GenBank (PRJNA758689).

Genome annotation
Genes were predicted in the Aspergillus flavus INIFAP 2021 genome based on a generalized hidden Markov model and the gene finding mode as ab initio prediction, using the software Augustus within OmicsBox with Aspergillus oryzae as the closest species.
The homologues of the proteins of the genome of Aspergillus flavus INIFAP 2021 were searched using GO mapping, GO annotation, BLAST and InterProScan.

Ethics Statements
Neither human nor vertebrate experimentation was conducted, and the ethical committee approved this experiments.
ITS amplicon is also available in GenBank with accession number ON64 4 422. VBS amplicon is also available in GenBank with accession number ON716447. nuSSU amplicon is also available in GenBank with accession number ON716448. CMD amplicon is also available in GenBank with accession number ON716449.