Metagenomes and Assembled Genomes from Diarrhea-Affected Cattle (Bos taurus)

The de novo metagenome assembly for C1-TPA is 68,577,389 bp long spread over 10,108 contigs, while that of C3-TPA is 55,517,929 bp distributed over 9,415 contigs. A total of 8 metagenome-assembled genomes (MAGs) were extracted from C1-TPA, and 10 were extracted from C3-TPA. Both samples have a Flavobacterium sp. and a Pseudomonas sp. in common among their bacterial communities.

D iarrheal disease remains a major cause of morbidity and mortality in the developing world; cattle and young calves are highly susceptible to enteric infections caused by various pathogens (1). Diarrheal samples, C1-TPA and C3-TPA, were collected from affected cattle (Bos taurus) directly from the rectum with sterile nitrile gloves, at Lokaleng Village in Mafikeng, South Africa (25.82°S, 25.58°E). About 150 mg of each of the fecal samples was apportioned for metagenome DNA extraction using a Quick-DNA fecal/soil microbe miniprep kit (Zymo Research Corp., USA). The library was prepared with a Nextera DNA Flex library preparation kit (Illumina) using Nextera DNA CD index adapters (96 indexes plated). The final concentrations of the libraries (70.80 ng/ml for both C1-TPA and C3-TPA) were measured using the Qubit doublestranded DNA (dsDNA) high-sensitivity (HS) assay kit (Life Technologies), and the average library sizes (521 bp and 523 bp for C1-TPA and C3-TPA, respectively) were determined using the Agilent 2100 bioanalyzer. The libraries were then pooled in equimolar ratios of 8.0 pM and sequenced on an Illumina NovaSeq 6000 system. The numbers of reads generated thereafter were 14,302,284 and 14,431,130 for samples C1-TPA and C3-TPA, respectively. The read length used in the library preparation was 2 Â 150 bp, and the coverage of the sequence was 29Â for C1-TPA and 36Â for C3-TPA.
The sequenced data were assessed and filtered with Trimmomatic v0.36 (2) for lowquality reads and adapter fragments. The adapter sequences were clipped using a mismatch value of 2, a palindrome clip threshold of 30, and a simple clip threshold of 10. The taxonomy of the metagenomes was determined using Kaiju v1.7.2 (3) and GOTTCHA2 v2.1.6. The de novo metagenome assembly was constructed with metaSPAdes v3.13.0 (4). Each of the metagenome assemblies was binned using MaxBin 2 v2.2.4 (5) and CONCOCT v1.1 set at different modes-Bowtie2-default and Bowtie2-verysensitive, respectivelyand BBMap. The binned contigs were then optimized to exclude bins that have low completeness and high contamination using DAS Tool v1.2. Each bin was then extracted as a metagenome-assembled genome (MAG) and assessed for quality control using CheckM v1.0.18 (6). The taxonomic assignments were obtained for the MAGs based on the genome taxonomy database using GTDB-Tk v1.1.0 (7) and Microbial Genomes Atlas (MiGA) (8); where the taxonomic assignment differs, identity with a higher average nucleotide identity (ANI) percentage was selected.
The MAGs were annotated using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) v4.12 (9). The acquired drug-resistant genes were determined using ResFinder v4.0 (10). Most of the software was accessed through the KBase workspace service v0.11.1 (11), except for MiGA, PGAP, and ResFinder. Default parameters were used for all software engaged in the analysis except where stated otherwise.
The de novo metagenome assembly for C1-TPA has 68,577,389 bp distributed over 10,108 contigs, and the metagenome assembly was binned into eight MAGs. For sample C3-TPA, the assembly size is 55,517,929 bp distributed across 9,415 contigs. Ten different MAGs were extracted from the metagenome. Not all the contigs were binned into the MAGs in both samples. Both samples have Flavobacterium spp. (Flavobacterium sp. strain N1CT and Flavobacterium sp. strain NTP45) and Pseudomonas spp. (Pseudomonas sp. strain N17CT and Pseudomonas stutzeri NTP17) in common (Table 1).
Ethical clearance for the study was approved by the Research Ethics Committee of North West University, South Africa (NWU-00160-14-A9).
Data availability. All data were deposited under the GenBank BioProject number PRJNA661076. The whole-genome shotgun projects have been deposited in DDBJ/ENA/ GenBank under the accession numbers JADGMW000000000 and JADKLW000000000. The versions described in this paper are the first versions, JADGMW000000000.1 and JADKLW000000000.1. The SRA accession numbers for the raw reads are SRX9212776 and SRX9218438 for samples C1-TPA and C3-TPA, respectively.