Data on rumen and faeces microbiota profiles of Yakutian and Kalmyk cattle revealed by high-throughput sequencing of 16S rRNA gene amplicons

It is known that the rumen microbiome directly or indirectly contributes to animal production, and may be a prospective target for mitigation of greenhouse gas emissions [1]. At the same time, feed types and components of diet can influence the composition of the rumen microbiome [2,3]. Fluctuations in the composition of the digestive tract microbiota can alter the development, health, and productivity of cattle [4]. Many studies of cattle microbiomes have focussed on the rumen microbiota, whereas the faecal microbiota has received less attention [5], [6], [7]. Therefore, the features of the faecal and the ruminal microbiomes in different cattle breeds are yet to be studied. Here, we provided 16S rRNA gene amplicon data of the ruminal and the faecal microbiomes from Yakutian and Kalmyk cattle living in the Republic of Sakha, Yakutia, Russia. Total DNA was extracted from 13 faecal and 13 ruminal samples, and DNA libraries were prepared and sequenced on an Illumina MiSeq platform. Paired-end raw reads were processed, and final operational taxonomic units (OTUs) were assigned to the respective prokaryotic taxa using the RDP (Ribosomal Database Project) database. Analysis of the microbiome composition at the phylum level revealed very similar faecal microbiota between the introduced Kalmyk breed and the indigenous Yakutian breed, whereas the ruminal microbiomes of these breeds differed substantially in terms of relative abundance of some prokaryotic phyla. We believe that the data obtained may provide new insights into the dynamics of the ruminal and the faecal microbiota of cattle as well as disclose breed-specific features of ruminal microbiomes. Besides, these data will contribute to our understanding of the ruminal microbiome structure and function, and might be useful for the management of cattle feeding and ruminal methane production.


a b s t r a c t
It is known that the rumen microbiome directly or indirectly contributes to animal production, and may be a prospective target for mitigation of greenhouse gas emissions [1] . At the same time, feed types and components of diet can influence the composition of the rumen microbiome [2 , 3] . Fluctuations in the composition of the digestive tract microbiota can alter the development, health, and productivity of cattle [4] . Many studies of cattle microbiomes have focussed on the rumen microbiota, whereas the faecal microbiota has received less attention [5][6][7] . Therefore, the features of the faecal and the ruminal microbiomes in different cattle breeds are yet to be studied. Here, we provided 16S rRNA gene amplicon data of the ruminal and the faecal microbiomes from Yakutian and Kalmyk cattle living in the Republic of Sakha, Yakutia, Russia. Total DNA was extracted from 13 faecal and 13 ruminal samples, and DNA libraries were prepared and sequenced on an Illumina MiSeq platform. Paired-end raw reads were processed, and final operational taxonomic units (OTUs) were assigned to the respective prokaryotic taxa using the RDP (Ribosomal Database Project) database. Analysis of the microbiome composition at the phylum level revealed very similar faecal microbiota between the introduced Kalmyk breed and the indigenous Yakutian breed, whereas the ruminal microbiomes of these breeds differed substantially in terms of relative abundance of some prokaryotic phyla. We believe that the data obtained may provide new insights into the dynamics of the ruminal and the faecal microbiota of cattle as well as disclose breed-specific features of ruminal microbiomes. Besides, these data will contribute to our understanding of the ruminal microbiome structure and function, and might be useful for the management of cattle feeding and ruminal methane production. ©

Value of the Data
• This dataset provides a description and comparison of the ruminal and the faecal microbiomes in cattle of Yakutian and Kalmyk breeds based on high-throughput sequencing of 16S rRNA gene amplicons. • Analysis of 16S rRNA gene sequences at the phylum level revealed very similar faecal microbiota between the introduced Kalmyk breed and the indigenous Yakutian breed as well as breed-specific ruminal microbiome profiles featured by differentially distributed prokaryotic phyla.
• The data on the microbiomes of Kalmyk and Yakutian cattle adapted to cold weather conditions provide insights that would allow to improve livestock rearing in regions with harsh climatic conditions.

Experimental design
The aim of this study was to assess the composition of rumen and faeces microbiomes in cattle of the introduced Kalmyk breed and the indigenous Yakutian breed. The composition of both groups of Kalmyk ( n = 7) and Yakutian ( n = 6) cattle was optimised for sex (cows only), age (4-7 years old), and weight (350-480 kg). Animals in both groups were kept under similar conditions and provided the same feed rations. Faecal samples were obtained from the selected cows by a non-invasive method. After defecation, the top layer of the faeces was removed with a sterile spatula, and then 0.4-0.5 g of faeces was transferred into a 2.0-mL Eppendorf tube containing 500 μL of a DNA/RNA Shield (Zymo Research, Irvine, CA, USA) preservative solution. Samples of ruminal fluid were obtained by rumenocentesis with a sterile needle under local anaesthesia by observing the rules of an aseptic technique. Afterwards, 0.5 mL of ruminal fluid was transferred into a 2.0-mL Eppendorf tube containing 500 μL of DNA/RNA Shield.

Sample collection
Sampling was carried out on the same day for all animals of the same group. The samples were transported to the laboratory at 4-25 °C in accordance with the manual of the DNA/RNA Shield preservative.

DNA extraction and 16S rRNA gene sequencing
Total DNA from ruminal fluid or faeces was isolated using a FastDNA® SPIN Kit for Faeces (MP Biomedicals Inc., Solon, OH, USA) by applying a Lysing Matrix E. Samples were homogenised on a TissueLyser LT (Qiagen, Venlo, Netherlands). The duration of homogenisation was increased up to 5 min, in contrast to the manufacturer's protocol. The quality of the extracted DNA was assessed by electrophoresis in 1% agarose gel and with Nanodrop 80 0 0 (Thermo Fisher Scientific, Inc., Waltham, MA, USA). The DNA concentration was quantified using a Qubit 4.0 Fluorometer with a dsDNA High Sensitivity Assay Kit (Life Technologies, Carlsbad, CA, USA).
DNA libraries were prepared according to the Illumina two-step protocol (Part #15,044,223, Rev. B). At the first stage, target amplicons were prepared using primers for the V3-V4 region of the 16S rRNA gene (S-D-Bact-0341-b-S-17 and S-D-Bact-0785-a-A-21) [8] , which were connected to Illumina overhang sequences. The composition of the PCR mixture and the PCR parameters are presented in Table 2 . At the second stage, the amplicons were bound with sample-specific dual Illumina indices (Nextera XT, i7 and i5). Paired-end 2 × 300-bp sequencing was carried out on an MiSeq platform ((Illumina, San Diego, CA, USA) with a Reagent Kit v.3 (Illumina).

Bioinformatics and statistical analysis
Paired-end reads were merged with a minimal overlap of 40 bp and a p-value of 0.0 0 01 using PEAR v. 0.9.10 [9] . Subsequent treatment of the merged reads was conducted with USEARCH v. 10.0.240 [10] and included quality filtering and amplicon size selection (minimal size, 420 bp). Reads shorter than 420 bp and reads with an expected error (ee) higher than 1 per 100 nucleotides (max. ee, 1.0) were filtered out. Filtering quality was evaluated using FastQC v. 0.11.7. Due to dereplication and clustering with USEARCH, OTUs were formed, whereas singletons and doubletons were removed. OTUs were determined using a similarity threshold level of 97% between sequences to classify microorganisms at the species level. Chimeric sequences were detected and removed using USEARCH via UCHIME [11] . Contaminant OTUs were identified and removed via the USEARCH command 'ublast' by matching the sequences of trial samples and negative control samples. The taxonomic classification of sequences was conducted using the RDP [12] and NCBI reference databases. Rarefaction curves were built using Microsoft Office Excel, based on the data obtained with the 'alpha_div_rare' command (USEARCH v.11).

Ethical Statement
This sampling was carried out in accordance with the recommendations of the National Institutes of Health Guide for the Care and Use of Laboratory Animals (NIH Publications No. 8023, revised 1978).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.