Environmental DNA (eDNA) dataset of foraminiferal diversity and distribution from the mining-impacted estuaries of Goa, west coast of India

The foraminiferal environmental DNA (eDNA) metabarcoding based on high-throughput sequencing (HTS) is a powerful tool to unravel the hidden genetic diversity and environmental lineages. Results from the eDNA approach provided valuable insight into an unplumbed diversity of soft-bodied monothalamous foraminifers [1]. Micropaleontologists overlooked monothalamids due to their soft organic and/or finely agglutinated test, which often gets destroyed during routine morphological investigations [2]. On the other hand, some foraminifera taxonomists or studies included monothalamids (soft-shelled species) in ecological and diversity investigations ranging from deep-sea locations to coastal marine habitats [1], [3], [4]. Here, we document our metabarcoding analysis of foraminiferal diversity and abundance from the mining-affected estuaries of the Indian state of Goa. High-throughput sequencing using the Illumina platform indicated the overwhelming abundance of monothalamous foraminifers in the studied estuarine sediments. For the first time, such detailed data of the foraminiferal diversity utilizing sedimentary environmental DNA (eDNA) methods was carried out in India. The raw sequence data used for analysis is available in NCBI under the Sequence Read Archive (SRA) with the BioProjects and SRA accession number: PRJNA1040471. The presented data may be used as baseline information for eDNA-based biomonitoring and biodiversity assessment surveys from Indian marine habitats across time and space.


a b s t r a c t
The foraminiferal environmental DNA (eDNA) metabarcoding based on high-throughput sequencing (HTS) is a powerful tool to unravel the hidden genetic diversity and environmental lineages.Results from the eDNA approach provided valuable insight into an unplumbed diversity of soft-bodied monothalamous foraminifers [1] .Micropaleontologists overlooked monothalamids due to their soft organic and/or finely agglutinated test, which often gets destroyed during routine morphological investigations [2] .On the other hand, some foraminifera taxonomists or studies included monothalamids (soft-shelled species) in ecological and diversity investigations ranging from deep-sea locations to coastal marine habitats [1,3,4] .Here, we document our metabarcoding analysis of foraminiferal diversity and abundance from the miningaffected estuaries of the Indian state of Goa.High-throughput sequencing using the Illumina platform indicated the overwhelming abundance of monothalamous foraminifers in the studied estuarine sediments.For the first time, such detailed data of the foraminiferal diversity utilizing sedimentary environmental DNA (eDNA) methods was carried out in India.The raw sequence data used for analysis is available in NCBI under the Sequence Read Archive (SRA) with the BioProjects and SRA accession number: PRJNA1040471.The presented data may be used as baseline information for eDNA-based biomonitoring and biodiversity assessment surveys from Indian marine habitats across time and space.
© 2024 The Author(s

Value of the Data
• The generated environmental DNA (eDNA) dataset provides the first-ever insight into the uncharted genetic diversity of foraminifera from tropical locations along the west coast of India.• This dataset may also serve as baseline information as a foraminiferal proxy for monitoring the long-term effect of mining pollution on the health of estuaries of the state of Goa, India.• This dataset may serve as reference information for metabarcoding-based biomonitoring, biodiversity assessment surveys, and environmental impact assessment studies from other Indian marine habitats across time and space.• The metabarcoding data hold the potential to aid predictive analysis, which is essential to understanding the effect of marine pollution on microorganisms.

Background
Goa, a state of the Republic of India, is among the primary producer of iron ore.The opencast iron ore mining at Goa creates a huge pile of leftover dumps and rejects, which enter into the estuaries during the Indian monsoon season.To minimize mining ʼs harmful consequences before they reach humans, it is critical to assess them through the perspective of lower-level species in the food chain.Foraminifera are widespread marine meiofauna-sized protists widely used to evaluate environmental quality [ 4 ].Owing to the short life cycle and responsiveness to environmental stresses, foraminifers have proven to be an effective proxy in biomonitoring marine environments [ 5 ].The wet-picking method preserves the soft-bodied and/or finely agglutinated test of monothalamous foraminifera [ 1 , 2 , 6 ].On the other hand, ecotoxicological studies use the dry-picking method, where only hard-shelled foraminifers were microscopically detected in marine sediments [ 7 ], which limits the benthic foraminifera to their full potential in biodiversity and pollution monitoring studies.Foraminiferal environmental DNA (eDNA) metabarcoding approaches based on high-throughput sequencing techniques can provide valuable insight into the uncharted diversity of foraminifers in ecologically sensitive marine environments and thus have considerable implications in biomonitoring surveys tracing anthropogenic impacts [ 8 ].Here, we document the first insight into the uncharted diversity of organic-walled benthic foraminifera, which outnumbered the hard-shelled foraminifers in the sediment samples analyzed using metabarcoding approaches.To our understanding, no monothalamous foraminifera sequences have been reported from Goa, India.This work aimed to establish a metabarcoding dataset of foraminifera from the tropical location along the west coast of India that can be used as a reference for a wide range of eDNA-based studies from Indian marine habitats.

Data Description
The samples were collected during the premonsoon season (March 2023) from coastal regions of Goa, India ( Fig. 1 ).The dataset described in this article is of foraminiferal genetic diversity including hard-shelled Globothalamea and soft-bodied monothalamids which were elucidated from the coastal shallow marine sediments using eDNA approach representing two mining affected estuaries of Goa, India, along the west coast of India.Besides, in-situ measurements of physical environmental parameters, including temperature, pH, DO, salinity, and sediment granulometry, have been deduced.The Chapora estuary stations (GOA1 and GOA2) are affected by anthropogenic activities, such as fishing, sand mining, and the flow of untreated effluents and solid waste.On the other hand, the Mandovi estuary stations (GOB1 and GOB2) are affected by severe metal pollution in the surficial sediments caused due to Fe-Mn ore mining activities at Goa, India [ 9 ].The sampling station coordinates, sampling depths, and in-situ environmental parameters recorded during sampling can be found in Table 1 .The dataset consists of raw environmental DNA (eDNA) reads from coastal regions of Goa, India.We sequenced the foraminifera-specific 37f region of 18S rRNA using the Illumina HiSeqX platform and classified the reads taxonomically using the QIIME2 pipeline [ 10 ].The complete dataset numbered 6353,543 reads with 12,057 pair-end sequences with a length of nearly 200 bp.Of these, 10,893 were successfully taxonomically classified.De-novo clustering at 95 % resulted in 456 OTUs ( Table 2 ).Our data suggests greater alpha diversity from Chapora estuary stations compared to Mandovi estuary stations.More unclassified foraminiferal sequences were documented from the Chapora estuary (57 %) than from the Mandovi estuary (19.67 %).The data without unclassified OTUs showed that in the Chapora estuary, GOA1 has higher OUTs counts for soft-bodied monothalamids (66 %) than multichambered globothalamids (33 %).At station GOA2, monothalamous has a maximum OTUs abundance of 95 % ( Fig. 2 ).To the contrary, in Mandovi estuary, GOB1 station has maximum OTUs abundance for globothalamids, and at station GOB2, both the monothalamids and globothalamids OTUs abundance was comparable ( Fig. 2 ; Table 3 ).The results of taxonomic classification at the genus level are presented in Fig. 3 .

Sampling
Surface sediment samples were collected using the Van Veen Grab Sampler in March 2023.The upper 2 cm sediment layer has been sampled from the surface of approximately 10 cm 2 .Samples for foraminiferal environmental DNA (eDNA) were immediately transferred to sterile 15 ml falcon centrifuge tubes and frozen in a −20 °C deep freezer.
of Forests and Head of Forest Force, Goa State, India, is duly acknowledged for granting permission to collect samples from the coastal regions of Goa state (ref.no.2-66-WL-RESEARCH PROPOSALS-FD-Vol.IV/2341 ).

Fig. 2 .
Fig. 2. Relative abundance of OTUs at Class level, including unassigned OTUs (green) for all samples in the dataset.

Fig. 3 .
Fig. 3. Relative abundance of OTUs at Order and Genus level, excluding unassigned OTUs for all samples in the dataset.

Table 1
Geographical coordinates, depths, and in-situ environmental parameters of sampling stations.

Table 2
Filtering statistics through the QIIME 2.0 pipeline.

Table 3
Class, order, and genus level OTUs distribution across sampling locations.