Fish Diversity Monitoring Using Environmental DNA Techniques in the Clarion–Clipperton Zone of the Paciﬁc Ocean

: Marine ﬁsh populations have suffered the consequences of overﬁshing for a long time, leading to a loss in biodiversity. Traditional methods have been historically used to survey ﬁsh diversity but are limited to commercial species, particularly on the high seas. Environmental DNA (eDNA) has been successfully used to monitor biodiversity in aquatic environments. In this study, we monitored ﬁsh diversity in the Clarion–Clipperton Zone (CCZ) of the Eastern Paciﬁc Ocean using eDNA metabarcoding. Our results identiﬁed 2 classes, 35 orders, 64 families, and 87 genera. The genera Mugil , Scomberomorus , and Scomber had high relative abundance in the mesopelagic and demersal zone. Fish diversity varied with sampling sites, and the greatest number of species was found at a depth of 2500 m. Environmental changes drove ﬁsh aggregation, and our results indicated that Chla was negatively correlated with ﬁsh communities, while DO was positively correlated with ﬁsh communities. This study released the ﬁsh diversity pattern and the effects of the environment in the CCZ, which would provide useful information for biodiversity management and an environmental baseline for the International Seabed Authority.


Introduction
The Clarion-Clipperton Zone (CCZ) is an area of 6 million square kilometers [1] and is located in the northeastern subtropical mid-Pacific Ocean between Mexico and Hawaii [2]. The CCZ is delimited by two fault zones, the Clarion and Clipperton, and encompasses an extensive range of habitats, including hills, seamounts, fault zones, and vast abyssal plains [3,4]. The CCZ contains many metal nodules rich in manganese, nickel, copper, cobalt, iron, and other rare earth elements, and is an important area for deep-sea manganese nodule mining [5,6].
Mining equipment generates noise pollution and impacts the environment by disturbing ecosystems both physically and chemically [7][8][9]. Deep-sea mining pumps sediment and metallic nodules to the surface, releasing sediment plumes back into the water column. The nutrients of sediment plumes influence pelagic food webs [9]. Mining not only affects the area where metallic nodules are removed, but also disrupts adjacent areas through the redeposition of sediment plumes, affecting wider areas of the seafloor than those directly affected by nodule removal. These changes are likely to persist for decades to centuries [3,10]. Mining of nodules requires appropriate monitoring and conservation strategies [11]. Fish diversity is a part of the environmental baseline.  Table 1 and Figure 1. Real-time data of environmental factors in different depths were measured by CTD on board, including temperature (T), salinity (S), turbidity (NTU), Chlorophyll a (Chla), and dissolved oxygen (DO). Seawater samples were collected from mesopelagic and demersal depths using WTS-LV Large Volume Water Transfer System and filtered immediately after collection. All samples and filtration equipment for seawater collection were washed with Milli-Q water before use. Samples were filtered through a glass-fiber membrane with a nominal pore size of 0.3 µm (GF-75, ADVANTEC, Tokyo, Japan). After filtration, filter membranes were placed in cell culture dishes (NEST, Wuxi, China). All samples were immediately frozen at −80 • C until eDNA extraction. eDNA was shredded and extracted using DNeasy PowerWater kit (Qiagen, Hilden, Germany) following the manufacturer's protocol in the laboratory. eDNA samples were stored at −80 • C until further analysis.

Metabarcoding of eDNA Samples
eDNA metabarcoding using universal MiFish primer pairs has been shown to amplify short fragments of fish DNA in various taxa from environmental samples [44]. Our samples were analyzed using two universal primer pairs (MiFish-U, MiFish-E, Shengong, Shanghai, China) to amplify the V5 region of the mitochondrial 12S rRNA gene. The multiplex polymerase chain reaction (PCR) volume was 50 µL, including 20 µL of sterile distilled H 2 O, 25 µL of Taq 2× Master Mix (Vazyme, Nanjing, China), 1 µL of each primer (MiFish-U-F: 5 -GTCGGTAAAACTCGTGCCAGC-3 ; MiFish-U-R: 3 -GTTTGACCCTAATCTATGGGGTGATAC-5 ; MiFish-E-F: 5 -GTTGGTAAATCTCGTGCCAGC-3 ; and MiFish-E-R: 3 -GTTTGATCCTAATC TATGGGGTGATAC-5 ), and 1 µL of DNA solution. The thermal cycle PCR process included an initial 2 min denaturation at 94 • C, followed by 30 cycles of denaturation at 98 • C for 5 s each. It was then annealed at 50 • C for 10 s, extended at 72 • C for 10 s, and completed with a final extension at 72 • C for 5 min. Once the PCR was complete, equal amounts of 1× loading buffer (containing SYBR green) and PCR products were mixed and electrophoresed on 1% agarose gels. Samples with a bright main strip of 297 ± 25 bp were selected. The mixed PCR products were then purified with GeneJET Gel Extraction Kit (Thermo Scientific, Waltham, MA, USA).
the Eastern Pacific Ocean. The sites were chosen because they contained habitats of seamounts and sea basins. A total of 22 samples were collected on board R/V XIANG YANG HONG 03 during the China Ocean 45 cruise in July 2017 and the China Ocean 45 cruise in August 2018, through WTS-LV Large Volume Water Transfer System (McLANE, Carrollton, TX, USA) for greater biodiversity coverage. The characteristics of the sampling sites are shown in Table 1 and Figure 1. Real-time data of environmental factors in different depths were measured by CTD on board, including temperature (T), salinity (S), turbidity (NTU), Chlorophyll a (Chla), and dissolved oxygen (DO).

Bioinformatics
The quality screening was performed on paired-end reads in the FASTQ format. To analyze the original double C-terminal sequencing data, the sliding window method was used. A window size of 10 bp was used. The analysis results showed that data began to move at 1 bp from the 5 end of the first base position. A quality score of 20 (Q20) was required for 99% accuracy using FASTQ. The first value was lower than average quality as a result of a truncated sequence. The truncation ceased at 150 bp. Ambiguous bases (Ns) were not permissible.
Following the quality screening, Fast Length Adjustment of Short Reads (FLASH v1.2.7; http://ccb.jhu.edu/software/FLASH/ (19 October 2020)) [45] software was used to merge paired-end reads. FLASH software is able to extend short reads by overlapping paired-end reads with a base length of 10 or higher bp and with base mismatch numbers that had less than 10% overlapping base length.
Finally, using index information (i.e., barcode sequence, a short base sequence used to identify the sample), the indexed sequence was matched to the correct corresponding sample.

Statistical Analysis
Analysis of the sequence was performed using QIIME2 [46], according to the official tutorial (https://docs.qiime2.org/2019.4/tutorials/ (10 January 2021)). The raw data obtained via high-throughput sequencing were screened according to sequence quality, and high-quality sequences were used for subsequent analysis. The raw sequences that passed the quality screening were divided according to index and barcode information, and the barcode sequences were removed. Sequences were then quality filtered, denoised, merged, and chimera was removed using the DADA2 [47]. Deduplicated sequences generated by DADA2 quality control were considered ASVs (amplicon sequence variants) [47,48]. ASV is equivalent to OTU with 100% similarity clustering [49]. Statistics were performed on the length distribution of ASVs to check whether the lengths of these sequences were equivalent to the target fragments or sequences of abnormal lengths. Databases downloaded from NCBI (https://www.ncbi.nlm.nih.gov/ (28 February 2021)) and MitoFish (http://mitofish. aori.u-tokyo.ac.jp (28 February 2021)) were used for taxonomy.
Heatmap is plotted using heatmap tools in the Genescloud platform (https://www. genescloud.cn (20 May 2021)). The tool was developed from the heatmap package (V1.0.8), which was slightly modified to improve the layout style. The data were normalized by z-scores. The package uses popular clustering distances and methods implemented in dist and hclust functions in R. The list of distances includes correlation, Euclidean (default), maximum, Manhattan, Canberra, binary, and Minkowski. The clustering method in our analysis is average (UPGMA). Krona software (Brian Ondov edited this page on 5 May 2022, 25 revisions) (https://github.com/marbl/Krona/wiki (19 May 2022)) was used to display community taxonomic composition and its interaction [50]. The Krona figure represents seven taxonomic levels of domain, phylum, class, order, family, genus, and species from inside to outside. The size of the sector reflects the relative abundance of different taxa, and gives specific values. At each taxonomic level, taxa are distinguished by different colors.
To compare the differences in species composition between samples and show the species abundance, a heatmap was used for species composition analysis. ASV-level alpha diversity indices, such as the Chao1 richness estimator [51], Observed species, Shannon diversity index [52], and Simpson index [53], were calculated using the ASV table in QIIME2. For the grouped samples, R script can be used to draw the data into boxplots to visually show the differences in alpha diversity among different groups. Kruskal-Wallis rank sum test and Dunn's test can be used as post hoc tests. The significance of the difference was verified (Kruskal-Wallis test was equivalent to Wilcoxon test for two groups of samples). A principal coordinates analysis (PCoA) was performed to visualize the similarity among the fish communities in different samples using Bray-Curtis index. Redundancy analysis (RDA) was used to analyse the relationship between the fish community and environmental factors [54]. Temperature, salinity, turbidity, Chlorophyll a, and dissolved oxygen were analyzed the correlation to fish assemblage.

eDNA Metabarcoding Sequencing Results
The eDNA metabarcoding assay yielded a total of 2,406,141 sequencing reads. After the quality control process, a total of 1,512,485 reads were retained, corresponding to an average of 68,749 reads per sample. After taxonomic annotation, a total of 2 classes, 35 orders, 64 families, and 87 genera were classified (Table 2). It was determined that all sequences from the water samples belonged to the classes Chondrichthyes and Actinopteri. The Chondrichthyes class contained three families: Carcharhinidae, Hexanchidae, and Myliobatidae. Within the Carcharhinidae family, the species Prionace glauca (blue shark) and Scoliodon laticaudus (Spadenose shark) were found. Species of the Hexanchidae family and the Myliobatidae family were unclassified. The Myliobatidae family was listed under the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES).

Species Composition and Diversity
The relative abundance of species found at different sampling locations showed distinct differences. Statistical analysis of the non-singleton data showed that the relative abundance of genera Mugil, Scomberomorus, and Scomber was high in all samples over two years (Figure 2). Mugil accounted for 14.61% of the overall relative abundance at the genus taxonomic level, while Scomberomorus accounted for 11.66%. diagram results showed that Scomberomorus niphonius and Mugil cephalus were the dominant fish species. Scomberomorus niphonius and Mugil cephalus are highly commercial species, according to FishBase. Lasiognathus sp. NMMBP 9030 belonged to bathypelagic fish and was the dominant species of Lophiiformes. Heatmap results (Figure 4) showed that the community of fish was different among various depths. The richness of fish at 1000 m was significantly greater than that at other sampling depths. The main distribution range of the species Scoliodon laticaudus and Prionace glauca were below 2500 m depth. Larimichthys crocea was mainly found in water depths of 700 m. The heatmap also indicated that fish richness in the seamounts was higher than that in the sea basin.  Taxonomic composition analysis indicated that the genus Serrivomer showed high relative abundance at 1000 and 1200 m in 2017. The relative abundance of Scomberomorus was the highest at 3000 m, and the relative abundance of Mugil was the highest at 1000 m. The results of the species composition analysis showed that the five genera with the highest relative abundance in the whole water column of each site in 2017 were Mugil, Scomberomorus, Serrivomer, Konosirus, and Scomber. The relative abundance of the genus Scomberomorus was the highest in the whole water column of each station in 2018, and the relative abundance of Mugil, Rhynchopelates, Coryphaena, and Scomber decreased in turn. The water column was dominated by Scombriformes, followed by Lophiiformes (Figure 3). Krona diagram results showed that Scomberomorus niphonius and Mugil cephalus were the dominant fish species. Scomberomorus niphonius and Mugil cephalus are highly commercial species, according to FishBase. Lasiognathus sp. NMMBP 9030 belonged to bathypelagic fish and was the dominant species of Lophiiformes. Heatmap results (Figure 4) showed that the community of fish was different among various depths. The richness of fish at 1000 m was significantly greater than that at other sampling depths. The main distribution range of the species Scoliodon laticaudus and Prionace glauca were below 2500 m depth. Larimichthys crocea was mainly found in water depths of 700 m. The heatmap also indicated that fish richness in the seamounts was higher than that in the sea basin.

Community Diversity
The results of the alpha diversity parameters (Chao1, observed species, Shannon index, and Simpson) for each sampling depth in 2018 tended to be consistent with that of 2017, with non-significant differences ( Figure 5). The variation among depths was neither significant for the Chao1 index nor for the observed species index in 2017. Shannon index and Simpson index gradually increased with sampling depths (Table 3). Alpha diversity results showed that the Chao1 index, observed species index, Shannon index, and Simpson index decreased with water depth in 2018 (Table 3). Alpha diversity parameters at 1000 m were significantly larger than those at 3000 m (p = 0.05), including for the Chao1 index and observed species index.

Species Distribution by Depth
Fish species distribution at different depths was detected using eDNA analysis. Our results indicated that the greatest number of species was detected at a depth of 2500 m, while the fewest number of species were detected at 5000 m ( Figure 6). The variation at 2500 m was relatively large over three random repeated sampling efforts. Samples collected at depths of 700 m, 1200 m, and 5000 m were not subject to repeated sampling efforts, which may have contributed to accidental results. The beta diversity showed that the spatial structure based on different depths was not obvious when the fish community was ordinated by Bray-Curtis PCoA (Figure 7). The relationship between fish community and environmental factors was clarified by RDA results, and the proportion of fish community variation was explained by axe 1 (14.6%) and axe 2 (7.8%). As shown in Figure 8, Chla and DO were the main influencing factors of fish community structure in the CCZ. Chla was negatively correlated with the fish community, but DO was positively correlated with the fish community.

Community Diversity
The results of the alpha diversity parameters (Chao1, observed species, Shannon index, and Simpson) for each sampling depth in 2018 tended to be consistent with that of 2017, with non-significant differences ( Figure 5). The variation among depths was neither significant for the Chao1 index nor for the observed species index in 2017. Shannon index and Simpson index gradually increased with sampling depths (Table 3). Alpha diversity results showed that the Chao1 index, observed species index, Shannon index, and Simpson index decreased with water depth in 2018 (Table 3). Alpha diversity parameters at  1000 m were significantly larger than those at 3000 m (p = 0.05), including for the Chao1 index and observed species index.

Species Distribution by Depth
Fish species distribution at different depths was detected using eDNA analysis. Our results indicated that the greatest number of species was detected at a depth of 2500 m, while the fewest number of species were detected at 5000 m ( Figure 6). The variation at 2500 m was relatively large over three random repeated sampling efforts. Samples col- lected at depths of 700 m, 1200 m, and 5000 m were not subject to repeated sampling efforts, which may have contributed to accidental results. The beta diversity showed that the spatial structure based on different depths was not obvious when the fish community was ordinated by Bray-Curtis PCoA (Figure 7). The relationship between fish community and environmental factors was clarified by RDA results, and the proportion of fish community variation was explained by axe 1 (14.6%) and axe 2 (7.8%). As shown in Figure 8, Chla and DO were the main influencing factors of fish community structure in the CCZ. Chla was negatively correlated with the fish community, but DO was positively correlated with the fish community.

Discussion
Our study demonstrates that eDNA can be an effective method for studying fish diversity. eDNA can be collected from any type of aquatic or wild environment for monitoring fish ecology, composition, and distribution, as well as for monitoring endangered and invasive species [55,56]. eDNA metabarcoding is an efficient and versatile method that does not require extensive taxonomic expertise [31]. Compared to traditional sampling methods, eDNA methods are non-invasive and not destructive to the environment [44]. In CCZ areas, nodules support distinct species and community structures, such as sessile organisms, numerous other megafaunal, and meiofaunal and microbial taxa [4,57].

Discussion
Our study demonstrates that eDNA can be an effective method for studying fish diversity. eDNA can be collected from any type of aquatic or wild environment for monitoring fish ecology, composition, and distribution, as well as for monitoring endangered and invasive species [55,56]. eDNA metabarcoding is an efficient and versatile method that does not require extensive taxonomic expertise [31]. Compared to traditional sampling methods, eDNA methods are non-invasive and not destructive to the environment [44]. In CCZ areas, nodules support distinct species and community structures, such as sessile organisms, numerous other megafaunal, and meiofaunal and microbial taxa [4,57]. The demersal fauna has a limited supply of exotic food and is characterized by slow growth, replenishment, reproduction, and recovery after disturbance. Mining would affect benthic communities, which in turn affects fish distribution through the food chain. Removal of polymetallic nodules due to mining would lead to a loss of food-web integrity and a substantial decline in faunal biodiversity [58]. So, the determination of fish diversity by eDNA provides a valuable community assessment before mining.
Since eDNA released by different individuals within a population coexists in the aquatic environment, eDNA analysis can be extended to the assessment of diversity within populations [15]. eDNA has been widely used to detect the presence of plants and animals, and fish have become a common study subject in recent studies. The DADA2 bioinformatics pipeline uses a denoising algorithm to obtain ASVs to infer the true biological sequence, discriminating differences in sequence variants as small as one nucleotide [15,47,59]. The ASV is considered to be equivalent to the DNA sequence present in the original environmental sample and has been proposed to improve the accuracy of assessing the intraspecific diversity of fish populations [47]. In a similar study, 66 functional entities were detected using eDNA technology on Malpelo Island, a remote marine protected area, and the functional richness for eDNA was higher than that in underwater videos [60]. eDNA metabarcoding detects more fish than underwater visual census techniques [61]. Research has shown that eDNA methods are capable of gathering a spectrum of functional traits, showing the most functionally diverse and least redundant fish assemblages [62].
At the genus level, Mugil accounted for 16.03% of the total relative abundance, and Scomberomorus accounted for 10.69% of the total relative abundance during our 2017 sampling efforts. The relative abundance of Mugil was the highest at 700 m, while the relative abundance of Scomberomorus was the highest at 2500 m. In 2018, Scomberomorus accounted for 13.92% of the total relative abundance, and Mugil accounted for 11.29% of the total relative abundance. The relative abundance of Mugil was highest at 1000 m, and the relative abundance of Scomberomorus was highest at 3000 m. The following species had high relative abundance and were widespread among the study area: Mugil cephalus, Scomberomorus niphonius, Konosirus punctatus, Scomber japonicus, and Serrivomer sector.
According to data retrieved from FishBase (https://fishbase.se/search.php (24 March 2021)), we identified that four of the fish species we detected with high relative abundance were migratory fish. Due to the high mobility and widespread distribution of migratory fish, we speculated that the release of eDNA during migratory processes results in a higher detection rate than other fish. We found that the species Scoliodon laticaudus, Prionace glauca, and Harpadon nehereus are 'Near Threatened' fish on the IUCN Red List, according to FishBase. Other species found on the IUCN Red List included Larimichthys crocea ('Critically Endangered') and Epinephelus fuscoguttatus ('Vulnerable'). Interestingly, our results showed that Chondrichthyes fish are detected in the bathyal zone. Chondrichthyan fish, such as Scoliodon laticaudus and Prionace glauca, are important consumers in most marine ecosystems that are commonly found to depths of 1000 m but are uncommon, exceedingly rare, or quite possibly absent deeper than 3000 m [63]. A survey by Priede et al. [64] illustrated that the deepest Chondrichthyes below 3000 m was a shark, Centrophorus squamosus, captured at 3280 m by baited long line. The sharks (Centroscymnus coelolepis) were reported to be deepest at 3700 m [65]. Our results showed for the first time that eDNA metabarcoding detected sharks, Scoliodon laticaudus and Prionace glauca, at depths over 1000 m in the CCZ. We investigated the diversity of fish taxa and found critically endangered species Larimichthys crocea at DY45-II-CC-S06 and DY45-III-CCW-S01, compared with other sites. However, Larimichthys crocea is a commercially important species in China and distributed in the Western Pacific regions. We hypothesize that the eDNA of Larimichthys crocea flows into the sea with domestic water on research vessels and is collected by WTS-LV Large Volume Water Transfer System.
In terms of global biodiversity in oceanic areas, Molinos et al. [66] analyzed the distribution of biodiversity under different climate change models. Their results showed that with an increase in temperature, the total number of species decreased at low latitudes, increased at middle latitudes, and remained unchanged at high latitudes. Costello and Chaudhary [20] analyzed the vertical distribution of biodiversity in a changing environment, and found that biodiversity decreased with increasing water depth (distribution law of indexing). Burrows et al. [67] further analyzed the horizontal and vertical migration rules of biodiversity under climate change, with results showing that in areas with small temperature changes, biological migration was not obvious, while in areas with large temperature changes, organisms mainly adapted to temperature changes through vertical migration. These findings were similar to our results; high Shannon diversity index results were found at a depth of 1000 m, but the largest number of fish species were detected at 2500 m. Low temperature makes eDNA gradually degrades compared with surface temperature. So, we hypothesize that the DNA degradation time, DNA sink, temperature, and light contribute to this phenomenon, and we should pay attention to this process in further study. Additionally, the sink rate in a huge depth is very low; for DNA from 2500 m to 3000 m or even deeper water, the period of degradation would be much longer, preventing the degradation that could make it so that fish are not detected. The variance explained by PCoA in different depths was not statistically significant, which may indicate the connectivity of vertical habitats through dispersal, migration, or movement of seawater that carries eDNA [33]. Low temperature has been a larger contributor to the fish community similarity at different depths. The study of Takahara et al. showed that temperature may be the main driving factor of eDNA distribution [68]. Higher temperatures directly increase DNA degradation through the denaturation of DNA molecules, and indirectly degrade eDNA by increasing enzyme kinetics and microbial metabolism [69]. The decay rates of fish eDNA in marine water appear to be between 6.9 and 71.1 h [70]. Low temperature can preserve eDNA, but eDNA gradually degrades with the increase in sedimentation time. The Spatial dynamic of the fish community was affected by environmental factors. The RDA result of this study indicated that the main environmental factors influencing fish distribution were Chla and DO. Our results are similar to the study of Diao et al., in which Chla was negatively correlated with fish assemblages and affected fish assemblages by cascade effects [29]. Studies showed that the interaction of temperature and DO drives fish to use horizontal and vertical space [71].
The eDNA metabarcoding has been widely used in research aimed at fish diversity and detecting a large number of fish species. Our results demonstrate the usefulness of eDNA metabarcoding in conservation and management purposes for marine fishes. We found the DNA signature of Near Threatened fish, Critically Endangered fish, and Vulnerable fish in the CCZ. eDNA metabarcoding in biodiversity assessments will be crucial as humans continue to balance the use and conservation of marine resources in marine ecosystems.