Genomic Diversity as a Key Conservation Criterion: Proof‐of‐Concept From Mammalian Whole‐Genome Resequencing Data

ABSTRACT Many international, national, state, and local organizations prioritize the ranking of threatened and endangered species to help direct conservation efforts. For example, the International Union for Conservation of Nature (IUCN) assesses the Green Status of species and publishes the influential Red List of threatened species. Unfortunately, such conservation yardsticks do not explicitly consider genetic or genomic diversity (GD), even though GD is positively associated with contemporary evolutionary fitness, individual viability, and with future evolutionary potential. To test whether populations of genome sequences could help improve conservation assessments, we estimated GD metrics from 82 publicly available mammalian datasets and examined their statistical association with attributes related to conservation. We also considered intrinsic biological factors, including trophic level and body mass, that could impact GD and quantified their relative influences. Our results identify key population GD metrics that are both reflective and predictive of IUCN conservation categories. Specifically, our analyses revealed that Watterson's theta (the population mutation rate) and autozygosity (a product of inbreeding) are associated with the current Red List categorization, likely because demographic declines that lead to “listing” decisions also reduce levels of standing genetic variation. We argue that by virtue of this relationship, conservation organizations like IUCN could leverage emerging genome sequence data to help categorize Red List threat rankings (especially in otherwise data‐deficient species) and/or enhance Green Status assessments to establish a baseline for future population monitoring. Thus, our paper (1) outlines the theoretical and empirical justification for a new GD‐based assessment criterion, (2) provides a bioinformatic pipeline for estimating GD from population genomic data, and (3) suggests an analytical framework that can be used to measure baseline GD while providing quantitative GD context for consideration by conservation authorities.


| Introduction
Global biodiversity is declining rapidly as humans continually modify natural habitats and expand our environmental footprint.Habitat reduction and fragmentation, overharvesting, invasive species, and other anthropogenic impacts routinely lead to population declines, reduced gene flow, and subsequent increases in inbreeding and genetic drift (Almeida-Rocha et al. 2020;Schlaepfer et al. 2018).Collectively, these anthropogenic impacts lead to a loss of genetic/genomic diversity (GD) and a concomitant reduction in population fitness.The loss of GD and fitness can accelerate an extinction vortex (Blomqvist et al. 2010;Gilpin and Soulé 1986) and jeopardize the sustainability of a population or species because GD provides the evolutionary potential needed to adapt to a changing environment (DeWoody et al. 2021;England et al. 2003;Frankham 2005;Kardos et al. 2021).In this regard, the Convention on Biological Diversity (CBD) recently listed maintaining GD as an important goal in the Post-2020 Global Biodiversity Framework with support from the International Union for Conservation of Nature (IUCN) (Hoban et al. 2020).As one of three components of biodiversity (along with species and ecosystem diversity), GD should be a central component of modern conservation policies.
As an international entity comprised largely of academic, government, and private members, IUCN strives to help protect nature by using the best available science to prioritize conservation efforts.For example, one of IUCN's major undertakings is the production of the "Green Status of Species" to help measure the recovery of populations and species which are subject to active conservation efforts (Akçakaya et al. 2018).A more ominous task is IUCN's production and regular updating of their "Red List," which classifies species into one of nine categories (Extinct, Extinct in the Wild, Critically Endangered, Endangered, Vulnerable, Near Threatened, Least Concern, Data Deficient, and Not Evaluated).The Red List often influences national and state authorities in their official listing decision for species under their supervision.For example, the IUCN Red List is used at the international level by the Convention on International Trade in Endangered Species (CITES), CBD, and by the United Nations Sustainable Development Goals (SDGs).The IUCN Red List is also used at the national level by the National Institute of Biological Resources of South Korea and the U.S. Fish and Wildlife Service, and at the state or provincial level by the California Department of Fish and Wildlife and by the Indiana Department of Natural Resources (among many others).Ultimately, decisions made by IUCN regarding the Red List have long reverberated through the global conservation community and we expect future Green Status assessments to be similarly influential.
As it now stands, Red List assignments do not explicitly consider GD despite numerous calls to do so (Garner, Hoban, and Luikart 2020;Laikre et al. 2020; van Oosterhout 2020;Willoughby et al. 2015).Similarly, IUCN's Green Status metrics do not currently include GD although recent papers argue that it should (Jackson et al. 2022; van Oosterhout et al. 2022; van Oosterhout 2024).This is unfortunate because GD is an important component of population viability (Kardos et al. 2021;Reed and Frankham 2003), and in many instances GD can provide insights into rare or elusive species whose population attributes are otherwise difficult to address (Brüniche-Olsen, Westerman, et al. 2018;Khan et al. 2021).For example, Rice's whale (Balaenoptera ricei) is a newly described species of baleen whale endemic to the Gulf of Mexico (Rosel et al. 2021).Baleen whales are notoriously difficult to study at sea, but empirical GD estimates from only a few individuals (e.g., sourced from beached whales or noninvasively collected DNA) could provide both baseline population genomic data for future monitoring as well as critical demographic context for conservation plans.
Previous studies based on microsatellite genetic markers have suggested that threshold levels of GD can be used to help delimit conservation categories.For instance, Willoughby et al. (2015) proposed a conceptual framework that-based strictly on GD estimates from related species and recognizing that GD is but one component of population viability-designates IUCN conservation categories based on the estimated time (in generations) that a species or population is predicted to lose more GD than 75% of its taxonomic relatives.The conceptual framework of Willoughby et al. (2015) was proposed at the twilight of the microsatellite era.Here, we extend it into the dawning population genomic era.Although previous genomic studies have explored the relationship between GD and conservation status (e.g., Brüniche-Olsen, Kellner, et al. 2018;Genereux et al. 2020;Wilder et al. 2023), they used only a single genome per species.Kardos et al. (2021) insisted that population-level data are necessary to accurately reflect genetic variation because a single individual may not properly represent the species as a whole, and we agree.Genome resequencing datasets from natural populations are on the rise (DeWoody et al. 2022) and they provide rich new opportunities for conservation-informative GD metrics to be molded into one or more formal conservation criteria.
The IUCN Red List makes categorical assignments for each species they evaluate according to five criteria: (1) population size reduction; (2) geographic range; (3) small population size and decline; (4) very small or restricted population; and (5) associated quantitative analyses of population viability.In contrast, Green Status assessments evaluates the recovery status of species based on a "Green Score" that ranges from 0% to 100% where 100% represents fully recovered (i.e., higher scores indicate better recoveries).
In this paper, we first assessed whether IUCN yardsticks effectively reflect contemporary mammalian GD.We reasoned that if so, Red List of Threatened species (Critically Endangered, Endangered, and Vulnerable) should exhibit lower levels of GD than Non-Threatened species (Near Threatened and Least Concern) due to inbreeding, genetic drift, and reduced gene flow.If not, this would indicate that the Red List evaluation criteria insufficiently capture a key aspect of biological diversity (i.e., GD; see van Oosterhout 2024).We then thought about how GD could be utilized in Green Status assessments, then developed an explicit algorithm for doing so.We focused primarily on Watterson's theta (θ W ), the number of segregating sites in a gene pool (i.e., θ W = 4N e μ for diploids under mutation-drift equilibrium; Watterson 1975).Watterson's theta can serve simultaneously as an effective metric of GD and as a proxy for effective population size (N e ), both useful additions to IUCN assessments in part because of the inherent difficulties in consistently estimating N e across studies (Waples 2024a).Unlike some lagging indicators of GD such as nucleotide diversity (which may reflect more ancient demographic events such as bottlenecks or expansions), θ W is a leading indicator of GD because it depends more on contemporary N e (Tajima 1989b;Brüniche-Olsen et al. 2021).
Unfortunately, a simple fixed GD threshold for conservation (e.g., mean θ W < 0.002 = Endangered) would be misleading because of the inherent variation in GD observed among taxa.Species vary in key biological attributes such as body sizes, generation times, and metabolic rates that are known to affect GD (Bromham 2024;Ellegren and Galtier 2016;Romiguier et al. 2014).Thus, we examined associations among fundamental biological characteristics, such as trophic level and body mass, with GD to account for major biological factors that might otherwise confound the relationship between GD and Red List status.We did so using Class Mammalia as an example because many flagship species of conservation interest (e.g., pandas, tigers, and whales) are mammals.Furthermore, mammalian data are sufficiently dense in both the IUCN Red List and in sequence repositories to allow for robust analyses of our GD framework.Finally, we provide explicit suggestions for how authorities might improve the Red List and Green Status assessments-and hopefully improve subsequent conservation outcomes-by incorporating GD.

| Materials and Methods
The overall workflow of this study is illustrated in Appendix A1.In addition to θ W , we also evaluated four other commonly utilized population genomic metrics that each has a strong theoretical justification for being conservation-informative: (1) mean nucleotide diversity (); (2) mean observed genome-wide heterozygosity (H); (3) Tajima's D (D); and (4) the extent of autozygosity as measured by runs of homozygosity (ROH).Nucleotide diversity, , conveys the average number of nucleotide differences per site between all pairs of sequences in a population.Heterozygosity, H, measures the proportion of heterozygous sites considered in a given sample (Nei 1978).At the population level, mean H is averaged across estimates from individual genomes.Tajima's D is the difference between  and θ W , divided by its variance under mutation-drift equilibrium (Tajima 1989a).Tajima's D can be used to identify signatures of selection on individual loci, but demographic trends can also be detected when it is measured across the genome: D < 0 indicates population growth after a bottleneck, D = 0 indicates population stability, and D > 0 indicates a sudden population decline.Finally, ROHs describe the proportion of contiguous homozygous regions along the genome and can used to directly estimate both the extent and timing of inbreeding (and indirectly, the level of inbreeding depression due to associated reductions in fitness) (Ceballos et al. 2018).We calculated two ROH estimators, namely F ROH>100kb (the fraction of ROH longer than 100 kb, F100kb; representing the cumulative inbreeding level) and F ROH>1Mb (the fraction of ROH longer than 1 Mb, F1Mb; representing the recent inbreeding level).

| Data Collection
We collected four types of data from public databases: (1) reference genomes; (2) population-level whole genome resequencing (WGR) reads for the inference of GD metrics; (3) IUCN Red List information; and (4) ecological characteristics (trophic level, body mass, and habitat breadth) for statistical tests of association with the GD metrics.Because formal taxonomic designations (e.g., subspecies) do not always perfectly correspond with conservation units, we used subspecies or regional population level data from the Red List whenever available because conservation status can vary among demographically independent populations within the same species.
We searched all the available reference genomes of mammalian species (as of December 2021) from National Center for Biotechnology Information (NCBI) and collected assembly identifiers (e.g., accession number and assembly name) required for our bioinformatic pipeline.We also collected additional information on the assembly level (i.e., contig, scaffold, or chromosome), contig N50, scaffold N50, and assembly size for downstream analyses (Dataset S1).Species with a reference genome were further searched and population-level WGR data were retrieved from NCBI's Sequence Read Archive (SRA).We use population-level data when: (1) "WGS" type data were available; (2) the data were comprised of paired-end reads; (3) the data were sequenced on Illumina platforms, such as Genome Analyzer, Hiseq, Novaseq, or Nextseq platforms; and lastly (4) a minimum of two different individuals from the same wild population were available.We followed the data author's population designation and limited the maximum number of individuals to 25 for computational tractability.The largest population was selected when multiple populations were available.We recorded the Illumina sequencing chemistry used (i.e., two-channel or four-channel; De-Kayne et al. 2021) and the number of individuals for downstream use (Dataset S1, https:// github.com/ jyj55 58/ theta ).
For each species evaluated, we used the IUCN Red List to record conservation category (i.e., CR-Critically Endangered, EN-Endangered, VU-Vulnerable, NT-Near Threatened, LC-Least Concern), population trend (i.e., decreasing, stable, or increasing), and extent of geographic range.We imported the shape file of the species geographic range to ArcGIS Pro 2.9.0 (Esri 2021) and calculated the total habitat ranges except for "Extinct" and "Possibly Extinct" species.The shape files were clipped to match with the collection site of "subspecies" or "subpopulation," as defined by the Red List, whenever apparent and applicable based on the associated metadata.
We estimated GD metrics using ANGSD v0.94.0 (Korneliussen, Albrechtsen, and Nielsen 2014) and bcftools v1.17 (Danecek et al. 2021).We applied conservative filters in ANGSD, including removing low quality reads and ambiguously mapped reads.We estimated genotype likelihoods with GATK and maximum likelihood estimates of the folded site frequency spectrum were obtained using the realSFS tool.We estimated , θ W , and D applying a sliding window approach with non-overlapping 50 kb windows.Estimates of  and θ W were divided by the effective number of sites to represent genomic proportions.Individual genome-wide H was estimated using a similar process and averaged to provide a population-level mean H estimate for each species.To estimate the ROH burden, a bcf file was generated from quality filtered bam files using ANGSD.Subsequently, bcftools/roh (Narasimhan et al. 2016) was employed to apply the hidden Markov model to identify individual ROHs.The fraction of ROHs in individual genomes (F ROH ) were averaged to obtain a population-level estimate per species using an in-house python script.

| Statistical Analysis
Descriptive statistics (mean and standard deviation) and the distribution of GD metrics were summarized and plotted by both IUCN categories and taxonomic Orders.Before the main analyses described below, we partitioned the full dataset (Dataset S1) into two subsets.The first subset, the "IUCN dataset" (Dataset S2), included all the species having their own categorical Red List assessment but excluded those listed as "Data Deficient."The second subset, the "EcoEvo dataset" (Dataset S3), included all the species with eco-evolutionary traits available in the COMBINE database (Soria et al. 2021).
Uncorrelated GD metrics were then individually tested against IUCN categories to determine if there was a significant difference between mean GD values across IUCN categories.We used both "full" categories (CR, EN, VU, NT, and LC) as well as "binary" categories consisting of Threatened (=CR + EN + VU) or Non-Threatened (=NT + LC).We considered technical factors as well (e.g., sequence read depth) and controlled for them in statistical models (Technical Dimensions; Dim.1-Dim.4)as described in Appendix A2.To account for phylogenetic signal (), we ran Phylogenetic Generalized Least Squares models (PGLS; GD ~ IUCN category + Dim.1 + Dim.2 + Dim.3 + Dim.4-1) using the R package "caper" (Orme et al. 2018) with the maximum likelihood method.The significance of individual independent variables was further examined in each model and effect sizes (partial omega-squared) were determined for the independent variables of interest using R package "sjstats" (Lüdecke 2022).Comparisons among significant models were assessed using Akaike information criterion (AIC) values.
The mammalian phylogenetic tree used in the models was derived from VertLife (Upham, Esselstyn, and Jetz 2019) by sampling 1000 trees from the "Mammals birth-death nodedated completed trees" distribution (Upham, Esselstyn, and Jetz 2019).The "averageTree" function with default settings was applied using the R package "phytools" (Revell 2012) to obtain a consensus tree from the 1000 trees, which was then rooted with Sarcophilus harrisii as an outgroup (Damas et al. 2022) using the "root" function in R package "ape" (Paradis and Schliep 2019).Additional tips for each subspecies were manually added to the tree as a sister taxon of its closest relative using the "AddTip" function in the R package "TreeTools" (Smith 2019).Two Red List assessment criteria, "population trend" and "geographic range," were compared in place of the IUCN category with the same procedure above.
We used ordinal regression tests to examine the explanatory power of GD metrics that were significantly correlated with IUCN categorization after accounting for phylogenetic nonindependence.Each model consisted of IUCN full categories or binary categories (i.e., threatened vs. non-threatened) as dependent variables, and one of the significant GD metrics as an independent variable.IUCN categories were treated as pseudocontinuous following (Graber 2013).
For combinations of GD metrics and IUCN categorizations that covaried according to phylogenetic history in both PGLS tests, we ran multi-response phylogenetic mixed modeling (MR-PMM) using the R packages "MCMCglmm" (Hadfield 2010).In these tests, both a GD metric and IUCN categorization were treated as response variables to distinguish and estimate phylogenetic and independent covariances (Halliwell, Yates, and Holland 2023;Westoby et al. 2023).For each test, various priors were tested with Markov chain Monte Carlo (MCMC) sampling to obtain the best model convergence, then one set of priors that achieved successful convergence was fixed to the test.Models were run for 11,000,000 iterations including a 1000,000 burn-in period and sampling every 1000 runs.Model convergence was assessed by effective sample sizes and trace plots.We then estimated the link-scale independent (i.e., non-phylogenetic) correlations between GD metrics and IUCN categorization.Detailed R code is available at https:// github.com/ jyj55 58/ theta .
We considered several machine learning classifiers (i.e., random forest, k-nearest neighbors, and support vector machine) that are applicable to small datasets using the "scikit-learn" package (Osisanwo et al. 2017).Only GD metrics identified as significant in linear models were included as predictors, and IUCN binary categories were included as responses.For the random forest and linear support vector machine classifiers, we used predictors simultaneously and compared their feature importance.For k-nearest neighbors and non-linear support vector machine classifiers, we used predictors individually and compared their model accuracy.Predictors were standardized in k-nearest neighbors and support vector machine models.We used 70% of the data for model training and 30% for model testing.After hyperparameter tuning by a wide range of randomized grid searches and/or a finer parameter grid search for each machine learning classifier, the final models were set according to the best estimator.
We tested associations between IUCN categories and (a) population trend or (b) geographic range estimates, two criteria currently used to help determine IUCN status.We did so to provide perspective on the signal (or lack thereof) of individual criteria contained in GD metrics.
To strengthen our conservation-oriented analyses by accounting for potential confounding factors, the distribution of all the GD metrics, conservation criteria and eco-evolutionary factors across species was displayed on a multi factor analysis (MFA) plot using R packages "FactoMineR" (Lê, Josse, and Husson 2008) with the first two dimensions.We also leveraged the dataset to evaluate evolutionary factors by comparing GD against key eco-evolutionary factors that could drive levels of standing GD, including trophic level, habitat breadth, and body mass.See Appendix A3 for details.

| Genetic Diversity Across Red List Categories and Across Species
Among 613 species and subspecies with reference genome assemblies available from NCBI at the time of our search in 2022, 98 species had population genomic whole-genome resequencing (WGR) datasets that met our inclusion criteria.Sixteen species were subsequently dropped during the bioinformatic data analysis due to unsatisfied thresholds (e.g., low mapping rates, depths, and/or breadth) or an inconsistent data type (e.g., pooled-sequencing).Ultimately, this resulted in 82 species in our final WGR dataset (Dataset S1).Our "IUCN" (Dataset S2) and "EcoEvo" (Dataset S3) datasets had 72 and 63 species, respectively, after reconciling taxonomy and pruning for phylogenetic pseudoreplication.
Descriptive statistics for each GD metric are summarized in Table 1 and Table S1 (see also Figure 1 and Figures S1-S14).Out of 82 species, 21 species did not yield F ROH>1Mb (F1Mb) estimates due to low contiguity of their reference genome.Nucleotide diversity () and genome-wide heterozygosity (H) values were strongly correlated (r > 0.9) with Watterson's theta (θ W ). Thus, while we interpreted results mainly based on θ W as it is more sensitive to population decline, we employed H for practical assessment criterion below (see Section 4.3).
In general, Non-Threatened species have higher GD than Threatened species.Individual and categorical θ W was effectively twice as high in Non-Threatened species compared to Threatened species (Table 1 and Figure 2), and F1Mb (the metric of recent inbreeding) was doubled in Threatened species (Table 1 and Figure S14).

| Statistical Associations Between GD and the IUCN Red List
Overall, our analyses indicate that two GD metrics-θ W and F100kb-were significantly associated with IUCN conservation categories (detailed model results are presented in Table S2 and Figures S15-S17).The main phylogenetic generalized least squares (PGLS) model between IUCN full categories (Least Concern-LC, Near Threatened-NT, Vulnerable-VU, Endangered-EN, Critically Endangered-CR) and θ W was significant (significant partial ω 2 = 0.099).
There was significant phylogenetic signal () in the model ( = 0.961; 95% CI = 0.883-0.988),implying there was phylogenetic non-independence in the data which was accounted for in the model.The secondary PGLS model of θ W against IUCN binary categories of Threatened and Non-Threatened (LC + NT + VU = Non-Threatened; EN + CR = Threatened) was also significant with  = 0.951 (significant partial ω 2 = 0.106).The PGLS models also revealed a significant relationship between the ROH burden (as measured by F100kb) and IUCN full categories (significant partial ω 2 = 0.106) and between the ROH burden and binary IUCN categories (nonsignificant partial ω 2 = 0.001).None of the PGLS models revealed significant associations between D or F1Mb and IUCN category (full or binary).Among significant models, the F100kb with IUCN binary categories model was best, only slightly better than the model with IUCN full categories, followed by θ W (Table S3).See Appendix A4 for the results of models which used individual Red List assessment criteria (i.e., "population trend" and "geographic range") in place of the IUCN categories.
Our analyses indicate that θ W and F100kb were significantly associated with conservation status.Thus, we used either θ W or F100kb as independent variables in the PGLS for phylogenetic ordinal regression against IUCN categorization.The model with θ W as the independent variable was significant when IUCN binary category was the dependent variable (significant partial ω 2 = 0.121; Table S2).Regardless of whether the dependent variable was the IUCN full category or binary category, θ W was always significant as an individual factor whereas F100kb was never a significant predictor of IUCN category (whether full or binary; Table S2).
The results of MR-PMM indicated that all models converged well with effective sample sizes reaching nearly 100% of the sampled chains for each estimated model parameter (Figures S18-S21).Both full and binary IUCN categories had a strong a phylogenetic signal (Table S4), with θ W and F100kb both correlating with the IUCN categories after controlling for phylogenetic non-independence, but θ W was more strongly correlated.Among machine learning classifiers, θ W was generally a better predictor of IUCN categories than F100kb.The K-nearest neighbors approach with θ W as a predictor of IUCN binary categorization showed the highest accuracy, followed by the linear support vector machine approach with θ W and F100kb both as predictors and non-linear support vector machine with θ W as a predictor.However, the model accuracy (≤0.6) and importance of predictor (≤0.6) were relatively low for both θ W and F100kb.The fine-tuned hyperparameters and results for each of the machine learning classifiers are presented in Table S5.

| Discussion
The relationships among GD, N e , and fitness have been thoroughly reviewed and summarized by previous studies (James and Eyre-Walker 2020;Mitton 1994;Nevo, Beiles, and Ben-Shlomo 1984).These and other studies indicate that GD, as estimated by θ W or related measures such as individual heterozygosity, is a critical component not only of contemporary fitness but also of future evolutionary potential.The idea that GD can serve as an indicator of future evolutionary potential should not be overlooked considering the global environmental challenges facing natural populations today (but see Section 4.5).Furthermore, genome resequencing data offer remarkably high information content per individual (e.g., estimates of GD such as mean θ W or F100kb).This means that sampling the genomes of only a few individuals can provide key insights into population biology, and this could be especially important in the case of rare and/or secretive species whose populations are difficult to survey using conventional means.
A reduction in GD, with its concomitant loss of fitness and increased probability of extinction (Flight 2010;Frankham 2005), is expected to result from demographic events like population bottlenecks, population subdivision, and founder events that reduce population sizes.Neutral GD is determined by the product of the generational mutation rate and the effective population size (N e ), and thus GD is determined in part by the census size of the population (James and Eyre-Walker 2020; Leffler et al. 2012).Moreover, and not surprisingly, population census size is positively correlated with geographic range size.According to conservation theory, small, threatened populations tend to have lower GD than large, broadly distributed populations which are typically not threatened (Frankham 1996).
Our analyses of empirical data bear out those theoretical predictions (Figure 2).We analyzed population genomic data from 82 species belonging to 11 Orders of mammals among various IUCN conservation categories.For each species, we calculated GD metrics and tested for significant associations between these metrics and various biological parameters, such as geographic distribution or body size, that might impact diversity.The overarching goal of the research was to determine the relationship between population-level GD metrics and IUCN conservation categories while simultaneously identifying key intrinsic drivers of mammalian GD, which we address first.

| Description of Mammalian Genomic Diversity
Our results are consistent with a long history of empirical genetic studies dating to the 1960s when protein electrophoresis was first used to measure GD in natural populations of mammals.For example, Figure 2 indicates that the four species with the highest θ W values (i.e., the number of variable nucleotide sites) are Peromyscus maniculatus (deer mouse, including two subspecies), Mus musculus castaneus (southeastern Asian house mouse), and Myotis lucifugus (little brown bat) followed by Peromyscus leucopus (white-footed mouse).Nevo, Beiles, and Ben-Shlomo (1984) compiled an allozyme dataset of GD metrics (e.g., H) from 1111 species of animals and plants including 184 species of mammals.Their dataset was comprised of GD estimates from only a few dozen allozyme markers per species, and they examined only a few of the same species that we did.However, there are some remarkable similarities between Nevo, Beiles, and Ben-Shlomo (1984) and our current study.Nevo, Beiles, and Ben-Shlomo (1984) found only 13 species (not including humans and domestic cat) that had values of H > 0.09 out of 184 species of mammals.Among them were P. maniculatus and two species of bats of the genus Myotis.The taxonomic overlap in high GD species between Nevo, Beiles, and Ben-Shlomo (1984) and our study is reassuring.We also found a strong correlation (r = 0.81) between our heterozygosity estimates and Nevo, Beiles, and Ben-Shlomo's (1984) estimates among the 18 species included in both studies.These findings bolster our confidence that evolutionary genetics theory is buttressed by existing, publicly available genomic datasets that can be readily exploited by interested conservationists.
Taxonomic Order is the taxonomic level in which member species share a broad suite of morphological, physiological, genetic, and ecological characteristics; species of different Orders can easily be distinguished by many conservationists.If we just consider the four most speciose Orders, Rodentia had the highest mean value of θ W = 0.00626 and Carnivora had the lowest mean value θ W = 0.00090.This is not unexpected given that small herbivores generally have much larger population sizes and nucleotide substitution rates than do carnivores (Zhang et al. 2021).Conversely, Carnivores had the highest mean F1Mb = 0.06209 and rodents had the second lowest mean F1Mb = 0.02441.Again, this is consistent with their population biology in which rodents (a group of animals with small bodies, short lifespans, and high fecundity) are expected to have higher effective mutation rates and larger population sizes, thus higher GD, than carnivores (animals with larger bodies, longer lifespans, and lower fecundity), where there is generally far more opportunity for inbreeding in isolated populations (De Kort et al. 2021;Romiguier et al. 2014).Primates have relatively high inbreeding with F1Mb = 0.05569.This is perhaps a reflection of a high degree of social structuring, small census population sizes, and slower rates of molecular evolution in primates (Zhang et al. 2021).

| Genomic Diversity and Red List Status
We found that key population GD metrics are generally reflective and predictive of IUCN conservation categories that presumably reflect extinction threat status.The effect sizes of the independent GD metrics (partial ω 2 = 0.099-0.121)indicate they explained a modest to large proportion of the variance in the response variable (Field 2013).This supports the idea that GD is indirectly reflected by the current Red List assessment methodology.Our results also indicate that Threatened species or populations have reduced GD compared to those with Non-Threatened status.We found that θ W (and its correlates,  and H, which are both measures of genomic variation based on polymorphic nucleotide sites) was the best conservation metric, followed by F100kb (a measure of autozygosity that is reflective of inbreeding).One individual Red List criteria, "geographic range," also reflected GD.Geographic range was inversely proportional to longer fraction of ROH (Figure A5), another reasonable result in that habitat contraction can result in elevated levels of inbreeding relative to random mating (Nonaka et al. 2019).
The correlation between GD and Red List status has been tested before (Brüniche-Olsen, Kellner, et al. 2018;Brüniche-Olsen et al. 2021;Garner, Hoban, and Luikart 2020;Nabholz et al. 2008;Willoughby et al. 2015) but mostly with mitochondrial sequences, microsatellite marker data, or a single genome sequence.There has been no scientific consensus on whether the Red List indirectly captures GD.Recently, Schmidt et al. (2023) performed a meta-analysis of studies that used different markers and corroborated Willoughby et al. (2015), who found that GD is modestly predictive of Red List status.Our results are consistent with this interpretation.Several authors (Garner, Hoban, and Luikart 2020;Schmidt et al. 2023;Willoughby et al. 2015) have suggested the loss of GD over time would be even more valuable than snapshot values of GD in conservation assessments.In the next section, we extend this line of reasoning by detailing an approach for including GD as an explicit criterion in future conservation assessments.

| An Explicit Genetic Criterion for Conservation Assessments
Over 30 years ago, Mace and Lande (1991) originally suggested an assessment criterion based on N e in Version 1.0 of the Red List Categories and Criteria, but the most recent iteration of these Criteria (Version 3.1) still do not embrace N e despite recent pleas to include genetic considerations in status determinations (e.g., Garner, Hoban, and Luikart 2020;Laikre 2010;Willoughby et al. 2015).Furthermore, N e is the indicator proposed by the Global Biodiversity Framework (Hoban et al. 2020).Effective population size is notoriously difficult to estimate, but is a primary determinant of GD (e.g., Watterson 1975;equation 3 in Nei and Takahada 1993).Thus, we suggest that an additional criterion that considers GD and N e would help further inform conservation assessments.We think an additional criterion could be useful for all species where GD data are available, but especially for species that might otherwise be deemed "Data Deficient" (i.e., limited data available to assess conservation status based on traditional criteria).
Our proposal for an explicit GD criterion for status assessments is based on the mean loss of heterozygosity over time (Crow and Kimura 1970).We chose H in large part because (1) our results indicate that since H is correlated with θ W , H is both a good indicator and good predictor of existing IUCN categories because the number of variable sites across a genome (the θ W we used) is equal to H when calculated from a single individual; and (2) it has a solid theoretical foundation based on Crow and Kimura's equation.To accommodate a time frame (100 years) relevant for conservation programs and to align with Green Status assessments, we modified the original equation such that: where N e is the effective population size, H O is observed heterozygosity, H T is heterozygosity at time T where T is the number of generations in 100 years (e.g., T is 100 for most insects or annual plants, T is 50 for antelope with 2-year generation times, and T is 5 for whales with 20-year generation times).Furthermore, H has previously been proposed as a key genetic criterion for conservation efforts (Allendorf and Ryman 2002;Willoughby et al. 2015) and H can be accurately estimated from only a few whole genome sequences (Gorman and Renzi 1979;Nei and Roychoudhury 1974), an important consideration with respect to Threatened populations or species.Finally, and importantly, the concept of heterozygosity is well understood by most biologists.
Our proposed GD criterion for the Red List is illustrated in Figure 3 and, in principle, could be readily applied by any conservation organization that conducts status assessments given that the model parameters can be estimated from publicly available resources.For example, H O could be estimated from population genomic datasets using a standardized workflow (our Nextflow pipeline is available at https:// github.com/ jyj55 58/ theta ), and generation time is generally known from life history studies.N e can either be estimated indirectly from census population size (N c ) where N e is crudely estimated from N c (Frankham 1995;Palstra and Fraser 2012), or directly from population genomic data.For example, contemporary N e can be estimated using the linkage-disequilibrium-based method (e.g., currentNe; Santiago et al. 2020) so long as practitioners recognize that genomes do not immediately register demographic changes (i.e., there is a lag time or drift debt; Patton et al. 2019;Pinto et al. 2024;Waples 1990).
Here, we provide initial recommendations for how GD could be used to help assign extinction risk categories (i.e., CR, EN, or VU).These novel recommendations will ultimately evolve, but we base them on an established idea: the rate of heterozygosity loss over time (Allendorf and Ryman 2002;Díez-del-Molino et al. 2018;Frankham, Bradshaw, and Brook 2014;Lande 1988;Lynch and Lande 1998).We suggest that conservation authorities rely primarily on the rate of GD loss to help determine the threat category, where the rate is defined as the Secondarily, we propose that explicit N e cutoffs be used to help ensure that populations with demographic characteristics that put them at high risk of genetic erosion are effectively evaluated.For example, species with long generation times (e.g., whales) will register GD changes too slowly with the H T :H O ratio and thus are more appropriate to evaluate with N e .
If the generation time is long or if N e cannot be reasonably estimated (e.g., because of uncertainty in N c ), then we recommend using relative H O thresholds to classify taxa into "Nonthreatened" or "Threatened" categories.Our rationale is that H O varies among taxonomic categories (Figure 1) and that H O is often associated with threatened status (Table S2 In practice, it can be difficult to estimate credible N e values in a standardized way across taxa.This is because N e estimates depend on the approach employed (e.g., heterozygote excess, linkage disequilibrium, and temporal variance in allele frequencies), the number of variable loci, the number of samples, and other biological factors (e.g., reproductive skew) that vary across taxa (Waples 2024a).Thus, when possible we suggest the use of the N e /N c ratio to estimate N e (Waples 2024b) but recognize that N e may need to be estimated with genomic approaches (e.g., linkage disequilibrium).
Our specific recommendations are shown in Figure 3.If the rate of heterozygosity loss is extreme (i.e., H T :H O ≤ 0.95), we think this is sufficiently worrisome that it warrants a CR or EN categorization.In cases where the rate of heterozygosity loss is less extreme, perhaps due to the 100-year time frame coupled with long generation times, we suggest supplementing the H T :H O ratio with N e as described with the Boolean operators in Figure 3.For instance, H T :H O = 0.987 in the cheetah (Acinonyx jubatus) when using a 10% N e /N c ratio (Table S6), which is relatively high.As a safeguard for species that have long generation times (6 years for the cheetah), the logic shown in Figure 3 and Table S6 suggests that if an additional evaluation revealed that N e < 1000, then the cheetah should be assigned to the VU category.On the other hand, when using a 100% N e /N c ratio, the cheetah's H T :H O = 0.997 and N e = 6517, thus categorizing it as LC.However, the cheetah's H O is 0.00041, which is lower than the maximum (=0.00118 of Panthera tigris jacksoni) of genetically Threatened Canivora species (i.e., species of the Order Canivora which are assigned VU, EN, or CR according to the H T :H O cutoff or N e cutoff), and this would result in a Threatened determination for the cheetah.
How do these general recommendations perform using real data?
We tested these thresholds by using median census population size (N c ) for Red List species when available or, alternatively, by employing currentNe to estimate N e from the publicly available sequence data for "Data Deficient" species (i.e., those without Red information) to illustrate how genomic data could be used when demographic or other data are lacking.Because the ratio of effective to census size varies over time (Ardren and Kapuscinski 2003;Wang et al. 2023) and across taxa (Palstra and Fraser 2012), we used two extreme ratios of N e /N c , 10% (Frankham 1995) and 100% (Wang et al. 2023) to illustrate the strictest and the most lenient assessments.We found that in many cases, the loss of heterozygosity over 100 years ("H T /H O " in Table S6) was <2.5% due to large census population sizes and/or long generation intervals, but two "Data Deficient" species were assigned "CR" due to small estimates of N e .When we used a 10% N e /N c ratio, the overall trend of genetic categories was quite similar to that of the original IUCN categories (Figure 4a,b), but when a 100% N e /N c ratio was used, genetic categorization was more lenient than the original categorization (Figure 4c,d).Thus, our GD recommendations can effectively supplement the original Red List categorization, especially given that the final (official) category is determined as the highest threat category determined across multiple assessment criteria (IUCN 2012).Further research will be needed to determine the most robust N e /N c ratio or N e estimator for any given lineage.

| The Potential Role of GD in Green Status Conservation Assessments
IUCN's newer assessment scheme, the Green Score, is calculated as: where W S is the weight of a spatial unit (S; could be a geographic population, for example) defined as 0 (absent), 3 (present), 6 (viable state), or 9 (functional state); W F is weight of the functional state (i.e., 9); and N is the number of spatial units (https:// www.iucnr edlist.org/ asses sment/ measu ring-recov ery-green -statu sspecies).The Green Score can address intraspecific, populationlevel conservation status.This aspect of the Green Score is extremely valuable, but we think it could be enhanced with a "genetic diversity correction" because relative GD is associated with population productivity and adaptability.Explicitly, relative GD could be incorporated into the Green Score by weighting a population's GD against the average GD among related LC species (to correct for inherent biases in GD across organismal groups, as shown in Figures S2 and S3 and in Garner, Rachlow, and Hicks 2005).Incorporating this suggestion into the existing Green Score equation would give the equation: where GD S is the GD level of a spatial unit and GD LC is the average GD level of "LC" species or populations (the maximum value of GD S /GD LC would thus be capped at 1).
There are several reasonable GD parameters that would be suitable for use in our suggested modifications of the Green Score calculator.For example, H O estimates are alluring for the same reasons we recommend its use for estimating extinction risk: H O is related to existing IUCN categories and H has a solid theoretical foundation.However, if N e were used as the GD metric in the formula, this approach could accommodate two CBD indicators that Hoban et al. ( 2020) suggested for the CBD's post-2020  et al. 2021).To focus on the influence of GD correction, we assumed that all the available populations (spatial units in the Green Score formula) were sampled for each taxon and that all the California towhee populations are in a functional state whereas the Inyo California towhee population is in a viable state.Without the GD correction, the Green Score of California towhee would be: whereas the GD-corrected Green Score of California towhee would be Without the GD correction, the Green Score of the Inyo California towhee would be whereas the GD-corrected Green Score of Inyo California towhee would be Thus, the GD correction reduced the Green Scores by ~40%-50%, and provides a richer and more nuanced insight into the status of these two taxa.
Global Biodiversity Framework-"the number of populations [or breeds] within species with an effective population size (N e ) above 500 compared to the number below 500" and "the number of species and populations in which genetic diversity is being monitored using DNA based methods."We could further pursue Goal A in CBD's Post-2020 Global Biodiversity Framework (i.e., "maintaining at least 90% of GD of all species") by targeting the ratio of (∑GD S )/(GD LC × N) ≥0.9 for each species.Regardless of the selected metric, our intent is to illustrate that the Green Score could be further enhanced by explicitly addressing GD, and that these enhancements would help conservationists better estimate prominent biodiversity indicators used by international organizations like the CBD (Box 1).

| Limitations
GD worked well for conservation assessments in this study, but we acknowledge some potential limitations.First of all, genomic resequencing data is not always available but we note this is a potential criticism of any assessment criteria, including those already in use by IUCN and other organizations.
Secondly, some species seem to thrive even with low GD (e.g., Femerling et al. 2023).However, it is also true that some species seem to thrive even when they score poorly with existing assessment criteria, such as small population sizes or small geographic ranges (e.g., bottlenecked invasive species).Thirdly, there is the potential of unrecognized "drift debt," the time-lag impact of genetic drift akin to the "extinction debt," that has not yet manifested itself in species with long generation intervals (Pinto et al. 2024).Finally, our suggestions are overly simplistic.Additional genomic assessments (e.g., genetic load, genomic offset, and pangenomics) could ultimately be incorporated to produce a more comprehensive GD criterion of the future, but then the challenge to explain the criterion to non-geneticists becomes even more daunting.

| Conclusion
Our analyses show that the five conservation criteria currently used by the IUCN Red List (census population size, demographic trajectory, geographic range size, a combined index of population size and geographic range size, and associated quantitative analyses) indirectly capture GD.However, many species on IUCN's Red List are "Data Deficient" because parameters like census population size or demographic trajectory are extremely difficult to estimate.We think that GD could become valuable as an additional criterion for conservation assessments, in large part because GD can be more easily and inexpensively evaluated than census size or demographic trajectory and can be estimated directly from public sequence databases that are expanding rapidly.We reiterate that neither H nor any other GD criterion should replace any existing assessment criteria.Instead, we think a GD criterion should be another point of emphasis (e.g., as a baseline level or how it trends over time) in a holistic evaluation of conservation threats (Red List) or successes (Green Status).We have illustrated our ideas using mammalian data, but these ideas are applicable to other branches on the tree of life.
We hope our proof-of-concept analyses and suggestions for a novel GD criterion will provide a springboard for further discussion.Regardless of whether the scientific community embraces our specific recommendations, we think conservationists would do well to explicitly assess GD metrics as part of a comprehensive evaluation of each species and there are a number of conservation genetics organizations (e.g., the IUCN Conservation Genetics Specialist Group) poised to help.Our study outlines the theoretical and empirical justification for a new GD criterion, a bioinformatic pipeline for estimating GD from publicly-available population genomic data, an analytical framework, and explicit recommendations for use by conservation authorities.Contemporary GD is critical to population persistence, and we hope that institutional authorities are prescient enough to recognize that exponentially expanding sequence repositories (Karasikov et al. 2024;Offord 2024) are a rich source of biological insight; we only need to take advantage of them.

FIGURE 1 |
FIGURE 1 | A box plot of Watterson's theta by taxonomic Order.Taxonomic Orders are arranged by descending median value of diversity.Dashed line indicates the overall mean value.Silhouette images of animals are adapted from PhyloPic (https:// www.phylo pic.org/ ).

FIGURE 3 |
FIGURE3| A practical outline of how genetic or genomic diversity (GD) could be explicitly used to help determine formal conservation status.The loss of heterozygosity after 100 years can be predicted and used to determine conservation status according to specific thresholds or by comparison to H O estimates of related taxa.For example, even when H T /H O > 0.975, if N e < 1000 then the species or population would warrant categorization as VU as gauged exclusively on its GD.If N e cannot be reasonably estimated (e.g., because N c cannot be estimated), then a Threatened designation would be warranted when H O < maximum H O of species in "Threatened" genetic categories (i.e.VU, EN, or CR as determined by their N e values or their H T :H O ratios) within the same taxonomic Order or Family.CR = critically endangered, EN = endangered, H O = observed heterozygosity, H T = reduced heterozygosity after T generations, LD = linkage-disequilibrium, N c = census population size, N e = effective population size, t = generation time, T = the number of generations in 100 years, VU = vulnerable, μ = mutation rates.

FIGURE 4 |
FIGURE 4| The comparison between original, official IUCN categories and hypothetical genetic categories based solely on the genetic criterion described in this study, whether full (a and c) or binary (b and d) plus "Data Deficient," when the N e /N c ratio was 10% (a and b) or 100% (c and d).Note that we assigned "VU" to Threatened species using the relative H O threshold for purposes of illustration.CR = critically endangered, DD = data deficient, EN = endangered, LC = least concern, NT = near threatened, Thr = threatened, VU = vulnerable.

TABLE 1 |
Genomic diversity metrics grouped by IUCN full categories.
Bats and rodents (Order Chiroptera and Rodentia, respectively) had the highest mean θ W , , and H values, almost triple the next most genetically diverse Order (Artiodactyla; even-toed ungulates).Carnivores and large mammals (Orders Proboscidea and Cetacea) are at the other end of the GD distribution.Mean D was lowest among Proboscidea (elephants), Cetacea, Rodentia and highest among Eulipotyphla (hedgehogs and relatives), Pholidota (pangolins), and Carnivora.Chiroptera had the lowest mean (O'Brien et al. 2017;Schmidt et al. 2023often conveys valuable conservation signal.This relative H O threshold would effectively provide conservation insurance by enabling assessments of elusive species where sufficient sample sizes for accurate N e estimation are difficult or impossible to obtain(Waples 2024a).We propose that relative H O cutoffs be used in a simple binary fashion (yielding assessments of threatened or not) because snapshot H O values cannot determine whether low heterozygosity across the genome is due to recent genomic erosion (e.g., inbreeding) or to natural demographic events such as historic bottlenecks in species like the cheetah(O'Brien et al. 2017;Schmidt et al. 2023).Thus, the value of relative H O should be discounted relative to the information contained in the H T O ratio.We propose that species whose relative H O is less than the maximum H O of related species or populations (i.e., those in the same taxonomic Order or Family) that are already classified as CR, EN, or VU based on their H T :H O ratios or effective population sizes.We propose a threshold associated with a maximum H O , rather than a mean or median value, because those measures of central tendencies will change each time a new H O value is added (e.g., in a novel H O assessment of a related species) whereas the maximum H O will only increase and thus provides more stability to GD assessments.
(Brüniche-Olsen et al. 2021) novel GD criterion into the Green Score Here, we provide an example application of the modified Green Score for California towhee (Melozone crissalis) and Inyo California towhee (M.c.eremophilus).We used their H values(Black et al. 2023) and compared those to H values of Least-Concern birds(Brüniche-Olsen et al. 2021).California towhee was sampled from nine geographic sites where H values were [0.00169, 0.00187, 0.00194, 0.00206, 0.00206, 0.00208, 0.00218, 0.00246, 0.00249] and the Inyo California towhee was sampled from a single population whose H was 0.00180, compared to an average H value of 10 Least-Concern Passerine species of 0.00355 (Brüniche-Olsen