Genetic diversity varies with species traits and latitude in predatory soil arthropods (Myriapoda: Chilopoda)

Aim To investigate the drivers of intra-specific genetic diversity in centipedes, a group of ancient predatory soil arthropods. Location Asia, Australasia and Europe. Time Period Present. Major Taxa Studied Centipedes (Class: Chilopoda). Methods We assembled a database of 1245 mitochondrial cytochrome c oxidase subunit I sequences representing 128 centipede species from all five orders of Chilopoda. This sequence dataset was used to estimate genetic diversity for centipede species and compare its distribution with estimates from other arthropod groups. We studied the variation in centipede genetic diversity with species traits and biogeography using a beta regression framework, controlling for the effect of shared evolutionary history within a family. Results A wide variation in genetic diversity across centipede species (0–0.1713) falls towards the higher end of values among arthropods. Overall, 27.57% of the variation in mitochondrial COI genetic diversity in centipedes was explained by a combination of predictors related to life history and biogeography. Genetic diversity decreased with body size and latitudinal position of sampled localities, was greater in species showing maternal care and increased with geographic distance among conspecifics. Main Conclusions Centipedes fall towards the higher end of genetic diversity among arthropods, which may be related to their long evolutionary history and low dispersal ability. In centipedes, the negative association of body size with genetic diversity may be mediated by its influence on local abundance or the influence of ecological strategy on long-term population history. Species with maternal care had higher genetic diversity, which goes against expectations and needs further scrutiny. Hemispheric differences in genetic diversity can be due to historic climatic stability and lower seasonality in the southern hemisphere. Overall, we find that despite the differences in mean genetic diversity among animals, similar processes related to life-history strategy and biogeography are associated with the variation within them.


Introduction
Intra-specific genetic diversity (henceforth genetic diversity) is the amount of genetic variation present among individuals of a species and is an important component of biodiversity.It indicates the evolutionary potential of a species and is correlated with fitness and species' response to environmental change (DeWoody et al., 2021).Genetic diversity can also have an influence on higher levels of biological organization by influencing species diversity, shaping communities (Vellend & Geber, 2005) and regulating ecosystem functioning (Raffard et al., 2019).Population genetic theory postulates that neutral genetic diversity increases with effective population size-the size of an idealized population that loses genetic diversity at the same rate as the observed population (Kimura, 1983) and mutation rate.A reduction in population size increases the sampling error in allele frequencies between generations, known as genetic drift, leading to the loss of genetic diversity (Charlesworth, 2009).
Previous studies have shown that genetic diversity is influenced by species traits and biogeography (Leigh et al., 2021).Species traits can modulate long-term effective population size by determining species' responses to environmental fluctuations.On the other hand, biogeographic correlates determine the strength of environmental fluctuations experienced by species, and therefore, can influence genetic diversity (Ellegren & Galtier, 2016).The strength of the relationship between species traits, biogeography and genetic diversity can be obscured by differences in mutation rates between lineages, which can vary based on the genetic locus under study (Nabholz et al., 2009).Comparisons of genetic diversity across a wide range of taxa show a limited range of genetic diversity, which does not scale as expected with census population size, a phenomenon known as Lewontin's paradox (Lewontin, 1974).This may be because current census size is not a good proxy for long-term effective population size, or due to mutation, selection and demographic fluctuations dampening genetic diversity (Charlesworth & Jensen, 2022;Ellegren & Galtier, 2016;Leffler et al., 2012).
Global-scale studies either investigate the biogeographic correlates of variation in crossspecies average of genetic diversity between spatial grid cells or use species as analytical Europe PMC Funders Author Manuscripts Europe PMC Funders Author Manuscripts units to additionally understand the life history correlates of variation in genetic diversity.Analyses at the level of spatial grid cells show that mitochondrial genetic diversity of well-studied taxa decreases with latitude, indicating a relationship between latitude and evolutionary rate or stability (Gratton et al., 2017;Manel et al., 2020;Miraldo et al., 2016).Species-level global comparisons of nuclear genetic diversity reveal taxon-specific drivers of genetic diversity in animals, influenced by life-history strategy, environment, range size and position of the population within the range extent (De Kort et al., 2021).Taxon-specific studies show that traits indicative of life-history strategy, such as fecundity (Romiguier et al., 2014), reproductive mode (Paz et al., 2015) and body size (Mackintosh et al., 2019) are better predictors of genome-wide genetic diversity for a species than census population size.This class of studies has also revealed that apart from life history, biogeographic variables related to range size and latitudinal position also influence mitochondrial genetic diversity (Fujisawa et al., 2015).In addition to species range size, the proportion of geographic area sampled to generate sequence data also influences estimates of genetic diversity.This is illustrated by a recent study, which found a consistent power law relationship between genomic diversity and geographic area in each of the 20 plant and animal species studied (Exposito-Alonso et al., 2022).Although species differ in the basal level of genetic diversity depending on their biological traits, range size and abundance, this mathematical relationship for a given species was hypothesized to be driven by genetic drift and natural selection.Both global-scale and taxon-specific studies have limited representation of arthropod groups, undersampling the richness of species traits, evolutionary history and ecosystems they offer.Additionally, arthropods vary widely in their genetic diversity, having some of the highest values of genetic diversity among animals (Leffler et al., 2012).
The variation in species traits among centipedes can potentially influence genetic diversity.Centipedes show a striking variation in body size (ranging from a few mm to up to 300 mm), which can influence genetic diversity by regulating local population abundance (White et al., 2007).Centipedes are predominantly sexually reproducing and show variation in their reproductive strategy, which can influence fecundity and long-term effective population size and thus genetic diversity (Ellegren & Galtier, 2016).While species from two orders (Scutigeromorpha and Lithobiomorpha) lay single eggs, others (Craterostigmomorpha, Europe PMC Funders Author Manuscripts Europe PMC Funders Author Manuscripts Scolopendromorpha and Geophilomorpha) brood multiple eggs and maternal care is also provided to hatchlings (Bonato & Minelli, 2002;Fernández et al., 2014).Another species trait that can influence genetic diversity through its association with habitat specialization or dispersal ability is blindness, seen in the order Geophilomorpha, in a few species of Lithobiomorpha, and in three families along with a few mostly subterranean species within Scolopendromorpha (Edgecombe et al., 2019;Vahtera et al., 2012).
There has been an increase in the representation of centipedes in publicly available sequence data in the last two decades, primarily arising from integrative taxonomic studies (Edgecombe & Giribet, 2019 and references therein) and regional barcoding efforts (eg.Spelda et al., 2011;Wesener et al., 2015).Among other genetic markers, the mitochondrial cytochrome c oxidase subunit I gene (COI), which is widely used as a DNA barcode, is well represented across centipede species.The availability of global-scale publicly available sequence data for centipede species that vary with respect to their evolutionary age, species traits and biogeography motivated us to study their relationship with genetic diversity in a comparative framework.In this study, we specifically ask the following: How is genetic diversity distributed across centipede species?We aimed to understand the range of genetic diversity seen in centipedes, an ancient soil arthropod clade with a 420 million-year evolutionary history, in the context of genetic diversity documented in other well-studied arthropod clades.

2.
What are the species traits and biogeographic variables correlated with genetic diversity in centipedes?Based on theory, we expect to see a negative relationship between body size, maternal care (associated with low lifetime fecundity) and blindness (associated with habitat specificity and dispersal) relative to genetic diversity.These species traits can reduce effective population size leading to a reduction in genetic diversity.While the latitudinal range is thought to be correlated with population size and may be positively associated with

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts genetic diversity, the mean latitudinal position is expected to show the inverse relationship (Figure 1).
A comprehensive global dataset including species traits, bio-geographic correlates and mitochondrial sequences for 128 centipede species allowed us to estimate genetic diversity and examine its drivers.We observed a wide variation in genetic diversity across species, which was high compared with other arthropod classes.Both life-history traits (body size and maternal care) and biogeographic correlates were important in explaining the variation in mitochondrial COI genetic diversity.This highlights the role of ecological strategy and latitudinal correlates of environmental stability as possible drivers of genetic diversity across organisms, despite the differences in absolute values of genetic diversity between taxonomic groups.
2 Materials and Methods

Georeferenced DNA sequence data
We compiled an exhaustive database of published literature containing centipede DNA sequences using an opportunistic search.To the sequences arising from this database, we added Chilopoda sequences from a curated georeferenced sequence database (Pelletier & Carstens, 2018), which matched GBIF (https://www.gbif.org/)and GenBank (https:// www.ncbi.nlm.nih.gov/genbank/)data.This resulted in a literature database of 64 published studies and 11 sequence datasets.We used this database to compile sequence data by extracting accession numbers for the mitochondrial COI marker across centipede species.In addition to accession numbers, we also compiled information on museum catalogue number, collection locality and geographic coordinates from source literature.We filtered this dataset to only retain those species that had at least three distinct sequence representatives (Appendix S1 in Supporting Information).
Among the species that were retained, missing geographic co-ordinates associated with accession numbers were obtained by querying voucher numbers against museum websites (Appendix S1 in Supporting Information).When this was not available, we used geocoding to obtain geographic coordinates from locality names using the geocode_OSM function from the package 'tmaptools' (Tennekes, 2018).Average geographic distance between sequence locations for each species was calculated with the haversine formula using the function geodist in the package 'geodist' (Padgham & Sumner, 2021) in R version 4.3.0(R Core Team, 2023).
We additionally queried the phylogatR database (Pelletier et al., 2022) to obtain georeferenced sequence data for this group.The phylogatR database repurposes already existing data by using an automated pipeline to match GenBank accession numbers and BOLD (Barcode Of Life Database-http://www.boldsystems.org/index.php)entries with GBIF occurrences at the specimen level and curates these data to minimize errors.The phylogatR database for Chilopoda was accessed on 12 July 2022 using the Ohio Supercomputer Center, where it is hosted.We compared our dataset with phylogatR and integrated data between them to increase species representation.

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts We had information for 119 species through our literature search, while phylogatR represented 73 species, of which 58 species are common to both datasets.Of the 15 species that were found only in phylogatR and missing from our data, four species (Geophilus proximus, Lamyctes emarginatus, Pachymerium ferrugineum, Scutigera coleoptrata) were filtered out as they contain likely synanthropic introductions, and one species (Anopsobius neozelanicus) that does not represent a monophyletic group in a larger phylogeny (Edgecombe & Giribet, 2004) was removed.The remaining 10 species unique to the phylogatR database were combined with our larger dataset.A few accession numbers for Scolopendra subspinipes in our dataset were assigned to Scolopendra mutilans in the phylogatR dataset.Given the uncertainty associated with the assignment of S. subspinipes sequences to its two subspecies-S.subspinipes subspinipes and S. subspinipes mutilansthis species was removed from the dataset and further analysis.This resulted in a final dataset consisting of 128 species.

Species traits and biogeographic information
Each species was supplemented with trait data from various sources.While the presence of maternal care and vision show variation at higher taxonomic levels (Edgecombe & Giribet, 2007), body size information for each species was obtained largely from species descriptions in taxonomic studies (Appendix S2 in Supporting Information).Species distribution information was collated from locations corresponding to species accession numbers, Chilobase 2.0 (Bonato et al., 2016), GBIF (GBIF.org,2021), species descriptions and regional atlases.These distribution data were used to derive the latitudinal range for each species (Appendix S2 in Supporting Information).The mean latitudinal position of each species was calculated using only the geographic locations corresponding to the sequence dataset.We also analysed another version of the dataset including instances of synanthropic introductions, which led to the inclusion of additional sequences to existing species and the addition of six species (Geophilus proximus, Lamyctes africanus, Lamyctes coeculus, Lamyctes emarginatus, Pachymerium ferrugineum and Scutigera coleoptrata).The primary analysis described below was carried out using the smaller dataset representing only the native range of centipede species.

Sequence statistics
Mitochondrial COI sequences corresponding to the accession numbers were retrieved from the National Center for Biotechnology Information (NCBI) using the entrez_fetch function in the package 'rentrez' (Winter, 2017).For each species, sequence alignments were carried out separately using the MUSCLE algorithm in the package 'muscle' (Edgar, 2004) under the default parameters.The sequence alignment for each species was visualized in Aliview v1.26 (Larsson, 2014) and sequences were trimmed to bring them to the same length.
These edited alignments were used to calculate sequence statistics including sequence length, number of segregating sites (function seg.sites in the package 'ape'; Paradis & Schliep, 2019), number of parsimony informative sites (function pis in the package 'ips'; Heibl, 2008) and nucleotide diversity (function nuc.div in the package 'pegas'; Paradis, 2010).Nucleotide diversity is calculated as the per site average number of differences between a pair of sequences, which is the sum of the number of differences between

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts sequence pairs divided by the total number of sequence pairs compared.All analyses were carried out in R 4.3.0(R Core Team, 2023).

Statistical analysis
Genetic diversity is a proportion that estimates the probability of observing a mutation at a given site within a DNA sequence and can theoretically range from 0 to 1.However, intra-specific genetic diversity ranges closer to 0, as it is calculated from closely related individuals belonging to a single species.Our estimate of genetic diversity, average pairwise difference, is calculated by counting the number of mutations along a sequence that is hundreds of base pairs long.Given that genetic diversity is a proportion calculated using a large number of total counts (sequence length), it resembles continuous proportions, which can be analysed using a beta regression framework (Douma & Weedon, 2019).
The error distribution of our regression model, the beta distribution, belongs to the exponential family and is defined by two parameters-mean and precision.In the deterministic part of our model, our response variable of genetic diversity is predicted by species traits (body size-continuous; blindness and maternal care-binary) and biogeography (latitudinal range, mean latitude and geographic distance-continuous). Since some of our genetic diversity estimates took zero values, which cannot be modelled using the beta regression algorithm, we replaced these with a small value following standard recommendations (Smithson & Verkuilen, 2006).All the predictor variables were standardized to have a mean of zero and a standard deviation of 1.We used a logit-link function for the linear transformation of our exponential-family model.Given that the sample size of sequence representatives can influence the precision of the genetic diversity estimate for a species, we modelled the precision parameter of the error distribution using sample size as a covariate.We expected larger sample sizes to provide more precise estimates of genetic diversity compared with smaller numbers of sequences.
Our global model consisted of all the predictors mentioned above and taxonomic family as a random effect to account for the influence of shared evolutionary history on genetic diversity.Nested models created by dropping the precision parameter and/or the random effect were compared using their AIC values to choose the best model.We calculated a variation inflation factor for each predictor to check the influence of multi-collinearity between these variables on the coefficient estimates.We checked for the presence of spatial autocorrelation in model residuals by calculating Moran's I, which indicates the non-independence of observations that can lead to false positive errors (Dormann et al., 2007;Gaspard et al., 2019).To account for the presence of residual spatial autocorrelation, we used spatial eigenvectors as additional predictors in the model following the recommendations of Bauman, Drouet, Dray, et al. (2018) and Bauman, Drouet, Fortin, et al. (2018; details in Appendix S5 of Supporting Information).We measured the phylogenetic signal in model residuals using a family-level phylogenetic tree (Fernández et al., 2014) by calculating Pagel's λ (Pagel, 1999).
To test the influence of sample size (number of sequences per species) on our inferences, we performed sensitivity analysis following Barrow et al. (2021).We sampled sequences with replacement and calculated genetic diversity for each of the 100 replicates for a given

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts species.This was done for sample sizes ranging from 2 to 10 sequences.Based on plots of variance in genetic diversity against sample size, we used a four-sequence cut-off to redo the beta regression model using median genetic diversity across replicates (Appendix S5 in Supporting Information).
The beta regression models were run using the glmmTMB function in the package 'glmmTMB' (Brooks et al., 2017) and the betareg function in the package 'betareg' (Cribari-Neto & Zeileis, 2010), and phylogenetic signal in model residuals was calculated using the phylosig function in the 'phytools' package (Revell, 2012).Moran's I was calculated using the function Moran.I in the 'ape' package (Paradis & Schliep, 2019) and the selection of spatial eigenvectors was optimized using functions from the 'spdep' package (Bivand & Wong, 2018) in R 4.3.0(R Core Team, 2023).

Geographic and taxonomic distribution of data
The complete data representing 50 published studies and 11 sequence datasets were supplemented with data from phylogatR.This georeferenced sequence dataset along with Overall, the sequences in the dataset were obtained from 774 unique geographic locations spanning more than 100 degrees in latitude (46.9° S to 60.5° N), with centipede orders showing distinct patterns of geographic distribution (Figure 2).For the two most wellrepresented orders, Scolopendromorpha sequences largely originated from tropical and sub-tropical regions (mean latitude = 18.73°N), while Lithobiomorpha sequences were predominantly from northern temperate regions (mean latitude = 45.62°N).There was a larger number of sequences from the northern (n = 1072) compared with the southern hemisphere (n = 173), with longitudinal under-representation from the Americas and Africa (Figure 2).These geographic gaps may arise from an interaction of differences in patterns of species distribution along with sequencing effort and taxon sampling.

Genetic diversity in centipedes compared with other arthropods
The average genetic diversity for centipedes was 0.0721 (range = 0 to 0.1713), with its distribution falling towards the higher end of values compared with other arthropod groups (Figure 3).The average values of genetic diversity for other arthropod classes ranged from Europe PMC Funders Author Manuscripts Europe PMC Funders Author Manuscripts 0.0098 in insects to 0.0445 in millipedes, the latter belonging to the same sub-phylum as centipedes, Myriapoda.Although not an exhaustive effort, our collation of genetic diversity values from other arthropod groups showed evidence for an increased representation in insects in comparison with other taxonomic classes (Figure 3; Appendix S4 in Supporting Information).

Variation in genetic diversity is related to species traits and mean latitude
Among the four models compared, the one using predictors for the fixed effects along with an independent predictor for precision emerged as the best model given its lowest AIC score (Table 1).The variance inflation factors associated with the predictor variables were lower than 5, indicating that there was no significant influence of predictor multi-collinearity on coefficient estimates.There was weak but significant residual spatial autocorrelation in model residuals (Moran's I = 0.0726, p = 0.0083), which was accounted for by using two spatial eigenvectors as additional predictors in the beta regression.The resulting model explained 27.57% of the variation in genetic diversity across centipede species.Species traits -body size and maternal care-significantly contributed to explaining this variation, while average geographic distance and the mean latitude of sequences were the biogeographic variables that emerged as significant (Table 1).The phylogenetic signal in the model residuals was close to zero and not significant (λ = 6.61×10 −5 , p = 1), indicating that the residual variation in genetic diversity could not be explained by a Brownian motion model of trait evolution at the family level.
Genetic diversity showed a negative relationship with body size and mean latitude (Figure 4), where smaller species or those with sequences from lower latitudes had greater values of genetic diversity (Figure 5).The coefficient estimate for body size was significant, even though it had a wide confidence interval.Species with maternal care had higher values of genetic diversity, although there was substantial variation in genetic diversity within each reproductive strategy (Figures 4 and 5).Genetic diversity increased with a greater average geographic distance between sequences.Confidence intervals of co-efficients for vision and species latitudinal range overlapped with zero, indicating that they were relatively less important in explaining the variation in genetic diversity between centipede species (Figure 4).
In the analysis carried out using the dataset including likely synanthropic introductions, body size, maternal care and latitude remained significant predictors of genetic diversity.Additionally, vision showed a significant positive relationship with genetic diversity with a wide confidence interval.Values of average geographic distance showed a large variation between the two datasets and it was not a significant predictor when likely introductions were retained (Appendix S5 in Supporting Information).
The analysis using a cut-off of four sequences per species (91 of the 128 species retained) yielded broadly similar results.The relationships of latitude and body size with genetic diversity remained the same, while the bootstrapped confidence interval of maternal care was wider and overlapped with zero despite the estimated regression coefficient being significant.The average geographic distance between sequences was no longer a significant predictor.The latitudinal range of a species emerged as a significant predictor of genetic

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts diversity in this smaller dataset but had a wide bootstrapped confidence interval overlapping with zero.

Centipedes have relatively high genetic diversity among arthropods
We find that centipedes have a high genetic diversity in comparison with other arthropod groups, which themselves fall at the higher end of the spectrum compared with plants and chordates (Leffler et al., 2012).Among arthropods, where observations are skewed towards insects, high genetic diversity is hypothesized to be driven by their ability to reach large population sizes (Leffler et al., 2012).However, this mechanism, of a large effective population size holding greater genetic diversity, may not hold true for centipedes, which are predators that occur in low population densities in the soil ecosystem.The observed range of genetic diversity in centipedes may be explained by their persistence over a prolonged evolutionary history ('evolutionary framework' in Lawrence & Fraser, 2020) that extends back at least 420 million years (Edgecombe & Giribet, 2019).The limited dispersal ability of centipedes can also contribute to strong spatial differences in genetic composition reported in soil arthropod communities (Arribas et al., 2021) and the presence of geographically unique genetic diversity (Gloss et al., 2016).The positive relationship between geographic distance and genetic diversity seen in centipedes supports such distance decay in genetic similarity.Additionally, the presence of cryptic diversity may contribute to the relatively high values of genetic diversity observed for some species, which needs to be further examined with species delimitation methods using genome-level data.

Species traits are significant correlates of genetic diversity
Despite the differences in the absolute values of genetic diversity across taxonomic groups, we find an overlap in traits and geographic factors that are correlated with genetic diversity.
The genetic diversity of centipedes decreases with increasing body size, a relationship that has been observed across several animal groups (Brüniche-Olsen et al., 2021;De Kort et al., 2021;Mackintosh et al., 2019;Romiguier et al., 2014; an exception being Barrow et al., 2021).This association could be driven by the negative relationship between body size and abundance due to resource constraints (White et al., 2007), and by body size representing an ecological strategy that determines long-term effective population size (Ellegren & Galtier, 2016).Species with small body size, high fecundity and a short lifespan are hypothesized to recover from bottlenecks driven by environmental fluctuations more easily, therefore maintaining a larger long-term effective population size and greater genetic diversity (Ellegren & Galtier, 2016;Romiguier et al., 2014).
We find that centipede species showing maternal care of offspring had higher values of genetic diversity compared with those that abandon their eggs.This questions our assumption of maternal care translating to greater investment in offspring quality over quantity, and therefore, lower lifetime fecundity and genetic diversity.There is a dearth of information on breeding biology from orders lacking maternal care, with substantial variation in the number of eggs reported for a few species (Lewis, 1981), and very little information on their survivorship.Gathering more natural history information would

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts clarify the relationship between maternal care and lifetime fecundity in centipedes, and the observed positive relationship with genetic diversity.
The effect of blindness on genetic diversity may be mediated through its association with specialization to a subterranean habitat and/or low dispersal ability.We find that vision only emerges as a significant positive correlate of genetic diversity when synanthropic introductions are included, indicating sensitivity to changes in input data.While the observed pattern aligns with a negative association between specialization and genetic diversity seen in amphibians (De Kort et al., 2021), parasitoid wasps (Bunnefeld et al., 2018) and bumblebees (Jackson et al., 2018), other studies show no relationship in butterflies (Mackintosh et al., 2019), forest carabid beetles (Brouat et al., 2004) and bees (Dellicour et al., 2015).A more balanced representation of species with and without vision (e.g. more Geophilomorpha, all of which are blind) using an expanded dataset could help resolve this relationship in centipedes.

Latitudinal gradient and hemispheric differences in genetic diversity
Apart from species traits, several recent studies document a decline in genetic diversity with increasing latitude (Species-level studies: beetles-Fujisawa et al., 2015; salamanders - Barrow et al., 2021;amphibians andmolluscs-De Kort et al., 2021. Grid-level studies: amphibians-Gratton et al., 2017;Miraldo et al., 2016;mammals-Millette et al., 2020;Theodoridis et al., 2020), mirroring the latitudinal gradient in species diversity (Mittelbach et al., 2007).The mechanisms shaping latitudinal patterns in genetic diversity are thought to be congruent with those driving species diversity, related to climatic stability, longer evolutionary history, larger area with higher productivity and higher temperature resulting in high rates of molecular evolution at low latitudes (Fine, 2015).In centipedes, we find that genetic diversity increases from the northern hemisphere towards the tropics and the southern hemisphere.The hemispheric differences in the genetic diversity of centipedes are indicative of similar patterns in species diversity (Dunn et al., 2009), which may be driven by differences in the current range of environmental variables and historic climatic stability (Chown et al., 2004).However, the wide confidence intervals for the southern hemisphere and limited representation of data points do not allow us to comment on a trend in comparison with the tropics.
Other arthropod groups have been reported to show deviations from the commonly observed latitudinal gradient in genetic diversity.Evenness in insect genetic diversity across spatial grid cells shows a quadratic relationship with latitude, where it peaks in the arid subtropics and is lower at the equator and the poles.This has been hypothesized to be a result of the interaction of Rapoport's rule, which predicts increasing range size at higher latitudes, with the positive correlation between range size and genetic diversity.Glaciation at higher latitudes during the Last Glacial Maximum and associated demographic changes may lead to the observed drop in genetic diversity at the poles (French et al., 2022).

What can intra-specific genetic diversity tell us about species diversity?
Variation in genetic diversity can be indicative of broader patterns in species diversity, either through the same underlying mechanisms acting independently or because of a cause-

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts and-effect relationship between the two.As mentioned earlier, area, time, environmental factors and climatic stability can influence intra-specific and species diversity in parallel.Genetic diversity can positively influence species diversity if it reflects population fitness and reduces extinction rates or increases the diversity of competing species.High species diversity can negatively influence genetic diversity if species packing leads to niche specialization and if limiting resources result in smaller population sizes per species (Vellend & Geber, 2005).
In an empirical evaluation, neutral mechanisms involving area and isolation were found to be associated with both species and genetic diversity in beetles within an island system, shaping community and haplotype similarity along with dispersal ability (Papadopoulou et al., 2011).A strong relationship between genetic diversity and phylogenetic diversity was also observed in a global study of mammals, where it was speculated that microevolution at the population level may drive patterns in species diversity through various mechanisms (Theodoridis et al., 2020).However, the relationship between species and genetic diversity may be decoupled due to biological differences between taxa, lack of correlation between range size and genetic diversity (as opposed to a strong relationship between range size and species diversity) and sampling biases at the population-level (Lawrence & Fraser, 2020).It remains to be seen if these two hierarchical levels of biodiversity are correlated in centipedes, and if there is a causal link between genetic diversity as an emergent species trait and diversification rates among various centipede groups.

Significant variation in genetic diversity using a mitochondrial marker
As explained above, we find substantial variation in mitochondrial genetic diversity in centipedes, which is correlated with species traits, geographic distance and latitudinal distribution.This is in contrast with some previous studies, which find very limited variation in mitochondrial compared with nuclear estimates (Bazin et al., 2006;Mackintosh et al., 2019) and no correlation with species life-history traits (Dapporto et al., 2019).Estimates of genetic diversity can vary based on the properties of the genetic marker-mode of inheritance, ploidy (Berlin et al., 2007), length of the genetic map (Mackintosh et al., 2019) and mutation rate variation among taxa (Nabholz et al., 2009).The limited variation in mitochondrial genetic diversity and its lack of correlation with effective population size is ascribed to repeated selective sweeps and loss of diversity through genetic draft, given its maternal inheritance and smaller genome (Gillespie, 2001).For these reasons, the use of mitochondrial markers has been criticized despite the wide availability of sequences arising from barcoding efforts (Paz-Vinas et al., 2021).
In this context, it is interesting that we find significant variation in diversity estimates across centipede species, which is associated with species traits.This variation could be due to the smaller population sizes of predatory arthropods, which can dampen the frequency of selective sweeps as beneficial mutations are lost to genetic drift (Piganeau & Eyre-Walker, 2009).The strength of selection also depends on the nature and spatial structure of genetic variation, which shapes genetic diversity (Leffler et al., 2012).
While the predictors in this study explain over a fourth of variation in genetic diversity, the strength of the observed correlation is congruent with other studies at a similar scale (Leigh  et al., 2021).The unexplained variation in genetic diversity could be due to spatial and temporal variation in drivers and population histories that cancel out at a broad spatial and taxonomic scale, the potential importance of environmental variables that are absent from the analysis, or the choice of the genetic marker as detailed above.

Taxonomic and geographic gaps in sequencing efforts
Apart from revealing potential drivers of variation in genetic diversity, our dataset revealed taxonomic and distributional gaps in sequencing effort.There is a dearth of sufficient sequence information from the Americas and Africa, leading to a longitudinal bias in the available data (Figure 2).This also adds to a latitudinal gap in sampling, as most sequence data in the southern hemisphere are from Australia and New Zealand (Figure 2).There is also a sampling bias in the Palearctic, where most available sequences are from Europe (Figure 2).
The existing sequence information used in our analysis represents about 4% of existing species diversity and 13 of the 18 centipede families.Among the five centipede orders, the sampling gap in terms of species and family representation is the starkest for Geophilomorpha, where we have sampled 17 of over 1300 species (Appendix S3 in Supporting Information, Edgecombe & Giribet, 2007).This group is unique in terms of its habitat, being obligate soil-dwellers, as well as its feeding behaviour.Geophilomorphs feed using a greater degree of liquid suction than other centipedes and use their mandibles to sweep or rasp food instead of chewing, which may potentially influence their prey resource base and population dynamics (Koch & Edgecombe, 2012;Lewis, 1981).These geographic and taxonomic gaps can be the focus of future sampling efforts to reassess the current results and would also contribute to centipede phylogenetics and biogeography.

Significance of examining intraspecific genetic diversity among divergent taxa
Our study generates hypotheses of drivers of genetic diversity in a relatively under-studied taxonomic group with a deep evolutionary history.These can be tested for their generality by using controlled comparisons of species with contrasting traits and distribution patterns and by screening additional nuclear markers.The generation of such hypotheses and efforts to test their validity provide means of understanding the generality of macroecological patterns across under-studied taxonomic groups (Beck & McCain, 2020) from unique and poorly explored habitats showing high biotic and abiotic variability (Thakur et al., 2020).While large-scale biogeographic studies in centipedes can be challenging due to a 'species identification bottleneck' reported in other arthropods (French et al., 2022), our study can act as a stepping-stone for future work.It also generates hypotheses for landscape-level studies exploring environmental and historical drivers of genetic diversity, and its relationship with population genetic structure (Salinas-Ivanenko & Múrria, 2021) and phylogenetic diversity (Bharti et al., 2021).

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.Schematic figure representing the theoretical drivers of intra-specific genetic diversity.Species traits and biogeography associated with species can influence their effective population size, which has a positive relationship with neutral genetic diversity.Variables with a negative influence on effective population size are highlighted in red and those with a positive influence in blue.Geographic locations of centipede species associated with at least three mitochondrial COI sequences.The grey circles are GBIF occurrences for centipedes across the globe.See Figure S3.1 in Appendix S3, Supporting Information for the distribution of sequence coordinates including introductions.Parameter estimates (standardized and in the logit scale) from the best-performing beta regression model using additional spatial eigenvectors as predictors to account for small residual spatial autocorrelation.The model is defined as-Genetic diversity i ~ Beta(μ i , ϕ information on species traits and biogeography consisted of 1245 mitochondrial COI sequences representing 128 unique species, 13 of 18 centipede families and all five orders of Chilopoda.The species in our dataset varied in body size by two orders of magnitude (mean = 48 mm, range = 8.5-250 mm), with a relatively balanced distribution of reproductive strategy (83 of 128 species showing maternal care) and a predominance of species with the presence of vision (98 of 128 species).On average, each species was represented by around 10 unique sequences (range = 3-68), with a mean alignment length of 648 bp (range = 465-840 bp).The centipede orders varied in the number of species and the total number of sequences representing them.These sequences arose from an average of seven unique geographic locations for each species (range = 1-53), separated by geographic distances up to 5066 km (Appendix S3 in Supporting Information).

Figure 3 .
Figure 3. Distribution of mitochondrial COI genetic diversity (average pairwise differences) across arthropod groups.The values arise from the smoothed density of counts, scaled to a maximum of 1 for each taxonomic class.In the legend, the numbers in parentheses are the number of individual data points used for a given taxonomic class.Data and their source references are provided in Appendix S4 in Supporting Information.

Figure 4 .
Figure 4. Standardized coefficient estimates (logit-scale) from the beta regression model with the lowest AIC value specified as-Genetic Diversity ~ Body size + Vision + Maternal care + Mean latitude + Latitudinal range + Geographic distance + MEM13 + MEM40 | Number of sequences.The spatial eigenvectors ('MEM' in the predictors) were obtained from a spatial weighting matrix, which was a product of a connectivity matrix derived from a Relative Neighbourhood graph of coordinates (centroid of sequence locations for a species), and a binary weighting matrix.Mean coefficient estimates are represented as points and 95% confidence intervals obtained from 1000 bootstrapped replicates are displayed as error bars for each predictor variable.Positive values indicate a positive relationship between the corresponding predictor variable and genetic diversity and the converse.

Figure 5 .
Figure 5.Fitted relationships between the significant explanatory variables and genetic diversity (measured as average pairwise difference) from the beta regression model, where genetic diversity is in the scale of observed values.95% confidence intervals are represented by the shaded band around the fitted line for continuous variables and error bars for the categorical variables.The effect of each predicted variable is calculated by varying it across the observed range, while keeping other predictor variables at their mean values.Image credits: Lithobius microps -Donald Hobern, distributed under a CC-BY 2.0 license; Scolopendra morsitans -Umesh Pavukandy; Lithobius sp. and Cormocephalus hartmeyeri -Gonzalo Giribet.
Europe PMC Funders Author ManuscriptsEurope PMC Funders Author Manuscripts