An analytical framework for estimating aquatic species density from environmental DNA

Abstract Environmental DNA (eDNA) analysis of water samples is on the brink of becoming a standard monitoring method for aquatic species. This method has improved detection rates over conventional survey methods and thus has demonstrated effectiveness for estimation of site occupancy and species distribution. The frontier of eDNA applications, however, is to infer species density. Building upon previous studies, we present and assess a modeling approach that aims at inferring animal density from eDNA. The modeling combines eDNA and animal count data from a subset of sites to estimate species density (and associated uncertainties) at other sites where only eDNA data are available. As a proof of concept, we first perform a cross‐validation study using experimental data on carp in mesocosms. In these data, fish densities are known without error, which allows us to test the performance of the method with known data. We then evaluate the model using field data from a study on a stream salamander species to assess the potential of this method to work in natural settings, where density can never be known with absolute certainty. Two alternative distributions (Normal and Negative Binomial) to model variability in eDNA concentration data are assessed. Assessment based on the proof of concept data (carp) revealed that the Negative Binomial model provided much more accurate estimates than the model based on a Normal distribution, likely because eDNA data tend to be overdispersed. Greater imprecision was found when we applied the method to the field data, but the Negative Binomial model still provided useful density estimates. We call for further model development in this direction, as well as further research targeted at sampling design optimization. It will be important to assess these approaches on a broad range of study systems.

Making the transition to estimation of animal density from eDNA concentration data requires (1) the development of adequate statistical models that account for issues inherent to eDNA studies (e.g., overdispersion) and (2) empirical validation of these models. These are the two novel elements that we provide in this article, which represents an important first step in this direction. First, we present a statistical modeling approach for the estimation of animal density for a number of sites from two sources of data: (1) "eDNA-only data": eDNA quantitative data, without any other type of information about animal density, available for all sampled sites; and (2) "dual data": eDNA quantitative data, associated with reliable animal density estimates or metrics, available from a subset of sites. Our approach uses the information on the relationship between eDNA concentration and animal density, contained in the dual data, to infer animal density for all other sites, from the eDNA-only data. Instead of relying on the post hoc extrapolation of a linear regression linking eDNA and density data, we provide a likelihood-based method that combines all data in a one-step analysis and directly provides estimates of animal density and the associated uncertainty (i.e., SE and 95% C.I.). After introducing the method, we assess its performance using experimental data from a controlled study on common carp (Cyprinus carpio). In this proof of concept study, the model was implemented on eDNA data collected in mesocosms where the number of carps was perfectly known, thus providing an ideal situation to evaluate the method's accuracy under controlled conditions. Finally, we implemented the approach on field data from a study on Idaho giant salamanders (Dicamptodon aterrimus) to assess its potential to work in natural settings.

| MODEL DESCRIPTION
For the model described here, we consider that K i eDNA water samples, obtained from either spatial or temporal replicates, are collected at each one of a total of I sites for which we want to infer animal density. The eDNA data are denoted w ik for each site i = {1, … I} and each sampling replicate k = {1, … K i }. Data w ik must be a quantitative metric of eDNA concentration (e.g., ng/l or DNA copies per ml).
Ideally, the K i samples for a site i are extracted from spatially replicated water samples pooled together and homogenized. Water replicates would typically be collected at selected locations inside the sampling unit (site; e.g., pond, wetland, section of river), at a single time, thus ensuring population closure. For diffusion-limited lentic waters where eDNA concentrations are likely to reflect space use of the target species (e.g., Eichmiller et al., 2014), the sampling unit should be carefully identified and sampling replicates collected identically over the area of inference. Such a "snapshot" sampling design is probably more reliable than using temporally separated replicates, given that temporal variability in environmental conditions can affect eDNA. Moreover, if collection of eDNA replicates is spread across a relatively long period of time (e.g., several weeks), we are more likely to violate the assumption of closure of the target population, as birth, death, immigration, or emigration might occur between sampling occasions. In such a case, the population density we are trying to estimate would not be meaningful. In addition to the collection of eDNA, "traditional" surveys (e.g., repeated visual counts, trapping) must be conducted at a subset J < I sites to provide accurate and reliable estimates of animal density D j for each site j = {1, …J}. We will refer to (1) the (I-J) sites where only eDNA are available (i.e., uninformed animal density) as eDNA-only sites and (2) the subset of J sites, randomly selected among the I sites, where both eDNA and animal density data are available, as dual data sites. The dual data sites inform the relationship between eDNA concentration and animal density and thus allow inference about animal density D j at the eDNA-only sites. Our results (below) suggest that very few dual data sites (e.g., J = 3-5) are necessary for the model to work properly and provide accurate estimates when the relationship between eDNA concentration and animal density is constant across sampled sites (i.e., over space and time).
The goal is to infer unknown animal density D j for all sites i ≠ j from the data w ik and D j . This can be achieved by modeling, for each site i, the probabilistic distribution of eDNA concentration w ik as a function f (D i ) of local animal density D j , which is known (or at least estimated) for any site j but totally unknown for any site i ≠ j. This model can be written in a very general fashion as: where Distr simply denotes any probabilistic distribution and θ rep- Here, we consider two different distributions to model the data w ik : (1) a Normal distribution that assumes that deviations of realized values w ik from E (w ik ) are normally distributed; and (2) a Negative Binomial distribution that allows for larger dispersion in the realized values w ik . To further investigate dispersion issues, we also implemented a model based on the Poisson distribution, which is a special case of the Negative Binomial distribution (see Appendix S1).
The Normal model can be written as follows: where μ i = β 0 × D i is the expected value of w ik at site i and σ 2 represents across-replicates variance in values w ik . The corresponding likelihood formulation is as follows: The Negative Binomial model is written as follow: where μ i = β 0 × D i is the expected value of w ik at site i and r is the dispersion parameter, sometimes referred to as the "target number of successful trials" in the traditional description of the negative binomial distribution. Parameter r quantifies across-replicates variability and overdispersion in values w ik . The likelihood formulation for this model is as follows: where, p i = r μ i +r , which also referred to as the probability of trial success of the negative binomial process. For the Negative Binomial model, data inputs must be integers. Ideally, eDNA concentration is quantified as a number of DNA copies, which does not require any transformation, as in our carp example. Otherwise, eDNA values must first be transformed as integers, like we did in our salamander example.

| Proof of concept data
As a proof of concept, we applied the method on an experimental dataset on common carp, where fish density was known without error. This experiment consisted in a total of I = 11 non-empty mesocosms and K i = 3 eDNA sampling replicates per mesocosm (site), taken at 1-week intervals. Here, temporal replicates were used instead of the "snapshot" spatial replicate design recommended above, but because it was a controlled experiment, variability in environmental conditions was not an issue. For full details on this case study, see Doi et al. (2015).
Model performance was quantified by the root mean squared error and 95% C.I. realized coverage of repeated animal density estimates (D i ) provided by the model in a cross-validation study.
We used an exhaustive leave-p-out cross-validation approach where p represents the number of sites for which animal density was assumed unknown (i.e., eDNA-only sites) and thus used for validation. The number J = I-p of dual data sites represents the subset of data used for model calibration. We assessed the model for four different scenarios of model calibration: I-p = {2, 3, 4, 5} dual data sites (I = 11; hence p = {9, 8, 7, 6}). Cross-validations were exhaustive: For each scenario, the learning or testing process was done for all possible ways C p I of subdividing the original dataset. For instance, for I = 11, with p = 8, a total of C p I = 165 validation repetitions were performed. For each individual repetition, we proceeded as follows: (1) J sites, for which we kept both eDNA and density data, were used as dual data sites, and p sites, for which only eDNA data were kept, were used as eDNA-only sites for the analysis; (2) model outputs provided density estimates D i,i≠j for a total of p = I-J eDNA-only sites; (3) the p estimates D i≠j obtained as model's outputs were compared to the "known" density values that were left out of the analysis. We also calculated the error (ε i = D i -D i ) and recorded whether the "known" value was included in the 95% confidence interval (C.I.). This process was repeated C p I times, for all possible unique combinations. From these cross-validation results, we then derived two summary measures to assess model performance. First, we calculated the root mean squared error (RMSE), across all cross-validation repetitions. The RMSE is a measure of total error, combining bias and systematic error (variance, imprecision). Second, we calculated the realized coverage of the 95% C.I., across all cross-validation repetitions, as the proportion of time that the known animal density value (D i ) was included in the 95% C.I. produced by the model output (D i ± SE).
For a well-behaved estimator procedure, the realized coverage should be very close to 0.95.

| Field dataset
To evaluate the potential applicability of the method in natural settings, we applied the model to data obtained from a stream salamander survey performed in summer 2011 in the South Fork Salmon River Sub-basin, Idaho, USA (Pilliod et al., 2013). During this field study, eDNA was collected and quantified at the downstream edge of a survey reach just prior to salamander sampling in five streams (see Pilliod et al., 2013 for details). Idaho giant salamanders were surveyed using single-pass electrofishing in a 500-m stream reach (defined as a site) of each stream and salamander density was then estimated as the number individuals captured divided by the area searched per stream.
The 500-m reach was randomly selected and predefined as the spatial sampling unit or site. While the rate of eDNA loss per distance of stream is unknown, it has been estimated to occur on the scale of 100-200 m (Wilcox et al., 2016) to as much as 9 km (Deiner & Altermatt, 2014). Here, we used the sites from the Pilliod et al. (2013) study that were sampled with electrofishing, not those that were only sampled with the kick-net technique (see Pilliod et al., 2013) because this latter method did not provide reliable metrics of salamander densities. On the other hand, electrofishing has been shown to be one of the most reliable field sampling methods for larval and paedomorphic life stages of this species (Cossel, Gaige, & Sauder, 2012;Pilliod et al., 2013). The environmental characteristics (see Table S1) of the five sites we used were never captured during electrofishing surveys, and thus were assumed to contribute little to no eDNA to the water. The whole dataset consisted of I = 5 sites (i.e., a 500-m upstream stretch in five different streams) and the number of per-site water sampling replicates (K i ) varied from 3 to 12. For two sites K i = 3, for two others K i = 9 and for one site K i = 12. The data for the two sites with nine samples actually consisted of three replicated samples from each of three slightly different methods of water collection (GrabFilter, GrabHold, Instream; see Pilliod et al., 2013). We were confident in using all nine samples together because (1) the difference between methods only concerned water collection and storage (not eDNA processing and analysis), and (2) previous investigation had shown no evidence to suggest differences in the amount of eDNA captured by each method for these samples (Pilliod et al., 2013(Pilliod et al., , pp 1126. To assess models' performance on this dataset, we used the same cross-validation approach as described above, but with the caveat that salamander densities were not known perfectly. Because densities were estimated from a field sampling technique (electrofishing) that cannot provide perfect detection of individuals, we acknowledge that there is inherent uncertainty in the density data that feed the model, which likely affects the accuracy of our estimator. Despite this caveat, we think it was useful to assess the model with field data obtained from typical methods used by field biologists. These imperfect density measures still provide useful relative measures of true density, which we believe capture consistent differences among the sampled sites.

| Results
Before running analyses with our new modeling approach, we looked at correlations between values of eDNA concentration and animal density, to get a sense of the "quality" of information contained in the dataset. Correlation was high for both the proof of concept (r = .84, Figure 1a) and the field datasets (r = .79, Figure 1b). We also quan- For the salamander data, the per-site VMR averaged 0.49, which is much lower than the VMR value observed for the proof of concept dataset. To apply the Negative Binomial model, values of eDNA concentration (initially expressed in ng/l) were transformed as integers, after having been scaled to the level of precision with which DNA concentration can be measured (see details in Appendix S2).  Figure 2b, Table S2). The RMSE was between 10 and 12, which seems reasonable. The coverage of the 95% C.I. was also good, falling between 0.95 and 1.00 for the four scenarios considered (i.e., different number of dual data sites I-p = {2, 3, 4, 5}). Increasing the number of dual data sites from two to five does not seem to improve estimator accuracy or coverage level very much. This suggests that small sample sizes of dual data sites might be appropriate to achieve good estimator properties, but it is important to keep in mind that these small numbers still represent fairly large proportions of the total number of sites (between 18% and 45%).We also assessed the ability of the Negative Binomial model at providing relative rankings of site density. We found rates of correct relative rank assignment of 76.2%, 76.8%, 78.9%, and 81.5% for scenarios with 2, 3, 4, and 5 dual data sites, respectively. Estimate of the dispersion parameter r was 0.94 (SE = 0.412), a value in accordance with the highly overdispersed nature of these data (note: the more overdispersion, the smallest the values of r; we consider no overdispersion as r→∞). As a consequence of this high degree of overdispersion, the Poisson model (i.e., equivalent of the Negative Binomial but without overdispersion) showed much poorer performance, in terms of C.I. coverage, than the Negative Binomial (Appendix S1).

| Field dataset
Similarly to the proof of concept dataset, we found the Negative Binomial model to perform better than the Normal model (Table 1

| DISCUSSION
We provided a model framework to estimate animal density using eDNA data in combination with known animal density values obtained for a subset of sampled sites. Cross-validation analyses, performed using a dataset for which fish densities were known with certainty, revealed that this approach can work well when the correlation between eDNA data and animal density values is high. Cross-sample variability of eDNA concentration was best modeled with a Negative Binomial distribution, as found in a previous study (Furlan, Gleeson, Hardy, & F I G U R E 1 Linear regression between eDNA concentration data and measures of animal density for the two datasets. The correlation (r) and proportion of explained variance (R 2 ) values are both shown on each graph. (a) Common carp dataset: number of eDNA copies quantified through droplet digital PCR across values of carp density (carps/m 2 ). (b) Idaho giant salamander dataset: concentration (ng/l) of eDNA quantified through qPCR across values of salamander density (salamanders/ m 2 ). We note that the absolute SD among sampling replicates tends to increase with larger values of DNA concentration estimates of animal density. We therefore emphasize that it will be important to pursue model developments of this approach in order to (1) add a detection-based process for density estimation in the modeling frame and (2) explicitly incorporate uncertainty of D i values when modeling the eDNA-density relationship (i.e., the model presented here). Both goals could be achieved by modeling these different processes in a hierarchical framework (Royle & Dorazio, 2008) using, for instance, Bayesian inferential tools (Link & Barker, 2009).
More specifically, it could be performed within a MCMC routine by (1) explicitly incorporating the estimation procedure of D j (e.g., mark-capture, N-mixture models, etc.) in a Bayesian model and (2) sampling the posterior distribution of parameters D j , instead of using point estimates.
Another important area that deserves discussion and further research is sampling design. As with any estimator, the key to obtaining reliable estimates relies on our ability to obtain representative samples and characterize the probabilistic distribution (e.g., variability) of the biological and sampling processes involved in the data generating mechanisms. Here, these mechanisms can be summarized hierarchically as consisting of (1) a latent process relating true (unknown) animal density (D i ) to the expected (average) concentration of eDNA [E(w ij )] in the environment and (2) another unknown process for the variability in the actual values of eDNA concentration collected and quantified through PCR [w ij |E(w ij )]. The first process combines elements of both the biological and environmental processes (Barnes & Turner, 2016) that determine (1) the rate of DNA shedding and production by the focal organisms, (2) how this DNA will be transported and distributed in the environment (spatial aspects), and (3) how long it will persist in the environment (temporal aspects). Altogether, these elements will determine how much DNA from the focal species is available to sampling in the environment. This is formalized in our model by the function f (D i ) which ultimately determines the expected value of the eDNA in our samples for a given local animal density (E(w ij )|D i ). The investigator cannot control any of these elements, but the timing and spatial design of water sample collection will affect the amount of total available eDNA exposed to sampling.
It is essential for the application of the method that the relationship between density and eDNA available to sampling remains consistent across sampling sites for which inference is to be made. The second process combines elements of sampling design and laboratory processing, which can be regrouped under the term "sampling process," determining the concentrations of eDNA actually measured (w ij ), for a given amount that was available for sampling. This process is formalized in our model by the choice of specific distributions: w ij | E(w ij )~Distr(E(w ij ), θ). The accuracy with which we can characterize this distribution and estimate model's parameters will strongly be affected by the sampling design chosen. We therefore encourage endeavors that focus on improving eDNA sampling (Goldberg et al., 2016). Important study design issues concern the timing and location of sampling, as well as the number of water sample replicates and the volume of water collected. Timing of eDNA sampling should be chosen in relation to the species' life cycle and in consideration of the assumption of population closure (Williams et al., 2002). Indeed, as with any monitoring effort, sampling must occur when all individuals from the focal population are available for detection. It is also important that the independent density data (e.g., animal counts) used for model calibration be collected at virtually the same time as the eDNA data.
The exact location at which water samples should be collected is also of prime importance for this method. First, it is important to clearly define the sampling unit to which inference is to be made.
In lentic systems that are naturally delimited in space, such as pond or lakes, the site and local population sampled are usually easily defined. However, to accurately sample such an enclosed space, the water from which eDNA is to be extracted must be representative of the entire sampling unit (e.g., whole pond). Individual spatial replicates, which would be analyzed separately for DNA, can be used directly as sampling replicates. However, because animal space use, and thus target DNA distribution, is likely to be heterogeneous within the unit, such a design is likely to inflate variability in eDNA measurements. Instead, we recommend using a sampling design where each sampling replicate is obtained from several spatial replicates covering the entire unit of interest. The latter could be pooled and homogenized before proceeding to water filtering and DNA amplification. Repeating this procedure for each individual sampling replicates (i.e., repeat K i times) will provide the most representative sample for DNA concentration and density. As an alternative, but less desirable, one could resample a homogenized bulk of water to obtain the desired number of sampling replicates. In open lotic systems such as streams and rivers, limits of spatial inference might be less obvious. Moreover, eDNA transport distance, which is affected by water flow, adsorption, and degradation, is a critical, but currently poorly understood component in lotic systems (Jane et al., 2015;Wilcox et al., 2016). Animal distributions and movements may also influence eDNA concentrations and transport along a stream with unknown consequences for abundance estimation (e.g., input from an upstream population, local eDNA unavailable due to downstream export). If the system is open, this will cause issues similar to violation of the (temporal) closure assumption discussed above (Williams et al., 2002). In our salamander example, we know that salamander movements are limited relative to the scale of sampling (J.O. Cossel, unpublished data), but transport of eDNA downstream could still be an issue (Pilliod et al., 2014).
The quantity of water filtered for each sample will affect the likelihood of obtaining outlier values of eDNA concentration W ik , which are due to potential aggregation of eDNA distribution in the environment (Barnes & Turner, 2016;Lacoursière-Roussel, Côté, et al., 2016;Lacoursière-Roussel, Rosabal, et al., 2016). Therefore, the more water is filtered, the more likely we are to dilute this aggregation pattern and tend toward the local average concentration of eDNA in our individual samples. Sampling replication is crucial to better characterize the relationshipf D i and the pattern of variability in individual samples. Given the overdispersed nature of quantitative eDNA data, the consideration of the adequate number of spatial replicates of water samples is even more important. Similarly, it will be important to carefully consider the number of technical replicates for each individual water sample k processed in the laboratory to provide values of eDNA concentration. Finally, it is important to mention that the consistency and sensitivity of DNA analysis methods will influence the precision of estimates.
Besides sampling issues concerned with the eDNA component, it will also be important to consider how many dual data sites are required and what sites should be targeted to accurately inform the eDNA-animal density relationship. The selection of dual data sites should be random, but it is also important to consider the sites' characteristics when making these choices. Indeed, the physical and environmental characteristics (e.g., flow, discharge level, substrate, etc.) of the sites might influence the eDNA-density relationship.
Therefore, a stratified sampling strategy across these characteristics would ensure representation of this variation. With enough sites, these measures could be used as covariates to further refine these modeled relationships. Methodologic work aimed at assessing and optimizing all these aspects of sampling design (eDNA and dual data sites) could also improve the reliability of the analytical method presented here.
Concerns and skepticism about using eDNA to estimate animal density have been raised (Iversen et al., 2015). The main argument of these critics relies on observations of heterogeneity in organism's individual DNA shedding rates (Klymus et al., 2015;Maruyama et al., 2014). Although we agree this is a relevant point that complicates estimation of animal density from eDNA, we emphasize that the existence of these sources of variation and uncertainties could be accommodated through appropriate statistical models (e.g., explicitly modeling sources of heterogeneity with covariates or random effects) and will not necessarily prevent density estimation. For us, the main limiting factor for this method to work appropriately in field studies concerns the level of correlation at the population level, not at the individual level, between eDNA concentration in the environment and local animal density. Strong correlations can exist despite substantial variability in individual shedding rate (Doi et al., 2015;Thomsen et al., 2012), making eDNA concentration a potentially useful source of data to estimate, or at least provides relative indices of animal density. However, we acknowledge that this method will not work in all settings (e.g., when population sizes are extremely low, Spear, Groves, Williams, & Waits, 2015). The presence of outliers will also influence the strength of existing correlations (Biggs et al., 2015;Pilliod et al., 2013), but as illustrated here, appropriate statistical distributions can be used to account for this phenomenon.
Environmental DNA methods are already showing great promise (Goldberg, Strickler, & Pilliod, 2015;Goldberg et al., 2016), and we are convinced that, over the years, they will follow a path of method developments (e.g., laboratory techniques, modeling, sampling designs), which will make them a standard tool for wildlife monitoring and ecological science. Our modeling approach represents one of the first steps to advance the difficult, but important topic of inferring animal density from eDNA data. This work builds upon previous studies (Barnes & Turner, 2016;Goldberg et al., 2013;Lacoursière-Roussel, Côté, et al., 2016;Lacoursière-Roussel, Rosabal, et al., 2016;Pilliod et al., 2013;Takahara, Minamoto, & Doi, 2013;Takahara et al., 2012;Wilcox et al., 2016), but moves beyond a simple post hoc extrapolation based on linear regressions. The next big steps in this methodological development will require: (1) further model development to accommodate varying sources of uncertainties and allow incorporation of environmental covariates, fixed, and random effects; (2) further investigation of sampling design optimization; and (3) testing of different species in natural lotic and lentic waters.

DATA ACCESSIBILITY
Data files are all provided in Supporting information.

AUTHOR CONTRIBUTIONS
TC conceived and developed the statistical model and methods; CG and DP conceived the salamander field sampling protocols and collected these data; CG performed the salamander eDNA analyses; TT and HD conceived the carp experiment and collected these data; HD performed the carp eDNA analyses; TC analyzed the data; TC, DP, CG, and HD led the writing of the manuscript. All authors contributed critically to the drafts and gave final approval for publication.