Misidentification errors in reencounters result in biased estimates of survival probability from CJS models: Evidence and a solution using the robust design

1. Misidentification of marked individuals is unavoidable in most studies of wild animal populations. Models commonly used for the estimation of survival from such capture– recapture data ignore misidentification errors potentially resulting in biased parameter estimates. With a simulation study, we show that ignoring


General rights
It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).
Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons.In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website.Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands.You will be contacted as soon as possible.

| INTRODUC TI ON
The most widely used class of models to estimate survival probabilities from capture-recapture data belongs to the Cormack-Jolly-Seber (CJS) models and extensions (Lebreton et al., 1992) such as multistate (Brownie et al., 1993) and robust design models (Pollock, 1982).The key purpose of these models is to estimate the probability of survival as a function of individual age, state and various environmental covariates while accounting for imperfect detection of marked individuals.Implemented in user-friendly software such as MARK (White & Burnham, 1999) or E-Surge (Choquet et al., 2009), this has facilitated their wide application in population ecology and conservation.
One crucial assumption of the current CJS models is that all reencountered (resighted or recaptured) individuals are identified correctly (see, e.g.Lebreton et al., 1992).However, misidentification errors are likely to be widespread and common in many datasets (e.g.Lavers & Jones, 2008;Tucker et al., 2019).Several types of misidentification errors are possible.First, a yet unmarked individual may be wrongly identified as an already marked individual.Such errors are possible when the capture-recapture study is performed on animals without artificial marks, for example, when photographs are taken and individuals are identified by natural marks such as patterns on the fur or when genetic identification is used (Lukacs & Burnham, 2005;Morrison et al., 2011;Wright et al., 2009;Yoshizaki et al., 2009Yoshizaki et al., , 2011)).
Second, an already marked (or identified) individual is misidentified and wrongly allocated to a new individual.Again, such misidentification errors are likely to occur in studies based on animals with natural marks, but can occur also when artificial marks are used and mark loss is possible (note that when all marks are lost and individuals are captured as 'new', this has been referred to as recycled individuals).
Because the matching of an identity is wrongly rejected, it is referred to as false rejection error by Morrison et al. (2011).Third, a marked individual may be wrongly identified as another marked individual.This type of error occurs when animals have artificial marks such as colour rings that are read from a distance (Schwarz & Stobo, 1999;Tucker et al., 2019).Because the identity of two individuals is wrongly accepted to be the same, this is referred to as a false acceptance error by Morrison et al. (2011).Here we deal with this third type of misidentification error, when the recording of a marked individual is wrongly allocated to another marked individual.
If the capture-recapture data contain misidentification errors and are analysed with conventional methods, the estimates of survival are biased (unless study design resulting in recycled individuals is used; Malcolm- White et al., 2020).The magnitude and direction of bias depend on the type of misidentification error.When only false acceptance errors occur and misidentification is not permanent, the survival estimates are positively biased.Let us take a standard individual encounter history as an example.Assume that an individual was captured and marked on occasion 1, was seen again on occasion 3 and then died.Assume now that on occasion 10 another individual was seen, whose identifier was wrongly attributed to our focal individual, implying that our focal individual was alive all the 10 years up to occasion 10, although it died shortly after occasion 3. Obviously, survival from such a capture history will be overestimated while the recapture probability will be underestimated (see also Schwarz & Stobo, 1999).
Moreover, such misidentification errors will not only result in bias, but also in wrong temporal pattern of survival and recapture (Schwarz & Stobo, 1999;Tucker et al., 2019).This happens because cohorts of individuals that are marked early are affected more as they are subject to a longer period to accumulate erroneous observations, than cohorts of individuals that are marked later in the study.
The statistical treatment of misidentification errors is challenging.
Models to deal with 'false positives' have been developed in the context (>0.7) probabilities.For the empirical data, the CJS model estimated average juvenile survival at 0.997% and adult survival at 0.939% and also detected a strong decline in adult survival over time at a rate of −0.14 ± 0.029.In contrast, the RDMa model estimated a probability of correct identification of 0.94, annual juvenile survival at 0.234%, adult at 0.834% and less strong decline over time (−0.046 ± 0.016).
4. We conclude that estimates of survival probabilities obtained from data that include misidentification errors and analysed with standard CJS model are unlikely to be correct.The bias in survival increases with the magnitude of misidentification errors, which is inevitable as datasets become longer.Since misidentification due to tag misreads is common in empirical data, we recommend the use of the here presented RDM model to provide unbiased parameter estimates.

K E Y W O R D S
Bayesian analysis, black-tailed godwit, capture-recapture, CJS, misidentification, misreading, survival of occupancy modelling (Miller et al., 2011;Royle & Link, 2006) and for photographic capture-recapture data to estimate population size (Link et al., 2010;McClintock et al., 2013McClintock et al., , 2014;;Vale et al., 2014;Yoshizaki et al., 2009Yoshizaki et al., , 2011)).For the estimation of survival from classical capturerecapture, however, we are aware of only two approaches, that attempt to account for misidentification errors.Under the assumption of a single possible recapture event each year, Schwarz and Stobo (1999) constructed a multinomial model that estimates the probability of correct identification, Θ.The probability of observation of a specific identifier was expressed as the sum of two probabilities: (a) the probability that an animal was observed and correctly identified p i Θ i and (b) the probability that an animal acquired a wrong identification p misread i .This model provides unbiased estimates of survival as confirmed by a simulation study.
However, the downside of the Schwarz and Stobo model is the assumption of a single possible reencounter within a year.This assumption may not be met.When an animal was correctly identified, it can still acquire incorrect encounters based on misreading marks of other animals.Not implemented in standard software packages, the Schwarz and Stobo model, unfortunately, has not been applied.
The second solution to directly account for misidentification is developed by Schofield and Bonner (Bonner et al., 2016;Schofield & Bonner, 2015) and is based on the idea of Link et al. (2010) to model unobserved but correct encounter histories, the so-called 'latent multinomial', and project them to the imperfect observed encounters.Recent implementations of fast approximate solutions for this model (Zhang et al., 2021) allow the use of the model with relatively large datasets.The downside of the latent multinomial based models is that they (a) only allow a single reencounter of an individual within an occasion, (2) do not allow estimation of random effects and most importantly, and (c) do not estimate the misidentification probability, but instead require this parameter to be known a priori.Some ad-hoc approaches to the misidentification problem also resulted in improved parameter estimates.One approach was to remove all single sightings per occasion from the data; this solution was based on the reasonable assumption that it is more likely to make a wrong assignments once than multiple times (see, e.g.Kentie et al., 2016Kentie et al., , 2018;;Loonstra et al., 2019).Although generally correct, this data filtering also excludes correct single sightings and thus decreases precision of the estimates, and it does not guarantee that all incorrect assignments were excluded.The same concerns apply to the solution used by Morrison et al. (2011), who excluded the first resighting of every individual.Details of these ad-hoc solutions and their shortcomings were summarised by Tucker et al. (2019).
We here introduce an improved Schwarz-Stobo model that extends to data sampling under the robust design and which accounts for misidentification errors by estimating a probability of correct identification (RDM).The robust design protocol is often realistic when colour marked individuals are observed during several days (secondary occasions) within a year (primary occasion).We then fit our novel model in the Bayesian framework and provide JAGS (Plummer, 2003) code.We also develop an approximate, but computationally faster model (RDMa) that may be used with large datasets.With the developed models, we conduct a simulation study and show that the new models provide unbiased estimates of survival and recapture probabilities, while corresponding estimates from a classical CJS model are biased.Finally, we illustrate the application of the models to a capture-resighting dataset on black-tailed godwits Limosa limosa limosa from The Netherlands.

| MATERIAL S AND ME THODS
In the first part of this section, we introduce, in three steps, the extended robust design mark-resight model that accounts for individual misidentification (RDM).We start with the classic CJS model in statespace formulation (Gimenez et al., 2007;Royle, 2008) that we extend to the Poisson Robust Design Mark Resight Model (RD model) for allowing multiple observations of the same individual during the same primary occasion.We then introduce the possibility for misidentification (false acceptance errors) into the RD model (RDM), and finally develop the computationally more effective approximate model (RDMa).The second part of the methods section introduces the simulation functions, simulation settings and the dataset on black-tailed godwits that are used to assess the performance of the novel models.

| The standard CJS model
The standard CJS model in the state-space formulation consists of two parts: a state process model and a conditional observation model (Gimenez et al., 2007;Royle, 2008).Let z i,t {i = 1, … , n, t = 1, … , T} be the latent state (with 1 for alive and 0 for dead state) of an individual i on occasion t.The state of individual i at the first (marking) occasion f i is known and always z i,f(i) = 1.At the later occasions, the state of individual i is conditional on the state at the previous time step and governed by the probability of survival Φ i,t from time t−1 to time t: If an individual i is alive at occasion t, it can be reencountered (physically recaptured or resighted) with the probability p i,t and the observation model is then.
The code for this model implemented in JAGS (Plummer, 2003) language is provided in Rakhimberdiev (2021).

| RD model: Poisson robust design markresight model accounting for multiple readings of an individual per session
In many field studies, it is common that individuals are resighted repeatedly during a primary occasion because resighting efforts are not restricted to a single day during a year.Hence, there are multiple (2) 2041210x, 2022, 5, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13825by Uva Universiteitsbibliotheek, Wiley Online Library on [22/02/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License secondary occasions (typically days) nested within the primary occasions (typically years).To model these data, the observation process of the CJS model needs to be adapted to robust design models (Pollock, 1982) of which several variants exist.Here we follow the Poisson Robust Mark-Resight model (McClintock & White, 2009) and assume that during each secondary occasion only one encounter of an individual is recorded.These data are summarised such that the individual capture histories y contain the number of times each individual is recorded in each primary occasion.The total number of sightings of individual i at primary occasion t is assumed to follow a Poisson distribution, where λ i,t is the expected number of times individual i is observed at primary occasion t.
We define the probability of encountering (p i,t ) as the probability to encounter individual i at a primary occasion t at least once.The probability of not encountering an alive individual in a primary occasion is 1 − p i,t , which is identical to the probability of getting a zero from a Poisson distribution with expected values ( i,t ): and thus The RD model described above operates an independent Poisson distribution for each marked individual.The unconditional distribution of y 1,t … y k,t can be factored into the product of two distributions: a Poisson distribution for the total number of sightings in year t, O t , and a multinomial distribution conditional on O t for the number of times each individual is sighted (see (DasGupta, 2011;Steel, 1953).To introduce the possibility of misidentification, we replace Equation 3 with a factorisation.The overall number of observations O t is modelled with a Poisson distribution while the probability for each individual to be observed y i,t | z i,t is modelled with a multinomial distribution.

| RDM model: RD Model accounting for misidentification of individuals
Let Θ t be the probability of correct identification of an individual during a secondary occasion within primary occasion t.At a single reencounter event (i.e. at a secondary occasion) of an individual, misidentification might happen with probability 1 − Θ t .
We assume that no new identifiers, that is, new individuals, can be generated during a reencounter.The only possible error is the mixing-up of an encountered, marked individual with an already existing marked individual ('false acceptance error' in Morrison et al., 2011).We also assume that when a misidentification occurs, any of the already marked individuals is equally likely to be recorded.

Summary of the assumptions:
1. at a single reencounter event, the probability of correct identification is Θ t ; 2. all individuals have the same probability of correct identification; 3. only a marked individual, which is an individual that has been marked by the time of observation, can be recorded due to misidentification; 4. when the identification of an individual is wrong, the individual can be assigned to any other existing marked individual with the same probability; 5. at a primary occasion, the number of observations of each individual follows a Poisson error distribution.
To illustrate the way we dealt with misidentification errors, we first use a simplified example with three marked individuals (i = A, B, C) that are alive and observed at primary occasion t.The expected number of observations of individual A, obs a,t , is the sum of the cases when A was seen and correctly identified ( a,t × z a,t × Θ t ) and when A was not seen but the other observed individuals (in this case B or C) were wrongly identified as individual A. The number of times B is seen but incorrectly identified is b,t × z b,t × 1 − Θ t .These encounters are evenly distributed among all the marked prior to the time of current observation by the number of all identifiers used but one, Table 1 shows the full list of possible identifications for a case of three individuals. (3)

TA B L E 1 RDM model connection between true and observed identifiers
Identified as A Identified as B

Identified as C
True The expected number of observations of the individual A, obs a,t , is: The sum of misidentifications for an individual can be reformulated as the overall sum of sightings of individuals alive minus the sightings of the current individual: In the general case with individuals i = 1, 2,…, N, the expected number of observations of individual i is: The distribution of the overall number of observations O t at the primary occasion t is not affected by the misidentification.Therefore, it is still modelled with a Poisson distribution and the number of times each individual is recorded with a multinomial distribution.The only difference to the case without misidentification errors (Equation 6) is that we replace i,t in the nominator of the multinomial cell probabilities with obs i,t as developed in Equation 10:

| RDMa model: Computationally faster RDM model with approximated total number of sightings
The RDM model presented above requires estimation of the sum of expected i , ∑ j=N j=1 j × z j .This summation slows down model estimation when sample size (number of individuals and/or primary occasions) increases.

| Simulation study: RDM and RDMa model performance with simulated data
To explore the bias, precision and performance of the RDM and RDMa models in comparison with the CJS model for estimating survival from capture-resighting data with different levels of misidentification errors we performed a simulation study.We assessed absolute bias in survival and resighting probabilities for datasets of (a) typical size, with 25 primary occasions and 25 newly marked individuals at each occasion and (b) small size with 10 primary occasions, where between 5 and 200 individuals were marked at the first occasion only.We also compared model run times and looked into bias of temporal trends in survival probabilities estimated by CJS models.
We used R computing environment (R Core Team, 2019) for data simulation and analysis and JAGS (Plummer, 2003) run via jagsUI package (Kellner, 2018) for MCMC sampling.

| Simulation study of a typical dataset
To evaluate absolute bias, we first compared the performance of the CJS, RDM and RDMa models with constant parameters across time (Φ ⋅ p ⋅ and Φ ⋅ p ⋅ Θ ⋅ ) using simulated data with 25 occasions where 25 individual animals were marked at each occasion.We simulated 100 datasets for each of 24 possible permutations of survival (Φ) of 0.5 and 0.9 (range of typical survival values in birds, Karagicheva et al., 2018), reencounter (p) of 0.3, 0.5 and 0.9 and correct identification probabilities (Θ) of 0.9, 0.95, 0.99 and 1.The range of reencounter probabilities was chosen based on typical values for the capture-recapture studies, the range of probabilities of correct identification were chosen based on values reported earlier (Schwarz & Stobo, 1999;Tucker et al., 2019) and estimated in the current study.
To fit CJS models, we flattened simulated counts of observations of each individual within a primary period to a binary variable stating (7) (10) whether the individual was observed at least once during the period.
Each data analysis was run in parallel with six chains and a thinning rate of 1 while adaptively selecting the burn-in and overall number of iterations to reach values of Gelman-Rubin statistic lower than 1.01.Because the goal of the simulation studies and field data analysis was to quantify bias, we used vague priors for all parameters (see code for details).We then calculated posterior means and medians, biases, mean squared errors (MSE), 95% credible interval coverages and widths, and effective sample sizes for all estimated parameters.

| Simulation study of a small dataset
To see how the RDM and RDMa models perform with small datasets, we simulated data where individuals were marked only at the first occasion but could be resighted during the subsequent nine primary occasions.We used time-independent survival of Φ = 0.9; and different values for resighting probabilities p of 0.3, 0.5 and 0.9; probability of correct identification Θ of 0.5, 0.7, 0.9, 0.95 and 1.We also varied the number of marked individuals in the study (5, 25, 50 and 200).This resulted in 60 combinations that we simulated 100 times each and then obtained parameter values for CJS, RDM and RDMa models with the MCMC settings the same as in the previous simulations.

| Model running time comparison
To compare running time of the CJS, RDM and RDMa models, we simulated data with constant Φ = 0.9; p = 0.9; Θ = 0.9 and 5, 20, 100, 400 and 1,600 individuals marked once at the first occasion followed by nine observation sessions.We ran 100 simulations and saved the running time it took MCMC to make 1,000 iterations.

| Evaluation of bias in time-dependent models
To evaluate the bias in the estimated slopes of survival probability over time, we fitted CJS, RDM and RDMa models with survival probabilities being a function of time through a logistic link (Φ T p ⋅ and Φ T p ⋅ Θ ⋅ ) to one hundred of simulated datasets with 25 occasions and 25 individuals marked at every occasion and a time-independent survival probability Φ = 0.7, resighting probability p = 0.9 and identification probability Θ = 0.95.

| Case study: Annual survival of blacktailed godwits
We used 16 years of capture-resight data collected on blacktailed godwits on their breeding grounds in southwest Friesland, The Netherlands (see Senner et al., 2015 andKentie et al., 2018 for detailed descriptions of the study).Starting in 2004, during each breeding season (April-June) adults, and pre-fledging chicks (older than 12 days) were marked with a unique combination of 4 colour rings and a coloured flag.From 2005, we and experienced volunteers resighted the marked birds in the breeding area between March and mid-July on a daily basis (Kentie et al., 2018).
A total of 902 chicks and 1,558 adults were marked with unique colour code combinations and which yielded a total of 58,415 resightings by the end of 2019.The fieldwork for this study was conducted under licence numbers 6350A and AVD105002017823 granted by the national Dutch committee for animal experiments following the Dutch Animal Welfare Act Articles 9 and 11.
Assuming that survival in the first year of life was different from survival later (Kentie et al., 2013;Loonstra et al., 2019), we estimated first-year (juvenile) and adult (after first year) survival independently from each other with time-dependent models (Equation 13).To check whether annual survival in any of the age groups declined over time, we modelled annual survival as linear function of time age2categories(i,t) × Year t .To allow additional temporal variability in annual survival, we also added random time effects Φ t (Equation 13, see Kéry and Schaub (2012) for an approach and Rakhimberdiev (2021) for implementation details).Resighting probabilities were modelled as fully time dependent with additive age effects.We also introduced individual random effects λ i to account for the inter-individual variation in resighting probability.The final RDM model consisted of the following linear models for Φ, p and Θ: Here stands for intercepts, for the slopes.The parameter age2categories had two levels-first year and older than first year.The parameter age4categories had three age levels for birds marked as chicks (1st year, 2nd year, after being marked as chick, 3rd year or older) and one level for birds marked as adults.The probability of reencounter p r i,t was converted to i,t with Equation 5.The CJS model only differed from the RDM model with respect to the absence of the correct identification probability in the model and of secondary periods in the data.Because of the relatively large dataset (~6 × 10 4 resightings), we estimated parameters only with RDMa.
We ran each model with six chains, thinning rate of 1, 44,000 iterations and a 'burn-in' of 14,000 iterations.Model convergence was asserted when the Gelman-Rubin statistic was lower than 1.05.We report median and 95% credible intervals of survival and resighting parameters.

| Model performance on a typical dataset
Results of the simulation study on the large dataset are presented in Figure 1 and in Table S1.As expected, the CJS model performed well when no misidentification (Θ = 1) occurred.Under probabilities of correct identification of 0.95% or 0.90%, the 95% credible interval coverage was below 50% for both survival and resighting probabilities.A very low probability of misidentification (Θ = 0.99) produced intermediate results.In general, with increasing survival probability, the absolute bias decreased to zero and the credible interval coverage increased from 0 to 0.95, see Table S1 for details.
The RDM and RDMa models accurately estimated survival, reencounter and identification probabilities in all cases, as indicated by low bias, low MSE and good coverage (Table S1).The approximation in the RDMa model had no effect on any of the parameter estimates, with results being similar to those under the F I G U R E 1 Average bias of parameter estimates from a simulation study with 25 individuals marked every year over 25 marking occasions.The data were simulated with constant apparent survival probabilities Φ = 0.5, 0.9, reencounter probability p = 0.3, 0.5, 0.9 and identification probability Θ = 0.9, 0.95, 0.99, 1

| Model performance in simulation study on a small dataset
Simulations with smaller sample sizes (individuals marked only once and observed over nine primary occasions) and with lower reencounter and correct identification probabilities revealed differences between RDM and RDMa models (Figure 2; Table S2).RDM

| Model running time comparison
The novel models had longer run times than the CJS model.With 400 individuals marked each year for 25 years, the CJS model required only 0.3 min to draw 1,000 MCMC samples while the RDM model required 10 min (Figure 3).The RDMa model in the same settings showed intermediate running time of 3 min.

| Evaluation of bias in time-dependent models
In the dataset without misidentification (Θ = 1), a CJS model with a linear time trend in survival probability Φ T p ⋅ provided unbiased estimates of the slope.However, when a 5% probability of misidentification was introduced (Θ = 0.95), the CJS model estimated a spurious decrease in survival probability of −0.033 (LCI: −0.040, UCI: −0.027, based on 100 simulations) per time unit (Figure 4), while RDM and RDMa models provided unbiased estimates of Φ.

| Analyses of the empirical black-tailed godwit data
Applied to the existing black-tailed godwit demographic data, the naïve CJS model estimated mean annual adult survival probability as 0.939 (0.925, 0.951; here and below median and 95% credible intervals) and juvenile survival as 0.997 (0.984, 1.0; see Figure 5 for the annual survival estimates).The CJS model also detected a declining linear trend in adult survival of −0.14 ± 0.0.029(logit scale).

| DISCUSS ION
Misidentification errors in capture-recapture data sampling are likely to be especially common when marks are identified from some distance and by less trained persons.We demonstrated that erroneous individual identifier assignments bias estimates of survival probability obtained from misidentification-naïve CJS models.For the intercept-only model (Φ ⋅ p ⋅ ), we showed that Φ is always overestimated while p is underestimated.Four findings in simulation results (Figures 1 and 2; Tables S1 and S2) are important to mention for the CJS models.(a) The average bias in parameter estimates increased with decreasing survival, meaning that over a fixed period of time, the less long an individual lives, the more time for wrong assignments is left after its death.(b) The bias also grew with an increase in reencounter probability, since misidentifications are conditional on encounters, so the more encounters that are made, the more misidentifications may happen.(c) The bias increased and the coverage decreased with decreasing probability of correct identification.Finally, bias in survival increased from single (Figure 2) to multiple marking occasions (Figure 1).These dependencies indicate that when misidentification occurs, the estimates by naïve CJS models will be biased most for species with low survival probabilities.Ironically, an increased number of reencounters in the mark-reencounter programme will not improve but worsen the bias.Furthermore, the intuitive assumption that the longer the dataset, the more resistant it will be to statistical bias, is contradicted in this case.In fact, long-term datasets are more prone to the confounding effects of individual misidentification.
Our novel RDM model provided unbiased estimates of survival probability in all simulations, even if we have only five marked F I G U R E 2 Average bias of parameter estimates from the simulation study with individuals marked once and followed over nine occasions.One hundred datasets were simulated with for each combination of constant apparent survival probability Φ = 0.9, reencounter probability p = 0.3, 0.5, 0.9 and identification probability Θ = 0.5, 0.7, 0.9, 0.95, 1 (Φ ⋅ p ⋅ Θ ⋅ ) and analysed each with three different models (CJS, RDM and RDMa).The dots and error bars show the median of absolute bias and the 95% credible intervals over 100 simulations true p = 0.9 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0. true p = 0.9 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0. true p = 0.9 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0.95 1 0.5 0.7 0.9 0.  S2).When sample size is large, parameter estimates from RDMa are indistinguishable from those of the RDM model (Figure 2; Table S2).The benefit of the RDMa over the RDM model is a faster run time for large sample sizes.We thus recommend to use the RDMa over the RDM when a dataset is large (≥200 marked individuals), and the probabilities of resighting (≥0.7) and of correct identification are high (≥0.9).
Comparing the results of the analyses of the black-tailed godwit data using CJS and RDMa models corroborates the finding that ignoring misidentification seriously biases estimates of survival.Just as in the simulations, the CJS estimates of adult survival were higher than the estimates from the RDMa model (0.97 by CJS vs. 0.86 by RDMa).Also, in agreement with the simulation study, the CJS model overestimated juvenile survival even more strongly than adult survival (0.997 by CJS vs. 0.24 by RDMa).This confirms the notion of Schwarz and Stobo (1999) that the lower the real survival probability is, the more it is overestimated by CJS models when misidentification occurs.The RDM model thus improved the estimates of both juvenile and adult survival.
This improvement, however, comes at a cost in the form of a limitation: the RDM model requires that all individual identifiers used in the study are incorporated in the analysis.For example, in the black-tailed godwit case study, we could not estimate adult survival without modelling juvenile survival, because these two groups shared a colour marking scheme and thus misidentifications between them may have occurred.While all individuals from the current scheme should be included into the analysis, it can also happen that similar marking schemes exists elsewhere, making it possible to misidentify animals between schemes.There are two ways in which such mistakes can happen: (a) animals from the current programme can be mistakenly assigned to another programme and (b) animals from other schemes can be read as animals from the current programme.The first case does not create bias for the current programme, as the wrong observation will never be recorded in the current scheme and will just decrease estimated resighting probability.The second case will create excess of false observation and will bias parameter estimates.While such situation should generally be avoided by design, occasional misidentification between schemes should not create large biases in parameter estimates.
Similarly to other models that rely on a Poisson or binomial error structure, the RDM and RDMa models are sensitive to overdispersion (Lebreton et al., 1992 ⋅ p ⋅ Θ ⋅ , Φ = 0.7, p = 0.9, Θ = 0.95, 25 individuals marked every occasion).In contrast to the CJS (red lines), the temporal trends in survival probabilities estimated by the RDM (blue) and RDMa (green) models were correctly estimated not to be different from zero, which is also illustrated by their almost identical estimates to those from the CJS model ran for the same data before misidentifications were introduced (black lines) F I G U R E 3 Average time required for obtaining 1,000 MCMC samples with JAGS for CJS, RDM and RDMa models.The data were simulated with constant apparent survival Φ = 0.9, reencounter p = 0.9 and identification Θ = 0.9 probabilities (Φ ⋅ p ⋅ Θ ⋅ ) and 5, 20, 100, 400 and 1,600 individuals marked once at the first occasion followed by nine observation sessions.One hundred simulations were run for every parameter combination.Dots and error bars show the median 95% quantiles of running times of 100 simulations accounting for over-dispersion in our models is necessary.
The current implementation of the RDM model assumes that any individual identifier can be equally mistaken for any other identifier.
This assumption clearly does not hold for in every study.For example in a study that uses differently coloured rings, an orange ring is more likely to be mistaken with another orange ring, but not with, for example, green one.While we agree with Schwarz and Stobo (1999) that misidentifications are generally rare and thus specific correlations between the 'identified as' and 'true' identifiers are not likely to affect the results, we like to note here that monitoring and modelling of the specific identifier to identifier correlations is possible within the RDM model.It uses the full 'identified as' to 'true' correlation matrix presented in Table 1 and thus allows monitoring the matrix values and also implicit modelling of unequal probabilities of misidentifications.Exploration of the effects of non-random assignments of identifications and a development of a model accounting for these deserves more effort, but lies outside the scope of the current paper.
The simulation study (Figures 1 and 2; Tables S1 and S2) demonstrated that not only the accuracy, but also the precision of parameter estimates from the RDM and RDMa models applied to the misidentification error before making inference from the analysis.The RDM models introduced here allow for accurate estimation of survival when misidentification errors occur.
and analysed each with a different model (CJS, RDM, RDMa).The dots and error bars show the median of absolute bias and the 95% credible intervals over 100 simulations models estimated Φ and p without bias in such settings even in simulations with only five marked individuals and Θ as low as 0.3, while the estimate of Θ was unbiased only when at least 25 individuals were marked.While optimising slower than the RDM model when sample sizes were small, the RDMa model started to converge faster than RDM when sample size increased to 200 marked individuals (60 min for RDMa vs. 70 for RDM), but still was more than 10 times slower than the CJS model (3 min).The RDMa model failed to estimate parameter values in simulations with a small dataset and with low probabilities of correct identification.For the simulations with five marked individuals, estimates were biased under all parameter combinations.With 25 marked individuals, estimates of Φ and p were unbiased only for p ≥ 0.9; with 50 individuals RDMa required p ≥ 0.5 and Θ ≥ 0.9 , or p ≥ 0.9 and any values of Θ.Under scenarios with 200 marked individuals, RDMa estimates were only biased with p ≤ 0.3 and Θ ≤ 0.5.
Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13825by Uva Universiteitsbibliotheek, Wiley Online Library on [22/02/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License individuals.Accuracy and precision of parameter estimates increase with increasing sample size, probabilities of resighting p and correct identification Θ.While the RDM model directly models the full matrix of possible misidentifications, the approximate RDMa model ignores non-independence between individual sightings generated by misidentification.The negative effects of this approximation appear at small sample sizes, and low reencounter and identification probabilities (Figure 3; Table capture-resight data without misidentification when multiple observations of the same individual are possible is better than the precision obtained from the CJS model.The reason for that is that the robust design (RD) approach with a Poisson distribution of observations instead of a Bernoulli distribution in CJS does not require the dependent variable to be 'flattened' to binary values (i.e. it utilises the original number of reencounters).The RD model thus utilises more information from the original data providing more precise parameter estimates while misidentification, RDM, part is responsible for improved accuracy of the estimates.The RD model also solves the problem of boundary estimation in case of high reencounter probabilities, because instead of estimating the probability of at least one observation (as the Bernoulli-based CJS does), it estimates the expected number of reencounters .Using the connection between the probabilities of non-zero values in Poisson and Bernoulli distributions (Equations 4 and 5), one could also report instead of or together with p in publications.Thus, instead of only reporting p = 0.95 or p = 0.99, one could also provide the expected number of reencounters per primary occasion = 3.00 and = 4.61, as already done in camera trap studies(Gardner et al., 2010;Royle et al., 2009).The wide application of CJS models in the 1990s stimulated researchers to start demographic monitoring.By now, these long-term monitoring efforts have generated 20-30 years of observations, datasets which have grown ever more sensitive to misidentification events.At the same time, the demography data are increasingly collected by amateur observers in citizen science projects rather than by trained professionals.Automated resightings with photo-or DNA identification are also becoming more common.These methods facilitate the flow of data, but may well come at a cost of quality.Whenever chances of misidentification exist, we suggest evaluating F I G U R E 5 Annual apparent juvenile (left) and adult (right) survival of black-tailed godwits in Friesland, the Netherlands, estimated by a CJS (red) and a RDMa (green) model accounting for misidentification.The correct identification probability Θ was estimated at 0.940 ± 0Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13825by Uva Universiteitsbibliotheek, Wiley Online Library on [22/02/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License To speed up estimation, we can approximate the 2022, 5, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13825by Uva Universiteitsbibliotheek, Wiley Online Library on [22/02/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 2041210x, 2022, 5, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13825byUva Universiteitsbibliotheek, Wiley Online Library on [22/02/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)onWiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons LicenseRDM model for all parameter combinations.For datasets without misidentification error (Θ = 1), the RDM and RDMa estimates of survival were almost identical to the ones of the CJS model, with MSE being similar for survival probability but generally lower for resighting probability.
Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13825by Uva Universiteitsbibliotheek, Wiley Online Library on [22/02/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Harrison, 2014)led by the addition of random effects (at group, time, individual or observation level;Harrison, 2014).The addition of such random effects is possible at any level of the Bayesian state-space formulation of our RDM model, that is, random effects can be added in the survival, resighting and/or misidentification sub-models.For example, we used individual-level random effects on resighting probability in the black-tailed godwit model.While working with noisy field data, we recommend to consider and to model overdispersion.Clearly, further research on the detection and optimal