Patterns of zero and nonzero counts indicate spatiotemporal distributions, aggregation, and dispersion of invasive carp

.


Introduction
Silver carp Hypophthalmichthys molitrix (Valenciennes, 1844) and bighead carp H. nobilis (Richardson, 1845), collectively referred to as bigheaded carp, are large pelagic fishes native to Asia and introduced into North America (Kolar et al. 2007).Bigheaded carp have spread throughout most of the Mississippi River Basin, USA, and into major sub-basins such as the Missouri, Illinois, Ohio, and Lower Mississippi rivers.Their invasions have the potential to disrupt ecological processes due to the large amounts of plankton they consume (Sass et al. 2014;Tumolo andFlinn 2017, 2019;Collins and Wahl 2018).Knowledge of the spatial and temporal distributions of these fishes and their seasonal aggregation patterns are key to the development of effective monitoring and harvesting programs.Bigheaded carp are highly mobile species.Their distribution and movements in impounded rivers are under study but are not yet well understood (Coulter et al. 2018;Prechtel et al. 2018).Bigheaded carp can disperse great distances, seasonally aggregate in dense schools, and periodically change fine-scale occupancy distributions (Liang and Wu 2015;Glubzinski et al. 2021).In impounded river systems they occupy different habitats in different seasons (Ridgway and Bettoli 2017;Prechtel et al. 2018).At the invasion front where bigheaded carp are expanding in distribution and increasing in abundance, not all seemingly suitable habitats have been occupied or saturated.As a result of these differences in distribution, movement, and habitat saturation, catches of bigheaded carp can be extremely variable and include a large proportion of zeros.This variability in catch counts can aid interpretation of observed patterns of spatiotemporal distributions, aggregation, and dispersion.The catch can be dissected into two separate components: proportion of nonzero counts (p) and magnitude of nonzero counts (C).These two components are combined by the standard catch estimator that combines zero and nonzero counts and is equivalent to  *  (Stefánsson 1996).Partitioning catch into its p and C components may help understand a species spatiotemporal aggregation and distribution.By itself, p predicts the probability of presence (i.e., encounter probability) and C the nonzero count.Each of these components reflects different aspects of a species distribution.Moreover, spatiotemporal comparisons in p and C may indicate shifts in aggregation patterns.For example, reductions in p (i.e., fewer nonzero counts) coupled with increases in C (i.e., higher nonzero counts) can indicate increases in aggregations.
The objective of this study was to identify spatiotemporal distributions and the seasonal shifts in aggregation of bigheaded carp at an invasion front.A better understanding of seasonal habitat selection and aggregations is a key step to interpreting monitoring data, refining surveillance programs designed to monitor spread, and developing targeted harvest plans aimed at curbing population densities.To this end, we apply a count model approach to assess three aspects of bigheaded carp ecology: spatiotemporal distributions, aggregation, and dispersion.

Study area
This study was conducted in Lake Barkley in the Cumberland River and Kentucky Lake in the Tennessee River, USA (Figure 1).Barkley and Kentucky dams are fitted with navigation locks and located 110 km and 145 km, respectively, upstream from the Mississippi River, the hub of bigheaded carp expansion.Lake Barkley has an area of 230 km 2 , length of 216 km, and mean depth of 11.0 m.Kentucky Lake, the largest reservoir in the United States east of the Mississippi River, has an area of 650 km 2 , Figure 1.Lake Barkley in the Cumberland River Basin and Kentucky Lake in the Tennessee River Basin, in southeastern North America.Both reservoirs are gateways to a cascade of reservoirs impounded in their respective river systems.
length of 296 km, and mean depth of 14.4 m.Both impoundments are gateways to a cascade of reservoirs impounded in their respective river systems, and many reservoirs in each cascade are connected through navigation locks that delay but do not impede fish movement (Lubejko et al. 2017).These two impoundments are elongated waterbodies with multiple embayments (< 10% of the reservoir area).Sampling for bigheaded carp was limited to the impounded sections in the State (Commonwealth) of Kentucky, roughly the lower 1/3-1/2 of both reservoirs.Bigheaded carp were first recorded in the study reservoirs in 2004 (USGS 2021).
Fish assemblages in the Cumberland and Tennessee rivers have the greatest species richness in the Mississippi River basin.Over 10% of species are native endemic species, the highest level of endemism in the Mississippi basin (Miranda et al. 2021a).Nearly 130 fish species have been recorded in reservoirs of the Tennessee River, with about 90 species in Kentucky Lake alone (Miranda et al. 2021b).Considering this high species richness, potential for competitive interactions with bigheaded carp, and the connectivity allowed by navigation locks, surveillance with the goal of containing the expansion of bigheaded carp is a priority for aquatic conservation in the Cumberland and Tennessee rivers.

Fish collections
Monitoring of adult bigheaded carp has generally relied on gillnet gear, although alternative methods including electrofishing, hydroacoustics, eDNA, and herding are under investigation (Erickson et al. 2016;MacNamara et al. 2016;Bouska et al. 2017;Hammen et al. 2019;Butler et al. 2019;Ridgway et al. 2020).Sampling with sinking, monofilament, experimental gillnets was standardized across the study area.All gillnets were 90-m long, 3.7-m deep, hobbled to 3.0 m every 2.4 m, and included three 30-m panels each with 7.6, 10.2, and 12.7-cm bar mesh.Nets were fished in water 4-8 m deep.Gillnets were fished in 2018-2021, during spring, summer, and fall (except spring 2018) in two discrete macrohabitats including sites in the main stem (channel) of the impoundments and sites in embayments adjacent to the main stem (bay).Seasonal and habitat layers were applied to gather preliminary data and in due course refined sampling effort.With rare exceptions caused by missing nets, a standard 16 net-nights (i.e., collection) were fished per macrohabitat, season, reservoir, and year with each collection extending over four nights.A net-night consisted of a net deployed in late afternoon and retrieved the following morning.Exposure (i.e., soak time) was recorded for each net.All bigheaded carp collected were counted and discarded according to agency protocol.

Data analyses
We modeled the proportion of nonzero counts (p) and magnitude of nonzero counts (C) with a two-stage hurdle model (also known as Delta model; Maunder and Punt 2004).Stage 1 is a binary 0/1 model that represents fish absence (0) and presence (1) in a net, and stage 2 is a truncated count distribution that has no zeros and models the counts ≥ 1 only.Stage 1 is implemented with a logistic regression model with one or more predictor variables.The counts in stage 2 may be modeled with various density distributions but are frequently implemented with a Poisson distribution (i.e., hurdle-Poisson) or a negative binomial distribution (i.e., hurdlenegative binomial) model.The Poisson-lognormal density distribution can be an alternative to the negative binomial, but these two distributions generally provide practically identical results that are hard to distinguish without extremely large sample sizes (Warton et al. 2016).Stage 2 can also have one or multiple predictor variables, but they do not have to be the same variables as for stage 1.Thus, the hurdle model allows the flexibility of accounting for different processes that influence the zero and the nonzero (positive) captures.In preliminary analyses we considered alternative models (i.e., Poisson, negative binomial, zero-inflated Poisson, zero-inflated negative binomial) but these competing distributions were less supported than the hurdle model based on information theoretic model selection, indicating there are excessive zero catches in gillnets explained by the binary model (stage 1).
We fitted the hurdle model to the gillnet captures using the FMM (finite mixture models) procedure of SAS/STAT® software, version 9.4 (SAS Institute Inc., Cary, North Carolina, USA).The FMM procedure fits regression models in which the dependent variable is an integer, non-negative count; the independent variable can be continuous, categorical, or a mix of both types.This count model contrasts with ordinary linear regression in that in linear regression the dependent variable is continuous, and the model can produce negative values (objectionable with counts).The surveillance program's main objective is to track annual changes in bigheaded carp abundance.Therefore, it is necessary to control for spatial (habitat, reservoir) and seasonal sources of variation that potentially mask annual changes in population dynamics.Accordingly, independent categorical variables included in the hurdle model to account for catch variation were reservoir (R; Barkley or Kentucky), macrohabitat (H; channel or bay), year (Y; 2018, 2019, 2020, or 2021), and season (S; spring, summer, or fall): Equation 1 predicts stage 1, the probability p i of nonzero captures; R, H, Y, and S are categorical variables; and b 0 -b 6 are regression coefficients.Exposure (log e E) was added as an offset variable to control for differences in soak time among gillnets.An offset variable accounts for unequal exposure, preserves the cardinal nature of count data, and avoids the need for turning counts into ratios (i.e., fish/h).Equation 2predicts stage 2, the count of fish C i caught in the i gillnet, a discrete variable.Other variables and coefficients in equation 2 are as per equation 1.We included the same categorical variables in equations 1 and 2 because we expected these to influence the zero counts as well as the nonzero counts.In both equations we also inserted the macrohabitat * season interaction because exploratory plots of count data indicated counts in macrohabitats depended on season, and vice versa.
We used Akaike's Information Criterion (AIC) to evaluate support for competing models.The AIC evaluates how well a model fits a data set while penalizing models that use excessive independent variables to guard against over-fitting.A preliminary model comparison indicated a global hurdle model using a negative binomial count distribution (AIC = 2,611.9)was far better supported than a global hurdle model using a Poisson distribution (AIC = 3,315.8).Therefore, we used a negative binomial count distribution for comparisons among all further hurdle models.We used a two-step process to identify the best-supported model structure for the two parts of a hurdle model.In step 1, we identified the best-supported model structure for the binary model part of the hurdle model via backward selection by comparing the AIC of the global model to AIC values of models with all iterations of a single term dropped from the global model.If a simpler model was better supported than the global model, then the simpler model was retained for subsequent backward selection.Backward selection stopped when a more complex model was better supported than all iterations of simpler models.We kept the global model structure for the count-model component when performing backward selection for the binary-model component.In step 2, we performed backward selection on the count-model component beginning with the global model structure and iteratively dropping single terms until identifying the best-supported count model structure.We used the best supported structure for the binary-model component derived in step 1 for the binary-model component in step 2. Bayesian Information Criterion (BIC) can be more appropriate than AIC when candidate models include interaction terms (Dennis et al. 2019).Therefore, we also report BIC values for candidate models as a supplement to AIC.
We investigated how catch of bigheaded carp varied temporally and spatially.For this analysis we applied the best-supported hurdle model identified by the AIC.Model residuals were inspected to verify fits.The focus of this analysis was on examining p and C differences over seasons and habitats.However, reservoir and year were included as covariates to remove their effects and facilitate a more direct comparison among seasons and between habitats.

Results
In all, 691 gillnets were fished over 9,640 h to collect 2,365 bigheaded carp, for an average 3.4 carp per net night.Exposure averaged 14.0 hours (minmax, 11.0-17.3).Silver carp represented 94.7% of the catch and bighead carp 5.3%.Overall, 39.5% of the gillnets collected no bigheaded carp, 19.7% collected one, and 40.8% had multiple captures.The largest catch in a single net was 51 bigheaded carp (Figure 2).Table 1.Evaluation criteria for two-part hurdle models explaining observed catch of bigheaded carps (Hypophthalmichthys spp.) in gillnets in Lake Barkley and Kentucky Lake, 2018-2021.K is the number of estimated parameters.Bayesian Information Criterion (BIC) is calculated as -2(log e -likelihood) + log e (N)*K.N = 691.Akaike's Information Criterion (AIC) is calculated as -2(log e -likelihood) + 2K.ΔAIC is the difference in AIC between the lowest AIC model and alternative model.First, backward selection was performed using ΔAIC for binary presence-absence models while retaining the global negative binomial count model.Second, backward selection was performed using ΔAIC for negative binomial count models with the best-supported binary model.Main effects for macrohabitat and season are implicit for models with a macrohabitat*season interaction.All models included log e (Exposure) as an offset variable (not shown) to control for differences in soak time among gillnets.The encounter probability (probability of nonzero count, p) and catch per net (nonzero catch only, C) components of the gillnet captures were effectively represented by the two-stage hurdle model.In step 1 (p), the hurdle model with terms for reservoir, year, season, and macrohabitat (AIC = 2,609.2;BIC = 2,695.4;Table 1) was better supported than the global model containing the macrohabitat * season interaction (AIC = 2,611.9;BIC = 2,707.2;Table 1).In step 2 (C), the best supported hurdle model for the count component was the global model (AIC = 2,609.2;BIC = 2,695.4).Thus, the best-supported model across both backwardselection steps included terms for reservoir, year, season, and macrohabitat in the binary component and terms for reservoir, year, season, macrohabitat, and season * macrohabitat in the count component.An inspection of residuals and an overlay plot of observed and predicted values (Figure 3) showed the hurdle negative binomial model adequately described the empirical data.

Model
Estimates of p and C showed various patterns over the four categorical variables (Figure 3).For the year effect, predicted p and C decreased after 2018 indicating annual population declines, but the decrease in C was sharper than the decrease in p, as illustrated in Figure 3A, B and quantified by the slopes (b i ) in Table 2. Slopes of −0.274, −0.496, and −0.545 for p translate to drops of 24%, 39%, and 42% from 2018 highs (drop = 1 −    ).In contrast, slopes of −1.567, −2.725, and −2.038 for C in 2019, 2021, respectively, translate to drops of 79%, 93%, and 87% from 2018 highs.For the reservoir effect (Figure 3C, D), predicted p was higher in Barkley than Kentucky by 22% (i.e., e 0.199 -1; Table 2), but C was lower in Barkley than Kentucky by 40% (i.e., 1 -e -0.508 ; Table 2).Estimates of predicted C were highest in spring and decreased though summer and fall.An inspection of the season * macrohabitat interaction indicated predicted C remained relatively steady over seasons in bays but showed large reductions in channels from spring to fall (Figure 3F).For p, the lack of season * macrohabitat interaction  1 explaining gillnet captures of bigheaded carps (Hypophthalmichthys spp.) in Lake Barkley and Kentucky Lake.Binary presence-absence model (equation 1) refers to the encounter probability (p, probability of nonzero count) and negative binomial count model (equation 2) to non-zero counts of bigheaded carps in gillnets.The b i coefficients are the natural logarithms of ratios, and    indicates the magnitude of differences within effects (holding other variables constant).For example, the probability of having a nonzero catch is 1.22 times higher in Lake Barkley (i.e., 22% higher) than Kentucky Lake.Similarly, means of nonzero counts in Lake Barkley are 0.60 of those in Kentucky Lake (i.e., 40% lower).For the season * macrohabitat interaction,    for summer * bay is e (-0.993)+(-0.756)+(1.113)= 0.636.P-values for coefficients were included as interpretation aids for those who rely on hypothesis testing; however, no P-values were used in model selection.indicated p was consistently greater in bays across seasons (Figure 3E).The low p coupled with a high C in channels in spring point to intensified aggregations of bigheaded carp.The larger deviation between predicted and observed C's in spring (Figure 3F) reflects the missing spring 2018 collections and therefore observed estimates that included only the smaller 2019-2021 counts; yet the pattern of higher C in channel remained.

Discussion
Considering gillnet captures from the perspective of a hurdle model requires thinking about two separate but related components of catch: proportion of nonzero captures (p), and magnitude of nonzero captures (C; Figure 4).These two components are generally integrated into a standard catch estimator (i.e., catch =  * ; Stefánsson 1996) that combines zero and nonzero captures.Dissecting catch into p and C conveys additional information about distributions and has various advantages.By itself p is an indicator of density because in low-density situations, an increase in p reflects an increase in abundance (Gaston 1996;Gaston and Blackburn 2000).In sites with low encounter probability, such as an invasion front, examining spatial and temporal changes in p alone may provide the information necessary to monitor expansion.Early in the invasion when habitat is unsaturated, p may be most useful as count data will have excess zeros due to undersaturation.Later in the invasion, as habitats become saturated and zero captures become rare, C may be most sensitive to changes in population densities.Additionally, patterns in p and C pairs can indicate spatiotemporal aggregations.Bigheaded carps are difficult to detect (Cupp et al. 2021).Occupancy methods that incorporate spatially or temporally replicated samples can estimate detection probability to adjust for biases in naïve occupancy and counts (Guillera-Arroita 2017).However, our monitoring program allocated fishing effort with a goal to estimate spatial and temporal distributions of fish rather than to estimate detection probability.Thus, our data would likely violate assumptions needed to estimate detection probability if analyzed within an occupancy framework.Future monitoring might consider concentrating effort within a finite space and time to incorporate detectability into estimated p and C within seasons and macrohabitats identified by our study to maximize catch.Nevertheless, adjustments for imperfect detection can add new biases, and it is not clear when or if adjustment methods outperform the unadjusted p and C, particularly when some assumptions (population closure, no unmodeled structure in counts or detection) go unmet (Welsh et al. 2013).Thus, trying to adjust for detectability could be as misleading as ignoring it.Lacking reliable estimates of detectability in our study, we conjecture that any incomplete detection must have reduced the proportion and magnitude of nonzero counts.
The dynamics of p and C revealed interesting details about patterns in bigheaded carp distributions.Estimates of p and C varied temporally and spatially, sometimes in opposing directions, indicating temporal and spatial shifts in fish distributions and aggregations.For example, dense aggregations were observed in channels in spring, possibly caused by increased main stem discharge that induce spawning behavior, or by residual overwintering in the deeper waters of the main stem channel.These aggregations scattered by summer and fall as shown by reduced C but increased p.This pattern was also apparent in bays but was less distinct, as C's remained little changed but p's increased in summer and fall, possibly reflecting changes in school distributions from channel to bay habitats.There was a decrease in p's and C's over the four-year study.Silver carp do not spawn annually in the study reservoirs but reportedly spawned in 2015 (Ridgway and Bettoli 2017).By 2018, the 2015 cohort was being exploited by the commercial fishery, and by 2021 fishing and natural mortality, and perhaps emigration, had markedly decreased abundance in the reservoirs.There was a conspicuous decrease in C over the four years, but a weaker decrease in p.Although we do not know for certain what caused this pattern, we hypothesize that as abundance declined, school sizes were reduced faster than school numbers, casting fish into smaller schools or solitariness.Lastly, C was lower in Barkley than Kentucky, but p was higher in Barkley than Kentucky, indicating fish aggregate less or schools are smaller in Lake Barkley.Partitioning the standard catch estimator into its p and C components with a hurdle model provided insight into population ecology not immediately obvious when considering the standard catch estimator (i.e., p * C) alone.
The hurdle-negative binomial model indicates an alternative but complementary interpretation of our observations.Schooling and aggregation behavior is common to many cyprinid species (Winfield and Nelson 1991) and has been described for common carp Cyprinus carpio Linnaeus, 1758 and grass carp Ctenopharyngodon idella (Valenciennes, 1844) (Penne and Pierce 2008;Hessler et al. 2023).Reportedly, in their natural environments bigheaded carp exhibit strong schooling aggregations that change seasonally to sustain annual activities such as spawning, feeding, and wintering (Senlin 1988).Conceivably, if bigheaded carp occur mostly in schools in the study reservoirs, the hurdle model may be reflecting incidence and size of aggregations.The binary portion of the hurdle model, which would represent the probability of encountering a school, may be interpreted as the density of schools in a waterbody.Correspondingly, the count portion of the model may be interpreted as the size of the school.Under this interpretation, the temporal and spatial variations in the abundance and size of schools observed may be caused by schools merging and fragmenting seasonally as fish cycle through annual activities (Coulter et al. 2016).Thus, spring seemed to be a period of schools aggregating in channels, whereas fragmentation or restructuring of schools ensued in summer and fall.
The hurdle-negative binomial model adequately described bigheaded carp distributions in the study reservoirs.Nevertheless, this model may not be the best fit in all settings as others may fit best depending on invasion status.Count data for an expanding species commonly have many zeros (Potts and Elith 2006).An expanding species may be absent from many sites because of unsuitable habitat, or from suitable but unoccupied habitat because the species has not yet saturated its new environment (Martin et al. 2005).Thus, collections in waterbodies at the edge of the invasion may have a preponderance of zero captures along with occasional single captures that a binary presence-absence distribution alone may adequately fit.However, as density increases, zero captures may become less frequent and other distributions may become more applicable.Thus, while a binary model may be most suitable early in the invasion, a hurdle-model may work best midway through the invasion, and a count model may provide a better fit in saturated environments late in the invasion.
Our results can also inform development of surveillance and harvesting programs.First, aggregations of bigheaded carp vary seasonally and by macrohabitats as demonstrated by the interplay between p and C.This variation indicates that precision of counts may be improved if the monitoring program is stratified by time and space, or else if a single standard season and habitat are identified for comparative monitoring over time.Choice of macrohabitat may depend on p and C characteristics, but also on the percentage composition of habitats in the reservoirs (i.e., if certain macrohabitats represent only a small fraction of the reservoir, they may not need to be considered).Nevertheless, if composition of macrohabitats in the waterbody is known, multiple macrohabitats may be sampled and estimation of p and C weighted by macrohabitat composition.Second, if the objective is to maximize total catch (i.e., p * C), fishing in bays in summer or fall may produce the largest captures.Targeting spring-time fish aggregation in channels could also produce large total captures if p is increased by refining predictability of school distributions.Additional considerations such as annual distribution of labor workplans, seasonal distribution of bycatch, bycatch mortality rates, and ancillary data needs (e.g., reproductive status, age determination, condition) are likely to assist the selection.Third, applying the appropriate statistical model when considering change over time and space is critical to making correct inferences.Binary models based on presence-absence, count models based on Poisson or negative binomial distributions, or hurdle models that combine binary and count models are pertinent to gillnet catch distributions, but the type of model depends on local densities that may change temporally as bigheaded carp expand their distributions.
Altogether, bigheaded carp are well established in Lake Barkley and Kentucky Lake reservoirs, evidenced by high total captures nearly two decades after their initial reported invasion.Intensive monitoring revealed sources of spatiotemporal variation in these species' catch dynamics.Although seasonal aggregations and habitat use contribute to catch dynamics, hurdle models revealed successive years of population declines potentially attributed to low recruitment, emigration, and mortality.Successfully distinguishing between drivers of seasonal and annual patterns is key to fine tuning monitoring programs, identifying factors that regulate populations, and ultimately, developing management strategies to limit impacts of invasive species.

Figure 2 .
Figure 2. Observed frequency distribution of 2,365 bigheaded carp counted in 691 gillnets fished in Lake Barkley and Kentucky Lake, 2018-2021.The highest count in a single net was 51 bigheaded carp.

Figure 3 .
Figure 3. Encounter probability (p, probability of nonzero count; A, C, E) and catch per net night (C, nonzero catch only; B, D, F) of bigheaded carp.Estimates are given across years (2018-2021), reservoirs (Barkley and Kentucky), and the interaction of season (spring, summer, fall) and macrohabitat (bay and channel).The predicted values were computed holding other variables constant with the model in Table2.The observed values estimated by averaging counts in the 691 gillnets over categories.The larger deviation between predicted and observed C's in spring (panel F) reflects the missing spring 2018 collections and therefore observed estimates that included only the smaller 2019-2021 counts.

Figure 4 .
Figure 4.A conceptual model for interpreting shifts in zero and nonzero catches (p) and counts per net (nonzero counts only, C).Increases in population density are indicated by increases of p and C; conversely, decreases in population density entail decreases in p and C.However, a decrease in p coupled with an increase in C indicates aggregation behavior; conversely, an increase in p coupled with a decrease in C indicates dispersion behavior.

Table 2 .
Coefficients and statistics for the best-supported hurdle model in Table