Contact heterogeneities in feral swine: implications for disease management and future research

Contact rates vary widely among individuals in socially structured wildlife populations. Un- derstanding the interplay of factors responsible for this variation is essential for planning effective disease management. Feral swine (Sus scrofa ) are a socially structured species which pose an increasing threat to livestock and human health, and little is known about contact structure. We analyzed 11 GPS data sets from across the United States to understand the interplay of ecological and demographic factors on vari- ation in co- location rates, a proxy for contact rates. Between- sounder contact rates strongly depended on the distance among home ranges (less contact among sounders separated by >2 km; negligible between sounders separated by >6 km), but other factors causing high clustering between groups of sounders also seemed apparent. Our results provide spatial parameters for targeted management actions, identify data gaps that could lead to improved management and provide insight on experimental design for quantitat - ing contact rates and structure.

v www.esajournals.org PEPIN ET AL. (Cross et al. 2012, Lavelle et al. 2014, Long et al. 2014). Global positioning system (GPS) devices and ultra-high-frequency proximity loggers (PL) are two of the most commonly used technologies for quantitating contact in wildlife. Both devices are used to quantitate contact rates, social networks, factors driving contact heterogeneities and impacts of contact heterogeneities on disease transmission and management (e.g., Cross et al. 2012, Hamede et al. 2012, Drewe et al. 2013, Lavelle et al. 2014, Long et al. 2014, Podgorski et al. 2014, Williams et al. 2014. Networks are one method for analyzing contact heterogeneities using GPS or PL data from wildlife populations. Network properties that describe contact heterogeneities (e.g., degree, transitivity and connectedness) can be quantitated, and their effects on disease transmission can be studied using disease transmission models (e.g., Bansal et al. 2007, Ames et al. 2011. Degree is the number of individuals to which an individual is connected. Variability in degree distribution affects epidemiological progression by increasing rates of disease spread early during an epidemic and decreasing it later on, relative to a randomly mixing population (Bansal et al. 2007). Networks with some high-degree individuals termed "superspreaders" can reach peak numbers of cases faster and have higher probability of pathogen extinction before an epidemic starts (Lloyd-Smith et al. 2005). Network transitivity describes the proportion of the network that contains individuals whose contacts are interconnected (forming closed triangles: A contacts B and C, and B and C contact each other). Global transitivity can reduce disease transmission because redundancy in local connections leads to local depletion of susceptible individuals (Eames and Keeling 2003). Connectedness describes how completely a network is connected, for example, unconnected networks include one to several discrete clusters with no possibility for disease transmission between them. Realistic representations of contact heterogeneities in wildlife populations can be used to optimize disease management in terms of when and where to increase surveillance and implement controls (Hamede et al. 2012).

Feral swine: data-poor disease threats
Feral swine (Sus scrofa) are a highly socially structured species that are globally distributed and pose a threat to human and livestock health in many countries. They have been implicated in the spillover of economically devastating livestock pathogens such as classical and African swine fever viruses and Brucella abortus (Penrith et al. 2011, Costard et al. 2013, Kreizinger et al. 2014, as well as zoonotic pathogens such as hepatitis E virus and influenza A virus (Feng et al. 2014, Caruso et al. 2015. Although the ecological capacity of wild pig populations to maintain pathogens or spark outbreaks is uncertain, population density and social structure are thought to be important due to their effects on contact between individuals (Cowled and Garner 2008, Penrith et al. 2011, Costard et al. 2013. To our knowledge, only a single study in one location (Poland) has attempted to quantitate contact heterogeneities of feral swine (or wild boar; Podgorski et al. 2014). The study found significant clustering of individuals and sex-based differences in the duration of associations. In addition, it is well-accepted that dominant boars generally occur alone, while reproductively active sows and their offspring occur in groups called sounders (Mayer and Brisbin 2009). Individuals in the same sounder likely contact each other more frequently than individuals from different sounders, but the relative difference in contact rates within and between sounders, and contact structure among sounders is unknown.
To begin addressing these knowledge gaps, we analyzed GPS data from 207 individual pigs (104 males, 103 females) from 11 populations across the United States (Gaston et al. 2008, Campbell and Long 2010, Cooper et al. 2010, Wyckoff et al. 2012, Hartley et al. 2015 and unpublished data; Supplement S1: Table S1). All except for one study included individuals from different sounders allowing us to focus on variation in betweensounder contact rates; the scale of contact that is likely most heterogeneous, and potentially limiting disease transmission most strongly. Goals of the above studies were to estimate contact rates with domestic livestock or quantitate pig movement behavior and territoriality. Thus, our analyses of the data should not be taken as absolute measures of contact rates or heterogeneities. Our goals were to: (1) explore potential causes of contact heterogeneities in feral swine; (2) examine the relationship between contact and spatial distribution of home ranges; (3) provide insight v www.esajournals.org PEPIN ET AL.
for experimental design of contact studies; and (4) evaluate the effects of missing data on contact structure.

Methods
We calculated pairwise co-location rates for all pairs of individuals co-monitored for ≥7 d, with home range centroids less than 10 km apart. Home range centroids were the median latitude and longitude during co-monitoring. Of the 207 individuals (812 unique pairs), only 3 (three unique pairs) were from the same sounder, thus our analyses focused on contact rates among sounders. Daily co-location rates were calculated as the number of co-locations in a 24-h period between individuals A and B within time interval T and distance D (fixed at 10 m; distance criterion for co-location): (1) where d j,k is the distance between location j for individual A and location k for individual B; and t j,k is the time between location j for individual A and location k for individual B (nj is the total number of locations for individual A and nk is the total number of locations for individual B). T x was the time interval between co-locations, where x = 1 week or 15 min. These intervals were chosen because they reflect the most frequent time interval that GPS locations were taken ("direct contact" = 15 min between co-locations) and realistic persistence times for bacterial and viral pathogens outside the host ("indirect contact" = 1 week between co-locations). In addition, the mean number of daily locations (location intensity) and duration of monitoring were calculated to account for sampling effort variation. Location intensity was calculated from the mean total locations for sounders A and B throughout the entire period of monitoring (i.e., nj + nk/2), divided by the number of days they were monitored (i.e., [nj + nk/2]/days of overlap).
For statistical analyses we used linear mixed effects models implemented via "lme4" package (R software, R Core Team 2013), using maximum likelihood for parameter estimation. Because we were only interested in factors explaining variation in contact rates for pairs that made contact at least once, we excluded the pairs that never made contact (i.e., zeros) in all statistical analyses. To account for error correlation within individuals, two random effects were included: unique identifiers for each of collar A (interactor) and collar B (interactee). Our conceptual approach to analyzing fixed effects was hierarchical because we expected that factors intrinsic to the study design (i.e., location intensity and duration of monitoring) would explain significant variation in contact rates, yet we were not interested in their effects. Thus, first we investigated the effects of inherent factors with the goal of removing variation from these factors prior to examining effects of factors we were interested in. Duration of monitoring did not improve AIC over an intercept-only model (Table 1), thus we did not consider this covariate further. In contrast, location intensity significantly improved AIC, thus we used the residuals from the location intensity model as the response in models with covariates such as distance between home range centroids, sex, age class (adult vs. juvenile/subadult) and sounder membership (Table 1). In addition, because distance between home range centroids is another factor that is intrinsic to the study design, we looked at the effects of sex, age and sounder membership both before and after accounting for effects of distance between home range centroids (i.e., response was residuals from a model with just location intensity vs. residuals from a model with location intensity and distance between home range centroids; Table 1). The distance between home range centroids, sex, age class and sounder membership covariates were examined in separate, univariate models because each variable had different amounts of missing data and we were trying to maximize the amount of data used to examine each factor. Although some of the data sets distinguished three age classes (juvenile, subadult and adult), others only distinguished juvenile and adult. Also, there was a lot of missing age-class data. As we were most interested in the age classes that represent pre-and postdispersal, we aggregated the data to make two age classes: adult vs. younger than adult. For models with contact rate data as the response, the response was logtransformed. For all fitted models, the residuals were normally distributed as evidenced by normal quantile plots of the residuals. We calculated

Number of days overlap
March 2016 v Volume 7(3) v Article e01230 4 v www.esajournals.org P-values using the t statistics generated by the "lmer" function (lme4 package, R Software), assuming that the t-distribution converges to the z distribution at large sample size. We assessed explained variation in the models (absolute goodness of fit) with squared correlation coefficients of observed and model-predicted data. We calculated network properties (mean degree, global transitivity and number of independent clusters) using the "igraph" package in R (Csardi and Nepusz 2006). Network properties were based on undirected, unweighted graphs (i.e., binary adjacency matrices), except we additionally calculated transitivity from weighted graphs to examine how contact rate heterogeneity affected this property. Mean degree was the average number of edges per node. Global transitivity for the unweighted networks was the ratio of triangles to triplets in the network (Csardi and Nepusz 2006). A triangle was defined as a closed triplet (three nodes connected by three edges), whereas a triplet was defined as three nodes connected by either two or three edges (thus a triangle is a special case of a triplet). Global transitivity for the weighted networks was calculated using the "clustering_w" function in the "tnet" package in R, and assuming the geometric mean method for averaging weights, as defined in Opsahl and Panzarasa (2009). The equation for calculating global transitivity of the weighted networks is equivalent to the ratio of triangles to triplets when an unweighted network is used. The number of independent clusters was the total number of either single or subgoups of connected nodes that were completely unconnected to the rest of the network. The number of independent clusters was calculated using the "clusters" function in the "igraph" package. For the network analyses, we only included three studies with the most individuals (N = 20, 18 and 13). We also dropped different amounts of data (from 1 point up to 50% of points) from the networks and recalculated the network properties to investigate effects of missing data. For each number of data points excluded, nodes were dropped at random and the procedure was repeated 100 times.

Heterogeneity in co-location rates
The maximum daily contact rates among sounders were 45 times higher for indirect relative to direct contact. Direct contact rates did not change with the duration of monitoring (Fig. 1a, c), but did increase significantly with the number of locations recorded per day (P = 0.013 for direct contact and P = 0.001 for indirect contact; Fig. 1b, d, Models 1 and 2 in Table 1). This emphasizes that when using GPS-type data for quantitating contact rates, the interval between locations should approximate the minimum time between the start and end of a perceived diseaserelevant contact. For example, if a pathogen requires 5 min of contact for a high probability of transmission, then to quantitate disease-relevant contact rates recording locations at least every 5 min would be important. Furthermore, variability from the duration of monitoring may be weak or insignificant except when durations are very short (<7 d in our case). Additional studies, which include long-term monitoring should be undertaken to determine if contact rates change seasonally, which could obscure the potential importance of monitoring duration on quantitation of contact rates.

Effects of distance between home range centroids on contact rates
As previous work has shown that wild pig home ranges are relatively small, we expected that distance between pig home range centroids would be a strong predictor of contact rates. Indeed, contact rates decreased significantly with distance between home range centroids ( Fig. 2a; considering pairs with at least one contact only: P < 0.0001 for direct and indirect contact, Model 3 in Table 1), and this relationship remained significant even after variation from location intensity was removed (P < 0.0001 for direct and indirect contact, Model 5 in Table 1). Contacts were rare between individuals in different sounders whose home range centroids were >2 km, despite there being many data points from pairs of sounders with home range centroids separated by further distances (up to ~10 km; Fig. 2a). There were very few contacts for pairs with home range centroids separated by 2-6 km and none for pairs at distances >6 km. Thus, direct disease transmission is likely to be rare between sounders ranging further than 2 km and non-existent between sounders separated by 6 km, unless a significant amount of long-distance dispersal occurs during persistence of a given disease.

Effects of sex and age class on contact rates
No significant effects of sex on either direct (Fig. 2b) or indirect contact were apparent (Models 6 and 7, Table 1) after accounting for intrinsic factors, suggesting that any potential sex-based differences may be mostly explained by intrinsic factors. Effects of age class were significant on indirect contact but not direct contact (Fig. 2c), after variation from location intensity was accounted for (Model 8 in Table 1), but not after variation from both location intensity and distance was considered (Model 9 in Table 1). Specifically, indirect contacts between different age classes were significantly lower than indirect contacts between like age classes. The age class results suggest that although agebased differences in contact rates may explain some variation in indirect contact rates among sounders, it may be less significant for explaining variation in direct contact rates. Also, the variation in contact rates among sounders that was explained by age class was very small (explained variation = 0.04) and when distance between home range centroids was accounted for, the age class effect became insignificant (variation explained was reduced by 50%), suggesting that potential heterogeneities due to age class could be mostly explained by other factors such as distance between home range centroids. Although it should be noted that age class data were missing in some studies and ages were defined using different methods across studies, emphasizing that effects of age class should be tested more rigorously in a study that uses a standardized measure of age for all individuals v www.esajournals.org PEPIN ET AL. and designed to decipher the relative contributions of age-based differences in contact rates compared with home range locations.

Effects of sounder membership on contact rates
Although only three individuals in one study (SC) were from the same sounder, sounder membership explained a significant amount of variation in both direct (Fig. 2d) and indirect contact rates even after location intensity and distance between home range centroids were accounted for (Models 10 and 11 in Table 1). Fig. 2d shows a >20-fold difference in contact rates for pairs from the same vs. adjacent sounders. This difference was roughly fivefold for indirect contact.
Taken together, our results emphasize that heterogeneities in contact rates are substantial in feral swine populations. Most of the variation can be explained by individual, location intensity, dis-tance between home range centroids and sounder membership. Although sex was not significant and age class only explained a tiny amount of variation, it may be worth testing effects of these factors in future studies that are specifically designed to measure contact rates. The random effects for interactor A and interactee B explained the largest proportion of variation (compare absGOF for Model 0 to Models 1-4), highlighting that some individuals are inherently more likely to interact with others. After accounting for location intensity, distance between home range centroids and sounder membership were the two most important factors explaining variation in contact rates after individual-level behavior.

Contact structure
In all three of the studies for which we calculated network properties (FL, TX4 and TX6),  average degree (G) and connectedness (C) increases (meaning the value of C decreases) as the time between contacts increases (meaning contact is more indirect, Fig. 3), highlighting that consideration of indirect contacts increases opportunity for disease transmission through increased connectedness between individuals (or, here, sounders; Drewe et al. 2013). Network properties differed across studies and contact type (direct vs. indirect) despite the relatively stable pattern in average distance between home range centroids (d, Fig. 3), suggesting that network structure may describe contact heterogeneities in addition to those imposed by spatial relationships.
Global transitivity for the unweighted networks decreased with increasing time between contacts in two of the studies (FL and TX4), but it showed an inconsistent pattern in the third study (TX6, see T values Fig. 3). This inconsistency may partly be due to the very weak connectedness of these highly disconnected networks (up to nine distinct clusters), i.e., transitivity levels likely vary between distinct clusters and thus global effects from addition of connections strongly depend on which connections are added. It is also noteworthy that transitivity was very high (≥0.67 for all networks) and connectedness very low in all three networks from geographically distinct populations. When transitivity was calculated by considering contact rate heterogeneities, it was even higher at ≥0.95 for all networks, suggesting that high transitivity may be a common characteristic in wild pig populations. This hypothesis  should be validated with a study that is specifically designed to measure contact networks because disease dynamics in disconnected, clustered populations are predicted to be very different from those in fully connected, unclustered ones and disease management is most effective when based on knowledge of contact structure (Eames andKeeling 2003, Hamede et al. 2012). One biological explanation for the high transitivity could be that sounders are more dynamic, frequently swapping members between sounders with closely related individuals, than is traditionally believed. To verify this hypothesis, all individuals from multiple adjacent sounders need to be monitored for contact, a study that remains to be done. The networks we presented are based on a small number of sounders (13-20), undoubtedly with missing data (i.e., not all sounders within the area of the sampled individuals were included). We explored the effects of such missing data on network properties by dropping up to 50% of the sounders and recalculating the network properties (Fig. 4). Degree decreased significantly when 50% of nodes were dropped (Fig. 4). Transitivity was relatively stable with up to 50% missing data for the networks with indirect contacts, but changed in different directions in the direct-contact networks (which were much more clustered). The number of independent clusters was also more sensitive to missing data in the direct-contact networks relative to those with indirect contacts. Thus, networks of wild pig social structure based on indirect contacts may be relatively robust to inherent difficulties with field study design. Nevertheless, to quantitate contact structure every effort should be made to sample all adjacent sounders within the target spatial area.

conclusIon
The 45-fold variation in direct vs. indirect contact rates, and differences in network properties, highlight that appropriately defining disease-specific direct and indirect contact is a crucial first step for designing studies to measure contact rates. To quantitate contact rates in absolute terms for predicting rates of disease spread, a second important consideration is to record locations at a frequency that approximates the minimum time of an effective contact. As expected, within-sounder contact rates appear to be much higher than between-sounder rates (>20 times in our study), but studies with larger sample sizes are needed to quantitate the magnitude of these differences. Between-sounder contact rates strongly depended on home range location, suggesting that between-sounder contact heterogeneities can be partly described using knowledge of wild pig movement ecology. Specifically, we expect disease transmission to be reduced between sounders separated by >2 km and negligible between sounders separated by >6 km. It should be noted, however, that densities of the populations sampled are unknown, and lower density populations could show increased contact across greater distances-a relationship that remains to be studied. In addition to space, other factors causing high clustering may be at play (e.g., access to water, baiting). Studies designed to identify and quantitate these factors, as well as potential seasonality, are imperative for planning effective management of disease threats from feral swine because they directly impact the timing, severity and spatial spread of outbreaks.