A stage-structured Bayesian hierarchical model for salmon lice populations at individual salmon farms - Estimated from multiple farm data sets

Salmon farming has become a prosperous international industry over the last decades. Along with growth in the production farmed salmon, however, an increasing threat by pathogens has emerged. Of special concern is the propagation and spread of the salmon louse, Lepeophtheirus salmonis. In order to gain insight into this parasites population dynamics in large scale salmon farming system, we present a fully mechanistic stage-structured population model for the salmon louse, also allowing for complexities involved in the hierarchical structure of full scale salmon farming. The model estimates parameters controlling a wide range of processes, including temperature dependent demographic rates, fish size and abundance effects on louse transmission rates, effects sizes of various salmon louse control measures, and distance based between farm transmission rates. Model parameters were estimated from data including 32 salmon farms, except the last production months for five farms which were used to evaluate model predictions. We used a Bayesian estimation approach, combining the prior distributions and the data likelihood into a joint posterior distribution for all model parameters. The model generated expected values that fitted the observed infection levels of the chalimus, adult female and other mobile stages of salmon lice, reasonably well. Predictions for the time periods not used for fitting the model were also consistent with the observational data. We argue that the present model for the population dynamics of the salmon louse in aquaculture farm systems may contribute to resolve the complexity of processes that drive that drive this host-parasite relationship, and hence may improve strategies to control the parasite in this production system.


Introduction
Salmon farming has become a large and economically prosperous international industry over the last decades. Norway holds a leading position as a producer of farmed salmonids with an annual production of about 1.2 million tonnes, which is roughly half of the worldwide production (Anonymous, 2015). Further growth in the production of salmonids is in demand (Anonymous, 2015) , but this will come at the cost of increasing risks of pathogen propagation and transmission. Large-scale host density dependence acting on pathogen transmission has been demonstrated in salmon farming production systems, both for macro parasites (Aldrin et al., 2013;Jansen et al., 2012;Kristoffersen et al., 2014) and viruses (Aldrin et al., 2011(Aldrin et al., , 2010Kristoffersen et al., 2009). Of special concern, is the propagation and spread of the salmon louse, Lepeophtheirus salmonis, and this parasite's potentially harmful effect on wild salmon populations (Krkošek et al., 2007).
Mathematical and statistical models are increasingly being used to evaluate infection pathways and risk factors for pathogen propagation and disease development, both in aquatic and terrestrial animal farming (Aldrin et al., 2013(Aldrin et al., , 2011(Aldrin et al., , 2010Salama and Murray, 2013;Murray and Salama, 2016;Jonkers et al., 2010;Diggle, 2006;Höhle, 2009;Keeling et al., 2001;Scheel et al., 2007). When such models are able to describe the main patterns in the host-pathogen population dynamics, including the spread within and between farms, they can be used to predict future infection levels as well as simulate the outcomes of disease mitigation scenarios, examples being interventions to mitigate bovine tuberculosis in Great Britain (Brooks-Pollock et al., 2014) and long term effects of infection control measurements to mitigate salmonid alphavirus (SAV) incidences causing pancreas disease (PD) outbreaks (Aldrin et al., 2015). The Norwegian salmonid production system is exceptionally well suited for developing models for salmon lice infection dynamics because of the wealth of surveillance time-series that document both the spatial locations and population sizes of host populations at risk of infection, as well as salmon lice abundances in these host populations. Coupling these host and parasite population data have provided insights into e.g. how salmon lice spread between farms depending on between-farm distances, and how transmission and parasite abundances depend on local host biomasses (Jansen et al., 2012;Aldrin et al., 2013;Kristoffersen et al., 2014). However, previous models that describe both between and within farm parasite population dynamics have for simplicity typically been autoregressive statistical models focusing on single aggregated measures of parasite infection levels (Jansen et al., 2012;Aldrin et al., 2013;Kristoffersen et al., 2013Kristoffersen et al., , 2014. Alternatively, models have been developed for simulation purposes only (Groner et al., 2014) or focused on the population dynamics on single farms over a limited time period (Krkošek et al., 2010). Most of these approaches have relied heavily on estimates of demographic rates ob-tained in the laboratory (Stien et al., 2005;Revie et al., 2005;Gettinby et al., 2011;Groner et al., 2013;Rittenhouse et al., 2016;Groner et al., 2016).
The aim of the present paper is to formulate a fully mechanistic stage-structured population model for the salmon louse, that also allows for the complexities involved in full scale salmon farming. Furthermore, the model accounts for the hierarchical structure of the data obtained from the production system where salmon lice are counted on subsamples of fish, the fish being aggregated into separate cages and the cages being aggregated to farm. The model estimates parameters controlling a wide range of processes, including effects of temperature on demographic rates, fish size and abundance effects on transmission rates, the different effect sizes, temporal and stage specific effects of a wide range of salmon lice control measures, and distance-based transmission rates between farms. The objectives for developing such a complex population model for the salmon louse are: 1) To evaluate whether estimates of demographic rates obtained in the laboratory seems applicable in full scale production settings. 2) To evaluate the efficiency of different control measures. 3) To explore importance of different sources of infection (e.g. internal versus external sources). 4) To develop a tool that can keep account of the salmon louse populations at the production unit level in salmon farms, based on the successive counting of salmon louse infestations. 5) To develop a tool for short term predictions of salmon louse infection levels. 6) Finally, to develop a sufficiently realistic model that can be used for scenario-simulations exploring the effects of various parasite control strategies. In this paper, we describe the model in detail and discuss the results in relation to objectives 1-5.

Modelling background
Many authors have previously presented models for salmon louse population dynamics. Most of these models are formulated on a continuous time scale (Stien et al., 2005;Revie et al., 2005;Gettinby et al., 2011;Groner et al., 2013;Rittenhouse et al., 2016;Krkošek et al., 2009) , whereas others are formulated at a discrete scale with a time step one day (Groner et al., 2016), which is also the case for our model. However, more important when comparing with our model is that the parameter values in previous models primarily are based on laboratory data or on small scale experimental units in the marine environment. When real production data has been used for estimation, these have been aggregated. For instance, Stien et al. (2005) used only laboratory data published in previous papers, whereas Groner et al. (2013) and Groner et al. (2016) used values from Stien et al. (2005) for some parameters and values from several other previous papers for other parameters. Furthermore, Revie et al. (2005) and Gettinby et al. (2011) used laboratory data from previous studies to estimate the majority of the parameters and real production data to estimate the remaining parameters, but the production data were aggregated over farms and to a monthly time scale. One exception is the study by Krkošek et al. (2009), who estimated their model by experimental data from small scale marine cages, but their model covers only the parasitic stages of the salmon louse.
The present estimating approach is fundamentally different. The parameters are estimated by fitting the model to every lice count collected through the whole production period on each cage at each of 32 farms. However, to ensure that the final parameter values are within biological plausible ranges, we use laboratory data, mostly based on results summarised in Stien et al. (2005), to specify informative prior distributions for many of the parameters. The priors are then updated to posterior distributions by the full scale farm data using Bayesian methods. Since the model is estimated on real data from many different farms under various conditions, it has to simultaneously incorporate many features to handle activities or events that affect lice abundance, including various types of treatments, external infection from neighbouring farms and the movement of fish (and then also lice) between cages at the same farm. Our model is therefore more complex than the aforementioned models.
For instance, our model takes into account and estimate the effect of several different types of treatments, such as medical bath treatments, in-feed treatments and the use of cleaner fish. Of the aforementioned models, the model of Groner et al. (2013) includes the effect of cleaner fish, but the effect size was taken from previous studies from the 1990-ies. Likewise, the models used in Revie et al. (2005), Gettinby et al. (2011) and Groner et al. (2013) include effects of medical treatments, but again the values of the effects were based on previous studies. Our model is designed to allow new interventions to be incorporated and their effect on lice abundances to be estimated from real-time production data.
Fish are always free of lice when they are stocked as smolt to seawater cages, so the lice population at a farm is always initiated by external infection. The models used in Revie et al. (2005), Gettinby et al. (2011) and Groner et al. (2013) include external infection as a constant, estimated from aggregated data. In our model, the external infection is farm-specific and time-varying, depending on abundance of adult female lice at neighbouring farms. Because we intend to maximise the model fit to each lice count at each cage at each farm, some of the parameters in our model are farm-specific or even cage-specific and some are time-varying, whereas other parameters are constant and common for all cages and farms.

Salmon farming and salmon lice
Farm production of salmon comprises of a freshwater juvenile phase, being followed by a marine grow out phase, the latter which is the focus of this study. The production of salmon on a marine farm typically initiates by stocking juvenile smolts to cages (or net-pens) either in spring or in autumn. Salmon are kept in the marine farms for about 1.5 years after which they are slaughtered for food consumption. In Norway, only fish of the same year class of age are kept on a given farm and we term this a cohort throughout the present paper. After slaughtering, it is mandatory to fallow the farm for a period of at least two months before stocking a new cohort of salmon. Fish may occasionally be moved from one marine farm location to an empty farm location, in which case this farm will initially report fish weights larger than expected for smolts recently stocked into the sea.

Data
The main body of data in the present study consist of cage-level data from 32 marine salmon farms in Norway, of which 12 farms are located north of the island Frøya in Mid-Norway ( Figure 1). For each farm, the data covers a full production cycle for farmed salmon, from stocking as smolts to slaughtering as adult Atlantic salmon (Salmo salar L.), including fish production data, lice counts, temperatures and louse control efforts. Salmon were stocked between 2011 and 2013 and slaughtered about 1 1/2 year after stocking, between 2012 and 2014. The number of production units (cages) per farm varied from 3 to 12, but were usually around 8 (mean 7.7). For 9 farms, the fish were moved between cages within the farm during the production period.
Seawater temperatures were measured at 5 m depth at the farms. The average temperature was 9.1 • C, and 95 % of the temperatures were between 3.6 and 15.0 • C. Data on salinity were not available in sufficient detail and have therefore not been used.
The production data consist of daily numbers and mean weights of salmon per cage during the production period, information on movement of salmon between cages within farms and information on antiparasitic lice treatment using chemotherapeutic medicals (day of application and type of medical). Furthermore, the data contain information on stocking of cleaner fish (day and number of cleaner fish stocked), but with limited information on their mortality, and hence also for the number of cleaner fish present at a given day. We do not distinguish between various species of cleaner fish.
The production cycles lasted on average 16.5 months per cage, with on average 140 000 fish per cage, typically more in the beginning of a production cycle and less towards the end. The average minimum and maximum fish weights during a production cycle was 140 g and 5.7 kg, respectively. 89 % of the cages contained cleaner fish in parts of the production period.
As a main rule, lice counts were performed on a sample of at least 10 fish every second week for each cage. The salmon lice were divided into three categories according to developmental stages, i.e. i) chalimus (CH), ii) other mobiles (OM), which consist of pre-adults and adult males, and iii) adult females (AF). There were on average 41 lice counts per cage, with averages (abundance) of 0.23 CH, 0.76 OM and 0.18 AF per fish.
Six different types of antiparasitic medicals were used (Table 1), and there were on average 4.6 events of medical treatments per cage. We assume that the effects of deltamethrin and cypermethrin are equal, since these are similar compounds. Furthermore, when azamethiphos is used in combination with deltamethrin or cypermethrin, we assume it has the same effect as using deltamethrin or cypermethrin alone, since this combination is used when reduced treatment effect is expected due to resistance towards the medicals. The medicals emamectin benzoate and diflubenzuron are given through the feed, typically over a period of around two weeks. These treatments have a relatively low daily effect, but effects last over a prolonged period. The other medicals are applied as bath treatments over a duration of a few hours, with a larger daily effect, but lasting over a shorter period. We assume it is a time delay of ∆ del days (Table 1) after application before the treatments give visible effects. Furthermore, we assume that the duration of the effect, ∆ dur , depends on the seawater temperature according to where δ dur is a constant given in Table 1 and T t 0 is the seawater temperature when the medical is applied. One exception is when hydrogen peroxide was applied, for which ∆ dur is temperature independent and given by ∆ dur = δ dur (in days).
In addition to the detailed cage-level data on the 32 farms, we have more aggregated data on all other Norwegian marine salmon farms. For a farm f at day t, we know the number of salmon, denoted by N SAL tf . We also have an estimate A AF tf of the abundance of AF lice at the farm, based on weekly lice counts on a sample of fish, and therefore also an estimate of the total number of AF lice, given by N AF . Finally, we have the seaway distances between all farms, and we let d f f denote the seaway distance between a farm f and another farm f . Based on these quantities for all other farms, we have calculated an external infection pressure index for each of the 32 farms, which can be seen as a preliminary estimate of external infection pressure at time t. This is a weighted sum of the estimated numbers of AF lice at neighbouring farms, where the weights decrease  Nygaard (2010) and Ottesen et al. (2012). For hydrogen peroxide, the unit for δ dur is days.
by increasing seaway distance to the farm in question. This index is denoted by N AF Ext tf for farm f at time t, and is given by where g(·) is a function decreasing by increasing distance given by This distance function is taken from Aldrin et al. (2013), and is based on a datadriven model for lice abundance estimated from more than eight years of data on all 1400 Norwegian salmon farms that were active in the data period.
We have also calculated a corresponding weighted average of the counted abundance of adult females at neighbouring farms, given by Figure 2 shows the most relevant data for one cage at one farm. The upper panel shows time plots of the seawater temperature (on the left y-axis) and the external infection pressure index. In addition, the first stocking of salmon in this cage is indicated by the vertical pink line and the various medical treatments are shown as blue vertical lines. Finally, the stocking of cleaner fish is also shown as vertical lines. The vertical extension of these lines is proportional to the stocked cleaner fish ratio (on the right y-axis), i.e. the number of stocked cleaner fish divided by the number of salmon. The lower three panels show the counted abundance of lice in the CH, OM and AF categories. 3.0 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Adult females data

Model framework 2.4.1 Some model overview and notation
Biologically, the life cycle of the salmon louse consists of eight developmental stages (Hamre et al., 2013). These are aggregated into the following five stages in our model: i) recruits (R, eggs and nauplii larvae), ii) copepodids (CO, infective planktonic larvae), iii) chalimi (CH, sessile lice on fish), iv) pre-adults (PA, mobile lice on fish) and v) adults (A, also mobile lice on fish). The adults are further divided into adult females (AF) and adult males (AM). When an infective copepodid (stage CO) attaches to a fish host, it takes approximately 24 hours before it moults into the CH stage (Krkošek et al., 2009). We ignore this short period and assume that a copepodid enters the CH stage immediately upon attachment to a fish host. The salmon lice count data from fish farms pool the stages PA and AM together in one category (OM), whereas the stages CH and AF are counted as separate categories.
The time resolution of the model is one day. The general idea is that for stage-age a = 0, the lice have developed into the given stage from the previous stage, and that for a > 0, the lice can develop into the subsequent stage. We further assume that within a day, in the following order; 0) lice may be counted on a sample of fish, i) lice may die due to natural mortality or treatment, ii) the surviving lice might develop to the next stage, and finally, iii) fish, with sessile or mobile lice, can be moved to another cage or be slaughtered.
When fish are stocked to marine farms as smolt, they are free of lice, and hence the initial lice transmission is caused by external infections. Not until some lice at the farm have developed into adults, the internal infection process can start. The population model for a farm with two cages is illustrated by Figure 3. In the R and CO stages, the lice are associated with the farm, but not with any specific cage. From the CH stage and onwards, however, the lice infect fish and are therefore associated with specific cages. In the following subsections we describe the various aspects of the population model and how the model is related to lice count data. Table 2 gives an overview of the main notation we use.
The main elements of the model are inspired by the model in Stien et al. (2005), but with some extensions, e.g. to capture management interventions on the farms. The model is validated rather informally, by assessing whether parameter estimates are plausible and by graphical evaluation of predictions ahead in time for five farms where the last 3-11 months of data were not used in the estimation process.
The model is estimated from the available data by a Bayesian approach. We thus need prior distributions for the model parameters. Many of these priors are informative, based on results from laboratory experiments, e.g those reported in Stien et al. (2005). For some priors, however, we use more vague settings (see Section 2 in the Supplementary material for details).

R CH
Cage 2 Figure 3: Overview of the population model for the salmon louse. Lice in the orange, red and green stages are counted, whereas lice in the blue stages are not counted. Lice are associated with a cage from the chalimus stage, here illustrated by a farm with two cages. The d-s, m-s and r-s symbolise development, mortality and recruitment, respectively.

Model for the recruitment stage
At the R stage, lice are not associated with a specific cage, but rather seen as a reservoir of recruits with the potential to infect a fish host at the farm in question in the future. The model is: The first term in (5) represents recruitment (into stage-age 0) from neighbouring farms, also called external recruitment. Here, N AF Ext t−1 is a weighted sum of adult females at neighbouring farms at time t − 1 (see Section 2.3). Furthermore, we assume that these reproduce with a rate r Ext t−1 (see Section 2.4.7 ). Then, can be interpreted as a preliminary estimate of the number of external recruits reaching the farm. However, this accounts for seaway distances to neighbouring farms, but not for the sea currents in the area that may be more or less favourable for a given farm, and which also may vary over time. Therefore we have introduced the modifying factor e Ext t , which is a farm-dependent and time varying modifying factor, see Section 2.5 for an exact definition.
The second term in (5) represents recruitment (into stage-age 0) from adult female lice at the same farm, also called internal recruitment. The product N AF (t−1)a c s AF (t−1)a c is the number of adult females at stage-age a in cage c that survives at time t − 1 and they reproduce with a rate r (t−1)a c (see Section 2.4.6).
The new recruits are summed over all possible stage-ages of the adult females and over all cages.
Eq. (6) keeps track of the number of recruits of stage-age a > 0 that i) survives from the previous time point with survival rate s R (t−1)(a−1) (see Section 2.4.3) and ii) do not develop into the infective CO stage, where d R (t−1)(a−1) is the development rate, i.e. the proportion of recruits that develop into the CO stage (see Section 2.4.4).
The equations for the next stages use similar notation for survival rates, development rates and numbers of lice.
Model for the copepodid stage Here, the development rate d CO (t−1)(a−1)c (see Section 2.4.5) represents the infection rate, i.e. the proportion of the available copepodids that during a day infect fish in cage c and thus enter the CH stage. The sum over cages, d CO (t−1)(a−1) = c d CO (t−1)(a−1)c is then the total infection rate at the farm. This is modelled as independent of stage-age.

Model for the chalimus stage
From the CH stage on, the lice are attached to a fish, and therefore associated with a specific cage. We assume that the attached lice follow the fish if the fish are moved to another cage or if the fish are removed from the farm (including slaughtering and other fish mortality). To handle this, the equations given here for the CH, PA and AF stages are extended slightly (see Section 1.2 in the Supplementary material).
Model for the pre-adult stage

Model for the adult stages
For the adult stage, we distinguish in principle between males and females. However, we assume that males and females have the same survival and development rates and therefore each constitute 50 % of the adults. The main reason for this is that we do not have data on the number of adult males on the fish, and therefore do not have the information necessary for separate estimation of adult male demographic rates.. The equations for adult females are then while the number of adult males is equal to the number of adult females:

Survival rates
We assume that the survival rates may be farm-specific for some stages, and therefore use the index f when convenient, but we sometimes drop the superscript that indicates the stage name. We assume that the total survival rate is the product of three terms; where s nat tf ac is survival after natural mortality, s clf tf c is survival after additional mortality due to cleaner fish predation (independent of stage-age) and s cht tf c is survival after additional mortality due to chemotherapeutic treatment (independent of stage-age). The m-s denote the corresponding mortalities. The two latter terms are relevant only for the CH, PA and A stages. All the three mortality (and survival) terms must lie between 0 and 1, but they have different structures.

Natural mortality
For the R and CO stages, we simply assume that the natural mortality is a constant that is common for both stages, i.e.
For each of the CH, PA and A stages, we assume that the natural mortalities are stochastic processes that can vary over time and between farms, but are common for all cages within a farm and independent of stage-age. This may account for factors that differ between farms and change over time, for instance salinity, which is not included in the model. Furthermore, including these mortalities as farm-specific and time-varying terms improves the fit of the model to data. For each farm and louse stage, the mortality is assumed to follow an autoregressive model of order 1 (AR(1)) on the logit-scale as m nat tf ac = m nat tf = exp(z nat tf /(1 + exp(z nat tf )), Var(ε nat tf ) = (σ nat ) 2 .
Here, z nat tf = logit(m nat tf ) = log(m nat tf /(1 − m nat tf )). Furthermore, λ nat 0 is the expected value on the logit-scale, φ nat an autoregressive coefficient and ε nat tf a white noise process with variance (σ nat ) 2 . These parameters have separate values for each stage. In addition, the time-varying mortalities m nat tf ac are restricted to lie within specified intervals, which are (0.0006-0.02) for CH, (0.002-0.21) for PA and (0.0003-0.70) for A. These limits are motivated from the various studies summarised in Stien et al. (2005), and are simply the most extreme limits of the intervals given in their Table 4.
In addition, we assume m = 1 from stage-age 80 for adults and from stage-age 60 for the other stages. This is an approximation made to save computer time.

Mortality due to cleaner fish
We assume that cleaner fish feed on lice at the PA and A stages only (Leclercq et al., 2014), and that the corresponding mortality for these two stages are equal. Let x clf tf c = N CLF tf c /N SAL tf c be the ratio of the number of cleaner fish to the number of salmon in cage c, at farm f and time t. This ratio depends among others on the mortality of cleaner fish, which has to be estimated, and the model for this is described later in Section 2.5.1. Lice mortality in the PA and A stages due to cleaner fish is then given by where the parameter λ clf is non-negative, such that the mortality always is between 0 and 1. One reason for assuming such a simple model for the effect of cleaner fish (as opposed to the effect of medical treatments discussed below) is that we also must estimate the cleaner fish ratio which is multiplied by the cleaner fish effect.

Mortality due to chemotherapeutic treatment
We assume that the chemotherapeutic treatments introduce extra mortality of lice in some or all the stages CH, PA and A, depending on the type of treatment. However, for simplicity we assume that lice are only affected as long as they stay in the stage they were at the time of treatment. If they manage to develop to the next stage, they are clear of the treatment effect. Let the set of subscripts f cbi denote the i-th application of a chemotherapeutic of type b in cage c at farm f . Assuming that this treatment was given at time t 0 f cbi , we define an indicator variable x cht tf acbi that is 1 when the treatment is active (a period after the treatment is given) for lice at stage-age a, i.e. when where the delay constant ∆ del b and the duration ∆ dur f cbi are explained in Section 2.3.
The mortality due to chemotherapeutic treatment is given by where u cht f cbi is a regression coefficient expressing the effect of the specific application of the treatment. These regression coefficients vary systematically between treatment types, accounting for varying efficiency of different types of treatments. In addition, they vary randomly between different applications of the same treatment type, which for instance may be due to a varying degree of resistance in the lice populations. This is handled by the following formulation In general, the parameters λ cht and σ cht differ between various types of treatment, but are set equal for some treatment types. These parameters are equal for the stages for which a treatment have effect, but the effect of a specific treatment f cbi, represented by the random coefficient u cht f cbi , may vary between stages (this is for simplicity omitted from the notation above).

Development rates
We consider here the development rate from one stage to the next, for the stages R, CH and PA. In all of the population models for lice that we mentioned in the introduction, the development rate is 0 until some, perhaps temperature-specific, minimum stage-age, and afterwards positive and constant. In our opinion, the concept of a strict and absolute minimum development time can be questioned in a population with millions of individuals, and the assumption of a constant development thereafter may be unrealistic. We have chosen to consider the development to the next stage as a time-to-event process, and model it as a discretised version of a Weibull distribution. The Weibull distribution is widely used in statistical models for time-to-event or survival analysis (Aalen et al., 2008). It also gave a better fit to our data than a comparable formulation that included the minimum development time model mentioned above as a special case (data not shown).
When the time to an event is continiuous and Weibull distributed, the event rate (often called hazard in survival analysis) is (δ sc ) −δ s δ s a δ s −1 , where a is the stage-age or time, δ s is a shape parameter and δ sc is a scale parameter (sometimes (δ sc ) −δ s is termed the scale parameter). In our case, it is convenient to re-parameterise this as a function of the median time to event, δ m , and the shape parameter. The event rate then becomes log(2)(δ m ) −δ s δ s a δ s −1 , since the median in the Weibull distribution is δ sc (log(2)) (1/δ s ) .
We use a discretised version of this, i.e. our development rate is the probability to develop to the next stage within a day, and it must therefore also be restricted to be at most one. We assume that the median time to develop may vary over time and between farms, and introduce therefore the subscripts tf a on it. The model for the development rate is then d tf a = min(log(2)(δ m tf a ) −δ s δ s a δ s −1 , 1) for a = 0, 1, . . . .
We further assume that the median development time depends on the temperature history as δ m tf a = c/(T tf a ) δ p , whereT tf a is the average temperature that lice at stage-age a at farm f have experienced, i.e. the average temperature from time t − a to time t, c is a constant and δ p is another constant that performs a power transformation ofT tf a . To get a more clear interpretation of the constant c, we parametrise it as a function of the median development time at 10 • C, denoted by δ m10 . The final model for the median development time then becomes δ m tf a = (10 δ p δ m10 )/(T tf a ) δ p = δ m10 (10/T tf a ) δ p .
The development rate defined by Eqs. (27) and (28) also depends on the stage in the way that the parameters δ m10 , δ s and δ p are stage-specific. One motivation for introducing δ m10 as a basic parameter is that we use results on development times around 10 • C from other studies as prior information, to ensure that our estimates lies within biological plausible ranges. For the R stage, which consists of eggs and nauplii, this prior information is given separate for eggs and nauplii. Therefore, for the R stage, δ Rm10 is the sum of one parameter δ Em10 for eggs and another quantity δ N m10 for nauplii, i.e.
and δ Em10 is also contained in the reproduction factor introduced later in Section 2.4.6.
In the estimation, we restrict δ s to be larger than 1, and the development rate will then be 0 at stage-age 0 and then increase by increasing stage-age. Furthermore, the larger δ s is, the more steep will the development rate increase from 0 to 1 around the median. When δ s > 2, the difference between the mean and the median will be less than 7 %. It should further be noted that the parameter δ m10 is only approximately the median development time, since we consider a time-discrete version of the Weibull distribution.
Assuming a constant temperature T , Stien et al. (2005) modelled the minimum development time as c 1 /(T +c 2 ) c 3 , where c 1 , c 2 and c 3 are constants. They further assumed that c 3 = 2 and estimated c 1 and c 2 . We use a similar formulation for the median development time, but assume c 2 = 0 and estimate c 1 and c 3 . In practice, these two formulations are quite similar for the relevant temperatures and for the estimated values of c 3 = δ p (between 0.4 and 1.3, see Table Table 3).

Infection rate
The infection rate is the proportion of the copepodids that infect fish during a day and thus develop into the CH stage. It is farm-and cage-dependent, but does not depend on stage-age a, except that we assume that development may only happen for a ≥ 1. This is modelled as where Here N SAL tf c and W tf c are the number (in millions) and the average weight (in kg), respectively, of fish in cage c at farm f and time t, and 0.55 is roughly the mean of the natural logarithm of the weight of fish. With this formulation, d CO tf = c d CO tf c will be the proportion of copepodids that infect fish in any cage during day t, and this will always be between 0 and 1. Furthermore, when the proportions or rates are small, the rate d CO tf ac for each cage will approximately be proportional to the number of fish N SAL tf c in the cage and to W δ CO 1 tf c . The parameter δ CO 0f c controls the magnitude of the infection rate conditioned on the number and weight of fish within a given cage. In our model δ CO 0f c depends on cage and farm, reflecting that some farms or cages may be more exposed to infection than others due to for instance sea current conditions. This is handled by the following hierarchical structure: where δ CO 0f is a farm-specific mean and δ CO 0 an overall mean. Furthermore, σ COdf reflects the variability between cages at the same farm, whereas σ COd reflects the variability between farms.

Reproduction factor
The recruitment model, Eq. (5), includes the internal reproduction factor r tac , which is modelled taking into account the following factors: Female adults extrude pairs of egg strings. They can extrude a new set of egg strings within 24 hour after the previous set was hatched, but hatching can take several days (Stien et al., 2005). The number of eggs per string may increase for each consequtive extrusion, which we approximate with stage-age. Finally, not all eggs are viable. In addition, we allow for density dependence in recruitment as suggested by Stormoen et al. (2013), Krkošek et al. (2012) and Groner et al. (2014).
The reproduction factor r tac for internal recruitment at time t, stage-age a and cage c is thus modelled as The first term in Eq. (34), β r 0 , represents the number of viable eggs for the first extrusion. The next term, (a + 1) β r 1 models how the number of viable eggs per extrusion increases by stage-age. The third term, 1/(δ Em t + 1), represents the rate of pairs of egg strings produced per day, which is the inverse of average time between each egg extrusions, which further is approximately the median hatching time plus one day for developing new egg strings. The median hatching time is given by where T t is the seawater temperature and δ Em10 and δ Rp are parameters defined in Section 2.4.4. Finally, the term (1 − exp(−γ r · A tc )) allows for density dependent recruitment. Here, A tc = N AF tc /N SAL tc is the abundance of adult females in cage c at time t. A very large value of γ r corresponds to a model without density dependent recruitment. Of the parameters involved in Eq. (34), we estimate δ m10E , δ pR and γ r and fix β r 0 and β r 1 to 172.5 and 0.2, respectively (see Section 2.5 in the Supplementary material for a motivation of these values).

Reproduction factor for external recruitment
The reproduction factor r Ext t for external recruitment in Eq. (5) is similar to the internal one, but the female lice abundance A tc in Eq. (34) is replaced by a weighted average of the counted abundance at neighbouring farms, A AF Ext t (Section 2.3 and Section 1.1 in the Supplementary material). Furthermore, we assume that all these female lice at neighbouring farms are at stage-age a = 10. The assumed stage-age of 10 is rather arbitrary, but the results are insensitive to this choice.

Modifying factor in the external recruitment
The modifying factor e Ext t for external recruitment is farm-specific, so we include the farm index f as well. At the log-scale, it varies over time around a farm-specific level according to the following AR(1) model: Here, µ Ext f is the farm-specific expected value on the log-scale, φ Ext the autoregressive coefficient and (σ Extar ) 2 the residual variance. Furthermore, µ Ext is the overall expected value and (σ Ext ) 2 the between-farm variance of µ Ext f .

Cleaner fish model
Let S clf tc denote the number of cleaner fish stocked and N clf tc the total number of cleaner fish in cage c at time t. S clf tc is observed, whereas N clf tc is unknown and modelled as where κ clf is the daily constant mortality rate of cleaner fish, common for all farms.

Data model
In this subsection, we describe how the population model is related to the lice count data. Let Y CG tc be the number of lice in count group CG found on n tc counted fish at time t and cage c, where the count groups are either chalimus (CH), adult females (AF ) or other mobiles (OM , i.e, pre-adults and adult males). We assume that these follow a negative binomial distribution with mean µ CG tc = E(Y CG tc ) and a heterogeneity or aggregation parameter n tc ρ CG , such that the variance of Y CG tc is µ CG tc + (µ CG tc ) 2 /(n tc ρ CG ). Deleting the superscript CG and subscript tc for a moment, the probability distribution of Y is We get the total likelihood for each count group by multiplying over all counts, cages and farms. We further assume independence between count groups and get the total likelihood by multiplying the contribution from each count group.
The expected numbers of the various Y CG tc 's are given from the population model as where the role of the factor p CHcount tc is to adjust for under-reporting of CH lice, since they are very small and difficult to count, especially on large fish. We assume that this factor is farm-specific (for instance, the staff at some farms may be more trained or motivated than staff at other farms), and we introduce from now on the index f for farm. Then, the model for p CHcount where where W f tc as before is the mean weight of fish in cage c at farm f at time t. The constant 0.1 is chosen to make it easier to specify prior distributions for β CHcount 0f and β CHcount 1 . Here, β CHcount 1 is common for all farms, but β CHcount 0f varies between farms according to the following hierarchical model: The model was estimated from the data including 32 farms, except the last months (3-11) of data for five of the farms that were used for evaluating conditional predictions. We used a Bayesian estimation approach, combining the prior distributions and the data likelihood into a joint posterior distribution for all model parameters. This was done by Markov Chain Monte Carlo (MCMC) simulations (Gilks et al., 1996). First, several initial chains were run to identify a rough range for plausible parameter values. Then four independent chains were started from slightly different starting values within this range. The first 25000 iterations were used as burn-in to establish convergence, and the posterior distributions were calculated by combining 100 thinned samples from the last 6000 iterations from each of the chains. See Section 4 in the Supplementary Material for more details on the MCMC algorithm. However, for periods with elevated predicted population sizes, abundances of salmon lice were sometimes over-estimated (e.g. AF abundance in August 2013, Figure 3, Section 3 in Supplementary material) and sometimes under-estimated (e.g. AF and OM in first part of September 2013, Figure 4). These large deviations in some predictions are likely to reflect 1) that there are predictor variables that have not been included in the present model (e.g. salinity), 2) substantial uncertainty in some predictor variables like the abundance of cleaner fish in the cages, and 3) that stochasticity, in particular with respect to the infection process, limit our ability to make precise predictions. Accordingly, also the credible intervals for the predictions were wide when elevated abundances of infection were predicted (e.g. Figure 4).
There was substantial underreporting of the number of lice at the CH stage. Depending on the size of the fish, the model estimates suggested that on average only 9% to 19% of the CH lice were counted ( Figure 5). In addition, there was substantial between farm variability in this counting error ( Figure 5). The relationship between observed abundances of lice at the CH stage and predicted values was poorer than for the other stages (OM and AF), even when the underreporting was accounting for (the grey "counting error" area for CH in Figure 4 is wide and includes zero). This can be quantified by the aggregation parameter ρ which was 50-75% lower than for OM and AF (Table 3). This indicates that the information content in the counts of CH stage lice is limited.

Parameter estimates
Posterior mean estimates and credible intervals for parameters in the model are given in Tables 3 and 4. Note that for those parts of the model where we have no data, covariation between parameters in the model may lead to potential bias, i.e. high estimates of one parameter may be compensated for by an associated change in the value of another parameter. An example of this is that mortality and development rates for the R and CO stages and the reproduction rate are related, but without relevant observations to tease them apart. For instance, if the reproduction rate is overestimated, this can be compensated either by increasing the mortality in the R and CO stages (which are assumed to be equal) or by reducing the infection rate (development from CO to CH).
When compared to previously published estimates on stage-specific mortality, there are some notable differences ( Table 5). The estimate of the mean mortality rate at the R and CO stages (λ RCOnat ) was higher than the previous estimate (Table 5). However, note that in our model, this quantity also accounts for nauplii and copepodids that drift away from the farm, in addition to the pure natural mortality. For the CH, PA and A stages, the mortality rates vary over time, but we calculated their overall expectations (averages in the long run) by simulation. The overall expectation for the mortality rate of the CH and PA stages tended to be towards the lower range of previous estimates, while the estimate for the adult stage was within the range of previous studies (Table 5). These estimates must be interpreted with caution since the model assumes equal development rates between genders and, furthermore, that adult males are pooled together with the PA stages in the observational data. More detailed figures on mortality, including parameter uncertainties, are presented in Figures 9-13 in the Supplementary material.
We also used simulations to find the estimated median development times, since the δ m10 parameters given in Table 3 have exact interpretations only in the continuous Weibull distribution. Estimated median development times at 10 • C were similar to mean and minimum estimates from previous studies (Table 6), however with a slightly higher estimate at the CH stage and fairly low estimate for the PA stage. Panel a) in Figure 6 shows how the estimated development rates increases by stage-age at a temperature of 10 • C. These curves differ in principle from the development rates used in all population models mentioned in Section 2.1, since all these models use step functions with a development rate of 0 until a minimum development time and then a constant rate afterwards. For the R and CH stages, however, the estimated cumulative proportion of lice developed to the next stage (panel b) in Figure 6) resembel these step functions. For the PA stage, however, the estimated development rate is fundamentally different from those used in other models, since the estimated development rate for PA is non-zero already after one day. This implies that some lice in the PA stage may develop We obtain an estimate of the daily cleaner fish mortality of 0.027 (C.I. 0.022-0.034, Table 3). This suggest that the cleaner fish population is reduced to its half about 1 month after release. The model confirms that there is increased mortality of lice associated with the use of cleaner fish (Figure 7 and Table 3). With a 10% cleaner fish to salmon ratio, the estimated daily lice mortality (for lice in the PA and A stages) due to the use of cleaner fish, is 0.080 (C.I. 0.060-0.100). This implies a reduction in the life expectancy for adult lice from 8.5 to 5 days with an increase in cleaner fish ratio from 0 to 10%, and a decrease in the life expectancy for pre-adult lice (PA) going from 127 days to 11 days for the same change in cleaner fish ratio. Therefore, in particular for PA lice, the use of cleaner fish is estimated to have a substantial effect on lice survival. However, in the present data, the (estimated) cleaner fish ratio seldom amounted stocked to more than 5%, which seems to be too low to avoid additional treatments, since medical treatments were applied in almost all cages in the data set.
The effects of the various chemotherapeutic treatments are difficult to compare since the assumed duration of the effects varies between treatments and by temperature, and because they affect different stages of lice. Furthermore, since lice develop resistance towards such treatments (Aaen et al., 2015), we expect that the effect will decrease over time. Nevertheless, the estimated expected cumulative mortality of lice (found by simulation) in the PA or A stages due to bath treatments (i.e. non-feed) ten days post treatment at 10 • C, were high for hydrogen peroxide (0.99, C.I. 0.97-1.00), and deltamethrin/cypermethrin (0.94 , C.I. 0.88-0.97) and somewhat lower for azamethiphos (0.74 , C.I. 0.64-0.85). The first two of these are similar to what others have reported for non-resistant lice populations, being 99% for hydrogen peroxide (Groner et al., 2013) and 95% for deltamethrin and cypermethrin (Revie et al., 2005). The estimated effects of all treatment types are further illustrated in Figures 9-13 in the Supplementary material.
Both external and internal recruitment varied substantially over time, and one of the two may dominate the other in certain periods. The estimated proportion of internal recruitment for each farm averaged over the whole production cycle varies from about 4 to 73%. On average over all farms, this proportion is 24% (C.I.   based on a simulation of the spread of lice larvae between 591 farms from a hydrodynamic model that takes sea currents into account (Johnsen et al., 2014). When the abundance of adult female lice was low in the farm, naturally internal recruitment tended to be low, while internal recruitment was estimated to be very important in farms with a high abundance of adult females (Figure 8).
Concerning the reproduction rate, we notice that the estimate of γ r 0 is 493 (C.I 481-498). In practice, this means that there is no evidence of density dependent reproductive rates (Allee effect) in the estimated model, contrary to the hypothesis suggested by Stormoen et al. (2013), Krkošek et al. (2012) and Groner et al. (2014). One reason for this discrepancy is that recruitment from external sources is not considered in the aforementioned models. In our model, internal recruitment is on average of less importance than external recruitment at low abundances of AF lice. Reduced reproductive rates at low abundances of infection are therefore likely to be masked by external recruitment.

Conclusion
The presented process model for the population dynamics of salmon lice in aquaculture farm systems combine 1) a model for the main stage structure of salmon lice; 2) model terms that describe internal and external recruitment of salmon lice to the farms; 3) models for the impact of a management strategies adopted to reduce and control salmon lice infection levels; and 4) allows for stochasticity in aspects of these processes. In addition, we describe a model for the link between the process model and data collected on a routine basis in modern aquaculture. This allows the process model to be fitted to data and updated when new production and salmon lice data from fish farms become available. The model produces estimates of lice abundances in fish farms that correspond well with observed infection levels. Furthermore, it allows reasonable predictions of pre-adult and adult sea lice infection levels to be made for several weeks into the future (Figure 4). We note, however, that the observational data on the number of lice at the chalimus stage vary substantially, and is underestimated to varying degrees between fish farms. This suggests that the data collected on the abundance of lice at the chalimus stages contain limited information in terms of real infection levels in the farms.
The model generates stage specific estimates of development and mortality rates based on real production data. For the planktonic stages, the estimates of mortality rates were high relative to earlier field and laboratory studies, but these are not directly comparable, since our definition of mortality include lice that drift away from the farm. For the chalimus and pre-adult stages, the mortality rates were low compared to previous studies, while it was rather high for the adult stage. One reason could be the poor data support for the early life stages, since we have no data on the planktonic stages and poor data on the chalimus stages. The poor support by data makes it difficult to separate survival-estimates at these stages from other parameters in the model, in particular reproduction rates, development rates and biases in chalimus counts, and perhaps this problem propagates to the pre-adult stage. The median development times for are reasonable. However, we note that the distribution of development times from the pre-adult to the adult stages include the possibility of very short development times. It is unclear why this happens, but one possible explanation could be movement of adult lice between cages within a fish farm. Another explanation could be that some of the non-gravid female adults have been misclassified as pre-adults in the lice count data.
The model is well suited to quantify the effect of different treatments on lice infection levels. This aspect may be useful in monitoring development of resistance to treatments in sea lie populations. Furthermore, to our knowledge this is a first data-based quantification of the effect of using cleaner fish to manage lice levels in full scale salmon farms.
The model allows both internal and external infection processes to be quantified. When there are no adult female lice in a fish farm, all new infections will necessarily come from external sources. When reproducing adult female lice are present, however, they may contribute substantially to infection pressure at the farm. Their impact on local recruitment will depend both on their numbers and their reproductive rate, but also on the external infection pressure. The model did not suggest any evidence of density dependent reproductive rates.
The emphasis on salmon louse control in salmon farming has gravely intensified in later years, resulting in the implementation of a variety of innovative control methods. Some of these methods aim to reduce salmon louse recruitment from external sources, whereas other methods aim at increasing the mortality of parasitic stages of the lice, such as the use of cleaner fish. To tease out actual effects of such control efforts at farm levels, or more so for farms in a production area, is not a simple task, given the interconnected production-system of fish farms and planktonic spread of this parasite. The present model for the population dynamics of salmon lice in aquaculture farm systems may resolve the complexity of processes to a degree that will improve evaluations of effects of various control efforts at farm levels. We believe that using the model as "mathematical laboratory" to explore effects of different salmon louse control strategies in interconnected farms in production areas has a large potential for rationalising area-wise control strategies in an informed manner.