Parameterizing a dynamic influenza model using longitudinal versus age-stratified case notifications yields different predictions of vaccine impacts

Dynamic transmission models of influenza are often used in decision-making to identify which vaccination strategies might best reduce influenza-associated health and economic burdens. Our goal was to use laboratory confirmed influenza cases to fit model parameters in an age-structured, two-type (influenza A/B) dynamic model of influenza. We compared the fitted model under two different types of fitting methodologies: using longitudinal weekly case notification data versus using crosssectional age-stratified cumulative case notification data. These two approaches allow us to compare model predictions when using two different types of model fitting procedures, according to data availability. We find that the longitudinal fitting method provides best fitting parameter sets that have a higher variance between the respective parameters in each set than the cross-sectional cumulative case method. Also, model predictions–particularly for influenza A–are very different for the two fitting approaches under hypothetical vaccination scenarios that expand coverage in either younger age classes or older age classes. The cross-sectional method predicts much larger decreases in total cases from baseline vaccination coverage than the longitudinal method. Also, the longitudinal method predicts that vaccinating younger age groups yields greater declines in total cases than vaccinating older age groups, whereas the cross-sectional method predicts the opposite. These results show that the type of data used to fit a dynamic transmission model can produce very different outcomes, hence multiple fitting methods should be used whenever possible.


Introduction
Seasonal influenza imposes a significant health burden each year, reducing the quality of life for many across the globe [1].Although often viewed as a mild illness typically causing school or workplace absenteeism, influenza can cause significant complications for vulnerable populations such as the elderly or those with weakened immune systems.In order to combat influenza, health jurisdictions may implement vaccination programmes (such as the Universal Influenza Immunization Program in Ontario, Canada) that may target certain age groups, professions, or make vaccines widely available to the public.
Dynamic transmission models can be used to evaluate the effectiveness of control strategies for seasonal influenza, such as targeted vaccinations and vaccine types [2,3,4,5,6,7,8,9,10,11].These models are increasingly essential for decision-making regarding vaccine implementation, since in silico experiments regarding optimal age of vaccination can be done when experiments in real populations are impossible or impractical.For example, a frequent problem addressed in the literature is finding an optimal approach to distributing vaccines [12].Some research has found that targeting younger age groups produces the most benefit in limiting influenza spread and improving health outcomes across a population [13,14,15,16,17,18].However, other research has also shown this may not be the case in all circumstances [19,2].In the past, influenza transmission models have either chosen parameters without a fitting process [20,21], have been fitted to a single year using cross-sectional cumulative cases for that season [22,23], to weekly time series longitudinal data [24,8], to influenza like illness (ILI) data, assuming ILI incidence follows the same patterns as seasonal influenza [22,24], or incorporating data of laboratory confirmed influenza cases [8,23].
Here, we create an age stratified dynamic transmission model of seasonal influenza following similar approaches to Thommes et al. [9], and use positive influenza specimen tests for parameter estimation.Our research questions are (1) to determine whether a dynamic two-strain (influenza A/influenza B) transmission model can be fitted to longitudinal time series data of weekly laboratory confirmed influenza cases spanning multiple years, and (2) to compare the resulting fit and model predictions of the impact of vaccination to the case where the model is fitted only to non-longitudinal, cross-sectional data on age-stratified cumulative attack rates instead.This comparison will help determine how the quality and type of data can impact a model's outcomes, as some regions have more complete surveillance than others.In our model, we seek to estimate the key parameters of influenza transmission for both A and B strains.Previous models have not solely used laboratory confirmed case data for both A and B strains of seasonal influenza, or used the same fitting process to directly compare results when fitting to longitudinal time series data and cross-sectional cumulative case data over multi-year time spans.

Methods
Our model is a compartmental age structured model [25], and we will fit important transmission parameters to longitudinal influenza case data, as well as cross-sectional age stratified case data.
Incorporating age structure is a critical factor, as population contact patterns, and therefore influenza transmission, depend on age.Details of the model development are given in the following sections.

Population Demographics
Our model uses age compartments 1 year in size, starting from 0-1 years, and ending at 99+ years [24,9].The population size and age structure are modelled after the province of Ontario, Canada, to remain consistent with our data on influenza incidence.We also chose Ontario as our study population on account of its relatively large population size and presence of a universal influenza immunization program in the province.When we age the population, we use yearly population projections given by Ontario's Ministry of Finance, which are based on census data, birth/death rates, immigration, and emigration [26,27,28], or census data (for the model's 2011 population) [29].Due to the model's high age resolution, we are able to specify age dependent contact rates.
These contact rates play a crucial role in influenza transmission, and we use a contact matrix which specifies the mean daily duration of contact time in minutes between age groups [30].These contact data are based on studies conducted in the United States, and thus we are making the assumption that contact rates in the region we are modelling are similar.

Influenza Incidence Data and Epidemiology
All data used in our study are publicly available.Data on confirmed influenza cases are available for the province of Ontario, Canada from the years 2010 to 2015 [31].The data give the weekly number of confirmed cases in the province for the specified years.For fitting our model to age stratified cumulative cases, we use the years 2011 to 2016 due to these years having the required data available.The age categories used in the fitting are 0-19, 19-65, and 65+.In our model, we will consider influenza cases caused by both the A and B strains.
The influenza virus in our model has a susceptible-infected-recovered-vaccinated natural history.
For transmission, we use the contact hypothesis [30] where our contact matrix C taken from Table 1 in Zagheni et al [30] defines the average daily time of contact between age groups.We define β i to be the probability that an individual in age group i becomes infected after being in contact with an infectious individual, which in our case is constant across age groups.The time varying force of infection for age group i is given by where I j is the number of infected individuals in age group j and N j is the size of age group j.
Additionally, influenza incidence shows a prominent annual recurrence in the winter months, which has been thought to be caused by a variety of factors such as temperature, humidity, and changes in contact patterns [32,33,34].To ensure this seasonal variation in our model, we use a sinusoidal function [35] and multiply the force of infection by where A is the amplitude of the seasonality function which determines the variation of the basic reproductive number R 0 , and δ determines on what day the maximum value of the seasonality occurs (δ = 0 corresponds to January 1).This formulation is similar to previous work modelling the same dynamic [24,9,20], and we use the derivation found in Thommes et al. [9] to relate β i to R 0 .
Finally, infected individuals recover at a constant rate γ.Also, to model the antigenic drift of the influenza virus [36], we force individuals that have been infected to lose their immunity at a constant rate.In our model, natural immunity loss occurs at rate ρ N .

Vaccination
In Ontario, the primary types of vaccines used are the trivalent inactivated vaccine, the quadrivalent inactivated vaccine, and the quadrivalent live-attenuated vaccine [37].In this region, the recommended individuals to receive vaccination are those aged 6 months and older, and especially individuals in high risk groups or those who may directly transmit to high risk groups [37].
Much like natural immunity, vaccine acquired immunity wanes at a constant rate of ρ V .In our model, we choose ρ V to be a fitted parameter rather than choosing it as a fixed value or assuming it to be equal to ρ N , as was used in previous studies [24,9,20].Finally, those who become infected regardless of vaccinating will not show a reduction in infectiousness.

Model Structure
Our system of differential equations consists of susceptible S i (t), infected and vaccinated V i (t) individuals where i denotes the respective age class an individual belongs to.
The system is integrated with a time step of one day allowing for precise calculation the the daily force of infection as well as sufficient numerical solution accuracy.We use the MATLAB package ODE4 to fulfill our fixed time step requirement.In addition to the 5 year time period for which we have historical influenza incidence data, we run our model with a 10 year burn in period.During the burn in period, we use the 2010 population demographics and maintain the same vaccine uptake rates that were used during our period of interest.
Each year we choose a day near the end of summer (August 31), to age the population [20,24,9].
Individuals are moved to the next age class in one time step, and those in the 99+ age category remain.Then, the population is scaled to match the demographics of the next year's population, as projected by Statistics Canada and Ontario Ministry of Finance.If these more in depth metrics are not available, population birth and death rates may simply be used.Newborns entering the first age category all populate the S 0 compartment.
Next, vaccination occurs on October 1 of each year because in our selected region the majority of vaccination occurs in the fall.In our model, we make the approximation that vaccination of the population occurs before each influenza epidemic begins.Then, at a point t seed we add an external value λ ext to the force of infection for the remainder of the influenza season.This is a hybrid between models by Goeyvaerts et al. [24] and Thommes et al. [9] as we find this small addition to the force of infection grants a smoother transition into each new influenza season as opposed to infecting a bulk amount of individuals all at once on t seed .In addition, for the period of time between seasons, we remove λ ext so no additional new cases arise.A diagram showing the primary transitions of our model is shown in Figure 1.Vertical arrows represent aging, and on the day of the year the population is aged, members in each compartment are added to the corresponding compartment in the next age group.

Parameter Fitting
We compared two methods of fitting our model's parameters: fitting the parameters to longitudinal weekly case notification data spanning multiple years (we will call this the "longitudinal method") and fitting the parameters to cross-sectional age-stratified data that lack a temporal variable (we call this the "cross-sectional method").

Longitudinal Method
We aim to fit the parameters of our model to multi-year longitudinal time series data taken from [31] in a similar manner to Goeyvaerts et al. [24].However, we use laboratory confirmed influenza specimen cases instead of ILI incidence data used by previous models which are based on reported influenza-like symptoms rather than laboratory confirmed cases.
In order to quantify the goodness of fit for a given parameter set, we use a least squares approach: historical weekly incidence of the number of positive influenza cases is compared to our model's corresponding output.We define the number of historical reported cases in week w to be I H w , and the number of cases given by the model in week w to be I M w .In order to directly compare the two quantities, we use the parameter α introduced by Goeyvaerts et al. [24] to scale the model incidence.Here, α captures the probability that an infected individual is symptomatic, visits a medical practitioner and gets tested for the influenza virus which returns a positive result.The sum of squares error is then ∀w To evenly sample the parameter space, we use Latin hypercube sampling [45] to generate 35,000 parameter combinations.Parameter descriptions and fitting ranges (that is, Latin hypercube sampling ranges) are given in Table 1.We then determine each parameter set's sum of squares score over a simulation run.Next, we utilize MATLAB's GlobalSearch algorithm to search for optimal parameter combinations using the parameter sets that offered the lowest sum of squares values.
GlobalSearch attempts to find a function's global minimum, and initializes its search over the parameter space from a user defined start point.In our case, the function we are seeking to minimize is the sum of squares score of our system of differential equations.The input points are used by the solver to determine an initial estimate for a basin of attraction, and the algorithm also generates a set of trial points to be used in finding the minimum.Additionally, upper and lower bounds may be specified for each parameter, which we define as the same bounds used in the Latin hypercube sampling.Any number of runs of the GlobalSearch algorithm may be performed, using a different starting point corresponding to the parameter sets obtained from the Latin hypercube samples for each run.Moreover, maximum runtimes may be specified as well.
γ Mean latent plus infectious period 4 days (fixed) [49] ρN Natural waning immunity rate 1.0-2.Due to the stochastic nature of the process, more runs may result in lower least squares fits, and the available computational resources will be a determinant of how many initial points, and therefore runs, of GlobalSearch are used.In our analysis, we use the 50 best performing parameter sets obtained from the Latin hypercube sampling to use as initial points for the GlobalSearch algorithm.
We also tested a group of random initial points gathered from the top 15% of parameter sets from the Latin hypercube sampling, but they did not provide better results (lower sum of squares) than the aforementioned top 50 sets.

Cross-Sectional Method
For fitting age-stratified cumulative cases over the 5-year period, the available data is from Canada as a whole [31] (Ontario level data does not include age-stratified cases).Thus, we scaled the cases by the proportion of the Canadian population that lives in Ontario in order to remain consistent with the longitudinal method's fitting.
The fitting for the cross-sectional method was identical to the longitudinal method, except we did not compute a difference of squares from the model output to the historical data for each week.
Instead, the difference of squares was computed over total cases over the entire 5-year period.Also, the model output of each age category (ages 0-19, 19-65, and 65+) was separately compared to the corresponding historical data, such that we attempted to fit age-specific number of cases in the model to the age-specific profile observed in the data.

Parameter Fitting Comparison
Time series of the best parameter combinations resulting from the the fitting processes for influenza A and B for the longitudinal method are shown in Figure 2 and Table 2.The plotted results are compared to the historical laboratory confirmed cases over the time period.We used a separate fitting process for each strain, although we assume that the vaccine efficacy and the infectious periods are the same for both.The largest differences in our model emanate from the 2012 season for influenza B. Most parameter sets undershoot the peak in this season, although some achieve much closer fits.Simulations using the cross-sectional method produce the parameter combinations shown in Table 3, with the results for each age category shown in Figure 3   results, the variance of total cases produced by the parameter sets for influenza A (Figure 3A) in each age category are much lower than that of influenza B (Figure 3B).This could stem from the fitting process, and how Globalsearch attempts to find optimal parameter combinations to match the age-stratified data.In the case of influenza A, each search has parameters converge to very similar values, whereas the final values for influenza B have much higher variance in comparison.This could be due to how the infections are spread out across each age category.For example, influenza A has many more cases in the ages 19+ than the ages 0-18.However, influenza B's cases are evenly spread across all ages.For influenza A, the primary differences in parameters for the two fitting methods stem from the parameters A (seasonality amplitude), R 0 (average basic reproduction number), and α (incidence scaling factor).For the seasonality amplitude, we notice that when not required to meet multiple varying seasonal peaks as we did in the longitudinal method, the average amplitude is lower with less variance in the cross-sectional method fits than in its longitudinal counterparts.Similarly, the average R 0 value amongst the parameter sets follows the same pattern: in the cross-sectional method's fits, the average value and variance amongst the sets is lower than the values seen in the longitudinal method's fits.Finally, α is much larger in the cross-sectional method's fits.
For influenza B, the biggest differences in parameters for the two fitting methods stem from R 0 and the waning immunity rates ρ N and ρ V .Similarly to influenza A, the average basic reproduction number is smaller and has less variance amongst the sets in the cross-sectional method compared to the longitudinal method's parameters.Also, the average natural waning immunity rate is smaller as well.An interesting note is that the vaccine conferred waning immunity rate takes on its maximum allowed value in all of the sets for the cross-sectional method.A similar, but less extreme, shift towards the maximum ρ V value occurs in the influenza A parameter sets as well.

Projected Impact of Expanded Vaccination Coverage
These results may be further tested by observing the impact of implementing different vaccination scenarios or strategies for vaccine allocation approaches.Here, we test the changes in outcomes of our model with a targeted vaccine allocation in a scenario where a health jurisdiction is expanding their influenza vaccination program.
When expanding vaccination coverage, an important consideration is targeted distribution of vaccines.The two main strategies are to target children, who are believed to be responsible for the majority of transmission, or to target high risk individuals and age groups, such as the elderly.To test these two scenarios, we will increase the vaccine uptake of younger aged age groups in our model (ages 0-18) by 30% for each age, a strategy which has been believed to indirectly protect other age groups as well [18,17,16].Then, we compare these results to increasing the total number of vaccines administered by the same amount in older age groups instead, which in our population is the ages 55+.
With the longitudinal fitting method for influenza A, vaccinating younger age groups produces a 24.53% drop in total cases on average from baseline vaccination (Figure 4A and Figure 5C).
When targeting older age groups, we see an average reduction in total cases of 13.86%.Total mean confirmed cases and their 95% CIs are found in Table 4. Thus, the vaccination program aimed at the younger age classes provides a small benefit in total case reduction on average compared to a similar program targeting older ages.This stems from the low baseline vaccine uptake in children and their high contact rates with each other as well as middle aged adults.In the case of influenza B, targeting the younger ages gives an average 19.52% drop in total cases, whereas targeting the older age groups gives an average 24.27% drop in the mean (Figure 4B and Figure 5D).Total mean confirmed cases and their 95% CIs are found in Table 4.In this case, vaccinating older age groups produces a small but largely negligible average reduction in the mean of total cases across parameter sets used.Using the cross-sectional method, influenza A results differ from the longitudinal method.In this case, vaccinating older age groups results in the best case reduction (Figure 5A with comparison to the longitudinal model in Figure 5C).When increasing vaccination rates in the ages 55+, we see less than half the total cases than when expanding vaccination amongst ages 0-18.Total mean confirmed cases from the simulations are found in Table 4.For influenza B, vaccinating the younger age groups yields a 31.89%reduction in mean cases compared to baseline, and vaccinating older age groups provides a 50.16% reduction (Figure 5B with comparison to the longitudinal model in Figure 5D).Total mean confirmed cases from the simulations are found in Table 4.In general, the cross-sectional method's fitting predicts a much larger decrease in total cases for any vaccination expansion strategy than the longitudinal time series fitting method.These results reveal that the varying types of data that can be used to fit a predictive model of influenza transmission can produce very different results.

Discussion
We have designed and implemented an age-stratified dynamic transmission model of seasonal cumulative case data from the years 2011-2016 in Canada.We also used this model to evaluate vaccine expansion strategies which target certain age groups.
Using the cross-sectional method, the variance amongst the respective parameters in each of the 50 best sets is generally smaller than that of the variance amongst the parameters found using the longitudinal method.Also, when introducing vaccination scenarios targeting different age groups, outcomes from using parameters derived from the two types of data differ-particularly for influenza A. For example, the cross-sectional method's data predicts much larger decreases in total cases from baseline vaccination coverage than the time series data.Additionally, those simulations show that vaccinating older age groups will provide the most benefit in reducing the total number of cases in the population.Using the longitudinal method, results show that vaccinating younger age groups provides a moderate total case reduction for influenza A, and vaccinating older age groups provides a slight total case reduction for influenza B.
Our model makes some simplifying assumptions.For example, the parameter α, which represents the rate at which an infected individual is symptomatic and visits a physician who in turn administers a laboratory test for influenza which returns positive, is constant across all age groups.In reality, this may not be the case as some age groups may be more likely to visit a physician after becoming ill, or physicians may be more likely to administer tests for certain age groups.We also assume that the laboratory confirmed case data is a consistently uniform sample of all influenza cases.
However, physicians may send in more tests depending on the time of year or when they perceive the prevalence of influenza is higher.Finally, we assume that vaccine efficacies are the same for both A and B strains, and that the infectious period is the same for both as well [9].
There are some differences in the data used which hinder direct comparisons.The age-stratified cumulative case data gives country wide cases, whereas the time series data is for the province of Ontario.Although we scale the number of cases country wide by the proportion of Canada that lives in Ontario, the cases will still not be directly comparable.Moreover, the age-stratified cumulative case data available covers a one year difference from the time series data, causing some discrepancy in the number of cases over each 5 year span.
Different regions will often have varying types of data available for influenza attack rates.In this work, we have considered weekly time series confirmed influenza cases and age-stratified cumulative cases, but other research has utilized ILI incidence as well [24,8,52].We have shown that when using an identical fitting process, these different types of data used to fit the model can produce varying results.Thus, when fitting a dynamic transmission model for influenza, the quality of case notification data used is an important aspect that impacts model outputs.

Figure 1 :
Figure 1: Diagram of the age-stratified SIRS compartmental model with vaccination.See Methods for definitions of parameters and variables.

α
Encompasses the range used by similar models[9,24], and based on research deducing that antibodies may wane near the end of a season[50].λextInfectionsoriginating from an outside source.(Valueadded to the force of infection) 0.01-0.2Assumption.Scaling factor of model incidence 0.0005-0.15Estimate based on[51].*Also based on preliminary Latin hypercube sampling.Wider ranges were originally used, but the best results were contained within ranges shown above.**Ranges used for influenza B.

Figure 2 :
Figure 2: Time series of confirmed influenza cases (black) and the fitted model (grey) for (A) Influenza A and (B) Influenza B. Shaded region represents 95% confidence intervals.

Figure 3 :
Figure 3: Age-stratified cumulative cases for influenza A and B compared to empirical targets in our model.Target number of cases from the emprical data are given by X's where applicable.(A) Cross-sectional method for influenza A. (B) Cross-sectional method for influenza B. (C) Longitudinal method with age-stratified results for influenza A. (D) Longitudinal method with age-stratified results for influenza B. From bottom to top, each line in each boxplot shows the following information: minimum value, first quartile, median, third quartile, maximum value.Red crosses are considered outliers.

Figure 4 :
Figure 4: Time series of confirmed influenza A and B cases in the model under different vaccination scenarios for (A) Influenza A and (B) Influenza B.

Figure 5 :
Figure 5: Cross-sectional method results of model predictions for influenza A and B cases under different vaccination scenarios, including comparison to the longitudinal method's corresponding results.Subpanels show number of cases for (A) Influenza A, (B) Influenza B, (C) Longitudinal method with age-stratified results for influenza A, (D) Longitudinal method with age-stratified results for influenza B. From bottom to top, each line in each boxplot shows the following information: minimum value, first quartile, median, third quartile, maximum value.Red crosses are considered outliers.

Table 1 :
Parameter descriptions, fitting ranges and literature sources.

Table 2 :
Best fitting parameter values (mean and standard deviation) for the longitudinal method.Parameter Mean Value, A Strain Std.Dev.Mean Value, B Strain Std.Dev.

Table 3 :
Best fitting parameter values (mean and standard deviation) for the cross-sectional method.

Table 4 :
Mean number of cases for influenza strains A and B under different vaccination scenarios.