Estimation of influenza‐attributable medically attended acute respiratory illness by influenza type/subtype and age, Germany, 2001/02–2014/15

Background The total burden of influenza in primary care is difficult to assess. The case definition of medically attended “acute respiratory infection” (MAARI) in the German physician sentinel is sensitive; however, it requires modelling techniques to derive estimates of disease attributable to influenza. We aimed to examine the impact of type/subtype and age. Methods Data on MAARI and virological results of respiratory samples (virological sentinel) were available from 2001/02 until 2014/15. We constructed a generalized additive regression model for the periodic baseline and the secular trend. The weekly number of influenza‐positive samples represented influenza activity. In a second step, we distributed the estimated influenza‐attributable MAARI (iMAARI) according to the distribution of types/subtypes in the virological sentinel. Results Season‐specific iMAARI ranged from 0.7% to 8.9% of the population. Seasons with the strongest impact were dominated by A(H3), and iMAARI attack rate of the pandemic 2009 (A(H1)pdm09) was 4.9%. Regularly the two child age groups (0‐4 and 5‐14 years old) had the highest iMAARI attack rates reaching frequently levels up to 15%‐20%. Influenza B affected the age group of 5‐ to 14‐year‐old children substantially more than any other age group. Sensitivity analyses demonstrated both comparability and stability of the model. Conclusion We constructed a model that is well suited to estimate the substantial impact of influenza on the primary care sector. A(H3) causes overall the greatest number of iMAARI, and influenza B has the greatest impact on school‐age children. The model may incorporate time series of other pathogens as they become available.


| 111
an der HeIden and BUCHHOLZ many countries to monitor intensity and spread of influenza. The two major syndrome categories that have been used are acute respiratory illness (ARI) and influenza-like illness (ILI). Among European countries, a considerable variation of case definitions has been employed for both syndromes. 2 ILI case definitions may include the presence of fever 3 (or another systemic symptom) 4 in addition to one or more respiratory symptoms. In contrast, ARI case definitions usually do not require an obligatory presence of fever or feverishness. 4 As a result, ILI case definitions are more specific, but less sensitive compared to ARI, and as a corollary surveillance systems using ILI see more pronounced illness waves and peaks during influenza epidemics. However, because only a portion of all symptomatic influenza cases are captured by ILI case definitions, 1,5-9 ILI surveillance systems are less well suited to describe and capture the burden of disease of influenza.
In 2013, we published results of a cyclic regression ("sin/cosexcess model") model that estimated the excess burden of MAARI during periods of influenza circulation compared to a baseline (using a combination of sine and cosine curves) that was established for all weeks leaving out periods of influenza circulation. 10 We demonstrated that the season-specific attack rate (cumulative incidence) of influenza-attributable MAARI (iMAARI) is lower in older age groups.
We demonstrated also that the order among the child age groups (aged 0-4 and 5-14 years, respectively) varies; that is, in some years children aged 0-4 years had the highest attack rate, in others children aged 5-14 years. Because we worked with ARI data, this model enabled us to estimate the total number and attack rate of iMAARI in primary care. In strong seasons, such as in 2004/05 or 2008/09, up to 9% of the general population (i.e. seven of 82 million inhabitants) were seeking health care due to an influenza infection.
In the sin/cos-excess model, the periods of influenza circulation were determined using virological data. These data also include type and subtype information for influenza, and therefore, we intended to develop a model that allows to derive the number and the attack rate of MAARI attributable to influenza type and subtype stratified by age group. We also wanted to overcome a limitation of the sin/cos-excess model that produced occasionally negative excesses during some weeks of a period of influenza circulation. These occurred, for example, at the end of such a period, because influenza was still confirmed in the laboratory; however, the impact on the primary care sector had waned already.
Moreover, the sin/cos-excess model could not easily be generalized to estimate the weekly number of MAARI attributable to other relevant pathogens, such as respiratory syncytial virus (RSV), human metapneumovirus (hMPV) or rhinovirus, which we have started to systematically collect only since 2013/14. The model presented in this study should be capable in the future to incorporate these upcoming data, too.

| Data
In Germany, national surveillance for influenza on primary care level is organized by the "Working Group for Influenza" ("Arbeitsgemeinschaft Influenza" (AGI); influenza.rki.de), which collects data on medically attended ARI (MAARI). 11 Briefly, sentinel physicians in Germany cooperate in the AGI covering approximately 1%-1.5% of the total population of approximately 82 million persons. Physicians report aggregated age group-specific frequencies of patients presenting with acute respiratory illness (syndromic surveillance). "Acute respiratory illness" is defined as pharyngitis, bronchitis or pneumonia with or without fever. Data are collected in the following age groups: 0-4, 5-14, 15-34, 35-59 and 60 years and older. Data are sent to and analysed by the AGI (influenza.rki.de) yielding the MAARI attack rate by age group and calendar week. Physicians record illness syndromes regardless if an individual patient had presented already earlier in the season with the same syndrome. Thus, an individual patient may contribute illness data more than once in a given season. However, a second occurrence of an illness of an individual patient is only recorded again if at least 2 weeks have passed after the first incident. 10 In addition, a subset of about 20% of the sentinel physicians collects respiratory samples from patients with influenza-like illness (ILI) which are sent to the National Reference Center for influenza (NRCI). During the study period, physicians participating in the virological surveillance arm were requested to take respiratory samples from patients presenting with influenzalike illness. "Influenza-like illness" was defined as acute respiratory illness with fever and cough or sore throat. Physicians were asked to take at least one, but not more than two samples from a given age group (0-4 years, 5-14, 15-34, 35-59 and 60+ years). All samples were tested and typed in the NRCI by real-time PCR for presence of influenza A, B and the subtypes A(H1N1) and A(H3N2). 10 For the study,

| GAM sample-based Model
We used the age-specific weekly MAARI attack rate m t as dependent variable and developed a generalized additive regression model 12 with linear link function to analyse this curve. The model was stratified with respect to age group; to simplify the notation, we omitted in the formulas the subscript for age group. We denoted by i t the (age group-specific) number of respiratory samples with ILI having tested positive for influenza in week t. From hereon, we refer to i t as the course of laboratory-confirmed influenza. We made the following assumptions in each age group: 1. The MAARI attack rate m t can be described as additive composition of a periodic baseline, a secular trend and the iMAARI attack rate.

2.
In each season, the course of the laboratory-confirmed influenza, i t , mirrors the course of the attack rate of influenza-attributable ILI (iMAILI) in the total population.

3.
In each season, the age group-specific proportion of iMAILI among the iMAARI is approximately constant over the weeks.
We modelled the periodic MAARI baseline using a penalized cyclic p-spline f(p t ) with at most 52 knots-one for each calendar week; here, p t counts the calendar weeks of the year. The secular trend was modelled by a penalized p-spline g(t) with at most 7 knots (1 for every two seasons).
The third component of the model is i t multiplied by a season-specific factor s t , where s t describes the season week t belongs to. The formula of the expected weekly MAARI attack rate for each age group (five models) then looks as follows: Hence, we estimated the expected iMAARI attack rate in week t as When this model is used prospectively, the inclusion of i t with a season-specific factor is only useful after the start of a period of influenza circulation, for example as defined in an der Heiden, 10 after the lower confidence limit of the proportion of samples positive for influenza in the NRCI exceeds 10% in two consecutive weeks. A potential season without a period of influenza circulation should be disregarded.
In weak seasons, the estimated coefficient ̂s t , of a particular age group might be a negative number, as the observed MAARI may lie beyond the baseline. In this case, the expected iMAARI attack rate in this age group will be negative throughout the season. We conclude then, that it cannot be quantified and put it to zero.
In a second step, we subdivided the estimated iMAARI attack rate We combined the uncertainties of the estimated iMAARI and the (sub)type distribution by drawing random samples from the model normal distribution  s t , s t 2 i t and the Dirichlet distribution d t = (n t (H1pdm09),n t (H1prepan),n t (H3),n t (B)). The latter describes the uncertainty in the (sub)type distribution; n t is the number of samples having tested positive for the respective (sub)type in week t and its two neighbouring weeks.
The expected iMAARI attack rate associated with influenza of subtype τ is then given by Based on the consideration that a high influenza attack rate in one season may lead to a relative immunity of at least 1 year (until the virus has drifted sufficiently to evade immunity), one could postulate that a season with a higher attack rate (overall, or in a particular type/ subtype) may be followed by a lower attack rate and vice versa. We therefore attempted to "predict" the magnitude of iMAARI attack rates (by type/subtype and overall) based on the magnitude of the preceding season. To do that, we built thirteen pairs with the iMAARI attack rate of the season of interest as dependent variable and that of the preceding season as explanatory variable (the first season (2001/02) had no preceding season and dropped out). We used a Poisson regression to quantify the associations between the pairs and checked both qualitatively and with pseudo-R 2 how well these considerations were met by the data. estimates for the iMAARI attack rate were compared for all age groups combined and also separately for the five age groups used in this study. S2: To gauge the effect of varying degrees of freedom in the secular trend, we also considered a model with a more flexible secular trend that allowed 1 df per included season, that is 14 df instead of 7. We estimated the iMAARI attack rates in an ongoing season simulating from retrospective data that additional information becomes available as the season is evolving. We used season 2013/14 as example and calculated (cumulative) iMAARI attack rates for all age groups combined and for the five age groups separately.

| Sensitivity analysis
S3: To understand better the differences of our previous (sin/ cos-excess) model 10 to the GAM sample-based model presented in this study, we compared iMAARI attack rates derived from these two models with each other. Moreover, we considered an intermediate model "GAM-excess" model that incorporates characteristics of both the sin/cos-excess model and the GAM sample-based model.
It leaves out periods of influenza circulation for model building (as done in the sin/cos-excess model 10 ) but uses penalized p-splines for the secular trend and the cyclic component (as in the GAM samplebased model).
All estimations including the fitting of generalized additive models (package "mgcv") were performed using version 3.3.0 of the statistical analysis software r. 13

| Model fit
The GAM sample-

| Virological data
The frequency and proportional distribution of (sub)types varied considerably from season to season ( Figure 2, top panel; Table 1). Until

| Influenza-attributable MAARI
In most seasons, we found a considerable amount of MAARI explained by circulation of influenza viruses (iMAARI; red-shaded areas in Figure 1) with a wide variation among seasons (Figure 2, bottom panel; Table 1). The proportion of the population with iMAARI ranged from 0.74% in 2003/04 (0.6 million individuals) to 8.9% in season 2012/13 (7.1 million individuals; Table 1). In the median influenza season, 3.4% of the population (2.8 million individuals) consulted a physician due to their influenza infection (interquartile range, 2.3%-6.0%  Table 2); in only one season (2013/14), no subtype led to an iMAARI attack rate of more than 0.5% (Table 2;   age. In eight (57%) of 14 seasons, at least one of the two child age groups had an attack rate of more than 10%. In the pandemic season 2009/10, the age group 5-14 was by far the most affected and had an iMAARI attack rate that was more than two times higher than either the 0-to 4-or the 15-to 34-year-old age group. In Table 3, we categorized the age patterns of the season attack rates into three classes: (i) "low" meaning that all age-specific attack rates in a season were below 1%; (ii) "monotone" means the pattern was not low and the age group 0-4 had the highest attack rate, that is generally decreasing to that of age group 60+, that had the smallest attack rate or was below 0.5%; and (iii) "skewed hat" means the season was not low and the age group 5-14 had the highest attack    rate, that is generally decreasing to that of age group 60+ that had the smallest attack rate or was below 0.5%. The low, monotone and skewed-hat patterns are shown as icons in Table 3. The typical pattern was "monotone" for A(H1) and A(H3) and "skewed hat" for B.
In the right panel of Figure 5, the age-specific iMAARI attack rates are summed up by (sub)type for those seasons that followed the typical pattern. The median ratio of the iMAARI attack rates of the age

| Sensitivity analysis
S1: Using our GAM sample-based model on different data sets and comparing the estimates for the included seasons showed that the estimated iMAARI attack rate in a given season depends on the whole history of data included (Fig. S1A). Nevertheless, most changes lie inside the range given by the 95% confidence intervals. Moreover, the differences within seasons are substantially smaller than the differences between seasons; hence, the ranking of the seasons regarding the iMAARI attack rate was quite stable. Considering only seasons after the pandemic season 2009/10 leads to systematically higher estimates for the iMAARI attack rates. These are due to deviations in the two child age groups (Fig. S1B). In contrast, estimates for the age as an example (Fig. S2A). In particular, in calendar week 11/2014 the estimate was too high and had to be corrected downwards thereafter. In addition, iMAARI of the age group 0-4 was quite instable as it seemed to overshoot not only in the latter part of the season but also very early on (Fig. S2D). In comparison, the model using the less flexible secular trend resulted in more stable estimates (Fig. S2C). This The one exception is the pandemic (Fig. S3A). This relates to the fact that the baseline in the sin/cos-excess model showed in most seasons a bimodal course with two peaks of similar height separated by the turn of the year (Fig. S3B). The GAM sample-based model, however, has a lower peak before the turn of the year and a higher peak in January/February. As the influenza epidemic almost always starts after the turn of the year, the resulting excess of MAARI is therefore

| DISCUSSION
We have developed a model that uses virological type and subtype as well as age specific data from sentinel physician surveillance to "explain" MAARI attributable to influenza using a GAM regression model. iMAARI in % of population in preceding season iMAARI in % of the population 95%−CI the estimated season iMAILI attack rate yielded average estimates in the range of 0.35%-2.16% for the age group 0-4 and 0.32%-2.96% in the age group 5-14. Only estimates from Italy (8.9% and 9.8%) were in a comparable range to ours, but seem to represent an outlier among countries that collect ILI data. Population-based studies have shown that the actual burden of influenza in children is underrecognized with common (ILI) surveillance methods. For example, between 2004/05 and 2008/09 Poehling conducted a populationbased study in three US counties and found that per season, between 10% and 25% of children aged 0-4 years sought outpatient medical care because of influenza. 15 Our estimates in these two age groups between the seasons 2002/03 and 2008/09 ranged from 2% to 17% (age group 0-4; median 11%) and from 3% to 13% (age group 5-14; median 8%), respectively, and are therefore comparable with the thoroughly conducted population-based research study. Similarly, in the three seasons following the pandemic, influenza-associated consultations by patients with ILI were estimated in a populationbased surveillance project in 13 US health jurisdictions as 0.7%, 0.2% and 1.1%. 16 Given that only between 30% and 80% of all influenza cases manifest themselves as ILI [5][6][7][8] and in the same year influenza seasons may be quite different in different countries, the estimated 2.6%, 1.0% and 8.9% in our study lie in a comparable magnitude as the US data. We believe that our combination of surveillance (using ARI data) followed by modelling estimates the population impact of influenza more realistically than sentinel systems that use ILI data.
Moreover, the surveillance system used provides such estimates not just for a few counties for a limited number of seasons, as is the case for research studies, but for the entire country, the entire age range and every season.
Surveillance systems that collect ILI data may estimate iMAILI by multiplying the proportion positive for influenza among ILI patients in virological surveillance by the number of ILI patients in the population. 17 We could not use this approach because syndromic surveillance collects data on MAARI, whereas virological surveillance focusses on ILI patients. Certainly, the distribution of respiratory viruses among all MAARI patients is not the same as for MAILI patients. For example, the influenza positivity rate among ARI is considerably lower than for ILI. Hence, in our model we do not use the proportion positive for influenza, but we used the total number of samples that tested positive for influenza, and if yes, for which type and subtype. We still have to assume that the proportion of iMAILI among iMAARI (or the iMAILI to iMAARI ratio) is constant during a given season and does not depend on the influenza type or subtype. The fact that the GAM-excess model and the GAM sample-based model estimated a quite similar excess in most seasons shows that the course of the laboratory-confirmed influenza is suitable as a proxy for the iMAARI attack rate, as the GAM-excess model does not imply any assumptions on the shape of this course.
In fact, we found an almost identical baseline in both models. In seasons (07/08, 09/10, 10/11), where we found differences in the iMAARI attack rate, these resulted from weeks at the beginning or end of the influenza season. Either the observed MAARI were below the baseline and hence the GAM sample-based model did not follow them (seasons 09/10 and 10/11) or there were additional spikes (season 07/08) that were not paralleled in the course of the laboratory-confirmed influenza (Fig. S4). As the number of samples with laboratory-confirmed influenza was quite low for these periods, these deviations in the observed MAARI rate were rather not directly connected to influenza.
The previous model 10 used sine/cosine curves to calculate periodicity of the baseline. However, recently the use of splines has proved to provide a more flexible tool to estimate irregular periodic curves. 18 Using 52 knots for the yearly oscillating baseline allowed us to construct a quite realistic pattern. A particular characteristic of the sin/cos-excess model was that we accounted for oscillations of the MAARI baseline with periods of 2 and more years. This allowed us to adapt quite well to the MAARI we observed before the period of influenza circulation. As described in the Results section, the GAM sample-based model does not have this property and we observe unusual MAARI activity in the autumn period of several seasons. On the other hand, in the sin/cos-excess model we implicitly assumed that the (sometimes) unusual autumn MAARI activity continues into the period of influenza circulation which is also not always plausible. In the end, to adequately account for unusual MAARI activity in autumn, data about the MAARI or MAILI activity due to other pathogens are needed. These are now being collected but comprehensively only since 2013/14. Until we are able to construct a stable baseline over several years, we will be using a more parsimonious model.
Due to the fact that we have estimated the total number of iMAARI over a long time frame, we were also able to cumulate these over time. While A(H3) has been associated with both very strong and quite weak influenza seasons, it has overall led to more than half of all iMAARI in those 14 seasons. In contrast, A(H1)prepan has contributed the least, first of course because it ceased to circulate after the advent of A(H1)pdm09, and second because it was generally associated with a weaker seasonal impact when it did circulate ( Figure 3; Table 1).
Even both A(H1) variants (pre-pandemic and pdm09) together led only to approximately one-quarter of all iMAARI between 2001/02 and 2014/15.
The age dependency of influenza can be seen nicely also in Figures 4 and 5. The two child age groups mostly have a rather high attack rate, then it drops and stays rather constant in "younger" adulthood (15-59 years) before it drops again in old age. Several other population-based studies or analyses of surveillance data have observed that medically attended respiratory illnesses on primary care level decline with age. 1,[19][20][21][22] What this study adds is the additional information of the comprehensive burden of all-influenza cases in primary care, be they mild or more severe, not only by age, but also by type/subtype, over a long time period. It is interesting that the "typical" pattern of influenza B shows a substantially higher relative iMAARI attack rate among school-age children [5][6][7][8][9][10][11][12][13][14] compared to that in age group 0-4 ( Figure 5, right panel), although in absolute terms the attack rate among 5-to 14-year-old children is comparable to that caused by A(H3N2). This striking characteristic of influenza B concurs with data from two serological studies, one from the Netherlands and one from Germany, which investigated the seroprevalence of antibodies against influenza virus types and subtypes by year of age among children. 23,24 Both studies showed that seroprevalence of antibodies against influenza A viruses rises faster in early childhood compared to influenza B. In the German study, seroprevalence of antibodies against influenza A rises asymptotically with age reaching a rate of approximately 90% by the age of 6-7 years. However, seroprevalence rises only linearly for influenza B, and at the age of 6-7 years, 70% of children are still lacking detectable IgG antibodies. 24 Another interesting outcome of our analysis is that the impact of seasons tends to almost oscillate biannually; that is, a strong season is followed by a weaker season (Figure 2). The pandemic only interrupted this pattern for a couple of years. Indeed, when we modelled the iMAARI attack rate as an (inverse) function of the magnitude of the preceding season, we found a pattern that seems to support this observation ( Figure 6). However, this is by and large influenced by the dynamics of A(H3) and less so by influenza B. The reason for this pattern may be that the degree of population immunity after a heavy season is large enough to dampen the impact of next season's influenza virus, independently of its type or subtype.
We have to admit the following limitations. First, the weekly number of samples taken for virological surveillance is somewhat capped because physicians are requested to take no more than three to five specimens per week. It is expected that this might overemphasize a little bit the tail ends of the epidemic, and it requires a seasonspecific factor to calculate iMAARI from the number of confirmed influenza viruses. Second, as we are lacking a time series of data on RSV (and other respiratory pathogens) and as RSV seasons may overlap with influenza epidemics, 25 we might overestimate the burden of influenza to a certain extent. However, data from the above mentioned study in four European countries found that in two of the four countries, RSV was not a significant term in the model (explaining MAILI), and in two others, the model attributed only 11% and 13%, respectively, to RSV. 14 We therefore do not believe that we vastly overestimate the seasonal attack rate of iMAARI by neglecting RSV circulation.

| CONCLUSION
We present a GAM model that is capable of estimating the influenza-attributable MAARI attack rate on the basis of aggregated ARI as well as virological data stemming from a sentinel physician network. The model has been capable to yield type-and subtype-as well as age-specific estimates of the burden of influenza on primary care level. The estimated seasonal iMAARI attack rate is substantially higher than in other countries using ILI surveillance data and agrees better with detailed population-based research studies.
About half of the impact during these 14 seasons was caused by A(H3). Regularly the two child age groups (0-4, 5-14) had the highest iMAARI attack rates reaching frequently levels up to 15%-20%. Influenza B has led to an exceptionally high impact among 5-to 14-year-old children, compared to all other age groups. The degree of influenza activity in 1 year seems to influence the degree in the next, largely influenced by the activity of A(H3). The model is ready to integrate data from other pathogens that will become available in the near future.

ACKNOWLEDGEMENT
We would like to thank all sentinel physicians for their voluntary contributions to collect the syndromic and virological data needed to facilitate the analyses for this study. We wish to also thank Brunhilde Schweiger and other colleagues in the NRCI for analysing respiratory samples during the study period. We are also indebted to Silke Buda for her continuous and thoughtful input as well as Kerstin Prahm and Karla Köpke for their contributions relating to obtaining AGI data, as well as Michael Herzhoff for his assistance in managing the complex data obtained through the sentinel surveillance system.