Ten-year performance of Inﬂuenzanet: ILI time series, risks, vaccine effects, and care-seeking behaviour

Recent public health threats have propelled major innovations on infectious disease monitoring, culminating in the development of innovative syndromic surveillance methods. Inﬂuenzanet is an internet-based system that monitors inﬂuenza-like illness (ILI) in cohorts of self-reporting volunteers in European countries since 2003. We investigate and conﬁrm coherence through the ﬁrst ten years in comparison with ILI data from the European Inﬂuenza Surveillance Network and demonstrate country- speciﬁc behaviour of participants with ILI regarding medical care seeking. Using regression analysis, we determine that chronic diseases, being a child, living with children, being female, smoking and pets at home, are all independent predictors of ILI risk, whereas practicing sports and walking or bicycling for locomotion are associated with a small risk reduction. No effect for using public transportation or living alone was found. Furthermore, we determine the vaccine effectiveness for ILI for each season. Authors.


Introduction
Recent concerns with emerging infectious diseases have exposed deficiencies in disease surveillance systems and impelled radical rethinking on how to monitor population health and detect anomalies in real time (Butler, 2006). In this context, new approaches in syndromic surveillance -the collection and interpretation of data for public health before laboratory or clinical confirmation is available (Lazarus et al., 2001;Mandl et al., 2004) -have emerged. Several systems are in evaluation, showing a large diversity of data sources and methodologies employed, such as telephone-based health information services (Cooper et al., 2008), automated medical records (Lazarus et al., 2001;van den Wijngaard et al., 2008), pharmacy sales and absenteeism (Chretien et al., 2008), queries to online search engines (Ginsberg et al., 2009), and telephone-based self-reporting in cohorts of randomly selected participants (Merk et al., 2013;Rehn et al., 2014). Syndromic surveillance is complementary to traditional public health surveillance in disease reporting (Henning, 2004;Lipsitch et al., 2009).
Influenzanet is a monitoring system for influenza-like illness (ILI) in voluntary cohorts of internet users. It was initially conceived to make scientific information accessible to a broad public and to kindle students' enthusiasm for science, and was launched in the Netherlands and Belgium (www.degrotegriepmeting.nl)  Based on single-season analysis, previous studies established good correlations between ILI incidences as determined by Influenzanet and by the clinical surveillance by sentinel General Practitioners (GPs) as coordinated by the European Centre for Disease Prevention and Control (ECDC) (Friesema et al., 2009;Marquet et al., 2006;van Noort et al., 2007;Paolotti et al., 2014;Vandendijck et al., 2013). The absolute ILI incidence as reported by http Influenzanet is, however, much more consistent across countries according to Influenzanet than reported by the ECDC, due to country specific medical care seeking rates and disparities in ILI case definitions used by GPs in different countries (van Noort et al., 2007). This uniformity in rates reported across European countries facilitates the geographical analysis and modelling of epidemics (van Noort et al., 2012). In the United Kingdom (Eames et al., 2012a) and France , the vaccine effectiveness for ILI has been determined for a single season. By integrating serological data sources, ILI rates reported by Influenzanet have been converted to estimates of influenza attack rates (Patterson-Lomba et al., 2014).
Here we aim to further establish the Influenzanet system as a valid sentinel for ILI surveillance, by confirming that both the timing and relative intensities of epidemics are consistent with those reported by ECDC, and the identified risk factors for ILI are consistent with those in the published literature. The analysis is based on data collected over the first 10 seasons (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013) from the countries in which Influenzanet was implemented for at least 5 seasons: the Netherlands, Belgium, Portugal, and Italy. Time series analyses are applied to compare ILI incidences from Influenzanet and ECDC, whereas regression analysis is used to determine individual risk factors based on personal characteristics and vaccination status. Furthermore, based on the health care seeking behaviour as reported to Influenzanet, differences in ILI incidence by Influenzanet and ECDC are explained.

Data collection
Influenzanet participants are recruited from the general population by completing an intake questionnaire on one of the national websites, containing various demographic and life style questions. During the influenza season, participants receive a weekly newsletter by e-mail in which they are directed to an online questionnaire about a number of symptoms that they might have experienced since their last report. The Ethics Committee of Instituto Gulbenkian de Ciência approved the study.

Participants
An active population of participants is essential for the consistency of the system. An important cornerstone for success is the feedback of information to keep the participants involved and motivated. Although the specific recruitment strategies vary between countries (Bajardi et al., 2014), they tend to be based on mass communication. The websites contain a wealth of information on influenza, ILI and common cold, while the educational and scientific aims of the project are explained in direct mailings to schools, in repeated interviews on television and radio, and in newspapers. Schools are provided with educational material on influenza to promote incorporation of ideas of disease surveillance in science classes. At the beginning of each season, all participants from previous seasons are sent an email inviting them to participate again by completing an intake questionnaire for the new season. Based on a unique user id, participants can be tracked over multiple seasons.

Bias
Public health statistics such as asthma, diabetes and influenza vaccination rates in the Influenzanet participants have been shown to be similar for the Dutch (Marquet et al., 2006) and Belgian (Vandendijck et al., 2013) populations. Although younger and older age groups are underrepresented in Influenzanet (Cantarelli et al., 2014), these differences did not seem to have an impact on the observed ILI trends (van Noort et al., 2007). To minimize the selection bias in recruiting participants who already have ILI, any symptoms that started before or on the registration date are excluded from the analysis. Only participants who participate at least 3 times during a season are included in the analyses.

ILI incidence
ILI is defined as the acute onset (within a few hours) of fever (a measured temperature of at least 38 • C), together with muscle pain or headache, and cough or sore throat. The day of fever onset determines the day of ILI onset. Participants are considered active between registration date and the date of their last completed symptoms' questionnaire. ILI incidence is determined by dividing the number of ILI onsets per week by the number of active participants. If participants fit the ILI case definition in consecutive questionnaires, this is considered as a single ILI episode.

European Influenza Surveillance Network
The clinical surveillance of influenza in the European Influenza Surveillance Network (EISN, formally EISS), coordinated by the ECDC, is generally based on reports made by sentinel GPs. The ILI incidence for each country is determined by the number of patients who visit their (sentinel) GP and fit the (country-specific) ILI case definition (Aguilera et al., 2003), divided by the total number of people assigned to the participating GPs.

Crosscorrelation
For each country, the crosscorrelation between ILI incidence rates as reported by ECDC and Influenzanet is determined. Since both time series are autocorrelated and share a common seasonal trend, this direct crosscorrelation could give a misleading indication of their relationship (Bloom et al., 2007). Therefore, both time series are also prewhitened by fitting seasonal ARIMA models using the Box-Jenkins approach (Allard, 1998), where the model with the lowest Akaike information criterion is selected (Hyndman and Athanasopoulos, 2014). The detrended time series are obtained by filtering each time series by the selected model, and for each country the crosscorrelation between the detrended time series from Influenzanet and ECDC is determined.
Since the ILI incidence from Influenzanet is based on the reported day of onset and the ILI incidence from ECDC is determined by the week a patient visited their GP, it would be expected that the reported ILI incidence from Influenzanet precedes the ILI incidence as determined by ECDC. Since in Influenzanet not only the week of onset but the actual day is recorded, the weekly ILI incidence from Influenzanet can be shifted by single days, where a shift of zero days indicates that the ILI incidence from both systems is compared for the period Monday-Sunday.

Medical care seeking behaviour
Each participant who reports (ILI) symptoms, is asked some follow-up questions, such as whether the participant visited a medical doctor. This allows the determination of the percentage of participants with ILI who seek medical care. Since participants could seek medical care after they have reported their symptoms to Influenzanet, a reported visit within 15 days after a reported ILI onset is still considered. Since season 2011-2012, Influenzanet participants who reported to have visited a medical doctor are also asked how many days elapsed between the onset of symptoms and the visit.

Risk factor analysis
We apply a logistic regression model to explore the association between several individual covariates and the occurrence of at least one ILI episode during a season. These covariates are selected beforehand, consisting mostly of characteristics which have been identified in other studies regarding influenza risk, and some extra characteristics which are not normally analyzed. Most covariates are considered equal across all seasons: age group (<15, 15-49, 50-64, 65+), household situation (alone, with children or with only adults), gender, chronic disease (asthma, diabetes, heart disease, and/or immuno-compromised), smoking, sports (at least 1 h per week), pets at home (dogs, cats, and/or birds), and primary mode of daily locomotion (bicycle/foot, car, or public transport). The covariate "risk group (others)" includes those participants who report to belong to a risk group, but are younger than 65 years (60 in the Netherlands since 2008) and did not report any of the chronic diseases. The effect of vaccination is considered as a season-dependent covariate. For the season 2009-2010, the vaccine status is based on the pandemic vaccine. Country of residence and season (indirectly) are two extra covariates.
Only ILI onsets during the weeks when influenza strains were circulating in the population are considered. These periods are defined for each season and country as the weeks when the number of influenza-confirmed samples as reported by ECDC was at least 15% of the maximum for that season (moving average over 3 weeks) (Supplementary Figure S13). All participants are considered independent between two different seasons, and participants who were not active for the complete influenza period are excluded. Since Influenzanet is a cohort study in which healthy individuals (without ILI) are recruited and the possible onset of ILI is monitored over a fixed period of time, Influenzanet can determine the risk ratios for all the covariates. For each covariate a univariate risk ratio is determined. Adjusted risk ratios are determined by a multivariate log-binomial regression model including all global and season-dependent covariates analyzed by the general linear model in R software (R Development Core Team, 2014). The variance inflationary factor (VIF) for each covariate is determined to check for collinearity in the multivariate regression model.

ILI incidence
The ILI incidence as determined by Influenzanet correlates well over multiple seasons with the ILI incidence as reported by ECDC (Fig. 1). However, Influenzanet measures ILI incidence in all countries on the same scale, while the incidences reported by ECDC are in general lower and vary in scale between countries.
The crosscorrelation between the raw ILI incidences from Influenzanet and ECDC is significant ( Fig. 2A, C Table S1). The detrended time series of Influenzanet and ECDC also show a significant level of crosscorrelation at a lag of zero weeks (Fig. 2B, D, F, and H), for the Netherlands (0.38), Belgium (0.53), and Italy (0.29).

Medical care seeking behaviour
The percentage of participants with ILI who seeked medical care varies greatly by country (Fig. 3). Similar differences are observed in the number of days between the onset of symptoms and visiting the doctor (Fig. 4). The observed patterns do not change if only working adult participants are considered in the analysis. The crosscorrelation between the detrended time series from Influenzanet and ECDC is maximum when a shift of 4 days is applied in the Netherlands, 1-2 days in Belgium, 1 day in Portugal, and no shift in Italy (Table 1). This corresponds well with the median delay between ILI onset and seeking medical care as reported during the seasons 2011-2013: 4 days in the Netherlands, 2 days in Belgium, 1 day in Portugal, and 1 day in Italy.

Participation
The Netherlands has most participants (on average 19,491 per season, of which 16,481 completed at least 3 symptoms' questionnaires), followed by Belgium (6001; 5072), Portugal (2871; 1894), and Italy (1882; 1219) (Fig. 5). This corresponds to 0.1% of the population in the Netherlands and Belgium (Flanders, since basically all participants are from Flanders), 0.02% in Portugal, and 0.003% in Italy. Of all participants who completed at least 3 symptoms questionnaires during a season, in the Netherlands 76 ± 8% participated again in the following season, 74 ± 12% in Belgium, 69 ± 12% in Portugal, and 70 ± 4% in Italy.

Risk factors
The univariate risk ratios are listed in the Supplementary material (Table S2), whereas the adjusted risk ratios from the multivariate regression are listed in Table 2. The variance inflationary factor (VIF) for the covariates varies between 1.0 and 2.7 (Supplementary Table S2), reassuring that model specification is not compromised by undesirable collinearities (O'Brien, 2007). McFadden's pseudo R-squared for the final model fit (R 2 = 0.035) is relatively low, indicating that only a small part of the variation in ILI infections is explained by the presented covariates. The primary risk factor for acquiring an infection is having contact with an infectious person and this is absent from these analyses.
According to the adjusted risk ratios, having a chronic disease (asthma, diabetes, heart disease and/or immune-compromising condition), living with children, being female, belonging to a younger age group, pets at home (cats and/or dogs), and being a smoker, were all independent predictors of the risk of having at least one ILI episode during a flu season (Table 2). A small risk reduction was observed in participants who primarily use bicycle or foot for locomotion (compared to a car) and participants who practice more than 1 h of sports per week. No significant effect was observed for participants who live with other adults (compared to living alone), participants who have birds at home, and participants who use public transportation (compared to using a car).
The vaccine effectiveness for influenza-like illness varies from season to season. A significant reduction in ILI due to vaccination was observed in the seasons 2007-2008, 2008-2009, 2010-2011, and 2012-2013, while no significant effect was observed in other seasons. The vaccine effectiveness for season 2009-2010 is possibly underestimated, since the vaccine only became available when the ILI activity was already epidemic.

Discussion
Based on single-season analysis, previous studies established excellent correlations between ILI incidences as determined by Influenzanet and by ECDC (Friesema et al., 2009;Marquet et al., 2006;van Noort et al., 2007). A question remained on whether this consistency would persist for multiple-season data streams. We showed that during 10 seasons in the Netherlands andBelgium (2003-2013), 8 seasons in Portugal (2005Portugal ( -2013, and 5 seasons in Italy (2008-2013), the ILI trends from Influenzanet and ECDC are consistent in both timing and relative magnitude, with a significant crosscorrelation between both time series as lags of zero weeks. The signal from Influenzanet precedes ECDC by a few days, corresponding approximately to the median number of days between ILI onset and seeking medical care. However, this does not necessary indicate that in real-time monitoring Influenzanet would detect ILI trends earlier, since this depends on when the data becomes available and the statistical uncertainties in the data (van Noort, 2014).
Although both time series are correlated over the full 10-year period, there are localized discrepancies between the data streams, which could be attributed to the different methodology and composition of the cohorts in both systems. As an example, young children are largely underrepresented in Influenzanet (van Noort et al., 2007), whereas young children visit relatively more often a medical doctor. This could explain why for the season 2007-2008 in Portugal, dominated by the influenza B strain affecting mostly children, a small epidemic was reported by ECDC which went mostly undetected by Influenzanet. Another local discrepancy is the relatively high ILI incidence as reported by ECDC in the Netherlands during the months preceding the 2009 ILI pandemic, which might be attributed to an increase in awareness by medical doctors and patients due to a global concern about the new H1N1 influenza strain (Keramarou et al., 2011).
The presence of multiple independent sources encourages the development of integrative methods that explore the specific strengths of each system (Reis et al., 2007). Having multiple independent systems could uncover aspects of influenza transmission that would go unnoticed if only one data stream was available. Another cross-country data source for ILI incidences is Google Flu Trends (Ginsberg et al., 2009), which determines ILI incidence based on the frequency of ILI-related search terms. However, Google Flu Trends is not a strictly independent data source, since their algorithms rely on the ECDC data streams for calibration. Fig. 3. Influenzanet participants with ILI who visited a medical doctor specified by country (2011)(2012)(2013). ILI is defined as the acute onset (within a few hours) of fever (a measured temperature of at least 38 • C), together with muscle pain or headache, and cough or sore throat.  [2011][2012][2013]. ILI is defined as the acute onset (within a few hours) of fever (a measured temperature of at least 38 • C), together with muscle pain or headache, and cough or sore throat.
Patterns in medical care seeking behaviour suggest cultural difference between northern and southern Europe. In southern Europe (France, Italy, Portugal, and Spain) participants generally visit a medical doctor within 1-2 days after the onset of symptoms, whereas in northern Europe (Sweden, United Kingdom, and the Netherlands) participants seek medical care generally only 5-7 days after the onset of symptoms. Belgium (Flanders) seems an exception to the suggested pattern, most likely because according to Belgian law, an employer can require from their employee a medical statement within 24 h to justify work absenteeism. A similar pattern is observed in the percentage of participants with ILI who seek medical care, which is lower in the northern Europe (except Belgium) than in southern Europe. The two patterns could be associated by considering that in countries where participants wait longer before seeking medical care, many participants would no longer feel sufficiently ill to warrant a visit to a medical doctor.
This variation in medical care seeking rates across countries is one of the reasons why ILI incidences reported by ECDC cannot be compared directly (van Noort et al., 2007). Variations in medical care seeking could also affect the determined ILI incidence by ECDC within a country, if certain subgroups of the population visit a doctor at different rates. Influenzanet does not only serve as an independent source for ILI activity, but could also be used to calibrate ILI data as collected by GP sentinel systems (van Noort, 2014).
A crucial element in the success of Influenzanet, is having a sufficiently large cohort of participants. In the Netherlands (on average 16,481 active participants), Belgium (5072), Portugal (1894), and Italy (1219) the Influenzanet cohort was large enough to detect similar ILI epidemics as ECDC in all seasons, with the exception of season 2007-2008 in Portugal. Larger cohorts would lead to lower statistical noise such that epidemics could be detected earlier and even small ILI epidemics could be distinguished from baseline ILI activity. Furthermore, in larger cohorts, different subgroups, for example based on age or vaccine status, could be monitored separately. Of all active participants during a certain season, 73 ± 11% participated again in the following season. Although this shows impressive loyalty of participants, each season an effort should be made to recruit new participants to at least replace those who have left.
Risk factors estimated from the Influenzanet cohort are consistent with the influenza literature. Higher risk of ILI in children and in those living with children was observed, in consistency with observational studies (Cauchemez et al., 2009;Monto and Ross, 1977;Monto, 2004;Viboud et al., 2004). The increased ILI risk in women compared to men, which may be due to more intensive contact between women and children, has also been previously recognized (Monto and Ross, 1977). We found a significantly reduced risk of ILI among participants over 65. This is not due the higher vaccine uptake in seniors, since vaccine status is already included as a separate covariate in this multivariate analysis. Seniors are generally considered a risk group for influenza, not because of a higher probability for infection, but due to their greater risk for complications (Monto, 2004). Having a chronic disease, such as asthma, diabetes, heart disease or an immune-compromising condition, was a strong predictor of ILI in the Influenzanet cohort. People with these chronic diseases are generally advised to take an influenza vaccine. Increased risk of influenza has been observed in children with asthma in clinical cohort studies (Gordon et al., 2009), while diabetes is known to be strongly associated with complications due to influenza infections (Irwin et al., 2001). An increased risk of ILI was observed among the Influenzanet participants who smoke, as has been confirmed by other studies (Arcavi and Benowitz, 2004).
The Influenzanet system is flexible to the extent that questions of interest can easily be added or removed, allowing for the estimation of risk factors which are not usually considered. In this study, we found a small but significant protective effect of walking or bicycling as a primary means of locomotion in comparison with travelling by car, while no significant risk of travelling by public transportation was observed, nor in participants who live with other adults in comparison with adults who live alone. A small increase in risk was observed in participants who have pets at home. Practicing sports for at least one hour per week was associated with a small but significant decrease on the ILI risk.
Not only extra questions could be included in the intake questionnaire, entire new questionnaires could be added in particular seasons enabling further studies. A stress-related questionnaire released in the Netherlands in season 2004-2005 revealed significant trends between stress/personality and ILI self-reporting (Smolderen et al., 2007), and a simple questionnaire related to contact behaviour, showed that changes in contact patterns could explain changes in disease incidence (Eames et al., 2012b).
In 4 out of 10 seasons Influenzanet estimated a significant reduction in ILI due to vaccination, whereas in the other seasons no significant effect was observed. The direct effectiveness of vaccination varied between 33% (22-42%) in season 2010-2011 and −10% (−28 to 6%) in season [2004][2005]. A relatively low vaccine effectiveness against ILI is to be expected, since vaccination targets specifically the influenza virus, and not other influenzalike illnesses. A double-blind, randomized, placebo-controlled trial measured within the same cohort a vaccine efficacy for serologically confirmed influenza of respectively 50% (1997/98) and 86% (1998/99), but a vaccine effectiveness for ILI of −10% (1997)(1998) and 33% (1998-1999) (Bridges et al., 2000). According to a large meta-study based on 48 reports on vaccine effectiveness in healthy adults, inactivated parenteral vaccines were 30% effective against ILI, and 80% efficacious against influenza when the vaccine matched the circulating strain and circulation was high, but this decreased to an effectiveness against ILI of 12% and efficacy against influenza of 50% when it did not (Demicheli et al., 2009).
For two seasons (2003-2004 and 2004-2005) Influenzanet estimated a negative although non-significant vaccine effectiveness. Both seasons were characterized by a poor vaccine match (Belongia et al., 2009;Jin et al., 2005). A negative vaccine effect can be due to original antigenic sin, the tendency for antibodies produced in response to exposure to influenza vaccine antigens to suppress the maturation of antibodies with high affinity to the actual virus (Gupta et al., 2006).
In an observational study of vaccine effectiveness, any preexisting bias between vaccinated and unvaccinated participants could  Table S2) indicates a 10% higher ILI reduction in vaccinated participants than the multivariate study. However, with participants over 65 years of age having a lower ILI rate and a relatively high vaccination rate, the multivariate model estimates that a part of the reduction in ILI in vaccinated participants is due to their age. Although the multivariate logistic regression aims to correct for these biases, it is possible that other biases not represented by any of the risk factors listed in Table 2 exist. A cohort study of 72,527 seniors over 65 years of age followed during an 8 year period, found that vaccinated seniors already had a reduced risk of death and pneumonia hospitalization in the periods before the influenza season, and that the risk reduction actually decreased during the influenza season (Jackson et al., 2006). Such a preferential receipt of vaccine by relatively healthy seniors could lead to overestimation of the vaccine effectiveness in observational studies. It is plausible that most elderly Influenzanet participants are relatively healthy and that this selection bias is less present in Influenzanet, leading to relatively lower estimates of vaccine effectiveness than in the average literature. Because of global recommendations for influenza vaccination, placebo-controlled trials, which could clarify the effects of influenza vaccines in individuals, are no longer considered possible on ethical grounds (Jefferson et al., 2010).
Since the Influenzanet participants are not a random sample of the overall population, care should be taken in extrapolating the estimated risks to the overall population in the respective countries. However, the observed consistency in risk factors for ILI between Influenzanet and those reported by studies in community settings further establishes that the Influenzanet population is a valuable sentinel for ILI surveillance in the population, in addition to the merits of engaging the participants in public health research and promoting risk awareness.
The system presented here stands on a concept for syndromic surveillance that depends on intense activity in science communication, public awareness and sufficient levels of Internet penetration. It has reported ILI activity in a consistent way for over 10 seasons in multiple countries. Influenzanet reports ILI trends consistent with GP sentinel surveillance (ECDC), and can complement these systems by providing valuable information about medical care seeking behaviour. Based on reported symptoms, Influenzanet can be extended to detect diseases other than influenza, including those in developing settings. Influenzanet as an Internet monitoring system based on voluntary participants might therefore develop into an important weapon to fight influenza as well as other contagious diseases globally.