Determining Risk Factors for Infection with Influenza A (H5N1)

Lukrafka et al. (1) warn against the dangers of overfitting a regression model when the number of outcomes is <10 per variable, “which could result in imprecise estimates or spurious associations.” This warning is valid, but it is equally important to consider the relative merits of multiple analysis options given the data available, the difficulties in collecting the data, and the objective of the study. The objective of our study (2) was to explore possible risk factors for human infection with influenza A (H5N1) rather than to test an explicit a priori hypothesis or to obtain precise estimates of risk. We were limited to a finite number of cases, and had we slavishly followed criteria to avoid overfitting, we would not have run a regression model at all because we could have included only 2 variables, for which a stratified analysis would have been preferable. The regression model was run to confirm that the variables identified in the bivariate analysis retained their importance in the context of other variables; it was not intended to confirm or refute an a priori hypothesis, to be a predictive model, or to obtain precise and adjusted measures of risk. Despite the sample size limitations, we felt that looking at independence in a multivariable analysis was still valuable. 
 
We explicitly acknowledge the limitations imposed by a small study size and were cautious in our interpretation, stating that the findings are the “basis for formulating new hypotheses.” The wide confidence intervals clearly indicate the low level of precision. The 3 variables in the final regression model were all statistically significant in bivariate analysis, and we do not believe they are spurious associations arising solely from an overfitted regression model.


Determining Risk Factors for Infection with Infl uenza A (H5N1)
To the Editor: Novel antigenic subtypes of infl uenza viruses have been introduced periodically into the human population, resulting in largescale global outbreaks (1). Highly pathogenic avian infl uenza (H5N1) viruses reemerged in 2003. Since then, they have reached endemic lev-els among poultry in several Southeast Asian countries, and across Asia, they have caused nearly 300 human infections, with a high rate of mortality (1,2). The results of many studies, including those for one recently conducted by Dinh et al. (3), have been published in an effort to identify the source(s) and modes of transmission of infl uenza A (H5N1) to humans and to guide the control and prevention of infl uenza infection.
Although new data regarding infl uenza A (H5N1) are urgently required, scientifi c rigor must be maintained during research and analysis to prevent misidentifi cation of exposures as a risk factor for the disease and to prevent creation of iatrogenic panic among the exposed population and the scientifi c community (4). One point of scientifi c rigor that must be maintained is the use of adequate statistical analysis. The multivariate model in the study by Dinh et al. (3) was constructed by using a backward, stepwise variable selection strategy, in which variables with p<0.20 were included in the initial model. However, such a strategy has resulted in a fi rst model and subsequent steps with far more than 10 variables per outcome (e.g., 28 persons with avian fl u), resulting in model overfi tting (i.e., a statistical model that is too complex for the amount of data), which could result in imprecise estimates or spurious associations (5).
We believe that scientifi c methods must be meticulously applied when planning, executing, analyzing, and interpreting the results of infl uenza (H5N1) studies to prevent identifi cation of false risk factors for acquiring infection.  1) warn against the dangers of overfi tting a regression model when the number of outcomes is <10 per variable, "which could result in imprecise estimates or spurious associations." This warning is valid, but it is equally important to consider the relative merits of multiple analysis options given the data available, the diffi culties in collecting the data, and the objective of the study. The objective of our study (2) was to explore possible risk factors for human infection with infl uenza A (H5N1) rather than to test an explicit a priori hypothesis or to obtain precise estimates of risk. We were limited to a fi nite number of cases, and had we slavishly followed criteria to avoid overfi tting, we would not have run a regression model at all because we could have included only 2 variables, for which a stratifi ed analysis would have been preferable. The regression model was run to confi rm that the variables identifi ed in the bivariate analysis retained their importance in the context of other variables; it was not intended to confi rm or refute an a priori hypothesis, to be a predictive model, or to obtain precise and adjusted measures of risk. Despite the sample size limitations, we felt that looking at independence in a multivariable analysis was still valuable.
We explicitly acknowledge the limitations imposed by a small study size and were cautious in our interpretation, stating that the fi ndings are the "basis for formulating new hypotheses." The wide confi dence intervals clearly indicate the low level of precision. The 3 variables in the fi nal regression model were all statistically signifi cant in bivariate analysis, and we do not believe they are spurious associations arising solely from an overfi tted regression model.

Ilheus Virus Isolate from a Human, Ecuador
To the Editor: Ilheus virus (ILHV) (genus Flavivirus in the Ntaya antigenic complex) is most closely related to Rocio virus. However, antibodies produced during ILHV infection cross-react in serologic assays to other fl avivirus antigens, and ILHV was originally classifi ed in the Japanese encephalitis antigenic complex (1)(2)(3). ILHV is transmitted in an enzootic cycle between birds and mosquitoes. Since the fi rst isolation of ILHV from a pool of Aedes spp. and Psorophora spp. mosquitoes collected in 1944 at Ilheus City, on the eastern coast of Brazil (4), isolates have been obtained in Central and South America and Trinidad, primarily from Psorophora ferox mosquitoes (5,6). ILHV is not associated with epidemic disease and has been only sporadically isolated from humans (5,(7)(8)(9). The clinical spectrum of human infections documented by virus isolation ranges from asymptomatic to signs of central nervous system involvement suggestive of encephalitis. Most commonly, patients exhibit a mild febrile illness accompanied by headache, myalgia, arthralgia, and photophobia, symptoms that may result in clinical diagnosis of dengue, Saint Louis encephalitis, yellow fever, or infl uenza (7). Laboratory diagnosis of ILHV infection may be diffi cult, unless a virus isolate can be obtained, because of the cross-reactivity in serologic assays to other fl aviviruses that circulate in the same area, such as Rocio, dengue, yellow fever, and Saint Louis encephalitis viruses.
On March 1, 2004, after 4 days of symptoms, a 20-year-old male soldier stationed in Lorocachi, Ecuador, was admitted to the Hospital de la IV División del Ejercito "Amazonas" in Puyo, Ecuador. Lorocachi is in the Amazonian province of Pastaza, of which Puyo is the capital. The patient All material published in Emerging Infectious Diseases is in the public domain and may be used and reprinted without special permission; proper citation, however, is required.