Outbreaks of highly pathogenic avian influenza (HPAI) H5N1 virus in Africa were first reported in Nigeria in February, 2006. By mid February same year, HPAI H5N1 virus emerged in the poultry population in many locations in Egypt [22]. Since then, the virus has been spreading rapidly in the country in many geographical regions infecting many species of poultry and was declared endemic by 2008 [16]. Human infections started shortly after this reaching 115 cases by end of 2010 with 38 deaths [21]. Although transmission to humans is still limited, the continuous close contact between man, especially children and birds in this country raises many concerns for the possibility of human adaptation with its consequent pandemic threats. Further, different susceptible hosts of avian influenza viruses are commonly reared together in a single area with direct human contact, representing a high risk of inter-species circulation thus enhancing viral persistence and potentially generating new variants. Meanwhile, sequence data from Egypt derived from species other than human and chicken are lacking interfering with accurate analysis and epidemiological characterization.

According to the timeline of major events of HPAI H5N1 virus infection reported by the WHO [22], Egypt seems to have a seasonal infection pattern that is characterized by a high peak and heavy eruption of the virus during November and December of preceding year and January and February of succeeding year and decline towards warmer weather. Human infections are mostly linked to the peaks of the cold-season avian outbreaks. This was repeatedly detected in the following years up to 2009 and further 2010 posing eradication challenges because of such long-term endemicity. In this report, we analyzed our samples that were collected between 2006 peak and the beginning of its decline in 2007. Utilizing the H5N1 virus directly from the original tissue without previous propagation showed molecular differences in the virus unique for each species and further different from those reported for other seasons’ viruses in seven viral segments.

In this article, we directly sequenced and analyzed the complete genome sequences of H5N1 virus from the tissues of two host species; chicken and duck. Tissue samples were collected separately from individual dead birds of 10 birds total for each species. The samples were collected in January and March 2007 from house reared ducks (Damanhour, EL Behiera governorate; n = 30) and large scale breeder chicken farm (Alexandria governorate; n = 10,000), respectively. All of the ducks and chicken showed severe clinical signs of classical HPAI H5N1 [2, 3] with high mortality rates. In order to minimize any possible variations due to laboratory passage either in embryonating chicken eggs (ECEs) or MDCK cell line, total RNA was directly extracted from the original clinical materials from chicken and duck samples using TRIZOL reagent (Invitrogen, Japan) according to the manufacturer’s instructions. RT-PCR amplification of the entire genome was performed using sets of specific primers [11]. The PCR products were separated on 1.5% agarose gels, and the fragments of interest were isolated from the gel using a QIAquick gel extraction kit (Qiagen, Japan). The purified full-length DNA fragments were cloned into the Mighty TA cloning kit (Takara, Japan) according to the manufacturer’s instructions. For individual segments, ten colony-purified plasmids were sequenced by capillary electrophoresis using the Applied Biosystems Genetic Analyzer 3130 (Applied Biosystems, USA). Sequence data were assembled using GENETYX (Software Development Co, Ltd, Tokyo, Japan) and BioEdit [10]. The GenBank/EMBL/DDBJ accession numbers for the sequences reported in this article are AB465592-AB465595, AB465620-AB465629, AB468063, and AB468064.

Sequence analysis showed >99% identity within each species-derived sequences, so representative sequences from chicken- and duck-derived viruses, CL6/07 and D2br10/07, respectively, were selected for further analysis. Analysis of the HA genes showed a high percent of identity (>98%) with other sequences from Egypt, Nigeria, and the Middle East. Several nucleotide changes (Table 1) were detected between CL6/07 and D2br10/07 as well as with the GenBank available reference sequences from Egypt. These changes were reflected on the amino acid sequences with three substitutions in CL6/07 and two in D2br10/07. Further, the two amino acid substitutions in the D2br10/07 were different from those in CL6/07 and the reference sequences for 2006; Asn154 in HA1 and Ser207 in HA2 (H5 numbering) but same as the A/Bar-headed-Goose/Qinghai/65/05 (GenBank accession no. DQ095622). For 2007 and 2008, the CL6/07 and D2br10/07-specific amino acids were detected in GenBank available sequences but they could not be species-correlated due to the low number of duck reference sequences, except the Lys140 (H5 numbering) in CL6/07 that was not detected in any of the available reference sequences. For 2009–2010, GenBank human sequences for Egypt were more abundant over the chicken and duck ones; however, 2009 was closer to CL6/07 while 2010 was equally closer to CL6/07 and D2br10/07 (Table 1). The highly pathogenic characteristic sequence of multiple basic amino acids at the cleavage site, GERRRKKRRG, was similarly detected in the HA of CL6/07 and D2br10/07 viruses. Further, the 2-3-NeuAcGal avian receptor binding preference was maintained in both viruses, expressing the Gln226 and Gly228 (H3 numbering) [9, 15]. In addition, the Lys216 in the HA receptor binding site detected in all clade 2.2 viruses [1] was also detected in our viruses.

Table 1 CL6/07- and D2br10/07-specific amino acid changes in the glycoproteins and the internal genes

The alignment of the NA sequences added to the differences between the CL6/07 and D2br10/07 (Table 1), where the CL6/07 had one amino acid substitution while the D2br10/07 had two, even though the former had nine nucleotide mutations versus four in the latter. Unlike the HA, such amino acid changes were not detected in any of the GenBank available sequences for Egypt from 2006 to 2010. Moreover, the 20 amino acid genotype Z dominant deletion in the stalk of the NA protein [14], from residue 49 to 68 resulting in the loss of an N-linked glycosylation site upstream the deletion, was equally detected in CL6/07 and D2br10/07 viruses.

Phylogenetic analysis was performed using the MEGA4 software [19] employing the neighbor-joining method on the basis of full nucleotide sequences for the whole genome. Estimates of phylogenies were calculated by performing 1000 bootstrap replicates. Phylogenetic analysis of the HA gene (Fig. 1) and NA (Data not shown) showed that both isolates belong to clade 2.2 [20] together with other Egyptian isolates as well as the isolates from Nigeria and Middle East. Although, CL6/07 and D2br10/07 sub-clustered far from each other, they were close to the 2006 and 2007 derived Egyptian sequences indicating that they have originated from endemic viruses circulating the same year with the duck viruses closer to their ancestors and sub-clustering independently. Further, the CL6/07 sub-clustered together with A/chicken/Egypt/C3Br11/2007 (GenBank accession no. AB551132), which was directly sequenced from the clinical materials without prior amplification. The 2009- and 2010-derived sequences were quite far from our 2007 sequences indicating that the virus is under continuous genetic evolution in the country.

Fig. 1
figure 1

Phylogenetic tree of the hemagglutinin (HA) segments of CL6/07 and D2br10/07, and other GenBank Egyptian reference viruses. The neighbor-joining trees based on the full-length nucleotides were generated with MEGA4 with 1000 bootstrap value. Bootstrap values over 50% are shown at the tree nodes. The chicken and duck sequences are indicated in bold and underlined. The arrow head points to A/chicken/Egypt/C3Br11/2007 that represents directly sequenced viral RNA without prior amplification. Trees are rooted to the A/chicken/Egypt/R1/2006 that was isolated in December 2006. The scale bar represents the distance unit between sequence pairs

Pairwise sequence comparison further revealed several nucleotide changes in the NP, M, and NS genes (Table 1). Further, the amino acid substitutions in the transmembrane region of M2 protein that are known as a key point for drug resistance [18] were not detected in CL6/07 or D2br10/07. In addition, there was no amino acid changes associated with the amantadine or rimantadine resistance, and all the amino acids were avian-specific except for the Val33Ile human signature in the NP of both viruses [5, 6, 7]. In addition, both isolates contained the Glu92 mutation in the NS1 protein, which is a major contributor to virulence of H5N1 viruses [12]. Furthermore, the phylogenetic analysis confirmed the separate sub-clustering of the CL6/07 and D2br10/07, except for the NS gene (Data not shown).

Further sequence comparison of the genes encoding the polymerase complex, PB2, PB1, and PA, revealed a number of differences between CL6/07 and D2br10/07, which were further different from those of the GenBank Egyptian sequences (Table 1). In contrast to CL6/07, the D2br10/07-derived sequences shared several nucleotides and the Leu82Ser in the PB1-F2 with the two human isolates from Egypt (Tables 1). The Glu627Lys mutation in PB2, which is characteristic of human viruses and of increased pathogenecity and host range as well [5, 17] was equally detected in both viruses. Moreover, phylogenetic analysis of the polymerase genes also showed separate sub-clustering (Data not shown). In addition, the polymerase complex appeared to be very distinct from the two Nigerian lineages SO and BA [8], but with closer relation to the Middle East isolates (Data not shown).

In our analysis, comparative genetic characterization of the eight RNA segments of CL6/07 and D2br10/07 together with 2006–2010 GenBank Egyptian reference sequences showed that both viruses had nucleotide as well as amino acid differences that appeared to be specific for each, except for the M gene that was highly conserved. The NA gene appeared to be uniquely maintained where the CL6/07 and D2br10/07-specific mutations were not detected in 2006 till 2010 Egyptian reference sequences. Moreover, 2007 was linked to more human infections (25 cases) compared to 2006 (18 cases) and 2008 (8 cases) [21] indicating a possible host-dependent molecular adaptation and/or evolution of H5N1 in 2007 for which the host is not yet disclosed. An increase in human infections was detected in 2009 (39 cases) even though reassortmant or new virus entry has not been reported yet for Egypt; however, this could be a consequence of the 2008 declared endemicity of the virus in the country. On the contrary, 25 cases were only reported for 2010 [21] reflecting a necessity to unravel the possible transfer host. Even though, CL6/07 and D2br10/07 were derived from different cities, they clustered with other Egyptian sequences indicating a single origin with a possible different molecular evolution. This further confirms that in contrast to Nigeria, and as Cattoli et al. [4], Egypt seems to have had a single entry of the virus, which appears to have happened in early 2006 or late 2005.

The maintenance of two different viruses in two different species may increase the burden of threat to human, especially where direct contact with different avian species is common. However, D2br10/07 carried more amino acid mutations and shared several nucleotides in the polymerase genes with the human isolate; Human/Egypt/902782/06 as well as the human signature; Leu82Ser in PB1-F2 with the isolate; Human/Egypt/902786/06. This may point a possibility that duck could serve as a viral disseminating and/or amplifying host, being closer to the ancestors, and possibly the potential source for human infection.

Ducks have been shown to have a central role in the generation and maintenance of H5N1 viruses in China [14], while for Egypt, duck-derived sequences are so scarce compared to those of chicken- or human-derived ones even though ducks are extensively reared and consumed, mainly in the country side. Considering the diversity of duck susceptibility to HPAI H5N1, understanding the role of ducks in the emergence and maintenance of these viruses and its role in viral spread to other poultry species and human is required.

The differences in the amino acids detected in our sequences relative to the GenBank 2006–2010 available Egyptian sequences could result from utilizing the original clinical materials directly for sequence analysis without prior amplification in vivo; ECE or in vitro; MDCK cells. Recently, Le et al. [13] showed that the PB2 gene population of H5N1 virus grown in ECE or on MDCK did not reflect that of the original. Together, it seems that this was likely to occur in our analysis where mostly ECE-derived viruses were used for the sequence analysis of GenBank reference viruses. Further, it may indicate that viral variants harbored inside the infected host may differ from those shed outside. Moreover, such species-associated changes could have been independently selected during replication in individual birds and/or individual tissue (Y. Watanabe and M.S. Ibrahim, Unpublished Results) reflecting a possible differential evolution within individual species.

Human infections are mainly linked to a previous contact with an unknown dead host. Thus, accurate molecular analysis of H5N1 gene assemblage in different avian hosts, mainly ducks as well as human would improve the detection of the host-associated changes having the potential for viral spread within the human population and identifying the source of infection and the mysterious host behind that. Furthermore, our findings highlight an essential need for using the original clinical material as a source for viral sequence analysis to accurately understand the molecular evolution of H5N1 in individual hosts and also to identify sequence changes that may facilitate cross species infection.