BIOINFORMATICS ANALYSIS TO ASSESS THE HOMOLOGY AMONG INFLUENZA A VIRUSES AND OTHER PATHOGENS

he mechanisms by which influenza viruses cross species barriers to infect humans or other mammals, either causing dead-end infections or leading to subsequent human-to-human transmission, are unknown. Moreover, the properties of influenza viruses that have the greatest medical and public health relevance, such as human infectivity, transmissibility, and pathogenicity, appear to be complex and polygenic and are poorly understood (Morens et al., 2009). Influenza viruses are members of the Orthomyxoviridae family of RNA viruses and are grouped into types A, B, and C on the basis of their nucleoprotein (NP) and matrix protein characteristics. Type A influenza viruses are classified into subtypes based on two proteins on the surface of the virus, hemagglutinin (HA) and neuraminidase (NA) (Oliveira et al., 2003; Marjuki et al., 2007).

he mechanisms by which influenza viruses cross species barriers to infect humans or other mammals, either causing dead-end infections or leading to subsequent human-to-human transmission, are unknown.Moreover, the properties of influenza viruses that have the greatest medical and public health relevance, such as human infectivity, transmissibility, and pathogenicity, appear to be complex and polygenic and are poorly understood (Morens et al., 2009).Influenza viruses are members of the Orthomyxoviridae family of RNA viruses and are grouped into types A, B, and C on the basis of their nucleoprotein (NP) and matrix protein characteristics.Type A influenza viruses are classified into subtypes based on two proteins on the surface of the virus, hemagglutinin (HA) and neuraminidase (NA) (Oliveira et al., 2003;Marjuki et al., 2007).
Every influenza A virus has a gene coding for 1 of 16 possible hemagglutinin (HA) surface proteins and another gene coding for 1 of 9 possible neuraminidase (NA) surface proteins.These two proteins (facilitating viral attachment and release) are critical for the infection of susceptible cells of a host (Portela and Digard, 2002).
Of the 144 total combinatorial possibilities, only three HAs and two NAs, in only 3 combinations (H1N1, H2N2, and H3N2), have ever been found in truly human-adapted viruses (Morens et al., 2009).Influenza A viruses infect a large variety of mammals and birds, occasionally producing devastating pandemics in humans (Alexander and Brown, 2000).Epidemics frequently occur between pandemics as a result of gradual antigenic change in the prevalent virus; this phenomenon is termed antigenic drift (Laver et al., 1990).Three notable (years: 1918, 1958 and 1968) severe pandemics have occurred during the 20 th century: An H1N1 caused the 1918's "Spanish flu" pandemic, while an H3N2 was responsible for the 1968 "Hong Kong flu" pandemic (Taubenberger and Morens, 2006 a and b).
All avian influenza viruses are classified as type A. Only four avian influenza A viruses including H5N1, H7N3, H7N7 and H9N2 viruses have jumped host species to infect humans (Bao et al., 2008).The H5N1 subtype, in particular, has been reported in 410 human cases and has caused 256 human deaths in 15 countries.In Egypt it has been reported in 57 human cases and has caused 23 human T deaths, as reported in World Health Organization website in the year 2009 (http://www.who.int/csr/disease/avian_ influenza/country/cases_table_2009_03_ 10/en/index.html).
The species barrier reflects, at least in part, the different receptor preferences of mammalian and avian viruses.Researchers have suggested that human tracheal epithelial cells lack receptors for the attachment of avian influenza viruses and that avian tracheal epithelial cells lack the appropriate receptors for human viruses (Rogers et al., 1983).Pigs, however, possess receptors for both avian and mammalian viruses and are postulated to be the host in which influenza viruses of different origins can genetically reassort (Castrucci et al., 1994;Kida et al., 1994).
The genome of type A influenza is single-stranded, negative-sense RNA, that is their genomes cannot be translated into protein directly upon entering the host cell.It contains eight genome segments that encode 10 proteins (Huang et al., 1990;Portela and Digard, 2002) (Webster et al., 1992).
Influenza virus is very changeable.Mutations, including substitutions, deletions, and insertions, are one of the most important mechanisms for producing variation in influenza viruses.The lack of proofreading among RNA polymerases contributes to replication errors (Robert et al., 2008).RNA recombination would be another mechanism leading to genetic variation.Recombination in RNA viruses occurs by two different methods; reassortment and template switching or copychoice replication (Posada et al., 2002).Recombination by the process of reassortment is limited to viruses with segmented genomes such as influenza and rotaviruses (Lai, 1992).Reassortment occurs when two or more strains infect the same cell and exchange genomic segments during viral replication.This mechanism has been well studied for influenza A and is postulated to account for the emergence of antigenically and genetically novel viruses that enable microbes to evade the immune response and persist in the host's body (Worobey and Holmes, 1999;Posada et al., 2002).The second method, copychoice replication, can be utilized by either segmented or unsegmented viruses.Copy-choice is a process whereby the viral RNA-dependent RNA polymerase jumps from one RNA template to the other during replication creating a chimeric recombinant that is an amalgamation of both parental strands (Worobey and Holmes, 1999).
In the past few years, there has been a worldwide effort to isolate and sequence the genomes of influenza A viruses, which has led to the depositing of more than 46,000 sequences in the Influenza Virus Resource of the National Center for Biotechnology Information (NCBI) (www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html).As of May 25, 2009, the NCBI database included sequences from more than 220 strains from the 2009 swineorigin human influenza A (H1N1) virus isolated at various sites around the world.Consequently, the origin and recent history of new strains can be inferred from study of the most similar deposited sequences.The percentage of matching nucleotides (the nucleotide identity) after nucleotide alignment, as determined with the use of the NCBI Basic Local Alignment Search Tool (BLAST) or other tools, is a common measure of similarity used by researchers in the field.
To gain insight into the biology of this devastating disease, this study was aimed to assess the relationships among influenza A viruses including human and avian influenza virus in addition to other pathogens using basic local alignment search tool (BLAST) and multiple sequence alignment (MSA) program for proteins.

Basic local alignment search tool (BLAST)
The BLAST finds regions of local similarity between sequences.The pro-gram compares nucleotide or amino acid sequences to sequence databases and calculates the statistical significance of matches based on pair-wise alignment method.BLAST can be used to infer functional and evolutionary relationships between sequences.In addition it helps identify members of gene families (http://www.ncbi.nlm.nih.gov/BLAST).The whole proteome of the influenza A virus [A/Chicken/Hong Kong/258/97 (H5N1)] isolate, which contains 10 proteins was downloaded from the national center for biotechnology information (NCBI) website (http://www.ncbi.nlm.nih.gov/).Then, the same whole proteome was used to search the RefSeq database for similar sequences.

RefSeq database project
The Reference sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq) is a nonredundant collection of richly annotated DNA, RNA, and protein sequences from diverse taxa.The collection includes sequences from plasmids, organelles, viruses, archaea, bacteria, and eukaryotes.Each RefSeq represents a single, naturally occurring molecule from one organism.The goal is to provide a comprehensive, standard dataset that represents sequence information for a species.It should be noted that, RefSeq has been built using data from public archival databases only.

AlignX and ClastalW
ClustalW (Higgins and Sharp, 1988;Larkin et al., 2007) is a general purpose multiple sequence alignment (MSA) program for DNA or proteins.It produces biologically meaningful multiple sequence alignments of divergent sequences.It calculates the best match for the selected sequences, and lines them up so that the identities, similarities and differences can be shown.Evolutionary relationships can be shown via viewing Cladograms or Phylograms.
AlignX Module: rapid multiple sequence alignment with minimal preparation.AlignX uses a modified ClustalW algorithm to generate multiple sequence alignments of either protein or nucleic acid sequences for similarity comparisons and for annotation.The power of AlignX is that it maintains annotated features within the alignment for easy visualization and localization of regions of interest.

Phylogenetic relationships among influenza viruses
The Orthomyxoviridae is a family of RNA viruses with a single-stranded segmented RNA ranging from six to eight fragments with three major genera, Influenzavirus, Isavirus, and Thogotovirus.The first genus includes some of the most important human viral pathogens.Influenza viruses A (eight RNA segments), B (eight RNA segments), and C (seven RNA segments) are responsible worldwide for most respiratory disease epidemics and are associated with thousands of deaths annually (Murphy and Webster, 1996).
The whole proteome (10 proteins) of the influenza A virus [A/Chicken/Hong Kong/258/97 (H5N1)] isolate was downloaded from (http://www.ncbi.nlm.nih.gov/) and used to search the RefSeq database for similar sequences.Each segment showed high similarity with several types of influenza viruses.Of the three virus types, A and B viruses are much more similar to each other in protein homology than to C viruses.
The general structural features and genome organization of influenza A, B, and C viruses suggest that they share a common ancestry distinct from other negative-strand RNA viruses (Desselberger et al., 1980).The virus cannot proofread its RNA for errors resulted in high mutation load (Robert et al., 2008).These accumulated mutations selectively permit influenza to partially evade a host's immune system, so that, new strains and lineages belong to the influenza A, B, and C viruses could be produced (Wong et al., 2006).

Phylogenetic relationships between influenza A viruses and Salmonella typhi
In addition to the observed similarities among the different types of influenza viruses, BLAST search of six influenza A virus [A/Chicken/Hong Kong/258/97 (H5N1)] segments surprisingly showed high significant similarity (>90%) with 12 sequences belonging to Salmonella typhi.Four of the similar Salmonella typhi protein sequences similar to viral polymerase subunits, PB2 (produced by viral genome segment No.1) and PA (produced by viral genome segment No.3), two Salmonella protein sequences for each polymerase subunit were downloaded to build four datasets each set contains protein sequences similar to one protein sequence of the four Salmonella typhi protein sequences.Multiple sequence alignment (MSA) was performed with AlignX to find the best MSA and to construct phylogenetic tree within each dataset (Figures 1-4).
The MSA and phylogenetic relationship results revealed that, the Zp03359657 and Zp03374490 hypothetical proteins belonging to Salmonella typhi shared similarities with polymerase subunit PB2 produced by the viral segment No.1.Moreover the Zp03359711 and Zp03348148 hypothetical proteins shared similarities with PA polymerase subunit produced by the viral segment No.3.
The mentioned results (Figures 1-4) revealed that, Salmonella typhi is the organism who had acquired partial viral sequences, because the complete gene sequences are known for different types of influenza viruses.The complete gene and protein sequences could not be found in Salmonella typhi, only short sequences (hypothetical proteins).
Influenza A virus PB1 polymerase subunit protein didn't show any similarity with any of Salmonella typhi sequences.On the other hand no similarities were found between the under investigation Salmonella typhi sequences and other Salmonella strains.
Salmonella enterica typhi (referred to as Salmonella typhi) is the causative agent of worldwide, typhoid fever that affects roughly 17 million people annually, causing nearly 600,000 deaths.Sal-monella enterica is a major cause of gastroenteritis in humans.These facultative intracellular pathogen infections are originated in food-producing animals (cattle pig and poultry species), that are infected with S. enterica (Humphrey, 2004;Mastroeni et al., 2009).Members of the public are less likely to know that domestic pets, birds, rodents and cold blooded animals including tropical fish and reptiles, also harbor S. enterica (Ward, 2000).
It is important to mention that, the term viteria (bacteria-related sequences) was chosen to describe stealth-adapted viruses that had acquired bacterial genes.The term vifungus was also introduced since some of the novel sequences in the stealth-adapted virus culture were of apparent fungal origin (Martin, 2005).
The highly similar sequences between different influenza A viruses and Salmonella typhi could be explained by the possibility that, influenza A virus had acquired the capacity to recombine with Salmonella typhi genetic sequences via nonhomologous recombination and presumably had overcome the normal barrier restricting eukaryotic virus growth in prokaryotes.This argument was strengthened by the fact that, both avian influenza A virus and Salmonella typhi are intestinal and intracellular pathogens, that can infect human, pigs and avian species (Matrosovich et al., 2004;Mastroeni et al., 2009).Furthermore bacterial super infections after viral infections have been studied extensively in human and animals (Beadling and Slifka, 2004) meaning that, both of them can be found in the same cell at the same time.The consequences of nonhomologos recombination between viral RNA segments and Salmonella typhi could be resulted in highly pathogenic strains of Salmonella typhi and influenza A viruses.
Many recombinant RNA virus strains provide ample indication that recombination can generate beneficial new variation.In some viruses this new variation is achieved by borrowing genetic material from their hosts.Influenza A virus has been observed to recombine with cellular RNA, resulting in increased pathogenicity for the hybrid viruses (Khatchikian et al., 1989).Recombination between virus and host genetic material evidently occurs in plant viruses as well as illustrated by a luteovirus isolate with 5terminal sequence derived from a chloroplast exon (Mayo and Jolly, 1991) and closteroviruses which have acquired host cellular protein-coding genes (Dolja et al., 1994) which are nonessential for replication and virion production (Peremyslov et al., 1998).
Horizontal gene transfer is the transfer of genetic material between cells or genomes belonging to unrelated species, by processes other than usual reproduction.In the usual process of reproduction, genes are transferred vertically from parent to offspring; and such a process can occur only within a species or between closely related species.Horizontal gene transfer, where a significant proportion of the coding sequence is contributed by external sources, might give rise to extremely dynamic genomes, which brings impact on the ecological and pathogenic characters of the recipient organisms.The results of this study likely will encourage scientists in several fields to rethink their approach to the study of host-virus systems, which are believed to play a key evolutionary role by facilitating the transfer of genes between species.

SUMMARY
Influenza A virus causes annual epidemics and every 10 to 50 years, at unpredictable intervals, causes major pandemics.It is able to generate a high degree of genetic diversity by the high mutation rate, the ability of gene segments to reassort, recombination and the huge pool of influenza viruses in birds and mammals explain their changing behavior and the difficulty in developing a permanent, long-lasting, and effective vaccine.The whole proteome of the influenza A virus [A/Chicken/Hong Kong/258/97(H5N1)] was used to search refseq database for similar sequences.Each segment showed high similarity to several types of influenza viruses.In addition to the observed similarity among the different types of influenza viruses, BLAST search showed high similarity between different influenza A virus [A/Chicken/HongKong/258/97 (H5N1)] proteins and twelve sequences belonging to Salmonella typhi.Only four from the twelve Salmonella typhi sequences were chosen for further analysis.
BLAST search and multiple sequence alignment for these four sequences had identified high significant homology to PB2 and PA polymerase subunits (two Salmonella sequences for each polymerase subunit).These findings highlight the dynamic interface between bacterial and viral genomes and the potential of this interaction in the emergence and spread for novel and more virulent viral and bacterial pathogens.