Skip to main content

Advertisement

Log in

Is There a Twelfth Protein-Coding Gene in the Genome of Influenza A? A Selection-Based Approach to the Detection of Overlapping Genes in Closely Related Sequences

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Protein-coding genes often contain long overlapping open-reading frames (ORFs), which may or may not be functional. Current methods that utilize the signature of purifying selection to detect functional overlapping genes are limited to the analysis of sequences from divergent species, thus rendering them inapplicable to genes found only in closely related sequences. Here, we present a method for the detection of selection signatures on overlapping reading frames by using closely related sequences, and apply the method to several known overlapping genes, and to an overlapping ORF on the negative strand of segment 8 of influenza A virus (NEG8), for which the suggestion has been made that it is functional. We find no evidence that NEG8 is under selection, suggesting that the intact reading frame might be non-functional, although we cannot fully exclude the possibility that the method is not sensitive enough to detect the signature of selection acting on this gene. We present the limitations of the method using known overlapping genes and suggest several approaches to improve it in future studies. Finally, we examine alternative explanations for the sequence conservation of NEG8 in the absence of selection. We show that overlap type and genomic context affect the conservation of intact overlapping ORFs and should therefore be considered in any attempt of estimating the signature of selection in overlapping genes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410

    PubMed  CAS  Google Scholar 

  • Baez M, Taussig R, Zazra JJ, Young JF, Palese P, Reisfeld A, Skalka AM (1980) Complete nucleotide sequence of the influenza A/PR/8/34 virus NS gene and comparison with the NS genes of the A/Udorn/72 and A/FPV/Rostock/34 strains. Nucleic Acids Res 8:5845–5858

    Article  PubMed  CAS  Google Scholar 

  • Bao Y, Bolotov P, Dernovoy D, Kiryutin B, Zaslavsky L, Tatusova T, Ostell J, Lipman D (2008) The influenza virus resource at the National Center for Biotechnology Information. J Virol 82:596–601

    Article  PubMed  CAS  Google Scholar 

  • Campitelli L, Ciccozzi M, Salemi M, Taglia F, Boros S, Donatelli I, Rezza G (2006) H5N1 influenza virus evolution: a comparison of different epidemics in birds and humans (1997–2004). J Gen Virol 87:955–960

    Article  PubMed  CAS  Google Scholar 

  • Chen W, Calvo PA, Malide D, Gibbs J, Schubert U, Bacik I, Basta S, O’Neill R, Schickli J, Palese P, Henklein P, Bennink JR, Yewdell JW (2001) A novel influenza A virus mitochondrial protein that induces cell death. Nat Med 7:1306–1312

    Article  PubMed  CAS  Google Scholar 

  • Chung WY, Wadhawan S, Szklarczyk R, Pond SK, Nekrutenko A (2007) A first look at ARFome: dual-coding genes in mammalian genomes. PLoS Comput Biol 3:e91

    Article  PubMed  Google Scholar 

  • Chung BY, Miller WA, Atkins JF, Firth AE (2008) An overlapping essential gene in the Potyviridae. Proc Natl Acad Sci USA 105:5897–5902

    Article  PubMed  CAS  Google Scholar 

  • Clifford M, Twigg J, Upton C (2009) Evidence for a novel gene associated with human influenza A viruses. Virol J 6:198

    Article  PubMed  Google Scholar 

  • de Groot S, Mailund T, Hein J (2007) Comparative annotation of viral genomes with non-conserved gene structure. Bioinformatics 23:1080–1089

    Article  PubMed  Google Scholar 

  • de Groot S, Mailund T, Lunter G, Hein J (2008) Investigating selection on viruses: a statistical alignment approach. BMC Bioinform 9:304

    Article  Google Scholar 

  • Delport W, Scheffler K, Seoighe C (2008) Frequent toggling between alternative amino acids is driven by selection in HIV-1. PLoS Pathog 4:e1000242

    Article  PubMed  Google Scholar 

  • Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7:214

    Article  PubMed  Google Scholar 

  • Firth AE (2008) Bioinformatic analysis suggests that the Orbivirus VP6 cistron encodes an overlapping gene. Virol J 5:48

    Article  PubMed  Google Scholar 

  • Firth AE, Atkins JF (2008a) Bioinformatic analysis suggests that a conserved ORF in the waikaviruses encodes an overlapping gene. Arch Virol 153:1379–1383

    Article  PubMed  CAS  Google Scholar 

  • Firth AE, Atkins JF (2008b) Bioinformatic analysis suggests that the Cypovirus 1 major core protein cistron harbours an overlapping gene. Virol J 5:62

    Article  PubMed  Google Scholar 

  • Firth AE, Atkins JF (2009) Analysis of the coding potential of the partially overlapping 3′ ORF in segment 5 of the plant fijiviruses. Virol J 6:32

    Article  PubMed  Google Scholar 

  • Firth AE, Brown CM (2005) Detecting overlapping coding sequences with pairwise alignments. Bioinformatics 21:282–292

    Article  PubMed  CAS  Google Scholar 

  • Firth AE, Brown CM (2006) Detecting overlapping coding sequences in virus genomes. BMC Bioinform 7:75

    Article  Google Scholar 

  • Firth AE, Wang QS, Jan E, Atkins JF (2009) Bioinformatic evidence for a stem-loop structure 5′-adjacent to the IGR-IRES and for an overlapping gene in the bee paralysis dicistroviruses. Virol J 6:193

    Article  PubMed  Google Scholar 

  • Fisher R (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh

    Google Scholar 

  • Fitch WM (1971) Toward defining the course of evolution: minimum change for a specified tree topology. Syst Zool 20:406–416

    Article  Google Scholar 

  • Fitch WM, Bush RM, Bender CA, Cox NJ (1997) Long term trends in the evolution of H(3) HA1 human influenza type A. Proc Natl Acad Sci USA 94:7712–7718

    Article  PubMed  CAS  Google Scholar 

  • Goldman N, Yang Z (1994) A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol 11:725–736

    PubMed  CAS  Google Scholar 

  • Hein J, Stovlbaek J (1995) A maximum-likelihood approach to analyzing nonoverlapping and overlapping reading frames. J Mol Evol 40:181–189

    Article  PubMed  CAS  Google Scholar 

  • Holmes EC, Lipman DJ, Zamarin D, Yewdell JW (2006) Comment on “Large-scale sequence analysis of avian influenza isolates”. Science 313:1573 author reply 1573

    Article  PubMed  CAS  Google Scholar 

  • Hughes AL, Westover K, da Silva J, O’Connor DH, Watkins DI (2001) Simultaneous positive and purifying selection on overlapping reading frames of the tat and vpr genes of simian immunodeficiency virus. J Virol 75:7966–7972

    Article  PubMed  CAS  Google Scholar 

  • Keese PK, Gibbs A (1992) Origins of genes: “big bang” or continuous creation? Proc Natl Acad Sci USA 89:9489–9493

    Article  PubMed  CAS  Google Scholar 

  • Krakauer DC (2000) Stability and evolution of overlapping genes. Evol Int J Org Evol 54:731–739

    CAS  Google Scholar 

  • Lavorgna G, Dahary D, Lehner B, Sorek R, Sanderson CM, Casari G (2004) In search of antisense. Trends Biochem Sci 29:88–94

    Article  PubMed  CAS  Google Scholar 

  • Li WH, Wu CI, Luo CC (1985) A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol Biol Evol 2:150–174

    PubMed  Google Scholar 

  • Li KS, Guan Y, Wang J, Smith GJ, Xu KM, Duan L, Rahardjo AP, Puthavathana P, Buranathai C, Nguyen TD, Estoepangestie AT, Chaisingh A, Auewarakul P, Long HT, Hanh NT, Webby RJ, Poon LL, Chen H, Shortridge KF, Yuen KY, Webster RG, Peiris JS (2004) Genesis of a highly pathogenic and potentially pandemic H5N1 influenza virus in eastern Asia. Nature 430:209–213

    Article  PubMed  CAS  Google Scholar 

  • Liang H, Landweber LF (2006) A genome-wide study of dual coding regions in human alternatively spliced genes. Genome Res 16:190–196

    Article  PubMed  CAS  Google Scholar 

  • McCauley S, Hein J (2006) Using hidden Markov models and observed evolution to annotate viral genomes. Bioinformatics 22:1308–1316

    Article  PubMed  CAS  Google Scholar 

  • McCauley S, de Groot S, Mailund T, Hein J (2007) Annotation of selection strengths in viral genomes. Bioinformatics 23:2978–2986

    Article  PubMed  CAS  Google Scholar 

  • Miyata T, Yasunaga T (1978) Evolution of overlapping genes. Nature 272:532–535

    Article  PubMed  CAS  Google Scholar 

  • Nei M, Gojobori T (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3:418–426

    PubMed  CAS  Google Scholar 

  • Nekrutenko A, He J (2006) Functionality of unspliced XBP1 is required to explain evolution of overlapping reading frames. Trends Genet 22:645–648

    Article  PubMed  CAS  Google Scholar 

  • Nekrutenko A, Wadhawan S, Goetting-Minesky P, Makova KD (2005) Oscillating evolution of a mammalian locus with overlapping reading frames: an XLalphas/ALEX relay. PLoS Genet 1:e18

    Article  PubMed  Google Scholar 

  • Neuhaus K, Oelke D, Fürst D, Scherer S, Keim DA (2010) Towards automatic detecting of overlapping genes—clustered BLAST analysis of viral genomes. In: Proceedings of the 8th European conference on evolutionary computation, machine learning and data mining in bioinformatics (EvoBIO ‘10)

  • Obenauer JC, Denson J, Mehta PK, Su X, Mukatira S, Finkelstein DB, Xu X, Wang J, Ma J, Fan Y, Rakestraw KM, Webster RG, Hoffmann E, Krauss S, Zheng J, Zhang Z, Naeve CW (2006) Large-scale sequence analysis of avian influenza isolates. Science 311:1576–1580

    Article  PubMed  CAS  Google Scholar 

  • Palleja A, Harrington ED, Bork P (2008) Large gene overlaps in prokaryotic genomes: result of functional constraints or mispredictions? BMC Genomics 9:335

    Article  PubMed  Google Scholar 

  • Pamilo P, Bianchi NO (1993) Evolution of the Zfx and Zfy genes: rates and interdependence between the genes. Mol Biol Evol 10:271–281

    PubMed  CAS  Google Scholar 

  • Pavesi A (2007) Pattern of nucleotide substitution in the overlapping nonstructural genes of influenza A virus and implication for the genetic diversity of the H5N1 subtype. Gene 402:28–34

    Article  PubMed  CAS  Google Scholar 

  • Pedersen AM, Jensen JL (2001) A dependent-rates model and an MCMC-based methodology for the maximum-likelihood analysis of sequences with overlapping reading frames. Mol Biol Evol 18:763–776

    Article  PubMed  CAS  Google Scholar 

  • Pybus OG, Rambaut A, Belshaw R, Freckleton RP, Drummond AJ, Holmes EC (2007) Phylogenetic evidence for deleterious mutation load in RNA viruses and its contribution to viral evolution. Mol Biol Evol 24:845–852

    Article  PubMed  CAS  Google Scholar 

  • Ribrioux S, Brungger A, Baumgarten B, Seuwen K, John MR (2008) Bioinformatics prediction of overlapping frameshifted translation products in mammalian transcripts. BMC Genomics 9:122

    Article  PubMed  Google Scholar 

  • Rogozin IB, Spiridonov AN, Sorokin AV, Wolf YI, Jordan IK, Tatusov RL, Koonin EV (2002) Purifying and directional selection in overlapping prokaryotic genes. Trends Genet 18:228–232

    Article  PubMed  CAS  Google Scholar 

  • Sabath N, Graur D (2010) Detection of functional overlapping genes: simulation and case studies. J Mol Evol 71:308–316

    Article  PubMed  CAS  Google Scholar 

  • Sabath N, Graur D, Landan G (2008a) Same-strand overlapping genes in bacteria: compositional determinants of phase bias. Biol Direct 3:36

    PubMed  Google Scholar 

  • Sabath N, Landan G, Graur D (2008b) A method for the simultaneous estimation of selection intensities in overlapping genes. PLoS ONE 3:e3996

    Article  PubMed  Google Scholar 

  • Sabath N, Price N, Graur D (2009) A potentially novel overlapping gene in the genomes of Israeli acute paralysis virus and its relatives. Virol J 6:144

    Article  PubMed  Google Scholar 

  • Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425

    PubMed  CAS  Google Scholar 

  • Silke J (1997) The majority of long non-stop reading frames on the antisense strand can be explained by biased codon usage. Gene 194:143–155

    Article  PubMed  CAS  Google Scholar 

  • Smith TF, Waterman MS (1981) Overlapping genes and information theory. J Theor Biol 91:379–380

    Article  PubMed  CAS  Google Scholar 

  • Suzuki Y (2006) Natural selection on the influenza virus genome. Mol Biol Evol 23:1902–1911

    Article  PubMed  CAS  Google Scholar 

  • Suzuki Y, Gojobori T (1999) A method for detecting positive selection at single amino acid sites. Mol Biol Evol 16:1315–1328

    PubMed  CAS  Google Scholar 

  • Swofford DL (2003) PAUP*. Phylogenetic analysis using parsimony (*and other methods). Sinauer Associates, Sunderland, MA

    Google Scholar 

  • Szklarczyk R, Heringa J, Pond SK, Nekrutenko A (2007) Rapid asymmetric evolution of a dual-coding tumor suppressor INK4a/ARF locus contradicts its function. Proc Natl Acad Sci USA 104:12807–12812

    Article  PubMed  CAS  Google Scholar 

  • Todd D, Weston JH, Soike D, Smyth JA (2001) Genome sequence determinations and analyses of novel Circoviruses from Goose and Pigeon. Virology 286:354–362

    Article  PubMed  CAS  Google Scholar 

  • Trifonov V, Rabadan R (2009) The contribution of the PB1-F2 protein to the fitness of Influenza A viruses and its recent evolution in the 2009 Influenza A (H1N1) pandemic virus. PLoS Curr 1:RRN1006

    Article  PubMed  Google Scholar 

  • Williams TA, Wolfe KH, Fares MA (2009) No rosetta stone for a sense-antisense origin of aminoacyl tRNA synthetase classes. Mol Biol Evol 26:445–450

    Article  PubMed  CAS  Google Scholar 

  • Xu H, Wang P, Fu Y, Zheng Y, Tang Q, Si L, You J, Zhang Z, Zhu Y, Zhou L, Wei Z, Lin B, Hu L, Kong X (2010) Length of the ORF, position of the first AUG and the Kozak motif are important factors in potential dual-coding transcripts. Cell Res 20:445–457

    Article  PubMed  CAS  Google Scholar 

  • Yang Z, Nielsen R (2000) Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol 17:32–43

    PubMed  CAS  Google Scholar 

  • Zhang J, Nei M (1997) Accuracies of ancestral amino acid sequences inferred by the parsimony, likelihood, and distance methods. J Mol Evol 44(Suppl 1):S139–S146

    Article  PubMed  CAS  Google Scholar 

  • Zhirnov OP, Poyarkov SV, Vorob’eva IV, Safonova OA, Malyshev NA, Klenk HD (2007) Segment NS of influenza A virus contains an additional gene NSP in positive-sense orientation. Dokl Biochem Biophys 414:127–133

    Article  PubMed  CAS  Google Scholar 

  • Zhong W, Reche PA, Lai CC, Reinhold B, Reinherz EL (2003) Genome-wide characterization of a viral cytotoxic T lymphocyte epitope repertoire. J Biol Chem 278:45135–45144

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We thank Dr. Chris Upton for suggesting this problem and providing valuable information. DG and NS were supported by a Small Grant Award from the University of Houston and by the US National Library of Medicine grant LM010009-01 to DG and Giddy Landan.

Conflict of interest

The authors declare that they have no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Niv Sabath.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (ZIP 1,348 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sabath, N., Morris, J.S. & Graur, D. Is There a Twelfth Protein-Coding Gene in the Genome of Influenza A? A Selection-Based Approach to the Detection of Overlapping Genes in Closely Related Sequences. J Mol Evol 73, 305–315 (2011). https://doi.org/10.1007/s00239-011-9477-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-011-9477-9

Keywords

Navigation