Skip to main content

Investigating Some Attributes of Periodicity in DNA Sequences via Semi-Markov Modelling

  • Conference paper
  • First Online:
Stochastic Processes, Statistical Methods, and Engineering Mathematics (SPAS 2019)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 408))

  • 440 Accesses

Abstract

Periodicity of DNA segments and sequences have been studied thoroughly during the past decades. One of the main problems is the identification of protein coding and non-coding regions inside genes, using mathematical techniques. Periodicity plays an important role in the structure of DNA, as specific regions have been shown to have periodic patterns. In this paper, we consider that a DNA sequence is described by a semi-Markov chain (SMC), with discrete state space consisting of the four nucleotides. Equations in closed analytic form are derived, in order to characterize strong or weak d-periodic and quasiperiodic behaviour of our model for both the homogeneous and non-homogeneous case. The model is applied to 3-base periodic sequences, which characterize the protein-coding regions of the gene. The related probabilities and the corresponding indexes are provided, which yield a description of the underlying periodic pattern. Last, the previous theoretical results are illustrated with data from synthetic and real DNA sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Almagor, H.: A Markov analysis of DNA sequences. J. Theor. Biol. 104(4), 633–645 (1983)

    Article  Google Scholar 

  2. Almirantis, Y.: A standard deviation based quantification differentiates coding from non-coding DNA sequences and gives insight to their evolutionary history. J. Theor. Biol. 196(3), 297–308 (1999)

    Article  Google Scholar 

  3. Avery, P.J., Henderson, D.A.: Fitting Markov chain models to discrete state series such as DNA sequences. J. R. Stat. Soc.: Ser. C (Appl. Stat.) 48(1), 53–61 (1999)

    Article  MATH  Google Scholar 

  4. Bartholomew, D., Forbes, A., McClean, S.: Statistical Techniques for Manpower Planning. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. Wiley (1991)

    Google Scholar 

  5. Benson, G.: Tandem repeats finder: a program to analyze DNA sequences. Nucl. Acids Res. 27(2), 573–580 (1999)

    Article  Google Scholar 

  6. Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268(1), 78–94 (1997)

    Article  Google Scholar 

  7. Chechetkin, V.R., Yu. Turygin, A.: Search of hidden periodicities in DNA sequences. J. Theor. Biol. 175(4), 477–94 (1995)

    Google Scholar 

  8. Chechetkin, V.R., Turygin, A.Y.: On the spectral criteria of disorder in nonperiodic sequences: application to inflation models, symbolic dynamics and DNA sequences. J. Phys. A: Math. Gen. 27(14), 4875–4898 (1994)

    Article  MATH  Google Scholar 

  9. Cheever, E.A., Overton, G.C., Searls, D.B.: Fast Fourier transform-based correlation of DNA sequences using complex plane encoding. Comput. Appl. Biosci.: CABIOS 7(2), 143–54 (1991)

    Google Scholar 

  10. Cohanim, A.B., Trifonov, E.N., Kashi, Y.: Specific selection pressure at the third codon positions: contribution to 10-to 11-base periodicity in prokaryotic genomes. J. Mol. Evol. 63(3), 393–400 (2006)

    Article  Google Scholar 

  11. D’Amico, G., Petroni, F., Prattico, F.: First and second order semi-Markov chains for wind speed modeling. Phys. A: Stat. Mech. Its Appl. 392(5), 1194–1201 (2013)

    Article  Google Scholar 

  12. Eskesen, S.T., Eskesen, F.N., Kinghorn, B., Ruvinsky, A.: Periodicity of DNA in exons. BMC Mol. Biol. 5(1), 12 (2004)

    Article  Google Scholar 

  13. Garden, P.W.: Markov analysis of viral DNA/RNA sequences. J. Theor. Biol. 82(4), 679–684 (1980)

    Article  Google Scholar 

  14. Herzel, H., Weiss, O., Trifonov, E.N.: 10–11 bp periodicities in complete genomes reflect protein structure and DNA folding. Bioinformatics (Oxford, England) 15(3), 187–193 (1999)

    Google Scholar 

  15. Howard, R.A.: Dynamic probabilistic systems: Markov models, vol. 2. Courier Corporation (1971)

    Google Scholar 

  16. Janssen, J.: Semi-Markov Models: Theory and Applications. Springer (1999)

    Google Scholar 

  17. Janssen, J., Manca, R.: Applied semi-Markov processes. Springer Science & Business Media (2006)

    Google Scholar 

  18. Papadopoulou, A.: Counting transitions–entrance probabilities in non-homogeneous semi-Markov systems. Appl. Stoch. Models Data Anal. 13(3–4), 199–206 (1997)

    Article  MATH  Google Scholar 

  19. Papadopoulou, A.A.: Some results on modeling biological sequences and web navigation with a semi Markov chain. Commun. Stat.-Theory Methods 42(16), 2853–2871 (2013)

    Article  MATH  Google Scholar 

  20. Provata, A., Almirantis, Y.: Scaling properties of coding and non-coding DNA sequences. Phys. A: Stat. Mech. Its Appl. 247(1–4), 482–496 (1997)

    Article  MATH  Google Scholar 

  21. Reinert, G., Schbath, S., Waterman, M.S.: Probabilistic and statistical properties of words: an overview. J. Comput. Biol. 7(1–2), 1–46 (2000)

    Article  Google Scholar 

  22. Salih, B., Tripathi, V., Trifonov, E.N.: Visible periodicity of strong nucleosome DNA sequences. J. Biomol. Struct. Dyn. 33(1), 1–9 (2015)

    Article  Google Scholar 

  23. Schbath, S., Prum, B., De Turckheim, E.: Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences. J. Comput. Biol. 2(3), 417–437 (1995)

    Article  Google Scholar 

  24. Tavare, S., Giddings, B.W.: Some statistical aspects of the primary structure of nucleotide sequences. In: Waterman, M.S. (ed.) Mathematical Methods for DNA Sequences (1989)

    Google Scholar 

  25. Trifonov, E.N.: 3-, 10.5-, 200-and 400-base periodicities in genome sequences. Phys. A: Stat. Mech. Its Appl. 249(1–4), 511–516 (1998)

    Google Scholar 

  26. Trifonov, E.N., Sussman, J.L.: The pitch of chromatin DNA is reflected in its nucleotide sequence. Proc. Natl. Acad. Sci. 77(7), 3816–3820 (1980)

    Article  Google Scholar 

  27. Tsonis, A.A., Elsner, J.B., Tsonis, P.A.: Periodicity in DNA coding sequences: implications in gene evolution. J. Theor. Biol. 151(3), 323–331 (1991)

    Article  Google Scholar 

  28. Vassiliou, P.C.G., Papadopoulou, A.: Non-homogeneous semi-Markov systems and maintainability of the state sizes. J. Appl. Probab. 29(3), 519–534 (1992)

    Article  MATH  Google Scholar 

  29. Waterman, M.: Introduction to Computational Biology: Maps, Sequences, and Genomes: Interdisciplinary Statistics. Chapman & Hall/CRC, New York (1995)

    MATH  Google Scholar 

  30. Wu, T.J., Hsieh, Y.C., Li, L.A.: Statistical measures of DNA sequence dissimilarity under Markov chain models of base composition. Biometrics 57(2), 441–448 (2001)

    Article  MATH  Google Scholar 

  31. Yin, C., Wang, J.: Periodic power spectrum with applications in detection of latent periodicities in DNA sequences. J. Math. Biol. 73(5), 1053–1079 (2016)

    Article  MATH  Google Scholar 

  32. Yin, C., Yau, S.S.T.: Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. J. Theor. Biol. 247(4), 687–694 (2007)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pavlos Kolias .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kolias, P., Papadopoulou, A. (2022). Investigating Some Attributes of Periodicity in DNA Sequences via Semi-Markov Modelling. In: Malyarenko, A., Ni, Y., Rančić, M., Silvestrov, S. (eds) Stochastic Processes, Statistical Methods, and Engineering Mathematics . SPAS 2019. Springer Proceedings in Mathematics & Statistics, vol 408. Springer, Cham. https://doi.org/10.1007/978-3-031-17820-7_9

Download citation

Publish with us

Policies and ethics