Abstract
We have refined entropy theory to explore the meaning of the increasing sequence data on nucleic acids and proteins more conveniently. The concept of selection constraint was not introduced, only the analyzed sequences themselves were considered. The refined theory serves as a basis for deriving a method to analyze non-coding regions (NCRs) as well as coding regions. Positions with maximal entropy might play the most important role in genome functions as opposed to positions with minimal entropy. This method was tested in the well-characterized coding regions of 12 strains of Classical Swine Fever Virus (CSFV) and non-coding regions of 20 strains of CSFV. It is suitable to analyze nucleic acid sequences of a complete genome and to detect sensitive positions for mutagenesis. As such, the method serves to formulate the basis for elucidating the functional mechanism.
Similar content being viewed by others
REFERENCES
Behrens, S. E., L. Tomei and R. D. Francesco (1996). Identification and properties of the RNA-dependent RNA polymerase of hepatitis C virus. The EMBO Journal 15(1): 12-22.
Berg, O. G. and P. H. von Hippel (1987). Selection of DNA binding sites by regulatory proteins: statistical-mechanical theory and application to operators and promoters. Journal of Molecular Biology 193: 723-750.
Berg, O. G. and P. H. von Hippel (1988). Selection of DNA binding sites by regulatory proteins: II. The binding specificity of cyclic AMP receptor protein to recognition sites. Journal of Molecular Biology 200: 709-723.
Buck, B and V. A. Macaulay (1991). Maximal Entropy in Action: A Connection of Expository Essays. Clarendon Press, Oxford.
Daniell, G. J. and P. J. Hore (1989). Maximal entropy and NMR: a new approach. Journal of Magnetic Resonance 84: 515-536.
Fields, D. S., Y. Y. He, A. Y. Al-Uzri and G. D. Stormo (1997). Quantitative specificity of the Mnt repressor, Journal of Molecular Biology 271: 178-194.
Gelfand, M. S., E. V. Koonin and A. A. Mironov (2000). Prediction of transcription regulatory sites in archaea by a comparative genomic approach. Nucleic Acids Research 28: 695-705.
Jaynes, E. T. (1986). In Maximal Entropy and Bayesian Methods in Applied Statistics: Proceedings of the Fourth Maximal Entropy Workshop. Cambridge University Press, Cambridge.
Mao, S. S. (1998). Advanced Mathematical Statistics. Springer-Verlag Berlin Heidelberg. China Higher Education Press, Beijing.
Moser, C., P. Stettler, J. D. Trutschin and M. A. Hofmann (1999). Cytopathogenic and noncytopathogenic RNA replicons of classical swine fever virus. Journal of Virology 73: 7784-7794.
Rumenapf, T., G. Unger, J. H. Strauss and H. J. Thiel, (1993). Processing of the envelope glycoproteins of pestiviruses. Journal of Virology 67: 3288-3294.
Schneider, T. D., G. D. Stormo, L. Gold and A. Ehrenfeucht (1986). Information content of binding sites on nucleotide sequences. Journal of Molecular Biology 188: 415-431.
Skilling, J. and R. K. Bryan (1984). Maximal entropy image reconstruction: general algorithm. Monthly Notices of the Royal Astronomical Society 211: 111-24.
Squire, W. (1970). Integration for Engineers and Scientists. American Elsevier Publishing Company, New York.
Stark, R., G. Meyers, T. Rumenapf and H. J. Thiel (1993). Processing of pestivirus polyproteins: cleavage site between autoprotease and nucleocapsid protein of classical swine fever virus. Journal of Virology 67: 7088-7095.
Stormo, G. D. (1988). Computer methods for analyzing sequence recognition of nucleic acids. Annual Review of Biophysics and Biophysical Chemistry 17: 241-263.
Stormo, G. D. (1998). Information content and free energy in DNA-protein Interactions. Journal of Theoretical Biology 195: 135-137.
Stormo, G. D. and D. S. Fields (1998). Specificity, Free energy and information content in proten-DNA interactions. Trends in Biochemical Sciences 23: 109-113.
Stormo, G. D. and G. W. Hartzell (1989). Identifying protein-binding sites from unaligned DNA fragments. Proceedings of the National Academy of Sciences of the United States of America 86: 1183-1187.
Van Rijn, P.A., E. J. de Meijer, H.G.P. van Gennip and R. J. M. Moormann (1993). Epitope mapping of envelope glycoprotein E1 of hog cholera virus strain Brescia. Journal of General Virology 74: 2053-2060.
Wang, Z. Q. (1978). An Introduction to Statistical Mechanics. People's Education Press, Beijing.
Xu, J., E. Mendez, P. R. Caron, C. Lin, M. A. Mulcko, M. S. Collett and C. M. Rice (1997). Bovine viral diarrhea virus NS3 Serine proteinase: polyprotein cleavage sites, cofactor requirements, and molecular model of an enzyme essential for pestivirus replication. Journal of Virology 71: 5312-5322.
Yu, H., C. W. Grassmann and S. E. Behrens (1999). Sequence and structural elements at the 3' terminus of bovine viral diarrhea virus genomic RNA: functional role during RNA replication. Journal of Virology 73: 3638-3648.
Zhong, W., L. L. Gutshall and A. M. Del Vecchio (1998). Identification and characterization of an RNA-dependent RNA polymerase activity within the nonstructural protein 5B region of bovine viral diarrhea virus. Journal of Virology 72: 9365-9369.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Xiao, M., Zhu, Z.Z., Liu, J. et al. A New Method Based on Entropy Theory for Genomic Sequence Analysis. Acta Biotheor 50, 155–165 (2002). https://doi.org/10.1023/A:1016587025917
Issue Date:
DOI: https://doi.org/10.1023/A:1016587025917