CROSS-DISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY

Chaos game representation of functional protein sequences, and simulation and multifractal analysis of induced measures

, , , and

2010 Chinese Physical Society and IOP Publishing Ltd
, , Citation Yu Zu-Guo et al 2010 Chinese Phys. B 19 068701 DOI 10.1088/1674-1056/19/6/068701

1674-1056/19/6/068701

Abstract

Investigating the biological function of proteins is a key aspect of protein studies. Bioinformatic methods become important for studying the biological function of proteins. In this paper, we first give the chaos game representation (CGR) of randomly-linked functional protein sequences, then propose the use of the recurrent iterated function systems (RIFS) in fractal theory to simulate the measure based on their chaos game representations. This method helps to extract some features of functional protein sequences, and furthermore the biological functions of these proteins. Then multifractal analysis of the measures based on the CGRs of randomly-linked functional protein sequences are performed. We find that the CGRs have clear fractal patterns. The numerical results show that the RIFS can simulate the measure based on the CGR very well. The relative standard error and the estimated probability matrix in the RIFS do not depend on the order to link the functional protein sequences. The estimated probability matrices in the RIFS with different biological functions are evidently different. Hence the estimated probability matrices in the RIFS can be used to characterise the difference among linked functional protein sequences with different biological functions. From the values of the Dq curves, one sees that these functional protein sequences are not completely random. The Dq of all linked functional proteins studied are multifractal-like and sufficiently smooth for the Cq (analogous to specific heat) curves to be meaningful. Furthermore, the Dq curves of the measure μ based on their CGRs for different orders to link the functional protein sequences are almost identical if q ≥ 0. Finally, the Cq curves of all linked functional proteins resemble a classical phase transition at a critical point.

Export citation and abstract BibTeX RIS

10.1088/1674-1056/19/6/068701