Data on graphical representation (CGR and FCGR) of bacterial and archaeal species from two Soda Lakes

In this paper, we presented the datasets generated using Chose Game representation (CGR) and Choase Game Representation of Frequencies (FCGR) of bacterial and archaeal 16S rRNA sequences. The data in the form of graphical representations was yielded with the help of ENDMEMO tool. The computational representation of these data datasets is useful for studies and interpretation of microbial sequences. Based on a technique from chaotic dynamics, the method produces a picture of any gene (DNA and RNA) sequence which displays both local and global patterns. Eukaryotes and prokaryotes can be identified merely based on their generated visual representation/DNA structures.


a b s t r a c t
In this paper, we presented the datasets generated using Chose Game representation (CGR) and Choase Game Representation of Frequencies (FCGR) of bacterial and archaeal 16S rRNA sequences. The data in the form of graphical representations was yielded with the help of ENDMEMO tool. The computational representation of these data datasets is useful for studies and interpretation of microbial sequences. Based on a technique from chaotic dynamics, the method produces a picture of any gene (DNA and RNA) sequence which displays both local and global patterns. Eukaryotes and prokaryotes can be identified merely based on their generated visual representation/DNA structures.
& Value of the data Data generated in this study permits the representation and investigation of patterns in any type of sequences which visually revealed previously unknown pattern.
The generated graphical data by means of sequences using a new tool derived from the "chaotic dynamical systems" which allowed the depiction of frequencies of oligonucleotides in the form of images.
Data on CGR and FCGR are the main factors explaining the variability observed among sequences.
The distance between images helpful for measurement of phylogenetic proximity.

Data
This paper describes data on 16S rRNA sequence of bacterial and archaeal species isolated from Soda Lakes such as Sambhar Lake and Chilka Lake (India). The data generated in the form of graphical representations contains information on their oligonucleotides distribution and numbers.