Research Topics of the Bioinformatics of Gene Regulation

The study of gene expression regulation raises the challenge of developing bioinformatics tools and algorithms, demanding data integration [...].

The study of gene expression regulation raises the challenge of developing bioinformatics tools and algorithms, demanding data integration. A recent journal Special Issue, "Bioinformatics of Gene Regulations and Structure-2022", collected papers on bioinformatics applications originally based on the materials presented at "Bioinformatics of Genome Regulation and Structure/Systems Biology" (BGRS\SB-2022) in July 2022 in Novosibirsk, Russia (https://bgrssb.icgbio.ru/2022, accessed on 10 May 2023). BGRS is the biannual bioinformatics conference series, and has taken place in Novosibirsk since 1998. It facilitates a broad scientific discussion of systems biology and genomics achievements, published in thematic journal issues [1][2][3]. Based on these conference discussions, we can review current trends in medical genomics [4] and cancer genomics [5]. Gene expression regulation at a transcriptional level was a key problem discussed at the conference series and in IJMS Special Issues [5][6][7].
The molecular mechanisms of human disease progression [4], as well as gene expressions in laboratory animal and plant models [2], have been being studied using omics technologies. The presented analytical techniques were discussed at the BGRS meeting series in Novosibirsk, Russia [1][2][3], highlighting recent advances in the fields of evolution, biomedicine, and biotechnology, and have been reviewed in existing (https://www.mdpi. com/journal/ijms/special_issues/Bioinformatics_Gene, accessed on 10 May 2023) and new IJMS issues (https://www.mdpi.com/journal/ijms/special_issues/MVA479KFR7, accessed on 10 May 2023). In fact, there are ten thematic paper collections (Special Issues) on bioinformatics, gene expression regulation, computational genomics, and computational biology in MDPI journal issues referring to past BGRS series conferences [1,8].
This collection builds on studies in the field of gene expression initially presented in Frontiers in Genetics [4] and BioMed Central Special Issues [1], and also in International Journal of Molecular Sciences [2,3,5]. The examination of this topic has also continued through thematic issues in Genes [6] (www.mdpi.com/journal/genes/special_ issues/Transcriptional_Regulation_Tumor, accessed on 10 May 2023) and Life [7] (https: //www.mdpi.com/journal/life/special_issues/computational_genomics_life, accessed on 10 May 2023). Here, we have collected papers with an overarching theme of bioinformatics of gene expression regulation, initially considering biomedical applications [2,5].
The issue "Bioinformatics of Gene Regulations and Structure-2022" showcased insights into the fields of genomics, transcriptomics, and proteomics, as well as the works conducted in model organisms. This Special Issue contains nine research manuscripts and one review, each concerning a bioinformatics solution or tool for the analysis of the molecular mechanisms of gene expression regulation.
The papers present novel bioinformatics models on gene expression regulation and chromatin immunoprecipitation-sequencing (ChIP-seq) technology to analyze transcription factor binding to DNA, gene regulation at transcription and translation levels, single cell sequencing tools, and applications in model organisms.
We open this collection of papers with works discussing the molecular mechanisms of DNA binding regulating gene transcription. Anastasia Melikhova et al. [9] discussed evolutionary constraints on the DNA double helix in RNA Pol II core promoters. This classical approach used representative sets of aligned gene promoter sequences of fifteen eukaryotic species. The evolutionarily stable and heterogeneous secondary structure of Pol II core promoters was revealed, including the TATA-box [10]. It should be noted that the TATA-box structure and the Polymerase II binding have been described in detail, along with the development of the specialized tools, in recent IJMS publications in the "Bioinformatics of Gene Expression" Special Issue by M.P.Ponomarenko and colleagues [11,12] in relation to the same conference series.
Gene expression is regulated at transcription and translation levels [10,13]. Bioinformatics methods for assessing the efficiency of different stages of gene expression, including translation elongation, were presented by Aleksandra Korenskaia et al. [14]. The authors estimated mRNA sequence features, such as codon usage bias and mRNA secondary structure properties, and evaluated correlation coefficients between experimentally measured protein abundance and predicted elongation efficiency characteristics for the set of prokaryotes belonging to diverse taxonomic groups. The study extends the results on elongation efficiency estimates presented earlier [15].
Gene expression is controlled at the nucleotide and structural levels, as well [16]. The next group of papers discussed applications of the chromosomal regulation of gene expression in animal models. Artem Nurislamov and co-authors [17] used a lampbrush chromosomes model to analyze DNA methylation in chicken. The authors performed a single-cell methylome analysis of chicken diplotene oocytes. A. Nurislamov and colleagues characterized methylation patterns in these cells, obtained methylation-based chicken genome segmentation, and identified oocyte-specific methylated gene promoters. The role of chromatin architecture alterations in genomes has been discussed in our topical journal issue by the same science group by V. Fishman [18]. In the next study, Evelyn Kabirova et al. [19] reviewed distal gene expression regulation in animals considering the evolution of the loop extrusion machinery. Loop extrusion machinery consists of various proteins with different time origins [20]. The evolutionarily conserved core of all extrusion complexes is formed by SMC (structural maintenance of chromosomes) proteins. The review by Kabirova and co-authors [19] covers the roles of SMC complexes in gene regulation, DNA repair, and chromatin topology.
Evgenia Solodneva et al. [21] considered genetic models in cattle. The authors analyzed the genetic structures of transboundary and local cattle based on STR (short tandem repeat) markers. To determine the population genetic characteristics and clarify the phylogenetic relationships of modern representatives of hundreds of cattle populations from different regions of the world, the authors analyzed a large set of STR data (more than 10 individuals) including unique native cattle populations and breeds. The classification extends the results presented recently by this science group [22].
The next group of papers showed new bioinformatics tools for transcription regulatory region analysis [23]. ChIP-seq technology provides abundant data for the annotation of regulatory modules and nucleotide motifs in gene promoter regions [10,24]. Victor Levitsky et al. [23] presented a new web-server Web-MCOT (Motifs Co-Occurrence Tool) for ChIP-seq data analysis aimed for use in nucleotide motifs co-occurrence searching. Note that, in the paper by the same science group highlighted in an IJMS Special Issue [25], V. Levitsky and colleagues presented a study on the cooperative binding of protein transcription factors to DNA as the mechanism of transcription regulation. Transcription factor binding and the clustering of the binding sites based on ChIP-seq have been studied in plant [26] and yeast genomes [27]. However, the technology of bulk sequencing moving to single cell sequencing and its variants challenges the development of new tools [28]. In this Special Issue, Mikhail Raevskiy and colleagues [29] presented the Epi-Impute software for single-cell RNA-seq (scRNA-seq) and ATAC-seq analysis. scRNA-seq data contain many dropouts, hampering downstream analyses due to the inefficient capture of mRNAs in individual cells [30]. The method suggested is intended for dropout imputation by reconciling expression and epigenomic data. Epi-Impute leverages single-cell ATAC-seq data to reduce the number of dropouts [29]. The work by the same group by Y. Medvedeva on scRNA-seq data analysis application was published in an IJMS issue on medical bioinformatics [31].
Finally, a group of papers presents text mining and machine learning applications in gene regulation studies. Vladimir Ivanisenko et al. [32] present a new version of the ANDDigest (Associative Network Design-Digest) tool with improved AI-based short names recognition of genes. This tool, as an extension of the ANDSystem (Associative Network Discovery System) [33,34] for the text mining of science literature and network reconstruction, has been successfully applied to a range of biomedical problems [35] discussed in IJMS Special Issues (https://www.mdpi.com/journal/ijms/special_issues/ Medical_Genetics_2022, accessed on 10 May 2023). Currently, it is employed in applications for model plant organisms [36]. Aleksandar Veljkovic and colleagues [37] developed a new tool, BioGraph, for querying diverse biological metadata. The presented model enables information retrieval using interlinked entities from highly diverse biological datasets in a unified manner, extending the capabilities of the ANDDigest tool [32,33].
Zeping Cai et al. [38] presented the genome-wide mining of the tandem duplicated type III polyketide synthases and in Senna tora crop. The authors analyzed the tandem duplicated genes S. tora. The study [38] will provide helpful information for the further functional analysis of the CHS-L genes in the regulation of anthraquinone biosynthesis in S. tora [39].
The problems associated with plant bioinformatics and discovering molecular mechanisms of gene expression in plant models are presented in a parallel Special Issue, "Plant Biology and Biotechnology: Focus on Genomics and Bioinformatics 2.0" (https://www. mdpi.com/journal/ijms/special_issues/PlantBi_Biology, accessed on 10 May 2023). It can be noted that the next PlantGen-2023 conference, "Plant Genetics, Genomics, Bioinformatics and Biotechnology" will take place in Kazan, Russia, in July 2023 (https://plantgen202 3.ofr.su/, accessed on 10 May 2023). The PlantGen conference series originated from the Institute of Cytology and Genetics SB RAS events, and developed in parallel to the BGRS conference series on bioinformatics. We collected bioinformatics papers on previous topical issues [2], as well.
Thus, the current Special Issue series on bioinformatics has confirmed the research interest in gene expression regulation studies [2,3,5]. We can summarize the research topic papers' content as covering regulatory region sequence analysis, ChIP-seq studies, and epigenetic and single cell sequencing applications. The model genomes studies also vary from human and animal genomes to recently annotated plant genomes.
The guest editors are happy to announce the Special Issue topic "New Sights into Bioinformatics of Gene Regulations and Structure" (https://www.mdpi.com/journal/ ijms/special_issues/MVA479KFR7, accessed on 10 May 2023) of MDPI's IJMS, as well as the next conference on system biology and bioinformatics in Russia in 2023 (https: //conf.icgbio.ru/sblai2023/, accessed on 10 May 2023), considering Artificial Intelligence applications in bioinformatics. We also note the recent VII Congress of Russian Biophysicists in Krasnodar, Russia (http://rusbiophysics.ru/db/conf.pl?cid=1&lang=en&div=1, accessed on 10 May 2023), and the range of biophysical models related to molecular mechanisms of gene expression presented there. We hope that readers find these materials to be interesting and stimulating, and we will continue to collect papers on gene expression regulation for new IJMS journal issues.