Boolean modelling as a logic-based dynamic approach in systems medicine

Graphical abstract


Introduction
Extensive amounts of omics data generated to understand disease mechanisms require interpretation to formulate meaningful hypotheses [1]. Pathway databases [2][3][4] give an overview of disease-related processes, while mathematical models give qualitative and quantitative insights into their complexity. Similarly to pathway databases, mathematical models are stored and shared using dedicated platforms [5][6][7][8][9]. Moreover, communitydriven initiatives such as disease maps [10] encode diseasespecific mechanisms in both computable and diagrammatic form using dedicated tools for diagram biocuration [11][12][13] and visualisation [14,15]. In all cases, computationally readable content can be used as a scaffold to build dynamic models in an automated fashion to investigate the dynamic properties of the system [16].
Modelling of a biological process depends on the scope and it may vary depending on the nature of the process (e.g.signalling vs metabolic). The experimental design and resulting data also influence the model structure and analysis [17]. Dynamic modelling approaches include Boolean or Multi-valued Networks [18], Petri nets [19] or Ordinary Differential equations (ODEs) [20]. However, model parameterisation is a challenging task [21] making logical models an interesting alternative [22,23]. Boolean models are qualitative rather than quantitative and do not require detailed kinetic information. However, in some research areas, such as pharmacogenomics, presenting data to simple Boolean models may be challenging, and does not introduce the best description of the biological system [24]. Therefore, researchers studied the qualitative nature of Boolean models, facilitating the integration with other quantitative methods to allow better analysis [25][26][27]. Such methods, including ODEs and Petri nets, combined with BNs and constraint-based models, show that Boolean models are useful scaffolds for quantitative models [24].
Boolean network (BN) represents a dynamic system under the Boolean formalism, where the state of biomolecules has two possible values, one or zero, and changes following their interactions described by Boolean functions (BFs). BFs define the state of the outputs based on the interaction inputs and their internal logic. The order of evaluating BFs in a BN is governed by an updating scheme [28]. This straightforward framework of BNs was applied to a range of biological problems [29][30][31] and model qualitative behaviours [32,33]. Here, we review the application of Boolean modelling to systems medicine problems. First, we explain the modelling process itself, and follow by reviewing applications of Boolean modelling in clinical and translational medicine. We conclude by discussing emerging tools and methods improving the reproducibility and reuse of such models in biomedical research.

Boolean modelling process
Boolean models represent a logical formalism, where available variables have binary values one (ON) or zero (OFF). These variables are connected by BFs, which define the state of output variables based on input variables. BFs (interactions) connect the variables (nodes) in a model, creating a Boolean Network (BN). The updating schemes define conditions and order in which the BFs are calculated.
A BN can describe the dynamics of a biological system, where biomolecules are represented by the variables and BFs encode the interactions between the biomolecules, describing the behaviour of regulated outputs based on regulator inputs. Connectivity and logic of BNs are usually constructed based on biocuration, by defining relevant biomolecules and interactions based on available literature. BNs can also be built from mechanistic data, like time series or phosphoproteomics [28,34]. Finally, the updating schemes of BNs need to be selected, to govern the transitions of BN components from one state to another following the defined BFs.

Building the model structure
BNs can be manually built from literature. This requires selection of key components (nodes) to represent the biological system, including non-physical elements such as pathway endpoints (phenotypes). The interactions (edges) between these network components are based on relevant literature and represent logical dependencies of the components, i.e. the BFs of the system [35]. When building a BN, it is crucial to define its purpose and use cases to better determine information that should be collected and kept throughout the analysis [36]. Importantly, a process of manual curation depends on the expertise and reasoning of the curator. To ensure reliability of such models, curated BNs should be well documented, harmonised with community guidelines, and reproducible with the corresponding data [37]. The literature based curation can be complemented by model inference from timeseries data, which allows constructing hypothesis-free models [38][39][40][41].
For gene expression time series, the data is binarised based on a given threshold to infer ON/OFF states of particular genes in each time step. There are a range of possible approaches for determining an optimal threshold [38,42], and they should be chosen based on the nature of the measured biological process. After binarization, BFs can be inferred based on the sequence of state changes of selected genes, which may involve adjustments of the original binarization thresholds [43,44]. Importantly, such inference may be inaccurate for coexpressed genes [43]. To improve accuracy, different binarization methods can be combined to reduce the network complexity and rank common candidates by the performance of inferred models [45]. Another approach to infer the BN from expression data is to first construct a regulatory network and then assign BFs, avoiding data binarization. Here, the TIGRESS algorithm [40] can be used to infer the network and pass it to the TaBooN workflow [41] that infers the fittest BFs based on the compatibility with given expression profiles by using a Tabu metaheuristic [46]. BNs built from literature or inferred from expression data may require further refinement. Such refinement is possible by matching the performance of a BN against additional datasets [29], like perturbation experiments or phosphoproteomic readouts. Based on such data, a BN can be reduced to a version that best explains the validation data. This search can get computationally expensive, requiring heuristic-based approaches to identify a set of the most fitting BNs. For instance, in [47] the algorithm weights the fit between the data and a BN by measuring the deviation between the model steady states and the perturbation data in matched conditions (knockouts, overexpressions). However, this approach is not scalable in large complex networks. To solve this problem, caspo workflows [48] were proposed to infer all optimised BNs by using the Answer Set Programming (ASP) [49]. This approach was used to infer BNs of different cell lines from curated networks and infer their BFs based on phosphoproteomic datasets [48]. Importantly, for each perturbation condition only the corresponding part of the BN is analysed, selected based on the downstream phenotype. The ASP then identifies a set of BNs which fit the experimental data and satisfy a specific condition.

Construction of Boolean functions
Biological processes can be modelled as interactions of biomolecules, controlled by their regulators, where BFs describe the relationship between all interaction participants (see Fig. 1). The state of a biomolecule is changed based on these BFs in an iterative manner [50]. In each iteration, the states of biomolecules are changed based on a chosen updating scheme [28].
The three basic BFs are AND, OR, and NOT. BFS can be represented using a truth table in which each row represents a combination of Boolean variables and their output values (see Fig. 1). Other types of BFs are not as widely used, as they describe complex and non-intuitive relationships [51]. One of such functions is a canalysing function, allowing to define a hierarchical relationship between multiple input variables of a BF [52]. A canalysing function has a defined structure with at least one input and a fixed output. An input takes a specific value and determines the value of the function, making the network stable [52].
BFs can also be constructed by probability distributions [51] to represent combinatorial effects of regulations in a simple and interpretable representation [53]. In models with such BFs, called threshold Boolean Networks (TBNs), biomolecule regulation is an additive process, in which the operator functions describe the sequence of events in the regulatory system [32,54]. The dynamics of a TBN can be described by: x i ðt þ 1Þ represents the expression of the regulated biomolecule i at the next time.
(t þ 1Þ, and the interaction coefficient a ij refers to the strength and type of regulation that biomolecule j exerts on i. Positive regulation is specified by positive values of and negative regulation by negative values of a ij. Any regulation is a product of the regulator's state x j ðtÞ and the type and strength of the regulation a ij. The next state of a biomolecule depends on its regulator's state. In particular, the next state x i ðt þ 1Þ of a biomolecule i is ON if the sum of its regulators' regulatory effects surpasses 0, OFF if the sum is below 0, and when the sum is 0, the state remains the same.

Analysis of Boolean models
The dynamics of a model are simulated by incremental execution of one or more of its BFs. They change the states corresponding biomolecules in a series of discrete time steps called transitions. The spectrum of all possible transitions is illustrated using the state transition graph. The vertices represent 2 n possible states (n: number of the network elements) and the edges represent the transition from a state (s) to another (s0) as follows:: In which f i is the updating function and ðx i1 ; x i2 ; :::x ik ) are state variables.
The order at which BFs are executed is governed by an updating scheme [55]. The most frequently used are synchronous, asynchronous and hybrid updating schemes [35]. The synchronous scheme updates the state of all biomolecules at the same time (see Fig. 2.A) and is deterministic in nature. Another class is the asynchronous scheme, which updates randomly the state of a single biomolecule per transition (see Fig. 2.A), making each execution non-deterministic [56]. This makes the runtime considerably longer, especially for complex networks. This limitation is addressed by improved updating schemes such as random order asynchronous [57] and deterministic asynchronous [58] that apply reduction techniques to simplify the BNs. Another limitation of synchronous and asynchronous schemes is that they may inaccurately represent mechanisms that need more than one time step. This limitation is addressed by probabilistic BNs (PBN). This scheme assigns probabilities to the BFs, and biomolecules are updated based on this probability (see Fig. 2B) [35,53].
In a hybrid updating scheme, using synchronous and asynchronous schemes is possible. The synchronous update can include time delays in interactions between regulators and their final products [35,38]. This scheme assumes partitioning the network into groups, the variables at each group are synchronously updated, but the update between groups is asynchronous, causing time delays during the executing partitions of a network [35,59].

Attractor analysis
A simulated model can reach a stable dynamic behaviour, where the states of the biomolecules converge to a stable configuration, called an attractor, which is interpreted as a physiological endpoint [35,50].
An attractor is a state of a BN with no outgoing edges in the state transition graph. Attractors can be classified as i) stable states (fixed points) which are time invariant, and ii) complex attractors -sets of possible outcomes that can be reached following the synchronous and asynchronous scheme [60]. The set of states within an attractor is called the basin of attraction. It can be interpreted as a set of possible biological scenarios, supporting testable hypotheses [61]. In synchronous and deterministic asynchronous schemes, the system may oscillate regularly when attractors form a limit cycle, and each node has not more than one successor. An example of a limit cycle is the cell cycle in models of a eukaryotic cell [62][63][64]. In a stochastic asynchronous scheme, the system may oscillate irregularly due to the random initial condition selection leading to loose attractors. That means the network does not oscillate in a cycle due to the target node having more than one successor. It is challenging to interpret complex attractors with large numbers of steady states that oscillate in an irregular cycle.
To find an attractor, the past states of the model are compared to the updated ones to find recurring patterns. This search process can be exhaustive or heuristic. An exhaustive search starts from all states synchronously until the attractor is reached. This mode is mostly limited to small-size networks [65], although a SAT solver can increase the search speed, identifying the possible attractors in large networks with hundreds of components [66]. In turn, the heuristic search starts with a chosen subset of states to identify the attractor synchronously or asynchronously. The heuristic search performs random transitions, creating network states with a high probability. Then, the algorithm computes the forward reachable sets of the network states. If all sets are similar, an attractor is identified [56].
Identifying an attractor in a complex network is challenging. Many reduction techniques were implemented to simplify the original BFs to include a fewer number of operations [58,67,68]. This can be achieved by removing components that do not affect the behaviour of the original BFs. For example, some reduction techniques identify the biomolecules whose state do not change, turning the corresponding BFs into a simpler model [58,67]. In complex BNs, this technique is followed by removing interactions with one input and output and self-loops [67]. Another approach splits the network into strongly connected components (SCCs) to decrease the model complexity, and the simulations are run for all the SCCs independently [69]. Recently proposed Most Permissive Boolean Network simulations (MPBNs) is a paradigm to perform trajectories sampling and to reach the complete set of attractors faster than the asynchronous search, allowing to run more fine-grained simulations [70].

Topology, perturbation, and controllability analysis
Simulation of BNs to identify their stable states and attractors provides insights into the behaviour of the model. From this point, it is possible to predict meaningful interventions towards desired outcomes by analysing the structure of the model and its response to perturbations. To gain insight into the model structure, it is important to study the topology of a BN which may be a necessary prerequisite for some updating schemes and/or attractor analysis (the ones that need modularisation). Such analysis helps to understand the connectivity of the network components and how they affect phenotypes. This information can improve the understanding of BN dynamics under different updating schemes or attractor analysis [71] by identifying structural cycles in the BN topology. Moreover, this information can be used to define components sensitive against perturbations [72].
Perturbation analysis means changing the state of a biomolecule or its BFs, to analyse the topological robustness and the dynamic resilience of the Boolean model, and the attractors it reaches [73,74]. Comparing the original attractors with those after perturbation allows evaluating its impact. One of frequently used perturbations sets the state of a biomolecule to a fixed value, zero or one, emulating permanent activation or activation, e.g., due to a drug action. Other types of perturbation may change the rule structure of the BFs, either entirely (rule-flip) or partially. Such partial perturbations are called edge perturbations, as they affect the connectivity of a BN (Fig. 3).
The control of the BNs can be achieved by adding external sets of signals to affect the state of the biomolecules so the model reaches the desirable stable state or attractor [75,76]. The added signals, represented as additional nodes in the BNs, have no parent interactions and their values are a series of state values corresponding to simulation time steps, guiding the model towards the desirable states. They can represent possible therapies e.g., the control of gene expression essential for therapeutic interventions [75]. The second approach to control a BN is to perturb the states of the network randomly to select the biomolecules that may result in attractors representing the desired outcomes of the model. This approach was implemented as an algorithm [77] that identifies the optimal one-bit perturbation, i.e., the simplest form of perturbation that inverts the states of biomolecules in an attractor, for a given configuration of external inputs.

Boolean modelling formats and tools
A Boolean model can be constructed and represented using various modelling tools relying on different formats, as illustrated in Fig. 4. One of these formats is the simple interaction format (SIF), which is used for encoding a model topology from a list of interactions, giving an easy solution for combining new interactions to models. SIF is supported by different tools and databases such as Cytoscape [78], OmniPath [79] and Signor [80].
In order to re-use or integrate models, they need to be translated from their original format. Literature-constructed diagrams can be transformed into BNs as a SIF using automated conversion tools (Fig. 4). SIF can be translated into a list of BFs in Boolsim format [81] using a Standardised QUAlitative Dynamic approach (SQUAD) [82]. In addition, GNA allows the encoding of model functions and specifies qualitative values of a model from the experimental literature. [83].
Model annotations can be stored along with the topology and BFs using a SBML-qualitative format (SBML-qual). SBML-qual is a standard format designed by the CoLoMoTo community [84], extending the SBML [85] to represent the qualitative models of biological networks [86]. Pathway diagrams from KEGG, BioCarta and SABIO-RK [87] can be transformed to SBML-qual using PAth2Models [87]. Using a dedicated converter CaSQ [16], Cell-Designer SBML format can be translated to SBML-qual. Notably, SBML can be translated into SBML-qual by CellNOpt [29] and SQUAD [82]. However, the SBML-qual format is still incompati-ble with some tools such as RMut [74], NetDS [88] and CABEAN [89], pointing out that further integration efforts are required to allow reproducibility of SBML-qual models with the incompatible tools. In the modelling process, a range of tools is available for model inference, simulation, and attractor analysis. In the case of model inference, most represented tools in Table 1 generate Boolean models randomly based on selected information, which is known as random network generation. The two software-CABERNET and caspo-create a Boolean model by augmentation if the topological/ functional characterization is incomplete (Table 1). A user can generate ensembles of models by combining components and interactions with the original network.
As mentioned, particular tools such as RMut, NetDS, and CABEAN are incompatible with the SBML-qual modelling format. There may be limitations to the reusability of models created with these tools since they have their own formats. As a result, ensuring interoperability and reproducibility of models is necessary when incompatibilities exist.

Applications of Boolean modelling in clinical and translational medicine
Boolean modelling was applied in clinical and translational medicine research [50,[97][98][99] for various purposes. Simulation of the complex biological systems allowed to predict the activity of pathway endpoints (phenotypes) [100], drug targets [76] and cellular crosstalk [101]. Identifying attractors helped to understand the activity of the phenotypes, since they represent the steady states of biomolecules [102,103]. Finally, comparing attractors  before and after perturbations allowed evaluating the model stability and gave insight into how the in-vivo systems maintain their homeostasis. Below we discuss examples of such applications, as summarised in Table 2.

Modelling of cell signalling
A complex signalling network can determine cellular decisions, but the kinetic parameters and quantitative data that enable dynamic modelling may not be sufficient. Therefore, computational approaches based on the qualitative structure of these networks are of great interest. Boolean modelling of cellular signalling provided insights into the process of signals and the interactions between regulators and target molecules. A model of T cell signalling included: i) a T cell receptor, its co-receptors CD4/ CD8, and CD28 which regulates T cell function, ii) selected MAPK signalling and PI3K/PKB signalling driving cellular activation and differentiation. The model was able to reproduce the literature and experimental results upon different activation scenarios of TCR, CD4 and CD28. Moreover, it reproduced the T cell phenotype in response to knockouts and predicted unexpected activation of the PI3K/PKB signalling pathway after TCR activation [104]. This model was extended [105] into large BNs modelling a regulatory Th cell. TCR signalling, cytokine signalling, and cell cycle models were studied separately, and integrated into a single model. The model showed the naive cell differentiation into Th1, Th2, Th17 and Treg subtypes. The analysis predicted an unexpected plasticity behaviour of the canonical cell types as well as the potential of regulatory T cells to differentiate into Th1 or Th2 subtypes.
Another BN modelled the plasticity of CD4 + T cell differentiation [106] and showed that it is controlled by the dose and composition of cytokines. The model explained the T cell fate by defining 500 external conditions and considering all possible endogenous interactions. These interventions were perturbed to control the dynamics of the model from undesired to desired phenotypes. The model reproduced known synergistic actions of feedback loops on IL-12R expression and confirmed results from other studies [107,108], showing that the balance between i-Treg and Th17 was regulated by IL-6. Furthermore, the model predicted a complex phenotype (Th1-Th2) after activation of Tbet and GATA3 transcription factors under the similar environmental conditions proposed by an in vivo study [109].
Integrating different layers of biological data allowed for understanding the heterogeneity of multifactorial diseases and for reducing the possibility of false positive results. A Boolean model of an integrative network was used to analyse the regulation of key transcription factors (TFs) in Rheumatoid Arthritis (RA) and derive patient-specific models to understand the disease complexity and the response to treatment [110]. The model highlighted the impact of TGFB1, IL6, and TNF in response to the anti-TNF drugs on the model outputs. The analysis showed that TFs are master regulators-the activation of IL6 and/or TGFB1 positively regulates TFs expression, even with deactivation of TNF cascade. Blocking IL6 and TGFB1, and TNF cascades deactivates TFs expression. Further, the MAPK molecules depend on the activation of IL6 and TGFB1 and do not be affected by TNF deactivation.

Modelling of cancer growth signalling and apoptosis
In cancer, targeted therapies inhibit driver molecules of tumorigenic pathways [112]. However, it is difficult to identify targets that have crucial functions in tumour progression because of complex interactions and feedback loops between implicated molecules. Moreover, monotherapies were found to be additive in their actions because tumours are highly complex and evolve continuously [30]. Therefore, they had limited efficacy and needed many clinical trials. Perturbation analysis may help to understand this complexity, proposing the interventions between molecular targets and predicting their possible synergistic action. The BNs of gastric cancer used this analysis on seven known inhibitors that target the gastric cancer pathways [97]. All possible combinations were calculated then simulated in silico to identify new synergistic targets, which were then experimentally validated. In another work, PBN allowed associating the activity of the pathological phenotype to the perturbation probability of its regulators. Under a given perturbation, the model tested the possible synergistic perturbations to decrease the activity of the phenotype [113].
Boolean modelling was proposed to simplify the complex interactions and their downstream signals. The molecular intervention analysis showed that the combinatory inhibition of oncogenic molecules e.g. PDK1, AKT, and MDM2 or the activation of P53, RB and CDH1 reduces the proliferation and increases quiescent phenotypes since the targeted drug associations blocked cancer pathways at different regions [112].
Signalling networks in cancer are complex cascades and their pathological rewiring may alter cellular proliferation, migration and apoptosis resistance [114], and BNs can help to understand this complicated rewiring [111,115]. A Boolean model was constructed combining the main cancer pathways such as RTKs, WNT/WNT/b-catenin, TGF-b/Smads, Rb, HIF-1, p53, PI3K/AKT signalling pathways [114]. Identified attractors were associated with apoptosis, proliferation, and quiescent phenotypes in response to environmental conditions. The model revealed that growth factor signalling significantly increased the proliferation and quiescent phenotypes but decreased the apoptosis. The similar result was proposed by another model [116] which combined the intrinsic and extrinsic pro-apoptotic pathways with the growth factor signalling.
In another study, a BN describing the PI3K/AKT1 signalling pathway showed increased tissue proliferation and cell invasion phenotypes [115]. In particular, the oscillations of PI3K protein expression were studied by simulating its different activity levels at different cellular stages. Using different updating schemes can be more appropriate in specific settings and this is an example that illustrates it -While applying the synchronous updating scheme, the inhibition effect of PI3K induced four phenotypes including G2 arrest, mitotic catastrophe, and aberrant and normal anaphase. However, the asynchronous scheme showed that the previous four phenotypes didn't occur at the same time, and they are not synergistic in signal transduction because the asynchronous scheme updates the biomolecules at different time intervals. Therefore, depending on the biological process and the knowledge about the real biological time, we get to decide which updating schemes make more sense to achieve a desired output.
Logical models of cancer are usually generic because they use heterogeneous data and require clinical data to calibrate them. To generate precise BNs, a PROFILE framework [114] was proposed, integrating the mechanistic insights of logical modelling with multi-omics data. The PROFILE framework combined mutations and expression data (METABRIC [117], TCGAdataset https:// www.cancer.gov/tcga) with the cancer BN to simulate different cases and compare the model outputs. After data binarization, the activity of the nodes and the transition rates were modified based on specific cases. Stochastic simulations were performed using MaBoSS [91] for a semi-quantitative analysis of model perturbations. This approach was used in another study to investigate BRAF inhibition in melanoma and colorectal cancer which have significant variations despite the similar omics profiles [100]. The model was able to differentiate between the two cancers based on different datasets. This approach extends the previous works using the dynamic data [118] and the same pathways [119] to personalise the signalling behaviours in response to treatments.
Recently, researchers tested the PROFILE framework on a prostate model and infer patient-specific treatments [120]. The model of prostate cancer includes major deregulated signalling pathways integrated with mutation and RNAseq data. The biomolecules are fixed to zero/one according to the type of the mutations. For the continuous RNAseq data, the expression levels are translated as a modulation of a signal to the initial conditions to influence the probability of transitions. The analysis highlights that apoptosis is activated by Caspase 8/9, while proliferation is activated by cyclins D/B. Further, several readouts of cancer hallmarks (phenotypic outputs) were detected such as metastasis and DNA repair. The analysis identifies a list of drug combinations that reduce the proliferation phenotype or increase the apoptosis. The researchers use Boolean simulations to grade the effect of the combined drugs on patient-specific phenotypes, comparing the effects of treatments on each patient to suggest suitable treatment.

Perspectives
Computational models can improve understanding of complex disease mechanisms and help to develop treatment strategies applicable in the clinical settings. These applications need to be validated by closer integration with clinical research. To this end, modelling results and predictions need to be presented side by side with their uncertainties and biases. Despite successful applications of Boolean models in disease mechanism predictions and therapy (Table 2), scientists need to work on challenges of data integration, model building, precision, and standardisation.
We believe that working collectively toward solving these challenges will increase the development of decision-making pipelines using Boolean models in the future. Currently, clinical evaluation of Boolean models focuses on proposing personalised treatments. Further systematic approaches are required to study modes of action and doses besides studying the combinatorial effect on a single therapeutic target. Such approaches need to be tested and validated in vitro and on large cohorts, to better understand response and resistance to treatment, on a patient level.
The examples discussed in Section 4 show the ability of logical models to represent the dynamics of complex mechanisms. Still, more systematic approaches, including model curation, annotation and referencing in standardised formats, are needed to advance their application. Moreover, the application of Boolean models requires mathematical and bioinformatics expertise and can be made more understandable and reproducible by following established protocols [121]. Important parts of such protocols, to be defined before running simulations and experimental validations, should involve identifying the scope of the model, a choice of a justified modelling approach, and the strategy to reproduce the known behaviours.
It is important to keep the reproducibility of the generated models. This can be achieved using standardised formats (e.g., SBML packages) which facilitate the development of logical modelling pipelines (e.g., ColoMoTo notebook). Repositories like GINsim and CellCollective allow to construct, annotate, and share models. Still, reproducibility is a challenge and further integration of bioinformatic repositories with logical modelling, similarly to the BioModels platform (https://www.ebi.ac.uk/biomodels/) which already supports logical models [5]. However, a wide support to logical models requires involvement of communities Computational Modelling in Biology Network (COMBINE), the Simulation Experiment Description Markup Language (SED-ML) and SBML to advance the standards they develop. An example is the initiative of the ColoMoTo community and the Computational Modelling of Biological Systems (SysMod) community [122] to develop best practices for the curation and annotation of the logical models in biology [123]. Another example is the reproducibility scorecard, a list of eight questions to evaluate the reusability and reproducibility of the systems biology models [124]. In essence, a minimum information needs to be defined for the annotation of the logical models and systematically applied to enable their storage and reuse [123]. Automatic approaches of model annotation, quality assessment and curation will be crucial for this task.

Conclusions
Boolean modelling is a powerful and promising formalism to analyse a range of dynamic properties of the biological systems and disease mechanisms. It allows the use of many existing formats, including SBML-qual, SIF and GNA, offering the interoperability and the annotation of the created model. Boolean models need less parametrization than the quantitative models, making them a helpful approach to analyse less explored mechanisms. However, insufficient details in model construction may lead to inaccurate predictions. To avoid this, modellers can perform exploratory investigation, gathering the associated information of the model from literature and data resources. The missing details can be inferred by omics data integration that identifies the missing components and optimises the model accuracy. Another challenge is the model scale -complex models are harder to analyse and makes the attractor reachability very difficult [125][126][127]. This challenge motivates the scientists to propose different approaches to control complex models by reduction techniques.
Accurate and computable models improve the efficiency of simulations and the resulting analysis of their controllability. This makes Boolean models better suited for application in the areas of complex diseases such as cancer and immune cell differentiation. Therefore, it is crucial to emphasise the model quality in the construction and the analysis step. In parallel, the maintenance of the model repositories and sharing the models in easily interoperable formats are also important to improve their reproducibility. Further, the modelling community should cope with the advances in the experimental workflows and the new research findings by setting the best practices for the modelling process. This can be achieved by improving the existing models and trying to develop dedicated approaches to update the models automatically. Such reproducible models, further refined by omics data integration, will help to analyse the heterogeneity of complex diseases, simulate personalised responses to perturbations, and identify personalised treatments.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.