Introduction

Rheumatoid arthritis (RA) is a complex inflammatory autoimmune disease whose aetiology is still not fully understood1. RA is primarily characterised by a persistent inflammatory cascade in the synovial tissue2 resulting in painful, swollen, rigid joints and, later, in extra-articular manifestations like gastrointestinal and cardiovascular diseases3. There is currently no cure for RA, and if prescribed treatments merely seek to reduce the inflammation and alleviate disease symptoms4, they have also been associated with various adverse events5. Recently, research has revealed that the innate immune system is crucial to initiating and developing RA pathogenesis5. Macrophages are one of the most common innate immune cell populations in RA, and their number significantly correlates with the disease severity6. Macrophage populations are heterogeneous and can differentiate into various phenotypes in response to the local microenvironment stimuli. The M1 and M2 phenotypes represent the extremes of their activation spectrum.

Consequently, depending on their phenotype, these cells play a role in both the initiation and resolution of inflammation7. The M1 macrophages are responsible for the overproduction of inflammatory cytokines and the release of matrix degradation enzymes, leading to cartilage destruction5. They can also attract proinflammatory T cells and induce their hyperactivation. On the other hand, the M2 macrophages alleviate inflammation via (1) the production of anti-inflammatory cytokines, including IL-10 and TGF-β, (2) tissue homoeostasis and repair8, and (3) activation of regulatory T-cell functions9. Due to their excessive activation and proliferation and enhanced anti-apoptosis ability, the proportion of M1 macrophages is higher than that of M2 macrophages in RA6. Two approaches currently exist for targeting macrophages: downregulating M1 phenotype and expanding M2 phenotype or repolarising M1 to M2 macrophages10,11. Despite this, no medicines specifically targeting macrophages are currently used in clinics6,8. Thus, understanding the specific approach for targeted depletion of the inflammatory macrophage while sparing other macrophage subsets and reestablishing macrophage balance might be a practical therapeutic approach in RA12.

Investigation of such complex diseases has been hindered by reductionist approaches, focusing on specific cellular components but failing to provide a global picture of the pathogenic mechanisms under study. Indeed, each molecule can rarely be assigned a distinct role13. Instead, cellular functions and phenotypes arise from the interactions between the biological system components14. Ongoing developments in high-throughput experimental techniques provide tremendous data regarding these molecular interactions. One strategy to represent them is their abstraction to networks13. Initiatives have been conducted by creating mechanistic molecular interaction maps for various diseases15,16,17,18,19, including RA20. These maps are a rich source of knowledge. However, they remain limited in predictions and hypothesis testing due to their static nature.

One of the primary goals of dynamical modelling is to understand the emergent features and behaviours of such complex biological systems14. Several modelling approaches are available and can be divided into two categories: quantitative and qualitative modelling. The quantitative modelling approach better characterises a system but requires kinetic data as well as a high number of parameters. Because many of these characteristics are unknown and difficult to determine in most systems, these models are limited in size21. Qualitative modelling approaches, on the other hand, are more suitable for characterising systems for which kinetic parameters are unavailable. They do not require kinetic data and can provide a qualitative dynamic description of the system. Moreover, they are scalable for networks containing hundreds of components22. Qualitative modelling techniques include Petri nets (PNs)23 and logic-based models24. PNs can represent biological mechanisms and processes at various granularities and levels of abstraction in a single model. They include two types of nodes, places, which represent conditions or resources (passive elements), and transitions, that represent activities (active elements). Tokens represent movable objects along a directed edge and are located on the places. The tokens define the state of the PN. A PN can be simulated by moving tokens according to a firing rule, that defines the dynamic behaviour. PNs are versatile and can be used to model discrete, continuous, and stochastic events. However, defining the number of tokens for large-scale biological networks can be challenging, as these networks often lack stoichiometry. In this case logic-based modelling approaches are more straightforward25,26,27.

Logic-based modelling approaches have long been used to describe gene regulatory networks and signalling cascades28,29. In their simplest form, logic-based models permit each biochemical species to be in one of two discrete states: ON or OFF30. More complex logic-based methods have been developed, such as multi-state and fuzzy logic methods24,31. Although these approaches have the potential to more precisely simulate biochemical reactions, they require parameter values that are rarely known and, in some cases, difficult to correlate with biological data. Thus, discrete two-state logic models (Boolean models) remain the most intuitive and predictive method for describing biochemical interactions without requiring prior knowledge of complex mechanistic details of reaction kinetics or degrees of membership (needed for multi-state fuzzy logic systems).

Changes in the biomolecules’ values in Boolean models are defined by logical rules using the Boolean operators “AND”, “OR”, and “NOT”. The regulation of this state variable is given in a parameter-free way, making Boolean modelling a viable option for large-scale systems with unknown kinetic parameters32,33. When simulated, Boolean models may converge to stable configurations called attractors. Once reached, the system cannot escape unless an external perturbation occurs. Attractors comprise steady states of only one state; simple cycles, that include a fixed sequence of states periodically replicated; and more complex attractors, formed by overlapping loops34. Attractors represent the model’s long-term behaviour and have been connected to biological phenotypes, making their computation a key point in Boolean models’ analysis33.

Building and analysing Boolean models for large-scale complex biological systems remains challenging. When the logical rules are manually defined, the generated models are usually smaller and do not entirely cover the biological systems described in the molecular interaction maps. When the models are inferred automatically from maps, they are closer representations of the systems35, but these large-scale models are more complex, including especially a much higher number of inputs. Considering that the size of the state space of a Boolean model is exponentially dependent on its node number (2n states for n nodes), computing all of their attractors is computationally demanding36 and identifying biologically coherent states is difficult, especially in a cell or disease-specific context.

This work presents an efficient computational framework to build, analyse and validate the behaviours of large-scale Boolean models with hundreds of nodes and a significant number of inputs. The framework uses publicly available molecular interaction maps to automatically infer their corresponding executable Boolean models via the CaSQ tool35. Our approach enables the analysis of the generated models in a synchronous scheme using a new version of the BioModel Analyzer (BMA) tool37 deployed to a high-performance computing cluster set-up. The framework identifies all the existing attractors of the models using parallel computing and then tests their coherence against gene expression data sets and prior knowledge. It computes a similarity score that describes the ability of the model to reproduce what is known in the literature and observed in data sets. We successfully apply our framework to generate and validate the behaviour of the RA M1 macrophage and RA M2 macrophage Boolean models using their corresponding maps within the RA-Atlas20. Although the heterogeneity of macrophages in RA has not been fully uncovered, these models aim to cover the phenotypic diversity of macrophages through a phenotype-specific representation of their secreted cytokines/chemokines, stimulatory molecules, receptors, and transcription factors12. We used these validated models to investigate potential mono- and bi-therapies that specifically downregulate proinflammatory macrophages and promote anti-inflammatory macrophages in RA synovium. We perform in silico KnockOut (KO) simulations to evaluate new RA drug combinations and propose potential therapeutic repurposing.

Results

We illustrate in this section how we built and validated the large-scale Boolean models describing the RA M1 and M2 synovial macrophages using their maps that are available in the RA atlas20. We also demonstrate how we used the calibrated models to investigate potential therapeutic options that would specifically eliminate inflammatory synovial macrophages and boost anti-inflammatory macrophages in RA synovium. We perform in silico simulations to evaluate new RA drug combinations and propose potential therapeutic repurposing.

Generation of the Boolean models of M1 and M2 macrophages in RA

The updated RA M1 macrophage molecular interaction map includes 601 components interacting via 405 reactions. The updated version of the RA M2 macrophage map comprises 513 components and 323 reactions. Converting these two maps into executable Boolean models with CaSQ35 produced a network of 309 nodes, 75 inputs, and 562 interactions for the M1 macrophage and a network of 254 nodes, 57 inputs and 430 interactions for the M2 macrophage (Fig. 1).

Fig. 1: The RA M1 macrophage model in the BMA graphical interface.
figure 1

The black squares outline the cellular compartments represented in the model. The red square shows a zoom-in on a section of the extracellular space and the cytoplasmic membrane.

To focus on regulating the cell-specific phenotypes only, we used the export option in CaSQ via the argument -u to identify all the upstream nodes of these phenotypes.

Regarding the RA M1 macrophage model, the selected nodes are upstream of Apoptosis, Proliferation and Osteoclastogenesis phenotypes. The number of nodes decreased from 309 to 233, comprising 64 inputs. In the RA M2 macrophage model, the nodes of interest are upstream of Apoptosis and Proliferation phenotypes. Their number decreased from 254 to 169, including 39 inputs. The nodes that are not involved in regulating our phenotypes of interest were not considered. The inputs regulating these nodes were fixed at one, the default value in BMA.

Identification of differentially expressed molecules using literature search and transcriptomic data analysis

To identify the differentially expressed genes (DEGs) present in the models, we used the GSE97779 microarray data set. It contains nine RA synovial macrophage samples and five healthy peripheral blood monocyte-derived macrophage samples. We identified the DEGs between RA and healthy control samples as described in the Methods’ section. We also reviewed thoroughly the literature. Using low- and high-throughput experimental data, we extracted information regarding the changes in the nodes’ expression levels between RA and healthy control conditions.

From both: the analysis of the GSE97779 data set and an exhaustive literature search, we identified 105 and 88 differentially expressed biomolecules in the RA M1 and M2 macrophage models, respectively. The retrieved differential expressions can be associated with nodes at mRNA or protein levels in the models, assuming a linear relationship exists between the expression of mRNAs and the expression of their corresponding proteins. We discretised the differentially expressed molecules’ expressions: molecules that were overexpressed in RA were linked to the value 1, whereas molecules that were under expressed in RA were linked to the value 0. Supplementary Tables 1 and 3 list these differentially expressed molecules and their corresponding Boolean values in the M1 and M2 macrophages.

Computation of all the possible attractors of the models

Given the high number of inputs in the model, we reduced the list of input combinations by fixing the values of the differentially expressed ones. Based on the information displayed in Supplementary Table 1, 43 out of the 64 inputs present in the RA M1 macrophage model were fixed. The total number of input combinations was then equal to 221. We used the BMA tool deployed to a machine with 96 single-core CPUs and 768 GB of RAM to run the attractors’ search. All the resulting attractors were steady states and were kept for further analysis.

Regarding the RA M2 macrophage model, 24 inputs were fixed using the information provided in Supplementary Table 3. The number of input combinations was then equal to 215. All the corresponding attractors were steady states that we used for the following steps.

Validation of the models’ behaviours

First, we filtered the steady states according to the values of their cell-specific phenotypes. The biologically coherent Boolean values of these phenotypes were extracted from the literature in a disease and cell-specific manner. They reflect the increased M1/M2 ratio in the synovial macrophage population7,38,39 and the enhanced osteoclastic bone resorption in the RA joint40,41. Indeed, RA M1 macrophages predominate in RA synovial fluid due to their excessive proliferation (Proliferation phenotype in the model should be ON) and enhanced anti-apoptosis capabilities (Apoptosis phenotype in the model should be OFF) compared to the RA M2 macrophages (Apoptosis phenotype should be ON and Proliferation phenotype should be OFF in the model)6. In addition, the Osteoclastogenesis phenotype, which is only present in the RA M1 macrophage model, should be ON41,42.

All the steady states of the RA M1 macrophage model passed through this filtering step, while only 8192 of the steady states of the RA M2 macrophage model did.

We calculated the similarity score between the list of differentially expressed molecules (Supplementary Tables 1 and 3) and their matching nodes in each filtered steady state. Regarding the RA M1 macrophage model, 384 steady states had the highest similarity score, and their average vector was calculated. In the resulting vector, 222 nodes were fixed at zero or one, while eleven were not fixed (Supplementary Table 2). This model’s state can reproduce 99% of the observed Boolean values. Indeed, 104 of 105 differentially expressed nodes’ states matched their experimentally observed Boolean values. The only observed inconsistency is the CASP7 pro-apoptotic protein, which is overexpressed in RA macrophage samples but has a Boolean state equal to zero in the model.

Regarding the RA M2 macrophage model, 96 steady states had the highest similarity score. In their resulting mean vector, 158 out of 169 nodes were fixed at zero or one. Eleven nodes were not (Supplementary Table 4). This model’s state can reproduce 96,5% of the observed Boolean values. 85 out of the 88 differentially expressed nodes’ states matched their experimentally observed Boolean values. The only mismatches are BCL2L1 and MCL1, two upregulated anti-apoptotic proteins in RA macrophage samples with a Boolean state equal to zero in the model, and CASP3, a downregulated protein in RA macrophage samples with a Boolean state equal to one in the model.

Testing the effects of therapeutic targets’ single knockouts on the RA M1 and M1 macrophages

Selective downregulation of proinflammatory macrophages while promoting anti-inflammatory macrophages is a promising approach for inhibiting chronic inflammation and bone erosion in RA10. It can be achieved through the induction of the Apoptosis phenotype and the inhibition of the Proliferation phenotype in the RA M1 macrophage model and the activation of the Proliferation phenotype and the inhibition of the Apoptosis phenotype in the RA M2 macrophage model.

We performed an exhaustive search using the Therapeutic Target Database (TTD) to identify potential therapeutic targets (targets that have already been experimentally modulated) present in the models. It is a drug database designed to provide information about the known therapeutic protein and nucleic acid targets described in the literature, the targeted disease conditions, the pathway information, and the corresponding drugs/ligands directed at each target. The database currently contains 3578 targets and 38,760 drugs. Targets can be divided into four categories: successful targets, clinical trial targets, preclinical trial targets and research targets43. Drugs can also be divided into four categories depending on their status. They go from Approved drugs to Clinical trial drugs, to Preclinical/patented drugs to Experimental drugs. We screened the targets based on their associated drug’s Mode Of Action and only kept the components that can be targeted by at least one inhibitor (1643 targets). Then, we identified the targets in the RA M1 and M2 macrophage models. Regarding the RA M1 macrophage model, 71 therapeutic targets were identified (Supplementary Table 5). Sixty targets were identified in the RA M2 macrophage model (Supplementary Table 6).

We mimic the effect of these drugs using in silico KO simulations. We use the calibrated state of both RA M1 and RA M2 macrophage models as initial simulation conditions (Supplementary Table 2 and Supplementary Table 4). Then, the models’ phenotype states after the target KOs are compared to their corresponding calibrated states.

Table 1 summarises the identified therapeutic targets in both models with their associated drugs. We selected, for each identified target, the drug with the highest TTD status.

Table 1 Single knockouts of the therapeutic targets from the TTD database that perturb the RA macrophages’ phenotypes.

Inhibition of NFkB, in our models, stimulated the M1 macrophage’s death, inhibited the M1 macrophage’s growth (Fig. 2), and did not influence the M2 macrophage’s phenotypes. Even though ERK1 inhibition did not affect the M1 macrophage’s apoptosis, it did suppress their proliferation and reduce the release of most proinflammatory cytokines (CCL2, CSF2, IFNG, IL-18, IL-1, IL-6 and TNF). It also blocked the synthesis of the angiogenic factor VEGFA in the M2 macrophage model. GSK3B inhibition, on the other hand, induced the M2 macrophage’s proliferation while suppressing their apoptosis (Fig. 3).

Fig. 2: In silico simulation of NFkB KO in the RA M1 macrophage model.
figure 2

a Simulation with NFkB active in the model. The Apoptosis phenotype gets inhibited and the Proliferation phenotype gets activated in the presence of NFkB. b Simulation with NFkB inactive in the model. The Apoptosis phenotype gets activated and the Proliferation phenotype gets inhibited in the absence of NFkB.

Fig. 3: In silico simulation of GSK3B KO in the RA M2 macrophage model.
figure 3

a Simulation with GSK3B active in the model. The Apoptosis phenotype gets activated and the Proliferation phenotype gets inhibited in the presence of GSK3B. b Simulation with GSK3B inactive in the model. The Apoptosis phenotype gets inhibited and the Proliferation phenotype gets activated in the absence of GSK3B.

Testing the effects of therapeutic targets’ double knockouts on the RA M1 and M2 macrophages

In order to investigate the potential synergistic effect of the previously tested therapeutic targets on the models’ phenotypes, the targets were combined in pairs. Both RA M1 and RA M2 macrophages models were used to predict the outcome of their corresponding combined KOs. We used the same initial conditions for the mono drug testing; then, we compared the perturbed states with their corresponding calibrated states.

Table 2 summarises the identified target combinations with their associated drugs. We selected, for each identified target, the drug with the highest TTD status.

Table 2 Combinations of therapeutic targets (from the TTD database) that perturb the RA macrophages’ phenotypes.

Two thousand four hundred eighty-five drug combinations were tested using the RA M1 macrophage model. Among these combinations, the Notch1/ERK1 pair and the JAK1/JAK2 pair were identified as having a synergistic effect on the model’s phenotypes (Table 2). Indeed, ERK1 KO alone inhibited the M1 macrophage’s proliferation. When combined with Notch1 KO, it also led to the promotion of the M1 macrophage’s apoptosis (Fig. 4). JAK1 and JAK2 separate inhibitions did not perturb the M1 macrophages’ phenotypes either. When paired together, they suppressed the M1 macrophages’ proliferation and induced their apoptosis. All the other drug pairs did not provide a synergistic effect on the model’s Apoptosis and Proliferation phenotypes. Apoptosis induction and proliferation suppression were only driven by NFkB KO.

Fig. 4: In silico simulation of ERK1 and Notch1 KOs in the RA M1 macrophage model.
figure 4

a Simulation with ERK1 inactive and Notch1 active in the model. The Proliferation phenotype gets inhibited in the absence of ERK1. b Simulation with ERK1 and Notch1 inactive in the model. The Apoptosis phenotype gets activated and the Proliferation phenotype gets inhibited in the absence of ERK1 and Notch1.

Regarding the RA M2 macrophage model, 1770 drug combinations were tested. None of the drug combinations demonstrated a synergistic effect on the M2 macrophage model’s phenotypes. Apoptosis suppression and proliferation activation were only driven by GSK3B KO in the model.

Testing of the effects of all possible receptor double knockouts in the RA M1 and M2 models

Receptors, located on both the cell surface and within the cell, are the molecular targets through which drugs produce their beneficial effects in various disease states. They are coupled to various signal transduction systems within the membrane and intracellularly and can therefore regulate responses to the cellular/tissue microenvironment44. In order to investigate the effects of double KOs of the cellular receptors present in the M1 and M2 macrophage models, receptors were combined two by two. Both RA M1 macrophage and RA M2 macrophage models were used to predict the outcome of their corresponding combined KOs. We used the same initial conditions as for the mono and dual drug testing; then, we compared the perturbed states with their corresponding calibrated states.

Four hundred six double KOs were tested in the RA M1 macrophage model, while 300 double KOs were tested in the RA M2 macrophage model. None of these simulations perturbed the RA macrophages’ apoptosis or proliferation phenotypes.

Discussion

The number of macrophages in inflamed synovial tissue overgrows during RA, and their polarisation plays a critical role in RA’s physiological and pathological progression45. Thus, selectively suppressing the M1 macrophages or boosting the M2 macrophage could be a promising strategy for treating RA. To investigate such complex mechanisms, we developed a framework to calibrate large-scale Boolean models that can be either automatically inferred from molecular interaction maps using the CaSQ tool35 or manually built in the BMA JSON format. We analyse the models using a newly developed BMA tool37 version that can be deployed on Linux-based High-Performance Computing (HPC) clusters to identify all their attractors. These attractors are filtered to keep only the steady states. While these stable states are better suited to answering biological questions regarding stable patterns of biomolecule activities and cell phenotypes, and are easier to test against expression data, filtering out cyclic attractors may remove oscillatory behaviours that might be interesting to investigate in RA.

Given the high number of input combinations in large-scale Boolean models, using HPC clusters enables high-throughput model analysis. It overcomes the lack of computational power and takes advantage of parallel computing to considerably reduce the running time for the search of the possible steady states. Because we are using Boolean formalism, the resulting steady states of the models are binary vectors. To be able to validate their behaviours, we compare them to differentially expressed molecules, we discretised the expressions. Biomolecules not differentially expressed, such as housekeeping genes, were not considered. Qualitative models with a higher granularity could be envisioned to address this limitation; nevertheless, additional computational resources would be needed to cope with the exponentially higher complexity of these models.

We applied the proposed methodology to the large-scale RA M1 and M2 synovial macrophages interaction maps20, setting the path to many other disease maps to be explored, such as the Atlas of Cancer Signalling Network17, multiple sclerosis pathway map46 or COVID-19 disease map19. To analyse the resulting RA M1 and M2 macrophage models, we adapted the framework to make it relevant to the disease and cell type under study. Indeed, we filtered the models’ steady states according to the values of their cell-specific phenotypes and kept the ones that reflect the imbalanced M1/M2 ratio in the RA synovial macrophage population and the enhanced osteoclastic bone resorption in RA joints. In addition, to calibrate the models, we selected an RA and macrophage-specific gene expression data set and carefully curated the extracted information from the literature to ensure that it was specific to both RA disease and synovial macrophages. The experimentally observed values were extracted from macrophage- and RA-specific data but not phenotype-specific data. Therefore, those observed values do not consider the phenotypic differences between the M1 and M2 macrophages and do not reflect the increased apoptosis resistance and the high proliferation observed in the M1 macrophage compared to the M2 macrophage. Indeed, the identified mismatches in both models are related to biomolecules participating in the apoptosis pathway. Since the M1 phenotype resists apoptosis, pro-apoptotic components (CASP7) tend to be inhibited in the calibrated state of the M1 macrophage model. On the other hand, the M2 phenotype is pro-apoptotic; hence, pro-apoptotic components are active (CASP3) while anti-apoptotic components are inhibited (BCL2L1, MCL1) in the calibrated state of the M1 macrophage model. Having access to RA M1 and M2 synovial samples to calibrate the models would help us providing a phenotype-specific dynamics of the biological processes and reducing the observed mismatches between the models’ Boolean states and the experimentally observed states.

A recent initiative to provide gene signatures of the M1 and M2 macrophages has been made by Ghosh et al. 47. Using machine learning, authors identified signatures that accurately identified both physiologic and pathologic spectra of reactivity and tolerance in macrophages. Even though the overlap between our models’ components and these macrophage signatures is poor given their lack of RA specificity, it would be interesting to integrate this kind of data in our approach.

The integration of additional data sets for the calibration of the RA macrophage models, either publicly available or proprietary, would help improving the models’ robustness. It would also help fixing the values of the nodes that were not fixed due to a lack of information regarding their expression. As RA is a highly heterogeneous disease, changes in certain DEGs are to be expected.

It is important to note that the M1–M2 definitions we used in this work to account for different activation states are a simplified representation of the macrophage activation process. In fact, their activation exists on a spectrum and cannot easily be binned into defined groups48. These intermediate polarity stages are distinguished by the expression of specific surface markers and the production/release of distinct molecules7. The refinement of our models to represent the phenotypic diversity of macrophages as a continuous spectrum would improve our approach and provide a better understanding of the RA macrophage heterogeneity and its involvement in the disease pathogenesis. This could be achieved through the construction of a large-scale macrophage model that covers a wider range of phenotype-specific processes. However, the diversity of terminology in the literature, the inconsistent use of markers to describe macrophage activation and the great lack of phenotype-specific data represent a major limitation to this, at the moment.

We performed in silico simulations on the calibrated models to investigate the effects of mono and bi-therapies on RA macrophage phenotypes. NFkB inhibition in our model led to selective suppression of the RA M1 macrophage. NFkB represents an interesting potential therapeutic target as it is a key transcription factor of M1 macrophages, responsible for the upregulated expression of M1 macrophage-derived cytokines in the RA synovium49. Several studies support the concept of NFkB inhibition for therapeutic interventions in inflammatory diseases50,51,52,53. In RA, its in vitro inhibition induces apoptosis in fibroblasts, and contribute to a significant downregulation of M1 markers and upregulation of M2 markers7,54.

Further investigations showed that the observed beneficial effects of non-steroidal anti-inflammatory drugs and glucocorticoids, both used for RA treatment, are also due to NFkB inhibition55,56,57,58. However, their usage is limited due to severe side effects58,59 .Other NFkB inhibitors were identified, but most do not meet the standards to join clinical development programs60,61,62,63. Indeed, non-selective inhibition of NFkB in all cell types has multiple detrimental effects as it is critical for maintaining homoeostatic cellular pathways. Biological treatments have been developed that directly target the products of NFkB-driven genes, such as TNF, IL-6 and IL-1. However, these treatments mainly target inflammation rather than apoptosis. Furthermore, as the mechanisms of apoptosis are highly sophisticated and several cytokines have synergistic biological activities64, inhibiting a single cytokine may not be optimal. This is further underlined by the simulations performed on our models that mimic such treatments (anti-TNF, anti-IL6, anti-IL1,…) and fail to induce apoptosis in the inflammatory RA macrophages. Therefore, the discovery of techniques for cell-type-specific NFkB inhibition is needed to shift the benefit/risk balance65.

Regarding the RA M2 macrophage model, GSK3B was identified as a promising target for promoting the M2 macrophage population in RA. GSK3B is involved in the progression of various diseases, including RA66. Evidence suggests that GSK3B plays a central role in signalling pathways relevant to macrophage function, including polarisation and inflammatory response67. Its inhibition in RA suppresses inflammatory responses in fibroblast-like synoviocytes and collagen-induced arthritis68. Furthermore, its inhibition in allergic rhinitis inflammatory disease increases the expression of the M2 phenotypic signature markers69. CREB1 is one of GSK3B’s targets70. When GSK3B is inhibited, it induces CREB1 gain of function, sending an anti-inflammatory and anti-apoptotic survival signal in monocytes and macrophages71. It also increases M2 marker expression and promotes M2 phenotype in murine macrophages72,73.

We explored the synergistic effects that some therapeutic target pairs might have on macrophages models’ phenotypes. The M1 macrophage’s proliferation was suppressed by ERK1 KO alone. When paired with Notch1 deletion, it also promoted the M1 macrophage’s death. The potential therapeutic value of co-targeting ERK1 and Notch1 has already been demonstrated in cancer but not RA. Indeed, it has been shown that targeting Notch1 enhances the efficacy of ERK1 inhibitors in cancer patients74,75. In RA, separate ERK1 and Notch1 inhibitions reduce inflammation in mouse collagen-induced arthritis76,77. Notch1 signalling, on the other hand, is known to regulate M1 macrophage fate through direct transcriptional and indirect metabolic regulation78. We also identified the JAK1/JAK2 pair as a potential drug combination for the RA M1 macrophage depletion. Baricitinib, a Janus kinase (JAK) proteins inhibitor, is a Food and Drug Administration (FDA) approved for treating RA79. It prevents activation of STAT pathways and inhibits the cascade of transcription initiation of effector genes, which, in turn, prevents the autoimmune and inflammatory reactions associated with RA, including IFNg secretion. However, the way JAK inhibitors modulate macrophage phenotypes and whether this phenomenon explains their clinical benefit in RA is still not fully understood. A recent study showed that Baricitinib modulated the expression of membrane phenotype markers and the secretion of some cytokines in healthy macrophages80. Another study further supports the effect of JAKs inhibition on RA macrophage phenotypes by shifting the metabolic profile of M1 macrophage and rebalancing the metabolic reprogramming toward oxidative phosphorylation81.

Receptors are the molecular targets through which drugs produce their beneficial effects in various disease states. They are coupled to various signal transduction systems and can therefore regulate the cell’s responses to its microenvironment. We investigated the effects of double KOs of the macrophages’ receptors represented in the models, but none of the KOs perturbed the RA macrophages’ apoptosis or proliferation. These results underline the intricate crosstalks between the intracellular signalling pathways and the high synergistic activities of the various receptors represented in the M1 and M2 macrophage models.

Taken all together, these results further validate the behaviour of our macrophage models through the identification of a new potential drug combination as well as targets whose potential/proved therapeutical benefit in RA has been highlighted in the literature.

The following steps of this work would be to combine the RA M1 and M2 macrophages maps with other cell-specific maps from the RA-Atlas20, namely the RA fibroblast and the RA CD4+ T helper 1 (Th1) maps, via the addition of intercellular interactions. Then, we would apply our framework to the resulting multicellular model. As the model’s size and complexity would considerably increase, we could assess our framework’s scalability.

Methods

This section describes the computational framework we developed to convert publicly available molecular interaction maps into large-scale Boolean models, analyse their state space and validate their behaviours using prior knowledge and transcriptomic data (Fig. 5). This workflow is also applicable to Boolean models that were directly built in BMA JSON format. We also describe how we apply our framework to generate and validate the RA M1 macrophage and RA M2 macrophage Boolean models using their maps that are available in the RA atlas20.

Fig. 5: Schematic representation of the workflow we developed to generate and analyse large-scale Boolean models.
figure 5

Molecular interaction maps built in CellDesigner XML format are converted to executable Boolean models using the CaSQ tool. A new version of the BMA tool is then deployed on a high-performance computing cluster to identify all the models’ attractors. These attractors are filtered to keep only the steady states. Next, the filtered steady states are validated. Differentially expressed biomolecules in the models are identified using literature mining and transcriptomic data analysis. The identified biomolecule expressions are discretised and converted to a binary vector of experimentally observed Boolean values. After that, similarity scores are computed to describe the ability of the filtered steady states to reproduce the experimentally observed values. The steady states with the highest score are selected; their average vector represents the calibrated model’s state. The calibrated model can perform in silico simulations, test biological hypotheses and generate predictions.

Generation of Boolean models from molecular interaction maps

The workflow uses the CaSQ version 1.1.435 to convert publicly available molecular interaction maps to executable Boolean models. CaSQ automatically infers logical rules based on the network topology and semantics for each node in the starting XML file. The tool produces either Systems Biology Marked up Language Qualitative (SBML-Qual)82 or BMA JSON executable files. Using the latter format, it is also possible to produce qualitative networks where nodes can vary over a wide range of discrete values, which is defined as granularity in BMA. Granularity defines the higher value the nodes can take in the model. Since we use Boolean formalism, the granularity used in the following analysis equals one.

We used their corresponding maps available in the RA-Atlas to generate the RA M1 macrophage and RA M2 macrophage Boolean models20. These two maps are built in the Systems Biology Markup Language (SBML) format83 using CellDesigner84 and are compliant with the Systems Biology Graphical Notation (SBGN)85. They cover cell-specific signalling pathways, gene regulations, molecular processes and phenotypes involved in RA’s pathogenesis. Biomolecules and reactions in these maps are manually curated and extensively annotated through PubMed IDs, DOI, GEO and KEGG identifiers, following Minimum Information Required In The Annotation of Models (MIRIAM) standards85. All the references are available as csv files in the GitHub project we provide (see Code availability section). The XML files of the corresponding maps are also publicly available and can be parsed using the CellDesigner software where the references can be displayed using the MIRIAM section. The RA M1 and M2 macrophage maps can also be easily visualised in the form of online interactive maps using the platform MINERVA via the following link: https://ramap.uni.lu/minerva/.

Phenotypes are particular nodes in the maps. They describe biological states known to be active or inactive in RA. To make them more appropriate to the purpose of this work, we divided them into two categories. The first corresponds to cell-specific phenotypes, describing the cellular outcomes of RA synovial macrophages like proliferation and apoptosis. Depending on the map, their names end with “M1_macrophage” or “M2_macrophage” suffix. The second category is not specific to a particular cell type and corresponds to cellular signals and biological conditions in the RA joint, like inflammation and matrix degradation. Their names end with the “signal” suffix in both updated maps. We also looked for duplicates, removed them whenever found, and corrected the signalling pathways accordingly.

Stabilisation proof using BMA

BMA is a tool for constructing, analysing, and importing executable models of biological mechanisms37. The user is presented with a web-based interface, allowing for rapid and straightforward model construction and analysis. Whilst the graphical user interface is the primary tool for interacting with BMA, a console tool is also available, giving access to a wide range of analysis algorithms and enabling scripting for large and complex combinatorial analyses. CaSQ can generate models in the BMA JSON format, which can be used with either version of the BMA tool.

Fundamental analysis in BMA is the proof of model stability that symbolically analyses the model attractors without explicitly calculating the model transitions. A modular proof algorithm is used under the synchronous update scheduler to show whether or not a single steady state attractor exists and no cycles. Briefly, this proceeds in two steps. Initially, the ranges of individual variables are reduced to the set of reachable values by examining the input variable ranges and the target function. Stability is proven if this process reduces all ranges to a singleton and the global steady state attractor is returned to the user. If this fails, it uses Boolean satisfiability (SAT) queries via a constraint solver, with the reduced variable ranges from the first step, to search for cycles. If no cycles are found, the model must be stable and a final check searches for and returns the steady state86.

BMA architecture and underlying technologies

BMA is developed on the Microsoft .NET Framework and .NET standard, which tie the tools to Windows environments. The BMA web tool is hosted on Azure and is structured as two services, one hosting the user-facing client and another computing service dedicated to calculating proofs and simulations. The console tool is developed for Windows, which provides similar functionality to the compute service. To enable high-throughput model analysis and take advantage of parallelisation on high-performance computing facilities (typically Linux-based), we developed a prototype of the console tool based on the open source .NET core 3.1, which can be built using the dotnet SDK. All codes are available at https://doi.org/10.5281/zenodo.7541023.

Parallel computing for the calculation of all possible attractors

Attractors depend on the external stimuli the model receives from its environment. Stimuli in Boolean models are modelled in the form of inputs. Inputs are nodes with no upstream regulation. They are not associated with any logical rule in the model; therefore, their values are user-defined. In the BMA console tool, the user can assign values to the input nodes with the flag -ko that allows setting the specified nodes to be constants (zero or one). Depending on the input nodes’ state, the model reaches different attractors. To identify all the attractors of the model, we generate all the possible combinations of inputs’ values. For each input combination, we search for the corresponding attractor. We reduce the number of input combinations when possible by fixing the inputs associated with experimentally observed expressions. The Boolean values of these inputs are set based on the available literature and transcriptomic data.

The computation time complexity is exponential. Indeed, the number of all possible combinations of inputs’ values equals 2n, n being the number of inputs that vary in the model. Given the high number of inputs in the inferred large-scale Boolean models, we failed to execute the attractors search on a local Windows machine with eight cores and 64 GB of RAM. Therefore, we deployed BMA on a high-performance computing cluster to compensate for the lack of computational power. The cluster capacity should be selected based on the model’s size and the number of input combinations to process. Figure 6 illustrates the number of input combinations the BMA console tool can process per hour using one core and various model sizes. As the attractor search is slower on larger models, the number of processed combinations decreases proportionally with the model size. Therefore, Fig. 6 can be used to estimate the computational resources required to execute the analysis depending on the model’s size. We utilised the Joblib python package as well87 to parallelise the process and considerably reduce the running time of the framework.

Fig. 6: Plot showing the number of processed inputs’ combinations by BMA per hour using a single core machine.
figure 6

The number of processed combinations decreases proportionally with the model size.

Filtering the model’s attractors

Attractors can be viewed as the modelling equivalents of cell phenotypes, such as cell growth, differentiation, and apoptosis. Under this interpretation, cyclic attractors correspond to cell cycles and steady state attractors correspond to differentiated states. Steady states are therefore suitable for analysing stable patterns of biomolecule activities and cell phenotypes. Steady states can be compared with gene expression data sets, to calibrate and validate a model’s behaviour. In this work, the primary focus is on the steady states, as we study cell phenotypes, and use omics data sets for model calibration and validation.

Identification of differentially expressed biomolecules using literature search and gene expression data analysis

We use low- and high-throughput experimental data to identify the differentially expressed biomolecules in the model under study. First, we thoroughly review the literature regarding each node in the model. We extract information about the change in its expression level between two biological conditions. These conditions are defined based on the system under study. Depending on literature availability, these differential expressions can be at the mRNA and/or protein levels. When it is relevant to the model, we curate the retrieved information to keep it disease- and/or cell-type specific. We integrate transcriptomic data set(s) as well. We select the data set(s) according to the biological question we would like the model to address and perform DEA on the selected one(s). The final list of differentially expressed molecules in the model combines literature search and DEA outcomes.

To calibrate the RA M1 and RA M2 macrophage models, we use the GSE97779 data set, a publicly available microarray data set from the GEO database88. The data set contains nine RA synovial macrophage samples from nine patients and five peripheral blood monocyte-derived macrophage samples from five healthy donors. We normalised gene expression using quantile normalisation and the preprocessCore package89. We performed DEA using the Limma package90 to identify the DEGs between RA and healthy samples. We filtered the DEGs using an adjusted p-value threshold equal to 0.05.

Computation of similarity scores and data discretisation

We discretise the data to compare the expressions of the identified differentially expressed components with the model’s steady states. Overexpressed biomolecules in the condition under study are associated with the value one, while under-expressed molecules are associated with the value zero. Since we use Boolean formalism, where each biomolecule can only have two possible states, biomolecules that are not differentially expressed are not considered. The resulting discretised vector of experimentally observed expressions is then used to calculate similarity scores with each steady state to describe the ability of these filtered steady states to reproduce the experimentally observed values. To do so, we calculate a similarity score S (1) using the simple matching coefficient.

$${\rm{S}}=({\rm{N}}00+{\rm{N}}11)/({\rm{N}}00+{\rm{N}}11+{\rm{N}}10+{\rm{N}}01)$$
(1)

Where,

N00 = number of nodes with a state of zero in both the steady state and the discretised vector of experimentally observed expressions.

N01 and N10 = number of nodes with different states in the steady state and the discretised vector of experimentally observed expressions.

N11 = number of nodes with a state of one in both the steady state and the discretised vector of experimentally observed expressions.

Selection of the steady states with the highest similarity score

The model’s state is validated based on the signal propagation from the inputs to the internal nodes. The objective here is to select the input combinations that lead to coherent states in the internal nodes of the model. To do this, we filter the model’s steady states and select the ones that can reproduce what is known in the literature or observed in transcriptomic data sets. We select the steady states with the highest similarity score. Then, we compute the mean value of each node over these stable states to determine the nodes that are fixed at either zero or one and those that can be found in both states. The resulting average vector represents the calibrated model’s state.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.