An ontology-based modelling system (OBMS) for representing behaviour change theories applied to 76 theories

Background: To efficiently search, compare, test and integrate behaviour change theories, they need to be specified in a way that is clear, consistent and computable. An ontology-based modelling system (OBMS) has previously been shown to be able to represent five commonly used theories in this way. We aimed to assess whether the OBMS could be applied more widely and to create a database of behaviour change theories, their constructs and propositions. Methods: We labelled the constructs within 71 theories and used the OBMS to represent the relationships between the constructs. Diagrams of each theory were sent to authors or experts for feedback and amendment. The 71 finalised diagrams plus the five previously generated diagrams were used to create a searchable database of 76 theories in the form of construct-relationship-construct triples. We conducted a set of illustrative analyses to characterise theories in the database. Results: All 71 theories could be satisfactorily represented using this system. In total, 35 (49%) were finalised with no or very minor amendment. The remaining 36 (51%) were finalised after changes to the constructs (seven theories), relationships between constructs (15 theories) or both (14 theories) following author/expert feedback. The mean number of constructs per theory was 20 (min. = 6, max. = 72), with the mean number of triples per theory 31 (min. = 7, max. = 89). Fourteen distinct relationship types were used, of which the most commonly used was ‘influences’, followed by ‘part of’. Conclusions: The OBMS can represent a wide array of behavioural theories in a precise, computable format. This system should provide a basis for better integration and synthesis of theories than has hitherto been possible.


Introduction
Many of society's pressing global problems, such as environmental degradation, social conflict and poor health, require changes in a wide range of human behaviours at individual, community and population levels. Interventions are likely to be more effective if they are informed by behaviour change theories which are developed to represent a body of knowledge and understanding about processes of change (Craig et al., 2008;Wolfenden et al., 2018;Gourlan et al., 2016). However, many interventions aimed at changing behaviour are not explicitly guided by theory (Michie & Prestwich, 2010;Prestwich et al., 2014). We define theory as 'a set of concepts and/or statements with specification of how phenomena relate to each other. Theory provides an organising description of a system that accounts for what is known, and explains and predicts phenomena', (Michie et al., 2005; qeios.com/read/definition/631).
Those wishing to draw on theory in developing interventions are faced with a plethora of different theories and potential underlying mechanisms through which change can occur. A systematic review of 83 theories of behaviour change identified several barriers to effective theory use in intervention design and evaluation (Davis et al., 2015). Theories tend to overlap in terms of content and scope, often lack detail in terms of clear definitions of constructs and relationships, and most include only a subset of relevant constructs within their stated scope. Different labels are also used interchangeably to describe the same construct, or the same labels are used to describe different constructs (Michie et al., 2014).
Theories of behaviour are generally represented through natural language, sometimes with visualisations that may be whole, simplified or partial. Whilst natural language benefits from the richness and subtlety of language, it can introduce ambiguity and limits comparison, integration and testing, holding back the advancement of understanding human behaviour and hence the potential to develop effective interventions (Fried, 2020;Muthukrishna & Henrich, 2019;Oberauer & Lewandowsky, 2019). Moreover, it is often unclear whether natural language descriptions of theories written by authors are intended to be testable propositions.
To make theories more useable and useful, we need a method for accurately representing them that can identify ambiguities and allow comparisons in terms of content, scope, and predictions. Such a method would enable theory integration and development by making propositions within existing theories more easily discoverable, and facilitate the testing (including falsification) of propositions by making them more precise. More precise and accurate theory representations may also enhance the selection and application of theories when developing and evaluating interventions for real-world problems. Several methodologies have been developed to improve openness, precision and rigour in the cycle of psychological theory development (e.g. Borsboom et al., 2020;Forstmann et al., 2011;Guest & Martin, 2020;Haslbeck et al., 2019;van Rooij & Baggio, 2020). These methodologies agree that theories should be formally specified in unambiguous terms, for example as a set of equations or computational model. A systematic method capable of formally representing any kind of theory proposition in a way that can be consistently interpreted, whether read verbally, computationally or diagrammatically, can improve theory development and synthesis. An additional advantage of such a system is that theories can be made 'computer-readable' to enable searching and other computer-assisted operations.
One such method, termed an 'ontology-based modelling system' (OBMS) has been developed to enable systematic and precise theory representation (West et al., 2019). The OBMS is a system for modelling theories of behaviour and behaviour change, using a formal representation of the constructs within theories and the ways in which constructs may relate to or interact with each other. Ontologies are formal representations of entities and relations in a given domain, in which each entity is assigned an unambiguous label and a clear definition. The OBMS is not an ontology (as ontologies typically seek to represent knowledge about the world, rather than theories about it) but uses an equivalent modelling system to that used in ontologies to express theories formally as a set of construct-relationshipconstruct triples, such as "intention-influences-behaviour" or "anxiety-is correlated with-performance". The OBMS also provides a system for graphically representing different types of triples, which allows a given theory to be represented as a diagram that fully specifies all the proposed relationships among its constructs in an unambiguous format. The viability of the OBMS was demonstrated by West et al. (2019) by successfully capturing the propositions of five commonly used behaviour change theories (Painter et al., 2008;Prestwich et al., 2014): the Health Belief Model (HBM; Rosenstock, 1974), the Information-Motivation-Behavioural Skill Model (Fisher & Fisher, 1992), Social Cognitive Theory (Bandura, 1986), the Theory of Planned Behaviour (Ajzen, 1991), and the Transtheoretical Model (DiClemente et al., 1991).
The application of the OBMS to a much wider range of existing and future theories can facilitate theory testing and development, particularly when used in conjunction with ontological tools. Ontological tools provide an agreed and coherent system of reference, against which we can map theories that have been formally specified, and annotate evidence for the links proposed ( Figure 1). In this way, theory propositions can be shared, tested and evidence synthesised to advance theory much more efficiently. One specific ontological tool was developed in The Human Behaviour Change Project, a wide programme of research which uses behavioural science and machine learning to answer 'What intervention(s) work, compared with what, how well, with what exposure, with what behaviours, for how long, for whom, in what settings and why?' (Michie et al., 2020a). This work involves the development of an overarching Behaviour Change Intervention Ontology (BCIO) which specifies entities making up an intervention scenario and how they are related (Michie et al., 2020b). A recent scoping review informing the development of the BCIO identified 15 existing ontologies related to human behaviour change, but none that represented the breadth and details of human behaviour change (Norris et al., 2019). Therefore, the BCIO provides an ontology that can be used to systematise the reporting of behaviour change interventions, make evidence synthesis more effective and point to what is not known. The BCIO builds on the Behaviour Change Technique Taxonomyv1 (Michie et al., 2013) and a subsequent programme of work linking behaviour change techniques with frequently occurring mechanisms of action and generating a Theory and Techniques Tool for easy identification of these links (theoryandtechniquetool.humanbehaviourchange.org). The research generated links between 61 behaviour change techniques and 26 mechanisms of action, based on authors' reports in published behaviour change literature (Carey et al., 2019), expert consensus (Connell et al., 2019) and the triangulation of the two sources of evidence (Johnston et al., 2018).
The present study aimed to (i) investigate whether a wide range of behaviour change theories could be represented using the OBMS with sufficient precision to capture all constructs and relationships in computer-readable diagrammatic form and (ii) build a database of theories represented using this system that can be searched for construct names and relationships. This type of structured database provides a basis for logical and mathematical inferences. In the present study, we conducted a set of illustrative analyses to characterise theories in the database, with further research currently underway to use computational methods to identify 'canonical' theories.

Selection of theories
We screened descriptions of 83 theories identified in a multidisciplinary review of 83 theories of behaviour and behaviour change (Davis et al., 2015;Michie et al., 2014) for inclusion in this study ( Figure 2). Theories were numbered 1-83 in alphabetical order for ease of identification throughout the study (see Extended data for theory numbers; osf.io/dcqft (Hale et al., 2020)). In total 71 theories were included for representation in the present study with five theories already represented by West et al. (2019). All 76 theories were included in the final database. Seven theories were excluded according to the following criteria: 1. Did not clearly specify at least one relationship linking a construct to behaviour or behaviour change. Two theories did not meet this criterion: Goal Directed Theory (Bagozzi, 1992), and Social Action Theory (Weber, 1978) both contain constructs that describe behaviour, i.e. goal achievement and social action, respectively. However, neither describes a link between behaviour and one of the other constructs in the theory.  , 1994), which was included.
These criteria were decided through discussion between the researchers and study leads (SM and RW).

Identification of theory constructs
Theory constructs were identified from the original theory sources, using the following definition of a construct: "a component of that model or theory that is a representation of an object, event, state of affairs, feature of one of these, derived from observation and inference" (Michie et al., 2014, p. 39). At least two researchers independently identified constructs that could be unambiguously derived from theory descriptions. Discrepancies between researchers were resolved through discussion, and if necessary through consultation with the study leads (SM and RW). A further check for accuracy was conducted by comparing constructs extracted from the original sources with those in the theory summaries in Michie et al. (2014). Where the new reading of the original theory source led to an omission or addition of a construct compared with Michie et al. (2014), the differences were recorded and checked by a further researcher, and were checked with the theory author/expert when sending the theory diagram for feedback (see below section, 'Author/expert feedback on diagrams'). Generation of theory diagrams using the OBMS At the same time as extracting theory constructs, one researcher drafted an initial diagram of the theory using the OBMS (West et al., 2019). The diagram was generated using Lucid Chart diagram software (www.lucidchart.com), which allows constructs (visualised as shapes) to be linked together by labelled relationships (visualised as lines or arrows). Draw.io is a free alternative with similar functionality. Nineteen conceivable types of relationship were initially identified in the development of the OBMS and different symbols and labels were developed to denote each type (see Extended data for guide to relationship types; osf.io/8fvdb (Hale et al., 2020)). The number, wording and visual representation of these relationship types were revised for comprehensiveness and parsimony during the project, resulting in 1 additional relationship type to the types described in West et al. (2019). The second and/or third researcher agreed, refined or changed the initial representation in line with their reading of the theory and discussion with the other researcher/s. The agreed diagram was then sent to the theory author or an expert for feedback (see below section, 'Author/expert feedback on diagrams'). The diagrams of two theories, CEOS (Borland, 2014) and PRIME Theory (West & Brown, 2013), were initially drawn by the research team, but early feedback resulted in completely new diagrams being drawn by the theory authors, which did not subsequently need to go through the feedback process.
Author/expert feedback on diagrams Theory authors were identified from the original theory sources (Michie et al., 2014). When contact details were not provided in the theory source, the website of an author's affiliated institution was consulted. For the 15 authors with whom contact could not be made (14 deceased, 1 retired), alternative theory experts were identified by searching the Web of Science and Scopus databases for authors who had cited the theory most frequently, and then examining published articles by these authors for relevance (West et al., 2019).
Theory authors and experts were sent a PDF copy of the OBMS diagram of their theory, a guide to the symbols and labels we used to specify the propositions (Extended data: osf.io/8fvdb (Hale et al., 2020)), and a copy of the published theory review (Davis et al., 2015) which lists the constructs of each behaviour change theory. They were asked to review the OBMS representation of their theory for its accuracy, including any added or omitted constructs (see section above, 'Identification of theory constructs'). If no response was received, two reminder emails were sent. In the final reminder, we indicated that we would take a non-response as an indication of approval of our theory representation.
Revision of diagrams following author/expert feedback If a theory author/expert responded, we coded the content of their response according to whether it contained the following types of feedback: (1) specific feedback on constructs, relationships and/or structure; (2) general feedback on diagram as a whole; (3) general feedback on process/methodology; (4) reference to readings; (5) queries about diagram; (6) sent revised diagrams; (7) constraints on ability to comment.
A researcher identified necessary changes to the diagram from the content of the feedback and any additional readings sent. A second researcher reviewed this process for each theory. Disagreements were resolved through discussion. The agreed changes were coded into three categories: (1) add/ remove/amend constructs; (2) add/remove/amend relationships between constructs; (3) both. Very minor changes such as spelling corrections or hyphenations were not coded. The revised representation was sent back to the author/expert; this process was repeated until a final representation was agreed or until no more response was received. Following each revision of the theory, up to two reminder emails were sent, with the final reminder indicating that we would take a non-response as an indication of approval.
Generating a searchable database of theories represented using the OBMS To generate a searchable database of theories, we pooled the five theory representations generated by West et al. (2019) and the 71 theory representations generated in the present study.
Taking the finalised diagrams of all 76 theories, we derived a formal representation of each theory, i.e. a set of constructrelationship-construct triples. This was done by exporting CSV data from the LucidChart diagram of each theory using the LucidChart process diagram CSV export facility, then running a script in Python (Version 3.8) to identify constructrelationship-construct triples from the CSV data. Where theories contained a construct that was related to another relationship (rather than related to a construct), this was represented by 'reifying' (i.e. treating a triple as if it were a construct) the second relationship triple, e.g., "Feedback-influences-[the 'Goals' to 'Persistence' Influences (*) relationship]". A web interface for the theory database was also implemented in Python, using the Flask web framework (Version 1.1) with search functionalityprovided by the Whoosh indexing library (Version 2.7). To aid the readability of theories in the database, we generated new diagrams with simpler visual conventions than those used in the theory diagrams sent to authors, e.g. using text labels instead of shapes to denote the types of relationship.
The new diagrams were automatically generated from the set of triples for each theory using Python's library interface to the GraphViz application (Ellson et al., 2003).
In the present study, descriptive analyses were carried out on the database to explore the mean number of constructs and relationships in each theory and the frequencies of constructs and relationships across theories. A 'network neighbourhood' map was constructed to illustrate links to and from two exemplary constructs across all theories, and a bi-clustered percentage containment heat map was constructed to explore the similarities among theories' constructs across the whole dataset. Both maps were created using Python (Version 3.8).

Results
Representation of 71 theories using the OBMS in computer-readable diagrammatic form All 71 included theories of behaviour could be represented using the OBMS. Table 1 shows how many rounds of revision For two theories (Needs-Opportunities-Abilities Model and Social Identity Theory) first round feedback was received but necessary changes were not identified and therefore these theories were finalised with no revision. c For two theories (Feedback Intervention Theory and Self-Regulation Theory) second round feedback was received but necessary changes were not identified and therefore these theories were finalised after one round of revision. d For one theory (Control Theory) third round feedback was received but necessary changes were not identified and therefore this theory was finalised after two rounds of revision.
were required before the theory representations were finalised. It also shows the types of feedback received in each round and the category of changes made to the theories. Of the 69 authors/experts contacted, 50 (72.5%) responded and 38 of those who responded (55.1%) included feedback on the theory representation in their response. The finalised diagrams of the 76 theories represented using the OBMS, including the five theories previously described in West et al. (2019) are available in Extended data (osf.io/4urjc/files (Hale et al., 2020)).
A searchable database of 76 OBMS-represented behaviour change theories The online database of behaviour change theories is accessible at humanbehaviourchange.org/theory-database. The web interface provides online searching and browsing functionality for the database of triples associated with the theories. This provided a formal, computer-readable and searchable database of the 76 theories that have so far been represented using the OBMS.
The database can be searched by the name of any construct in any theory by entering a search string in the search box on the home page. Searches may include wildcards, for example, searching for 'act*' will retrieve results related to 'act', to 'actor' and to 'action'. If the search retrieves any results, the search results page will display the construct name that was found as well as the theories that that construct was found in. Some constructs are found in multiple theories. For example, the construct 'action' occurs in theories 'Health Action Process Approach' and 'Six staged model of communication effects.' Alternatively, it is possible to browse the database by theory. A full list of the theories currently included in the database is listed at the bottom of the home page. Clicking the link with the name of a theory (whether in the search results or the browse facility) will open a theory page in the theory database.
The theory page lists all the triples included in that theory and includes an automatically generated diagram of the relationships specified by the triples.
It was possible to derive several statistics and observations from the database of 76 theories, which were numbered for ease of identification (see Extended data for theory numbers; osf.io/dcqft (Hale et al., 2020)). The mean number of constructs per theory was 20 (min. = 6, max. = 72), while the mean number of triples (i.e. relationships between constructs) per theory was 31 (min. = 7, max. = 89). Broadly, these two counts are correlated, but some theories are denser in connections than constructs and others are denser in constructs ( Figure 3). For example, the five large theories (labelled by number in Figure 3) are theories number 13, 41, 60 and 80 (i.e., "Diffusion of Innovations" theory, "Needs-Opportunities-Abilities Model", "Social Action Theory (Ewart)" and "Theory of Triadic Influence", respectively). Amongst these, however, "Social Action Theory (Ewart)" (theory 41) has fewer triples relative to the number of constructs and 80 has fewer constructs relative to the number of triples. As would be expected, most theories have more triples than constructs.
In order to see how often each construct in the whole dataset was referenced across theories, we took each construct name (a total of 1,290) and searched for all theories where it matched all or part of a construct name. With this method, if we search for the construct 'behaviour', partial matches with 'coping behaviour' and 'attitude to the behaviour' would be included. We chose to include partial matches in the search because there were very few exact matches of construct names across theories. We also used the "lemmatized" form of the construct label so as to combine singular forms with plurals. Using this search method, we found that 1,028 constructs only appeared in one theory. Even with partial matching, only 20% (n = 262) of the constructs appeared in more than one theory. Among the 262 construct names that did get referenced in multiple theories, Figure 4 illustrates the 25 most common.
As we might expect, 'behaviour' is the most common construct across all theories, named in 61 theories. The list also includes other crucial constructs that feature commonly in theories of behaviour change, including 'self-efficacy', 'motivation', 'ability', 'intentions' and 'skills'.
Fourteen relationship types were used across the whole dataset of theories. Table 2 lists the counts of relationships used in terms of both the number of theories that a given relation appears in, and the number of triples it is used in across all theories. Thirteen of these relationship types were identified within the five theories represented by West et al. (2019); the 'transitions to' relationship was added during the present study.
The theory database provides the ability to get an overview across multiple theories. For example, we can probe the 'network neighbourhood' of a construct across multiple theories that it appears in to see which relationships have been captured for that construct regardless of which theory they were captured in. For example, Figure 5 illustrates all the 'influences' relationships to and from the 'beliefs' and 'attitudes' constructs across all theories.
One of the most promising applications of the theory database is to ask how theories themselves are related. While there are potentially many ways that a measure of theory similarity can be computed for the OBMS formalism, and we plan to explore these further in future work, an initial naïve metric could be based on the construct mentions as calculated on a per-construct basis above, generalised to a per-theory basis. Using this approach, we calculated for each theory the percentage of its constructs that are mentioned in another theory. Doing this for all possible pairwise combinations of theories in the dataset, we constructed a heat map showing the degree to which each pair of theories contain similar constructs ( Figure 6).
The value in each cell of the heat map is the percentage of the row theory's constructs, which are also mentioned in the column theory. All percentages are calculated with respect to the row theory, numbered along the right side of the matrix (see Extended data for theory numbers; osf.io/dcqft (Hale et al., 2020)). Thus, every pairwise combination of theories has two values in the heat map. For example, we can see that the percentage of constructs from theory 40 mentioned in theory 8 (light pink cell) is higher than the percentage of constructs from theory 8 mentioned in theory 40 (light blue cell).
For a given theory, the heat map can be interpreted by row or by column. Rows show the percentage of constructs from the numbered theory that are 'mentioned in' other theories. For example, a row of red cells would indicate 100% of constructs from the numbered theory are mentioned in every Figure 4. Twenty-five most commonly referenced theory construct names. Bars show how many theories referenced each construct name, either by exact match or within a longer construct name. The calculation of constructs appearing in theories was done using the "lemmatized" form of the construct label so as to combine singular forms with plurals. nities-Abilities" and "Focus Theory of Normative Conduct" respectively), while other rows that show higher percentages of mentions include theories 8, 55, 64 and 63 among others ("COM-B", "Risk as Feelings Theory", "Social Consensus Model" and "Social Cognitive Theory" respectively). This can be considered evidence of the generality of these theories according to the percentage similarity metric, which may indicate theories that describe a broader part of the domain of behaviour change or include commonly used construct names.
Column-wise, there are several groupings of theories that appear to mention higher percentages of constructs from other theories that are not all widely-mentioned overall; for example,theories 72, 29, 80, 66, 30 and 31 form one such group ("Systems model of behaviour change", "I-Change model", "Theory of triadic influence", "Social ecological model of behaviour change", "Information-Motivation-Behavioural Skills Model" and "Information-Motivation-Behavioural Skills Model of Adherence Behaviour"). We can expect to see such groupings as a result of theories covering a similar aspect or part of the overall behaviour change domain. On the other hand, the left-most group of dark blue columns indicates a group of theories that mention few constructs from other theories, which may be expected from theories that are somewhat dissimilar to most of the domain. This includes theories 13, 9, 44, 51, and 6 (namely, the "Diffusion of Innovations, "Consumption as social practices", "Precaution Adoption Process Model", the "Rational addiction model", and "Change theory"). Inspection of their respective constructs verifies that indeed, most of these theories are using different terminologies or construct names than the more typical language used in the bulk of the theories in the behaviour change domain. For example, "Change theory" appears in this grouping, which includes construct names such as "Freezing", "Restraining forces" and the "Quasi-stationary equilibrium" that are not shared with any other theories.

Discussion
For 71 theories included in the present study, a diagram using the OBMS could be generated which adequately captured the constructs and relationships proposed in the theory. Pooling these diagrams with five previously generated diagrams using a nearly identical methodology (West et al., 2019), we derived a database of formal representations of each theory in the form of construct-relationship-construct triples.  The initial thirteen relationship types were grouped by West et al. (2019) into three basic types: causal (e.g. "influences"), semantic (e.g. "type of") and structural (e.g. "part of"). "Transitions to", on the other hand, describes a process relationship whereby one state changes to another. Our findings suggest this type of relationship was not uncommonly used across theories, but only accounts for a small fraction of the overall number of propositions made. Our initial analyses on the theory database give a picture of the landscape of existing behaviour change theories. The theories themselves vary greatly in size, with a correlation between number of constructs per theory and number of relationships between constructs. Unsurprisingly, "behaviour" was the most commonly referenced construct across theories, but, perhaps surprisingly, a fifth of theories did not use the word behaviour at all. Instead, they used other terms for behaviour, such as "action" (e.g. Transtheoretical Model; Prochaska & DiClemente, 1982) and "action or inaction response" (Norm Activation Theory;Schwartz, 1975). This illustrates a practical consideration for representing and comparing theories systematically: construct names alone provide limited information about the actual construct. Additional information such as the definition of that construct and its ontological relationship to others is needed to search and compare across theories accurately. Without this information, it is hard to be precise about commonalities and gaps that may exist among theories. Our next steps (outlined below) will address this.
Our findings also indicated some groups of overlapping theories. On the one hand, we could identify groups of theories that contained constructs commonly used across the whole dataset, which could be interpreted as more general theories in terms of describing a broad part of the behaviour change domain or using commonly mentioned construct names. These were typically less detailed theories, containing fewer than the average number of constructs and triples. One of the theories in this grouping, Social Cognitive Theory (Bandura, 1986), was also identified in previous network analysis as contributing to 12 of the 83 theories in the corpus, based on explicit indication in the theory sources (Michie et al., 2014). However, the others from the grouping contributed to three or fewer theories in the network analysis, suggesting there is not necessarily high overlap between theories on the construct similarity metric used in the present study and the contribution metric used in the previous study.
On the other hand, we could also identify groups of theories that mention similar constructs from other theories which are not necessarily widely mentioned overall, thus representing a similar part of the overall domain, as well as theories which mentioned very few constructs contained in other theories. Furthermore, we have shown that formalising theories into the OBMS triple format can allow for exploration of the 'neighbourhood network' of all the propositions made about how what influences or is influenced by a given construct such as beliefs or attitudes relates to other constructs across the whole dataset. These findings demonstrate potential for the OBMS to facilitate synthesis and identify 'canonical' theories, which we will discuss in the next steps.
Despite the apparent groupings of similar theories, we found that only around 20% of construct names from the 76 theories appear in more than one theory, and this is further evidenced by the dominance of low percentages of mentions between theories. This is a striking result, considering that all of the theories pertained to behaviour change, and therefore more commonalities might be expected within a domain. It highlights the extent of theoretical heterogeneity in the labels used to describe different phenomena across the field, and further illustrates the need to map the constructs that have been used in theories to shared ontologies, so that overlaps in construct definitions could also be identified.
As well as facilitating generation of the theory database, the OBMS provides a tool which researchers can use to formally specify theories under development. Because the OBMS is ontologically-based, it is able to accommodate a variety of different relationship types, including causal, semantic, structural and process relationships, and could accommodate more types of propositions if needed. This means the OBMS is compatible with, but not restricted to, quantitative approaches to theory specification such as mathematical equations or simulation models, recommended in methodologies for psychological theory construction and testing (e.g. Borsboom et al., 2020;Forstmann et al., 2011;Oberauer & Lewandowsky, 2019).
Using the OBMS as an integrative semantic framework built on the existing database could help to shift from current practice, characterised by small group 'ownership' of theories and a climate which discourages theory development during early career stages (Borsboom et al., 2020;van Rooij & Baggio, 2020), to a more open and collaborative approach within and between teams.
A current limitation of the OBMS format is its representation of constructs which relate to a relationship, such as a moderator.
Because the triple format only allows for construct-relationship-construct propositions, in some cases we needed to treat relationships as constructs in order for triples with relationships as targets to be expressible. This type of workaround is commonly called 'reification' in the computer science and conceptual modelling literature. In the OBMS format it results in triples such as "Feedback-influences-[the 'Goals' to 'Persistence' Influences (*) relationship]" (Goal Setting Theory;Locke, 1968;Locke & Latham, 2002). This creates additional complexity when comparing and combining theories, because triples can be the targets of other triples, thus can be "nested". However, the ability to capture relationships such as moderators, which influence other relationships, is essential for a faithful representation of the semantic content of some theories. In the future, we may explore the use of a more expressive formalism than the basic construct-relationship-construct triple format in order to capture these elements of theories more intuitively.
The OBMS database structure can be searched and used as a basis for mathematical and logical inference. While some preliminary results are illustrated in this manuscript, to make more powerful inferences it will be necessary to assign an unambiguous meaning to the constructs. We plan to address this in the next steps of the project by annotating the constructs in the theories to ontology entities from e.g. the Behaviour Change Intervention Ontology (Michie et al., 2020b). Ontology annotations of this type provide additional metadata, such as synonyms and definitions, and also a formal logical semantics in the relevant ontology language (e.g. OWL). This will allow theories to be checked for logical consistency internally and in combination with other theories. It will then be possible to determine which theories are mutually inconsistent (i.e. cannot both be true) based on logical contradictions across the subsets of annotated ontology entities. This added functionality will mean that in future steps we can build a 'canonical' theory that is most consistent with the combined theoretical representation across all the theories. It will also allow researchers to identify and assess different predictions made by theories from the same domain.

Conclusions
The ontology-based modelling system (OBMS) is a systematic approach that can be used to represent a large body of behaviour change theories. Feedback from theory authors suggests that this method captures theory propositions adequately, with nearly half of theory authors requiring no revisions of the initial OBMS representation of their theory. This study extended the initial development of the system (West et al., 2019), and a significant output of the research is a database of 76 published behaviour change theories which can be used for comparison and synthesis. Initial analyses on the database indicate this body of theories is characterised by heterogeneous construct names, as well as some areas of overlap. This suggests that greater integration and synthesis of theories is possible and would be beneficial. The next steps of the research programme will facilitate this through mapping the theory database to ontologies of behaviour change and using computational methods to synthesise theories. • GenerateTheoryStatistics.py (Source code scripted in Python to conduct statistical analyses on the database of 76 theories, as presented in the Results section, Figure 3- Figure 6 and

Austin S. Baldwin
Department of Psychology, Southern Methodist University, Dallas, TX, USA In this paper, the authors report on the creation and development of a searchable database (an ontology-based modeling system [OBMS]) of behavior change theories and their related constructs and relationships. The database includes 76 different behavior change theories that went through a rigorous identification and coding process prior to inclusion (71 of the theories identified and coded in the phase of the project reported here). The creation, development, and validation of these tools are a very important and needed undertaking to systematically synthesize the vast array of behavior change theories. The database has the potential to be a critical tool in facilitating scientific advances in behavior change theory and interventions. This paper is wellconstructed and well-written. The rigorous process for identifying and coding the various theories, constructs, and relationships included feedback from theory authors and is an important strength of the paper. Several of the tables and figures are also helpful in conveying what types of information are available in the database and potential uses. Finally, the inclusion of database materials relevant to this paper on an open repository (OSF) is an important strength of the paper and the project. I support the indexing of this paper.
I also have a few suggestions for the authors that largely center on more clearly discussing the implications and elaborating on potential uses of the database. For example, the authors might consider including concrete examples of how researchers could use the database to design interventions and test theories. I think addressing the following issues would strengthen an already strong paper: It is not entirely clear how researchers might use information available in the database to inform intervention design (e.g., Which theory should we use? Does it matter which we use?), or to properly design interventions to test a theory and inform its refinement. Are there concrete examples of how researchers might approach these questions using the database? Is the database ultimately agnostic to these issues? Either way, it would be helpful to clarify for readers.

1.
The information about overlap in constructs among theories conveyed in the heat map ( Figure 6) is quite interesting. One persistent challenge in advancing behavior change 2.
© 2020 Dixon D. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Diane Dixon
Institute of Applied Health Sciences, School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, UK The preponderance of theories is a significant barrier to the development of a cumulative evidence base for understanding and changing human behaviour. This study is part of a programme of work that is developing the tools to address the issue of theory proliferation and construct overlap. I would strongly support its indexing. 'What's in a name?', asked Juliet and whilst I'd generally avoid challenging Shakespeare it turns out the answer is 'quite a lot, really'. As the authors discuss in the introduction, behavioural science is not bounded by the discipline imposed by the properties of matter that benefits the natural sciences. Our reliance on natural language poses challenges to behavioural science because it is unbounded. I see the work described in the paper as part of a wider attempt to develop tools and mechanisms that will enable behavioural medicine to identify the boundaries and shared content of the theories and constructs it regularly, and not so regularly uses. This is a worthwhile endeavour. I have some comments to share with the authors: They are collating constructs by collating construct names and partial names. This generates figure 4. However, not all of the 'constructs' displayed in figure 4 would generally be recognised as 'constructs'. For example, whilst self-efficacy is a recognised construct 'social' and 'personal' are not 'constructs' in the generally accepted meaning of the term, e.g. 'social support', 'social norm' are very different constructs. I am not arguing against the authors' method here rather that I am struggling with their use of the term 'construct' to label the groupings that are produced by the process of collation -'all constructs are equal but some are more equal than others'…some aren't even constructs.

○
The authors themselves are surprised by the finding that even for partial matching only 20% of constructs appear in more than one theory. Either they have a few theories that contain a very large number of unique constructs or there is a problem with the use of nonstandardised terms to label constructs. I am struggling to believe the 20% figure is a true representation of reality rather than an artefact of poor construct labelling. The authors allude to this in the discussion, but I wonder if they might consider strengthening the discussion of this possibility.

○
The included theories all seem to lean towards conceptualising behaviour as a result of deliberative processes. Perhaps I am in error here, but if not, it would be useful if the authors could consider how environmental models of behaviour, especially behaviourist approaches, might be accommodated within the OBMS.

○
Similarly, how well does the OBMS accommodate important sociodemographic, cultural and socially constructed influences on behaviour? These are often treated as moderators of belief-behaviour relationships. The authors point out the limitations of the reification approach to accommodating moderators and I would very much encourage the authors to ○