Modelling as a process

Broadly speaking, models are representations of something concrete or not. In science, models have always a purpose related to understanding and explaining phenomena. This requires focus and selecting what to represent and what not to represent and how to represent, among other things. Thus, a side effect of developing the scientific method is the development of a well-structured modelling paradigm. Starting from phenomena and objects, I discuss many decision-abstraction steps in the modelling process that leads to models of phenomena expressed mathematically or computationally, highlighting underlining contexts and procedures. This discourse is undertaken centred on a cross- and trans-disciplinary system science perspective. It grounds on a personal perspective and may be considered as a model of the modelling process.


Introduction
Models, in the sense of representations, allegories, or metaphors, have already been used by greek philosophers and possibly before without being remarked nor named. Presently, in the light of what we have learned from neurocognitive sciences, it looks like models are necessary for and lay at the foundations of our mental processes. In science, at least, models always highlight what modellers think is relevant or important in any given situation and for any phenomenon.
They are present, crucial, and rather conspicuous in all intellectual developments since scientific revolution, particularly when formalising empirical sciences (Béziau and Kritz 2010;Kritz and Béziau 2011;Badiou 1969). For instance, each equation in physics embodies a model, as much as any concept in physics and other sciences. Or, more correctly, an equation is the expression of a model which resides mostly inside our minds. This is why no two scientists read an equation exactly in the same way. The fact that most of what is expressed by equations and other tokens is common sense among scientists is a result of the scientific method but, nevertheless, a marvel. Words in a natural language embody a model. Distinctions in such models enligthen why texts may be interpreted in different ways.
This conscious and critical perception of models and their role was not needed before the middle of last century because a phenomenon usually had just one model and rarely the same model would represent more than one phenomenon. Even when this was the case, the model often represented yet another phenomenon that instantiates in two different manners. A classical example of this occurence is the harmonic oscillator with its mechanical and electro-magnetic variants. The phenomenon is the harmonic oscillation that appears in distinct contexts.
For this reason, models remained subjacent and unnamed till the middle of last century, when quantum phenomena and the theory of generic systems (General Systems Theory, or GST) unveiled the possibility of having several models representing a single phenomenon from distinct standpoints and with different details, enriching their understanding. Nowadays, model multiplicity is believed to be indeed mandatory for taming the inherent complexity of biological phenomena and of organised-complexity phenomena containing biological entities as elements (von Bertalanffy 1971;Kritz 2010).
The identification of models as an important concept and intellectual tool fashioned modelling and launched a flush of ad hoc activity where models were and still are build upon demand, tailored to specific classes of phenomena and based on formalisms chosen in advance for reasons not necessarily phenomenological. Modelling was initially called simulation and focused almost exclusively on (dynamic) behaviour. Nevertheless, these activities triggered theoretical investigations about systems by themselves that yielded sound and valuable understanding about their generic properties, from both a formal Takahara 1975, 1988;Klir 2001) and an epistemological (von Bertalanffy 1971;Weinberg 2001;Klir 2001) stand. In the course of time, modelling widened and converged into an identifiable procedure that remains in large measure implicit hiding important steps, choices and decisions intrinsic to the process. This is the topic of this text, that aims to highlight what has become undercurrents along the years. The discourse follows the broad guidelines set by GST (Kalman et al. 1969;Mesarović and Takahara 1975;Klir 2001), while preserving the unique flavour of a personal perspective towards nature what eventually may lead to differences in notation and meaning of certain concepts.

A roadmap
In scientific discurses, models are by and large symbolic, abstract, and linguistic objects, having the purpose of explaining what is observed and enhancing knowledge about some subject of interest (Kritz and Béziau 2011;Vieira Kritz 2020b). A model is always used when investigating scientific subjects and is for the most part built upon demand. Modelling is the process of creating models. Models are usually built by means of their expressions in some language or formalism chosen in advance and without proper consideration of characteristics or properties of what is being modelled. Due to the variety and unboundedness of possibilities within this process, there is no well-delineated procedure to describe modelling. Notwithstanding, the sketch of a feasible roadmap is available and presented in Fig. 1.
Some milestones and actions displayed in Fig. 1 have been addressed elsewhere supporting discussions about certain particularities of the scientific process (Kritz 2010;Kritz and dos Santos 2011;Kritz et al. 2010;Vieira Kritz 2020a). They are compactly recalled bellow in a more mathematical language where symbols, diagrams and (mathematical or programming) formalisms are employed. Other elements in the roadmap will be visited not abiding to any particular thread and their selection is moulded just by available space. It its worth noting, though, that the general and pictorial descriptions previously presented are no less rigorous than what follows. While modelling, it is utterly important to understand that rigour stems from precise reasoning and careful argumentation, not from symbols nor from the language used. Of course, it may be easier to express something in a (formal) language than another; as writing the same algorithm in different programming languages clearly illustrates.
Though symbolic and apparently abstract, the description below is not detached from reality. It is indeed very close to it, as the interspersed examples indicate. For the sake of communication, the examples will be simple and mostly drawn from physics and chemistry. Notwithstanding, all actions and statements below are valid for any kind of phenomenon and they are utterly relevant and important when addressing organised complexity phenomena (Weaver 1948;Klir 2001;Vieira Kritz 2020b).

Phenomena and objects
Every phenomenon subsumes change. Changes subsume (or induce) the perception of before and after, leading to time. For the time being, time is anything that allows the recognition of changes. Space is where things are situated.
The most conspicuous distinction between objects and phenomena is that in the latter one something changes during the observation period, while all aspects of the former remain immutable during observations. This dichotomy encompasses all that exists: anything we perceive either appears to change or not. Hence, observers are needed to distinguish between objects and phenomena, as well as to identify what changes, when it changes and how it changes. Clearly, depending on characteristic times and time-scales, a phenomenon may be perceived as an object and objects may show themselves as very slow phenomena (e.g., flows of solids, glasses, and non-Newtonian fluids). By and large, it is assumed that changes result exclusively from internal and external interactions of phenomenological components.

Definition 1 A Phenomenon, F, is a quintuple
where T is a set of things that are the interacting elements of F, I I I encompass all possible and potential interactions that may provoke change, O O O is a collection of observers, τ is an account of characteristic times of changes in F, and t is the minimum time needed for changes to be perceived and acknowledged.
Note that τ is a characteristic of phenomena, while t relates to the conjunction of observers, sensing apparatuses and observation methods. Put together, both attest a tensional situation and comparing them inform the modeller about what can be safely modelled. This tension is here denoted as [< τ , t >]. If both are numbers, the tension can be written as the boolean operator τ ≤ t or its reverse.
In general, however, τ is a collection of values without any specific structure, the determination of t can be a rather complicated procedure involving dimensional and other analyses, and [< τ , t >] is thus a complex object, either mathematical or computational. In certain cases, this tension may have to do with parameters resulting from the modelling process itself as is the case of discretisation parameters (see Sect. 6.1).

Describing phenomena: things
The first step in studying or discoursing about any subject is the ability to refer to it, to its parts and to all factors considered relevant. For phenomena, this is achieved by naming things, interactions, and observers, the last ones only when needed. Phenomenological elements without names do not exist or are not relevant for the modelling process.
Things are described by naming and enrolling them (indexing them) and by stating how they should be seen. That is, by pointing out what is observed as well as how changes in observations will be registered. Things in phenomena are never accessed in their entirety. That is, not all their observable details will be recorded or studied. Instead a (finite) subcollection A of aspects (qualities) characterising them is selected to represent each object, with its neatly defined form and volume, or each entity, otherwise.
That is, for each th ∈ T(F 1 ), there is a finite set of selected aspects that are effectively and regularly observed. For most phenomena in physics, A th 1 = A th 2 , for any th 1 , th 2 ∈ T. However, this is not mandatory as wave-particle interactions indicate. In chemistry, aspects may vary from one thing to another depending on substance classes and chemical affinities. But, in the case of inorganic chemistry, aspects remain the same within each categorical sort stemming from these classes and affinities. In bio-chemistry, though, aspects of the same substance may vary during observation due to conformal changes in their molecules. For economic and social phenomena, A th may vary wildly from a thing to another and it is even possible that A th 1 ∩ A th 2 = ∅, when th 1 = th 2 . If A th 1 ⊂ A th 2 , th 2 is seen in greater detail when compared with th 1 . Or else, more aspects are considered important and observed in one thing than another. From counting principles, the number of things composing phenomenon F 1 , #(T(F 1 )), is either finite, enumerable or densely infinite. Hence, a good way to name things is through an index function J −→ T, where J is either J n = {1, . . . , n}, n ∈ N, N itself, or any convenient non-denumerable set like R n or D ⊂ R n .
For material objects, a commonly considered aspect is where the object sits in space and time. Due to the non inter-penetrability principle, this attribute uniquely identifies the object (e.g., billiard balls on a table, cars in roads, particles suspended in fluids, or molecules in a gas). That is, they can be used as names for these objects. For material entities, like fluids and fields, localisation is not a value but a function from a region in space-time to {0, 1}, meaning presence or absence of the entity at each point in the region. This description embodies other aspects such as the entity's form and boundary.
In science, everything needs to be done in a coherent manner. Aspects that uniquely identify things are called identifying aspects. If they pervade all th ∈ T(F 1 ), they can always be used to construct naming schemes for F 1 , that is, to construct the index functions above. Eventually, things in T may be completely bypassed by naming their aspects directly aided by the multi-valued function: where N ( j) = A th j . This results in a double index scheme with variable index-bounds in the aspects' space A = th∈T A th . When J contains space-time positions in the usual sense, that is, when J ⊃ B, B ⊂ R + × R n , n = 1, 2, 3 or J ⊃ B, B ⊂ R × R n , n = 1, 2, 3, J is clearly a localisation space, besides being an identification space. Hence, they can be used to identify and locate all attribute changes (events). This perception can be maintained whenever J is associated with identifying aspects, even if these relate to space-time only indirectly. On that account, J will be called the event space.

Describing phenomena: changes and interactions
Summing the above up, some aspects do vary during observation of phenomena while others do not and the change of any phenomenological aspect A k associated with a th ∈ T is called an event. Observed events induce an immediate sense of before-after that is generally associated with time. Reversely, definitions of (linear) time establish a before-after relation in any set of events associated with it. The (apparently) immutable aspects necessary for explaining a phenomenon are called parameters. Aspects that change and affect behaviour are called variables, a concept slightly different for mathematical and computational models.
However, in reference to usual time, the observation of event e 1 before e 2 does not entail with certainty that e 1 occurred before e 2 , as the following example shows. If st 1 and st 2 are two stars and st 2 is three times more distant from Earth than st 1 , events occurring at st 1 planetary system will be perceivable on Earth three times earlier than those occurring around st 2 , because the signals associated with both events travel at the same speed, that of light. Hence, it is possible that an event in st 1 system occurring later than another event around st 2 will be seen first and appear to precede the latter. Due to constrains in observation methods and apparatuses, this mismatch can happen more often than not, even in controlled laboratories on Earth surface or in contexts where signals do not travel with the same speed. Nevertheless, anytime something changes one can identify moments before and after that change, what induces a (subjective) sense of time.
To describe changes in general and start addressing dynamics, a special but simple artefact that respects before-after relations is useful. Namely, Definition 2 A timeline is any intellectual construct that allows to distinguish before and after on observations of any kind, since perceptions are events inside observers.
Perceptions, observers and signals as defined in (Vieira Kritz 2017). For instance, a set B A endowed with a reflexive linear order ≤ and a plain Hausdorf topology separating points is a timeline.

Definition 3
Let O be a collection of observations where for any o 1 , o 2 ∈ O it can be said that o 1 was observed concomitantly with o 2 , before it, o 1 ≺ o 2 , or vice-versa. A chronicle is any association between O and a timeline that preserves ≺ and identifies ≺ with ≤.
Statistical time series are chronicles. See Rosen (1991) for a different treatment of chronicles and more examples. There are many situations, where all observations occur apparently at the same time and other information is required to establish the precedence needed for a chronicle, like in microbiology, for instance. When modelling, chronicles are tentative temporal arrangements. Experimenting with them is part of the modelling process. The usual time is the chronicle of a very special phenomenon -motion. Indeed, motion with constant velocity. Moreover, if B A = R with its usual metric topology, we get the usual measurable time.
Traditionally, interactions are accessed (named) indirectly through the things that take part in each interaction. For instance, a chemical reaction R that transforms three substances <| s 1 ; s 2 ; s 3 |> into products <| p 1 ; p 2 |>: is named by a token derived from the tuple (s 1 , s 2 , s 3 ; p 1 , p 2 ). This token can be simply an integer associated with the tuple by means of a table, the tuple itself, or any other arrangement based on it. Describing interactions to their full extent-what is exchanged (Kritz 2010) and how these exchanges happen-is an ad hoc, involved, and surprising research work which is out of the scope of this paper. For the most part such descriptions possess no general rules, can only be achieved case by case, or within well-defined phenomenological classes. Building stable and widely accepted descriptions often lasts for decades, eventually centuries. This step is at the heart of modelling and explaining (natural) phenomena, i.e., why there are changes at all. Notwithstanding, a certain number of steps and tools are common to all modelling of interactions, no matter whether only implicitly used. They are described in the sequel.
An initial description of interactions is often much humbler. For phenomena in the organised complexity class, it is common to start studying interactions by enrolling who can interact with whom and how each interaction does occur, that is, enrolling interaction possibilities and channels. In general, these elements are of immediate observation or inferable with little effort and scarce observations. Typical examples of this kind of description are the trophic webs of ecosystems, process enchainment in biochemistry, and biochemical networks in micro-biology. The latter are indeed hyper-graphs (Klamt et al. 2009).
If there is no natural law that forbids two things, th 1 , th 2 ∈ T, to interact we say that th 1 and th 2 are connected (or connectable). The collection of all pairs of things that can eventually interact forms an important connection-structure in T. Thus, is the collection of all subsets of N , and whenever th 1 and th 2 may interact phenomenologically, either (th 1 , th 2 ) ∨ (th 2 , th 1 ) ∈ I, for graphs, or {th 1 , th 2 } ⊂ a ∈ I, for hyper-graphs.
For phenomena of classical mechanics (billiard balls on a table, planetary systems, fluids, cars in roads etc), interaction graphs are always undirected complete graphs K n , where n = #(N ) is the cardinality of N , because every thing can interact with every other thing as long as they become "in contact". Physical interactions exchange momentum and energy changing aspects of motion (Schiller 2013(Schiller -2019. Interaction graphs in these cases are also immutable for the phenomenon's duration and do not provide interesting or useful information.
In the case of electro-magnetic circuits, there are elements that restrain electric current flows in certain directions. For that matter, the circuit wiring confines all electric and magnetic fields around it. That is, the circuit itself constrains the flow of electrons and conducts the electric currents. Considering that circuit elements interact with each other exchanging electrical charge, the circuit itself, the connections between the electrical elements, is a concrete instance of interaction graphs of this kind of phenomenon.
For chemical phenomena, interactions are restrained by chemical affinities. For instance, molecules of substances that are acids and bases may cross-interact but molecules of two basic or two acidic substances cannot in general react chemically, although they still can interact physically exchanging momenta, energy, and changing their motion patterns. Thus, forming mixtures. Usually, when basic and acidic substances react, their affinities are complementary and the product is saline. Therefore, g(F) or h(F) lose some arcs being endowed with a richer, non-trivial, connection structure. We can paint the nodes of N with colours that depend on whether the substance is acidic, basic or neutral (salt), or on other chemical characteristic to help unveiling the structure of the interaction graph. This shows that even in simple chemical reactions interaction graphs for chemical phenomena are bi-, tri-, or pluri-partite, besides having arcs directed from substrates to products.
Scaling further up into organisational complexity, the structure of interaction graphs becomes ever richer in possibilities requiring an analysis of their own (Ulanowicz 1983;Kritz et al. 2010). In biochemistry grounded phenomena, for instance, interaction graphs may even vary (Mohler and Ruberti 1978) along a phenomenon's evolution which requires the insertion of interaction graphs in any sensible description of their behaviour (Kritz 2010). Interaction graphs carry a lot of information about a phenomenon and have been used for a long time in certain scientific domains without any further modelling. This is the case of biochemical networks and trophic webs that have stood as "the" description of phenomena in micro-biology and ecology for decades. Both have the advantage of being directly observable and of becoming first order approximations of dynamical behaviour whenever their arcs are labeled with information about fluxes of matter or energy. This is straightforwardly achieved in micro-biology and ecology, for instance. Notwithstanding, the deeper information to be found in interaction graphs come from the missing arcs, that inform whereto the phenomenon cannot evolve (Kritz and dos Santos 2011).

Idealised phenomena and systems
The previous steps bring us a great deal into the modelling avenue.
However, previous knowledge affects how we perceive and think about what we 'see' and study. Hence, we never start an inquiry into nature ab initio. Instead, we start from a simpler, idealised version of the phenomenon where things are chosen from a collection of cherished objects developed since long-the thing-types.
When approaching a phenomenon, modellers almost always see it already in such simplified form, where the things th ∈ T(F) come from this relatively small collection of standardised components. For instance, interacting things in Nature may be sub-atomic particles, grains, molecules, atoms, waves, celestial bodies, fluids, organelles, macro-molecules, organisms, artefacts, reactants, substrates, products, populations, collectivities, firms, industries, environments and so on.
Par contre, in science and modelling we refer to particles, fields, bodies, substances, individuals, populations, systems, organisations, or institutions. The second list above may be as long as the first but it is finite while the first is potentially unbounded and limited solely by Nature's creativity. Some words or things appearing in both lists have slightly different meanings. In models, we refer to the particles of a fluid or body; never to their molecules, macro-molecules or aggregates. Also, we talk about the individuals in a population; never about persons in a collectivity. Not to say, about a "population of molecules".
A complete enrolment of both lists and a deeper discussion of their distinctions falls outside the scope of this paper. Objects in the second list is what is called here thing-types.
Definition 5 A thing-type is a typical exemplar of a class of things commonly encountered in scientific studies of natural phenomena. Each type is characterised by a small set of type-specific attributes which never change.
Nevertheless, these attribute-sets can be decorated in a ad hoc basis with other attributes relevant to understand a phenomenon.
Consider, for instance, billard balls on a table. Idealised particles have only position and no volume as type-specific attributes. Parameters like mass are required to understand and explain their motion. Other parameters, the decorations, may or may not be necessary, depending on how sophisticated their interaction descriptions will be. Knowing the radius of the balls allows for a better description of interaction results. An idealised phenomenon is described solely in terms of thing-types and their interactions. Thing-types unveil some amazing facts about nature-for instance, individuals are particles that decide. Idealised phenomena are more than half-way from establishing a system to study.
Interaction graphs stem directly from phenomena and Definition 4 above refers to and is centred on things. Considering that mathematical relations, graphs and (hyper-)graphs are tightly related (Schmidt and Ströhlein 1993) and also invoking the systems science definition of a system (Klir 2001), we can say that interaction graphs and systems are the same concept whenever just thing-types are considered as things and the interaction graph is expressed solely in terms of the minimal attribute-set (variables) of its thing-type instances. One important thing to note, though, is that these variables are not constrained to be numeric, quantitative, or measurable.

Modelling
There is no language nor knowledge that allows for a general and encompassing description of the modelling process. Nature's creativity is mesmerising and so is human creativity while inventing formalisms and formal systems. The best way to summarise what comes after thing-types and systems in the roadmap of Fig 1 is Rosen's modelling relation (Rosen 1991), sketched in Fig 2. Clearly, N represents a portion of nature and F M a formal model. That is, a model described in a formal system or language. Cause is a particular case of enchainment while inference is a particular type of reasoning. The differences about elements on the left are: cause requires a temporal alignment while enchainment doesn't. If a causes b, a occurs before b and directly provokes the occurrence of the latter. On the right side, inference strictly abides to the inference rules of F M while reasoning allow for creating hypotheses and deducting consequences to be tested against observation.
All previously described modelling steps belong to the codification procedure. Nevertheless, they do not exhaust it. De-codification is even harder to describe and encompasses the ontological interpretation of whatever a model outputs. In principle this diagram commutes. Roughly speaking, inference emulates cause and reasoning emulates enchainements.
This said, to model behaviour, we need to wisely choose a formal system or language, F S , where to build a formal model F M . This is commonly done inertially by choosing the F S with which we are better acquainted. This approach frequently leads to poorer descriptions and inferences as compared to what is obtained by experimenting and looking around to other possibilities. As an aperitif, I list some of the mathematical and computational formal systems available, which are nevertheless related (Kalman et al. 1969).
Each mathematical discipline entails a formalism that is for the most part an enriched form of formal system. Laying on the shelf, one finds many options that interrelate swiftly and in interesting manners (Mac Lane 1986;Kritz and dos Santos 2011): differential equations, ordinary and partial, functional equations and inequations, algebraic structures, graphs and hyper-graphs, iterative equations, topological structures, differential geometry, games and differential games, to cite a few. Most have been used to model one thing or another.
It is possible, but not wise, to say that each programming language stands for the same. There are far too many programming languages designed with certain questions in mind. To list some modelling possibilities, it is better to consider programming paradigms: functional, procedural, object-oriented, event-driven, declarative, adaptative agents, programspecification and so on. Still a lot but much better. Each paradigm encompasses a large number of programming languages but the way of thinking and dividing the tasks within each does not change much. That is why they are considered paradigms and why it is wise to start your choice of programming language examining them. Paradigms are also much closer to the many intermediate models that appear along the modelling workflow.

Mathematical and computational modelling
Both mathematical and computational models can be constructed along the lines unveiled by Rosen's modelling relation when adequate mathematical or computational formal systems, languages and contextes as above are chosen. This approach, not widely used, provides computational models encoded directly from phenomena. However, more often then not a mathematical model is established first, being later transformed into a computational model through the application of a plethora of existing, well-tested, techniques. The following diagram summarises this workflow.
In diagram 4, Nat represents all natural phenomena, Mod is the category of all models, M M is the sub-category of mathematical models, M M( ) is a sub-category of Mod consisting of potentially computable models, T i , i = 1, 2, are justification theories (Kritz and Béziau 2011), Alg is the category of algorithms, and P the class of all computer programs writable in the programming language L P for a chosen computer architecture Org(C). This is the preferred approach for modelling physical and chemical phenomena, where robust and widely tested mathematical models exist since long; an approach that started to be developed even short before computers appeared in the middle of last century. (The calculations were made by hand with mechanical calculation machines.) A reasonably solid treatment of this workflow transcend by far this article goals. Indeed, it is the subject of several books that address the manyfold possibilities concealed in the workflow suggested by diagram (4).
The number and variety of these possibilities correlate tightly with the combinations of mathematical disciplines, algorithmic formal systems, and programming paradigms. The important thing to retain is that T 1 unveils phenomenological parameters, while T 2 introduces parameters that connect the various intermediate models of diagram (4) production line and that do not relate to the phenomenon itself. Last but not least, it is worth noting that parameters need not to be numbers, as exemplified by the parameters [L P , Org(C)].

Conclusion
M.A. Raupp was a skilful mathematician, who moved to the politico-administrative side of science (Vieira Kritz 2022). Jim Douglas Jr., his advisor, told me in early 2000 s that the result he obtained in his thesis was only superseded 25 years later. Few can claim such a record in their vitae. But he enjoyed mathematics through all scientific disciplines. My thesis resulted from rather new at that time, non-iterative, fixed point algorithm that were being developed within the economical sciences milieu. I wish to publicly express my gratitude to him, first, for having convinced me to undertake doctoral work and, secondly, for accepting to advise this work and introduce me to the scientific inner workings.
However, my involvement with modelling didn't spring directly from my thesis but from the adventure he threw me in when he sent me to Manaus, capital of Amazon state and central to the region, to discuss the modelling of artificial lakes that were being planned in the Amazon landscape to produce electrical power. The aim was to access their ecological impact and sustainability. Notwithstanding, the Amazon landscape is extreme in ecological terms: the soil has no nutrients, the biodiversity is high, biotic factors are strongly coupled and interdependent. Moreover, the landscape provokes variations in the interaction possibilities due to the annual floods and other factors, which means that interaction graphs vary. Furthermore, field research unveiled non-trivial couplings between the forest and atmospheric phenomena by the end of 1980 s (Kritz et al. 2008).
Studying the Amazons has immersed me deep in trans-disciplinary subjects and tought me about aspects of the modelling process not generally addressed, indeed nor generally needed at all in physical and chemical phenomena. These aspects of the modelling process are nevertheless vital for modelling any phenomena in Weaver's organised complexity class (Weaver 1948), particularly those where the presuppositions underlying existing models fail. Although constrained by space, they lay at the core of this text. I acknowledge the fact that some statements above are not completely supported by the arguments presented, due to space constrains, relying too much on readers' background. I will be glad to further explain them individually or in future work.