Using statistical methods to model the fine-tuning of molecular machines and systems

Fine-tuning has received much attention in physics, and it states that the fundamental constants of physics are finely tuned to precise values for a rich chemistry and life permittance. It has not yet been applied in a broad manner to molecular biology. However, in this paper we argue that biological systems present fine-tuning at different levels, e.g. functional proteins, complex biochemical machines in living cells, and cellular networks. This paper describes molecular fine-tuning, how it can be used in biology, and how it challenges conventional Darwinian thinking. We also discuss the statistical methods underpinning fine-tuning and present a framework for such analysis.


Introduction
Fine-tuning has obtained much attention in physics, and many studies have been accomplished since Brandon Carter presented his first results at the conference honoring Copernicus's 500th birthday (Carter, 1974). Luke Barnes has published a good review paper on the fine-tuning of the universe (Barnes, 2012), and Lewis and Barnes wrote an up to date book (2016). This naturally raises the question whether it is appropriate to introduce and address fine-tuning in biology as well.
The term fine-tuning is used to characterize sensitive dependences of functions or properties on the values of certain parameters (cf. Friederich, 2018). While technological devices are fine-tuned products of actual engineers and manufacturers who designed and built them, only sensitivity with respect to the values of certain parameters or initial conditions are considered sufficient in the present paper. We define fine-tuning as an object with two properties: it must a) be unlikely to have occurred by chance, under the relevant probability distribution (i.e. complex), and b) conform to an independent or detached specification (i.e. specific).
The notion of design is also widely used within both historic and contemporary science (Thorvaldsen and Øhrstrøm, 2013). The concept will need a description for its use in our setting. A design is a specification or plan for the construction of an object or system, or the result of that specification or plan in the form of a product. The very term design is from the Medieval Latin word ''designare" (denoting ''mark out, point out, choose"); from ''de" (out) and ''signum" (identifying mark, sign). Hence, a public notice that advertises something or gives information. The design usually has to satisfy certain goals and constraints. It is also expected to interact with a certain environment, and thus be realized in the physical world. Humans have a powerful intuitive understanding of design that precedes modern science. Our common intuitions invariably begin with recognizing a pattern as a mark of design. The problem has been that our intuitions about design have been unrefined and pre-theoretical. For this reason, it is relevant to ask ourselves whether it is possible to turn the tables on this disparity and place those rough and pre-theoretical intuitions on a firm scientific foundation.
Fine-tuning and design are related entities. Fine-tuning is a bottom-up method, while design is more like a top-down approach. Hence, we focus on the topic of fine-tuning in the present paper and address the following questions: Is it possible to recognize fine-tuning in biological systems at the levels of functional proteins, protein groups and cellular networks? Can fine-tuning in molecular biology be formulated using state of the art statistical methods, or are the arguments just ''in the eyes of the beholder"?

Statistical methods
The real world is complicated, and scientific models must handle it by simplifying matters, approximate and focus on some aspects of a structural or numerical investigation, namely the aspects that interest us. Mathematical models have proven invaluable in several fields of both science and engineering (Quarteroni, 2009). In biology, they provide structured abstractions that enable the study of design, organization and evolution of biological systems. Whenever we use mathematics in order to study some observational phenomena, we must essentially begin by building either a deterministic or a stochastic model to represent the phenomena, which are the two main types of mathematical framework used in science.
For a large number of situations the deterministic mathematical model will suffice. However, there are also many phenomena which require a different mathematical model for their investigation, stochastic (often called probabilistic) models. A model is stochastic when it is able to represent different choices and to provide information on the probability of these choices. It differs from deterministic models, where the conditions determine the actual outcome and no choices are represented. The randomness of a stochastic model is either epistemic or ontological. Epistemic randomness represents our lack of knowledge within a deterministic framework, whereas ontological randomness corresponds to a more fundamental uncertainty. Even if all the initial conditions of an experiment were known, a model with ontological randomness would still only provide probabilities for a range of possible observable outcomes (Coffman, 2014).
In order to summarize all possible ways to choose the outcome of a stochastic model, with different probabilities, a distribution is used. This distribution (or likelihood) typically involves some unknown parameters (such as the mean or standard deviation). Each possible parameter setting gives rise to a different stochastic model. The collection of all such stochastic models is usually referred to as a statistical model. The objective of statistical inference is not to predict the randomness of a statistic model (whether epistemic or ontological). The best we can do is to infer (or estimate/test) the values of the unknown parameters, and based on this estimate the probabilities of a certain event A which represents a specific collection of possible outcomes.
Within statistic modeling there are two main traditions for doing this, the Frequentist and Bayesian schools (see Fig. 1), which differ in the way they treat parameters. Frequentists generally consider parameters to be fixed but unknown. Probabilities are interpreted as the fraction of times an event occurs, if it is possible to repeat an experiment a large number of times under identical circumstances. Bayesians rather assign probability distributions to parameters, according to a prior distribution, which either represents subjective beliefs or prior knowledge. In any case, there is a modeled continuity between past and present in Bayesian statistics, since new observations are used to update subjective beliefs or prior knowledge into a posterior distribution according to Bayes' Rule. Consequently, the posterior distribution also takes the observed outcomes of the experiment into account. A Bayesian speaks of the probability of a parameter or a theory h, while a true frequentist can speak only of the consistency of the evidence with the parameter or the theory, through hypothesis testing or confidence regions. While there is fundamental philosophical difference between the frequentist and Bayesian approaches, many statisticians use both models, depending on the type of problem they study.
Nonparametric statistics is a way to release assumptions on the distribution of outcomes of a stochastic model. The word is actually a misnomer, since infinitely many (or a very large number of) parameters are used in these kinds of models to represent the greater uncertainty of how data is distributed, so that data to a larger extent ''speaks for itself". Although nonparametric statistics was first developed within a frequentist setting, it is actually consistent with a Bayesian approach as well.
Bayesian statistics was pioneered through the work of Thomas Bayes (who introduced Bayes' Rule) and Pierre-Simon Laplace. It was the prevailing view of statistics throughout the 19th century. Then, through the work of Ronald Fisher, Jerzy Neyman, Egon Pearson and others, frequentist statistics came to dominate during most of the 20th century. More recently, Bayesian statistics has seen an upswing, not the least through the development of effective simulation methods, such as Markov chain Monte Carlo and Approximate Bayesian Computation, which enable complex models to be studied within a Bayesian framework (Berger, 1985;Lehmann and Casella, 1998;Gilks et al., 1996).
Both schools have impressive records of successful application. Classical frequentist statistics is well suited for designed repeatable experiments. It has a larger record because numerous results, tailored for these methods, were obtained with mechanical calculators and printed tables of special statistical distribution functions. Bayesian methods have been highly successful in the analysis of information that is naturally sequentially sampled (like radar and sonar). It has also been applied within such diverse areas as philosophy or religion and social science, for instance in order to analyze complicated decision making, where debates and other types of social interactions are taken into account (Korb, 2003;Colin and Urbach, 2006;Chen et al., 2010;Chandler and Harrison, 2012).
A common task of proving fine-tuning is to demonstrate that a certain event A is very unlikely to occur by chance, that is, to show that the probability P A ð Þ of this event, the prevalence, is small. Typically, A is a classification; that an existing observation is fine-tuned. But it is also possible that A corresponds to a future observation being fine-tuned, a prediction. Regardless of whether A represents a classification or a prediction, a stochastic model (II or IV in Fig. 1) can be used in order to determine the probability for each parameter value h, by summing the probabilities of all outcomes included in A. Since the parameter h is typically unknown, one needs to estimate it from data.
With a frequentist approach, a point estimate b h ¼ b h data ð Þ is used, and this leads to an estimate of the prevalence. In order to assess the uncertainty of (3), a Bayesian may translate the posterior distribution of the parameter h into a posterior distribution of the prevalence P Ajh ð Þ. For complex models, whether a frequentist of Bayesian approach is used, it is often the case that P Ajh ð Þ is unknown for all values h of the parameter. In this case one typically computes an estimate b P Ajh ð Þ of P Ajh ð Þ and then inserts it into (2) or (3).
Eq. (1) is actually consistent with a deterministic model (I or III in Fig. 1) as well, with P Ajh ð Þ equal to 1 or 0 depending on whether the observed event A is consistent with theory h or not. In particular, there is a way of reasoning denoted abductive reasoning (cf. III of Fig. 1) or inference to the best explanation (Walton, 2001). An explanation is a story h about an event A that has occurred, and this kind of explanatory inference plays a central role, both in ordinary life and contemporary science. Abduction was introduced by Charles Peirce as a form of logical inference that starts with a set of observations A and seeks to find the simplest and most likely explanation for the observations. Peirce considered it a topic in logic, but not as formal or mathematical logic. Computer science, expert systems and artificial intelligence research frequently employ abduction. In our framework, it can be viewed as a procedure of choosing the hypothesis or theory b h that best explains the available data A, based on some guiding principle. This process yields a plausible conclusion but does not positively verify it. Ernan McMullin (1992) even refers to abduction as ''the inference that makes science." Even though the original version of abduction was not stochastic, one still refers to a plausible result as relatively likely to be true, compared to competing hypotheses, given the background knowl-edge. In Fig. 1III, this would mean that all the likely hypotheses, theories, explanations or parameter values generate outcomes deterministically within the observed event A, i.e. P Ajh ð Þ ¼ 1. In recent years, several statisticians have become interested in a more mathematical version of abduction that is probabilistic in nature, with Bayesian inference as a special case (Douven and Wenmackers, 2017). Some authors have argued that not only is abduction compatible with Bayesianism, it is a much-needed supplement to it (Douven, 2017). This leads to a probabilistic view of abduction, where past events are analyzed through a stochastic Bayesian model (II), with a distribution being assigned to all possible theories or explanations. The analyst must then assign a prior to all possible explanations, using some criterion such as simplicity or scope. The likelihood, on the other hand, describes the distribution of outcomes for each possible explanation and thereby quantifies whether a theory explains the observed event well or not. In principle we may also frame abduction within a frequentist framework (IV), where all explanations are treated as fixed. One may argue though that a frequentist approach is less appealing, since the past event only happened once, whereas the likelihood within a frequentist framework involves probabilities that require a hypothetical assumption of how the outcome would appear if the exper-  (I, III) and stochastic (II, IV) models. All types of models involve parameters (or more generally possible theories or explanations). For a given parameter, the outcome of a deterministic and stochastic model is non-random and random respectively. The Bayesian models I and II treat the parameter (or collection of parameters) h as random, whereas the abduction and frequentist models III and IV treat it as fixed. The outcome x is either completely determined by the parameter (I, III) or it is an observation of a random quantity X, with a distribution P xjh ð Þ(the likelihood) that depends on the parameter (II, JV). The sample space X is the collection of all outcomes that are possible for at least one h, whereas an event A & X is a subset of the sample space, that is, a specific collection of outcomes. The parameter space H is the set of possible values of the parameter, with each h 2 H giving rise to a different deterministic or stochastic model. For II and IV, the collection g of all stochastic models is usually referred to as a statistical model. iment was repeated (a counter-factual). With a Bayesian approach, there is more freedom in modeling the distribution of counterfactuals, and hence the likelihood. When a past event is observed before they study begins, statisticians refer to it as an observational study. It is well known that sometimes (but not always) the likelihood of such a study needs to be adjusted in order to account for the way in which the past event was observed, also within a Bayesian approach (Rosenbaum, 2010). A designed experiment, on the other hand, is planned before the outcomes occur, and then the likelihood simply describes the randomness involved in the experiment.

Some historical background of fine-tuning
The biochemist Lawrence Henderson (1878Henderson ( -1942 at Harvard University wrote one of the first books to explore concepts of fine-tuning in the universe (Henderson, 1913). He discusses the significance of water and the environment with respect to living things, arguing that life depends entirely on the very specific environmental conditions on the Earth, particularly with regard to the prevalence and properties of water.
In the 1970s the astrophysicist Brandon Carter worked on a kind of counterfactual analysis of cosmology by asking the question: Suppose the laws of physics had been a bit different from what they actually are, what would the consequences be? (Davies, 2006). Carter was the first to name and employ the term Anthropic Principle in his important contribution to the 1973 Poland conference honoring Copernicus's 500th birthday. To his surprise, it turned out that many of the parameters necessary for life to exist in our universe, must fall within very narrow margins, or the universe would either not exist or not be able to support life. In his lecture, Carter derived the Anthropic Principle (AP) in reaction to the Copernican Principle, which states that humans do not occupy a privileged position in the universe. As Carter said on Copernicus's birthday: ''Although our situation is not necessarily central, it is inevitably privileged to some extent" (Carter, 1974).
The chances that the universe should be life permitting are so infinitesimal as to be incomprehensible and incalculable.
Having said this, it should also be noted that there is also a critique of the Antrophic Principle, referred to as the Weak Anthropic Principle (WAP). WAP states that only in a life-supporting universe will there be living beings around who are able to observe it. In the terminology of Section 2, we say that the act of discovering that we live in a life-permitting universe is part of an observational study, and therefore we have to modify the likelihood accordingly. Although there is some truth in this objection against the AP, it is also problematic. Indeed, if we apply the WAP principle consistently to other occasions where we discover regular or unexpected patterns, we should never be able to infer fine-tuning or design as an explanation. Philosopher John Leslie gives the picture of a person that very unexpectedly survives a fire squad. Is he then allowed to infer that all poppers missed deliberately (because of someone planning this to happen) or not? (Leslie, 1989).
It is hard to give a definitive answer to the number of fine tuning parameters. Based on the items discussed in Barrow and Tiplers' classic book (1988) there are about 100, and the Royal Astronomer Martin Rees lists six dimensionless constants that give overall fine-tuning to the universe (Rees, 1999). The finely tuned universe is like a panel that controls the parameters of the universe with about 100 knobs that can be set to certain values. In the framework of Section 2, the parameter h is a vector with 100 components (the knobs), the sample space X is the set of all possible universes (including no universe at all), whereas A is either the set of possible universes, or the set of those possible universes that are also hospitable. If you turn any knob just a little to the right or to the left, the result is either a universe that is inhospitable to life or no universe at all. If the Big Bang had been just slightly stronger or weaker, matter would not have condensed, and life never would have existed. The odds against our universe developing were ''enormous" -and yet here we are, a point that equates with religious implications, as expressed by Brian Schmidt at the Australian National University: Like a Bach fugue, the Universe has a beautiful elegance about it, governed by laws whose mathematical precision is meted out to the metronome of time. These equations of physics are finely balanced, with the constants of nature that underpin the equations tuned to values that allows our remarkable Universe to exist in a form where we, humanity, can study it. A slight change to these constants, and poof, in a puff of gedanken experimentation, we have a cosmos where atoms cease to be, or where planets are unable to form. We seem to truly be fortunate to be part of Our Universe (Lewis and Barnes, 2016, p. xi).
What Brian Schmidt refers to as a ''gedanken experiment", is often called ''multiuniverses", i.e. an enormous supply of universes and each one a little different. There is a subtle difference between the set of possible universes X referred to above (of which one is assumed to exist), and a multiverse theory, which holds that some or all of these universes exist in parallel. This multiverse hypothesis is not backed up with any empirical support, and may be regarded as a rather speculative idea.
A probabilistic argument presumes adequate knowledge of (the limits on) the space of possibility. It presupposes that current knowledge provides an accurate, unbiased statistical account of, or means of determining, what may or may not happen by chance. As Colyvan et al. (2005) and Dembski (2014, pp. 128-129) have argued, the fine-tuning argument for our universe is not a strict statistical argument, since it involves features that need to be in place before the universe can be said to exist and operate. And there is no way of assigning a probability distribution as reference associated with the universe in that early stage. Probabilities for the initial formation of the universe are by its nature independent of known processes operating in our present universe, i.e. ''gedanken probabilities". William Dembski, who mainly belongs to the frequentist's school in statistics, regards the fine-tuning argument as suggestive, as pointers to underlying design. We may describe this inference as abductive reasoning or inference to the best explanation. This reasoning yields a plausible conclusion that is relatively likely to be true, compared to competing hypotheses, given our background knowledge. In the case of fine-tuning of our cosmos, design is considered to be a better explanation than a set of multi-universes that lacks any empirical or historical evidence. If the existence/ habitability of a universe follows deterministically from the finetuned initial conditions, such a frequentistic approach leads to a model for the physical universe that is essentially deterministic (cf. III of Fig. 1). A Bayesian approach, on the other hand, corresponds to a model with deterministic outcomes for each parameter, but randomness still enters in the choice of parameters (cf. I of Fig. 1).
As noted in Section 2, a more general type of abductive reasoning is closely related to a Bayesian stochastic model. By applying methods from Bayesian statistics, several authors have framed a stronger conclusion than Dembski. Robin Collins (2012), Richard Swinburne (2012) and Vesa Palonen (2008; give the fullest and most up-to-date detailed account of the argument, and conclude that the possible existence of a multiverse does not greatly diminish the powerful force of the argument from fine-tuning to the existence of design. The main argument of their Bayesian analyses is that even under a multiverse we should use the proposition ''this universe is fine-tuned" as data, even if we do not know the 'true standing ' of our universe. Since multiverse hypotheses do not predict fine-tuning for this particular universe any better than a single universe hypothesis, it follows that multiverse hypotheses are not plausible explanations for fine-tuning. Therefore, our data on cosmic fine-tuning does not offer support to the multiverse hypotheses. For physics in general, irrespective of whether there really is a multiverse or not, the rational consequence of the above discussion is that we should prefer those theories which best predict (for this or any universe) the phenomena we observe in our universe.
One of the surprising discoveries of modern biology has been that the cell operates in a manner similar to modern technology, while biological information is organized in a manner similar to plain text. Words and terms like ''sequence code", and ''information", and ''machine" have proven very useful in describing and understanding molecular biology (Wills, 2016). The basic building blocks of life are proteins, long chain-like molecules consisting of varied combinations of 20 different amino acids. Complex biochemical machines are usually composed of many proteins, each folded together and configured in a unique 3D structure dependent upon the exact sequence of the amino acids within the chain. Proteins employ a wide variety of folds to perform their biological function, and each protein has a highly specified shape with some minor variations.
In the 1990s, a huge amount of publications and proceedings started to appear, with the book ''Evidence of Purpose", edited by Sir John Marks Templeton with papers from 10 distinguished scientists, as one of the first (Templeton, 1994). Michael Behe and others presented ideas of design in molecular biology, and published evidence of ''irreducibly complex biochemical machines" in living cells. In his argument, some parts of the complex systems found in biology are exceedingly important and do affect the overall function of their mechanism. The fine-tuning can be outlined through the vital and interacting parts of living organisms. In ''Darwin's Black Box" (Behe, 1996), Behe exemplified systems, like the flagellum bacteria use to swim and the blood-clotting cascade, that he called irreducibly complex, configured as a remarkable teamwork of several (often dozen or more) interacting proteins. Is it possible on an incremental model that such a system could evolve for something that does not yet exist? Many biological systems do not appear to have a functional viable predecessor from which they could have evolved stepwise, and the occurrence in one leap by chance is extremely small. To rephrase the first man on the moon: ''That's no small steps of proteins, no giant leap for biology." Living forms exhibit structures and functions that can best be understood as nano-level engineering. In 1998 Bruce Alberts, president of the National Academy of Sciences, published an important paper preparing the next generation of molecular biologists: The Cell as a Collection of Protein Machines (Alberts, 1998).

Main results and discussion
In this section, we will present and discuss some relevant observations from experimental biology. This will be done in the light of the theory of stochastic models, outlined in Section 2. More specifically, we will identify events A whose probability P A ð Þ is very low under naturalistic stochastic models, and argue that these represent extreme examples of fine-tuning.

Functional proteins
Natural proteins are known to fold only to a limited number of folds. The designability of a structure is defined as the number of sequences folding to the structure (Zhang et al., 2014). Some of these folds are frequently occurring and often referred to as highly designable, whereas some others are rarely observed and are less designable. Li et al. (1996) first introduced this concept of protein designability. One interesting aspect of their study was that the structures differed strongly in designability, and highly designable structures were only a small fraction of all structures.
An important goal is to obtain an estimate of the overall prevalence of sequences adopting functional protein folds, i.e. the right folded structure, with the correct dynamics and a precise active site for its specific function. Douglas Axe worked on this question at the Medical Research Council Centre in Cambridge. The experiments he performed showed a prevalence between 1 in 10 50 to 1 in 10 74 of protein sequences forming a working domain-sized fold of 150 amino acids (Axe, 2004). Hence, functional proteins require highly organised sequences, as illustrated in Fig. 2. Though proteins tolerate a range of possible amino acids at some positions in the sequence, a random process producing amino-acid chains of this length would stumble onto a functional protein only about one in every 10 50 to 10 74 attempts due to genetic variation. This empirical result is quite analog to the inference from fine-tuned physics. That is, we may regard the space X of all possible proteins as the outcomes of a stochastic model, where each outcome is a string of letters (amino acids). The prevalence P A p À Á is the probability of the event A p that a randomly chosen amino acid sequence leads to a functional protein (or more generally a protein with some characteristic patterns), whereas h p involves all biochemical constants of relevance for protein formation.
The experimental results reported by Douglas Axe are empirical studies of a single protein that typically would be involved as one of the constituent parts of a coherent Behe-system (see Section 4.2). Protein sequence space may look like a limitless desert of maladjusted sequences with only a few oases of working sequences, as outlined by Axe. Another study examines the probability of finding ATP binding proteins from a random sample of sequence space regardless of the fold (Ferrada and Wagner, 2010). The authors estimated a probability of 1 in 10 11 to find an ATP binding protein, suggesting a higher probability than found by Axe. Recently Kozulic and Leisola (2015) made careful analyses of these results, and concluded that even with very conservative conditions, the probability of finding ATP binding activity that would function in a cell, would be less than 1 in 10 32 . Estimates like these depend on various factors (the components of the parameter vector h p ), including the length of the proteins considered. They indicate that the probability of finding a functional protein in sequence space can vary broadly, but commonly remains far beyond the reach of Darwinian processes (Axe, 2010a). Some authors have even suggested that the original amino acid repertoire consisted of only four or five amino acids, in order to reduce the gigantic sequence space, and ''rule out the big number game" (Dryden et al., 2008). However, this will need another type of genetic code, something considered highly speculative. Hence, for a typical functional protein we may state experimentally: The functional protein arguments outlined above are empirical studies based on a standard statistical estimation of prevalence, using either a frequentist or a Bayesian framework (2)-(3). Such studies are commonly performed within scientific research by Monte Carlo estimates of the prevalence (cf. the discussion below (3)), examining a randomly selected sample from the entire population. Using such estimates, the proteins of life are found to be specific kinds of events with low probability. Notice however that the prevalence will depend on how the stochastic model of protein formation is built. The simplest approach is to choose the amino acids of the protein sequence independently and randomly, as above. A more refined approach is to model protein evolution (as briefly discussed in Section 5). Randomness is then built into an ancestral tree of proteins, whose dynamics is driven by random drift through reproduction, random mutations and natural selection. The parameters h p of such a model include the size of the protein population, the effective population size, mutation rates and the fitness of organisms carrying a certain protein, where organisms with well functioning proteins are assigned a higher fitness. Axe also elaborates on the massive improbabilities of anything like functional proteins arising by natural selection (Axe, 2016). The search space turns out to be too impossibly vast for blind selection to have even a slight chance of success. The contrasting view is innovations based on ingenuity, cleverness and intelligence. An element of this is what Axe calls ''functional coherence", which always involves hierarchical planning, hence is a product of finetuning. He concludes: ''Functional coherence makes accidental invention fantastically improbable and therefore physically impossible" (Axe, 2016, p. 160).
Life as it is today is an interdependent DNA-protein world (Voie, 2006). However, RNA-molecules can function both as enzyme (''protein") and as replicator (''DNA"). Eugene Koonin (2007Koonin ( , 2012 has made a theoretical study of the path from a putative RNA world to an explicit translation system (like a ''DNA-protein world"). He found this path to be incredibly steep (Koonin, 2012, p. 376), even under the best-case scenario. Koonin studied the requirements of a specified coupled replication-translation RNAsystem to emerge, after our universe was formed, within an Oregion of planets. Assuming that the replication-translation RNAsystem corresponds to an n-mer with n ¼ 1800 nucleotides, he calculated vanishingly small odds for it to emerge within a time interval of length t ¼ 3 Â 10 17 seconds after the Big Bang. The quantity E T ð Þ ¼ 4 n = 10 21 Â 5 Â 10 22 in the denominator of (5) is the expected waiting time until the the first coupled replication-translation RNA-system emerges by chance somewhere among the 10 21 planets of the O-region. It is assumed that each one of these planets has the same dimension as the earth, and a rate of 5 Â 10 22 molecules per second at which n-mers are formed within its habitable layer. Koonin raises a rather speculative solution of an infinite multiverse: the Many Worlds in One (MWO). This changes the very definition of what is possible and likely in such a way that the probability of the realization of any scenario in an infinite multiverse is 1. The odds do not matter anymore. Nevertheless, Koonin has presented a detailed calculation of a threshold for biological evolution. He also states that the RNA-World hardly has the potential to evolve beyond very simple ''organisms" (Koonin, 2012, p. 366).

Protein complexes
Proteins rarely work alone. They can interact with a variety of different molecules, but it is their simultaneous interactions with one another at the same location that account for many of the functions of the cell (Jones and Thornton, 1996). Proteins in a protein complex are linked by non-covalent protein-protein interactions. Protein complexes are a form of quaternary structure. These complexes are fundamental in many biological processes and together they form various types of molecular machinery that perform a vast array of biological functions. Protein assemblies are at the basis of numerous biological machines by performing actions that none of the individual proteins would be able to do. There are thousands, perhaps millions of different types and states of proteins in a living organism, and the number of possible interactions between them is enormous. Proper assembly of multiprotein complexes is important, and change from an ordered to a disordered state leads to a transition from function to dysfunction of the complex. Some protein complexes can be quite constant and exist for the lifetime of the cell while others can be transient, accumulated for some purpose and broken down when no longer needed. A Behe-system of irreducible complexity was mentioned in Section 3. It is composed of several well-matched, interacting modules that contribute to the basic function, wherein the removal of any one of the modules causes the system to effectively cease functioning.
Behe does not ignore the role of the laws of nature. Biology allows for changes and evolutionary modifications. Evolution is there, irreducible design is there, and they are both observed. The laws of nature can organize matter and force it to change. Behe's point is that there are some irreducibly complex systems that cannot be produced by the laws of nature: ''If a biological structure can be explained in terms of those natural laws [reproduction, mutation and natural selection] then we cannot conclude that it was designed. . . however, I have shown why many biochemical systems cannot be built up by natural selection working on mutations: no direct, gradual route exist to these irreducible complex systems, and the laws of chemistry work strongly against the undirected development of the biochemical systems that make molecules such as AMP 1 " (Behe, 1996, p. 203). 1 AMP: Adenosine Monophosphate is a nucleotide that is found in RNA and plays important role for intracellular signaling. Its delusive function is also used especially in diabetic products as bitterness suppressor.
Then, even if the natural laws work against the development of these ''irreducible complexities", they still exist. The strong synergy within the protein complex makes it irreducible to an incremental process. They are rather to be acknowledged as finetuned initial conditions of the constituting protein sequences. These structures are biological examples of nano-engineering that surpass anything human engineers have created. Such systems pose a serious challenge to a Darwinian account of evolution, since irreducibly complex systems have no direct series of selectable intermediates, and in addition, as we saw in Section 4.1, each module (protein) is of low probability by itself.
Extensive arguments have been written about whether or not Darwinian evolution can plausibly explain irreducibly complex systems (Behe, 2001;2019, 283-301;Miller, 2004;Dembski, 2004;Pallen and Matzke, 2006;Liu and Ochman, 2007;Doolittle, 2012). Irreducible complexity does not mean that irreducibly complex systems are logically impossible to evolve based on existing modules. One cannot definitively rule out the possibility of an indirect, circuitous route. A well-known subsystem of the bacterial flagella (called TTSS secretion system) performs a function distinct from the flagellum. However, finding a subsystem of a functional system that performs some other function is hardly an argument for the original system evolving from that other system. As the complexity of an interacting system increases, the likelihood of such an indirect route drops quickly. Hence, Darwinian explanations of irreducibly complex systems are improbable. Ultimately, this is a question that must be studied both experimentally and by computer simulations. Behe's concept of irreducible complexity has not been falsified by computer models (Ewert, 2014), and there are presently no detailed Darwinian accounts of the evolution of any such biochemical or cellular system, ''only a variety of wishful speculations" (Harold, 2003, p. 205).
In the framework of Section 2, the set of all possible protein complexes is regarded as the sample space X of a stochastic model. According to a naturalistic model, the outcomes are generated randomly by evolution, driven by random drift through reproduction, random mutations and natural selection. The prevalence P A ð Þ, i.e. the fraction of functioning protein complexes, will typically be even smaller than in Section 4.1, since it requires even more for a complex of proteins to function compared to one single protein.
Indeed, the stochastic model of protein complexes is quite involved, including, for instance, physical interaction. Physical interactions between proteins are specific types of interactions, and a Behe-system may be analyzed by the biochemical principle of complementarity. When a biologically active protein complex consists of more than one separate subunit, the so-called quaternary structure describes the topology of contacts, i.e. how the constituent units come together in space. The surface molecules in such a biological system fit together both because of special and electrostatic complementarity. Contours of one subunit of the system are complementary to the contours of the others, and regions of positive excess charge on the surface of one unit must fit closely with regions of negative excess charge on the others, as illustrated in Fig. 3. In addition, hydrophobicity and other physicochemical properties are also involved in the final configuration. The asymmetry between the proteins involved is conventionally divided into ''bait" and ''prey" (Scholtens et al., 2008). The bait is the protein whose interaction partners we are seeking; the prey proteins are those proteins detected to interact with a particular bait. The basic subunits fit into the multi-subunit system like a big 3D puzzle.
The principle of complementarity was first proposed by Nobel Prize winner Paul Ehrlich . It resonates throughout the whole of biochemistry, and continues to underpin much of modern research into the mode of action of enzymes (Hall, 2000, p. 303). Protein docking and pattern recognition at the molecular level is based on multilevel complementarity (geometry, charge, hydrophobicity etc.).
Dembski applies the term ''Discrete Combinatorial Object" to any of the biomolecular systems which have been defined by Behe as having ''irreducible complexity" (Dembski, 2002, pp. 289-302). The Drake equation is an expression often used in astrobiology to estimate the prevalence of active civilizations in our galaxy. By analogy to the Drake equation, Dembski proposes an equation based on three independent events: A p : originating the building blocks (protein chains) of the protein complex (as outlined in Section 4.1), A l : localizing the building blocks in the same place, and A c : configuring the building blocks correctly to form the complex. Then the probability of a protein complex is the multiplicative product of the probabilities of the origination of its constituent parts, the localization of those parts in one place, and the configuration of those parts into the resulting system (contact topology). This leads to the following estimate for the probability of a protein complex (PC) composed of N independent building blocks: where h n ð Þ p , h n ð Þ l , and h n ð Þ c are the parameters involved in forming the protein chain, the localization and the configuration of the nth building block. Modeling the formation of structures like protein complexes via this three-part process of production, convergence and assembly, is of course problematic because the parameters in the model are very difficult to estimate. Therefore, analogous to the Drake equation, the usefulness of the equation is not in the solving, but rather in the contemplation of all the various concepts which science must incorporate when considering the question of how to explain this kind of complex structures. Even if we take P (A p ) equal to 1, and thus assume there are no problematic obstacles involved in generating the building blocks; and also eliminate the localization probability by collapsing chance to necessity (selforganization), P(A c ) can still pose huge obstacles to the chance configuration of the quaternary structure of operative biological systems (Csermely et al., 2010). This problem of estimating P(A c ) seems quite intractable, but it may be addressed by performing perturbation experiments (Antal et al., 2009). The idea is to take a functional system, perturb it, and determine how perturbation affects the probability of retaining function. There is much of biological work to be done here, empirically and theoretically, and it is important to be open for any type of conclusions from new experiments. For instance, can we allow that irreducible complexity in the present tells us little or nothing about functional precursors in the past?
As we have seen, the stochastic model of protein complexes involves more complicated types of outcomes than the protein models of Section 4.1. Whereas single proteins correspond to strings of amino acids, the protein complexes are often represented as graphs (Fig. 4). Much of the research in studying protein interactions has been done with the use of mathematical graph theory (Chiang et al., 2007;Su et al., 2018). Graph theory is a straightforward and flexible way of implementing real interactive systems. The language of graph theory offers a mathematical abstraction for the description of such relationships. An important role for graphs is statistical modeling. A directed graph model is appropriate for bait to prey systems, in which a multinomial error model is used to represent the interactions. Both global and local statistics on the topology of interaction graphs aim to infer the nature and behavior of interactions of the protein complex. Su et al. (2018) have addressed the issue of significance testing procedures for real biological protein complexes. Their statistical studies show that the interactions in such complexes occur much less randomly than expected by chance.
The bait-prey model is in itself a way to model fine-tuning of protein modules. Moreover, the final function of the protein complex is achieved by complementarity between the binding cavity of the protein and its substrate. This involves an additional level of fined tuned complementarity with respect to the interacting groups of atoms that are involved in the ultimate function of the protein complex, a factor that additionally lowers the prevalence P A PC ð Þ of functioning protein complexes. There is also an additional level of information that should be accounted for in a stochastic model of protein complexes. This level of information is embedded in the language of molecular complementarity, which may also be understood as a biosemiotic sign language, i.e. signals written and read at the molecular level. Biosemiotics is in general the study of signs, of communication, and of information in living organisms. Charles Peirce is considered to be one of the founders of semiotics, and hence also of biosemiotics. In biosemiotics, the sign, rather than the molecule, is the basic unit for the study of life (Hoffmeyer, 1997). Our current preferential focus on the genome and amino acid sequences needs to be complemented by a similar focus on the senome (Baluška and Miller, 2018), representing the sum of all the activities of the living cell and its apparatus (Compagno, 2018).

Cellular networks
As Denis Noble states, biological systems function as a full orchestra with its different elements playing ensemble the score of life (Noble, 2006). Protein complexes perform their biological functions in a cooperative manner through their participation in many biological processes and networks, from the nucleus to the cell membrane. Cellular networks are also known to contain feedback loops and cycles. A stochastic model with cellular networks as outcomes is exceedingly complex. However, Bayesian models provide one of the most flexible frameworks for modeling such networks in terms of Dynamic Bayesian networks. In order to describe these structures, modern textbooks often utilize the pedagogical similarities between the cell's network and a modern city, or ''smart city" (Daempfle, 2016).
Studying protein interaction networks of all proteins in an organism (the ''interactomes") remains one of the major challenges in modern biology, and constitutes the objective of systems biology (Fig. 5). Statistical methods to reconstruct cellular networks is a vast and fast developing area of research, including Bayesian networks, Gaussian graphical models and graph-based methods for data from experimental interventions and perturbations (Markowetz and Spang, 2007). Random graphs may also be used for modeling cellular networks. They are described in terms of a random process that generates them, and the parameters h of this random process are chosen so that the edge configuration of the resulting random graph makes sense in comparison to real data. These resulting graphs should capture the fact that genes and gene products are connected in highly organized networks of information flow through the cell, which themselves do not work in isolation. We observe correlations between genes by the presence of other genes. Correlation graphs generate the simplest correlation structures of genes, whereas Bayesian networks encompass a more sophisticated set of models, with more intriguing correlations. Perturbation experiments are key to inferring gene function and regulatory pathways, and a common genetic technique is to perturb a gene of interest and to study which other genes' expressions that are affected. Several types of perturbations have a large effect on network stability, and a graph theoretical study shows that protein complex interaction networks are non-random networks (Jalan, 2013;Huang et al., 2016Huang et al., , 2019. Low randomness means that the probability of any two randomly chosen nodes to be wired to each other is very low or zero. However, although results such as these indicate the difficulty of random naturalistic processes to generate protein networks, there is still much work to be done before we can make more sense of biological networks in the light of fine-tuning. Network-based analysis falls into the following major categories: (a) motif identification and analysis, (b) global architecture study, (c) local topological properties, and (d) robustness of the network under different types of perturbations.
As we have outlined above, the internal organization of the cell comprises many layers. The genome refers to the collection of information stored in the DNA, while the proteome covers the set of all proteins. The metabolome contains small molecules (sugars, salts, nucleotides, and amino acids) that participate in metabolic reactions required for the maintenance and usual function of a cell, and all the proteins in the cell interact in a great network called the interactome. To understand the complexity of living cells, research will need to build models on all these layers. Statistical modeling of these systems may provide deeper insight into our understanding of the physical and biological universe, as displayed in Table 1.
In the following two sections, we will discuss some further implications and mathematical modeling questions related to fine-tuned systems.

Achieving fine-tuning in a conventional Darwinian model: The waiting time problem
In this section we will elaborate further on the connection between the probability of an event and the time available for that event to happen. In the context of living systems, we need to ask the question whether conventional Darwinian mechanisms have the ability to achieve fine-tuning during a prescribed period of time. This is of interest in order to correctly interpret the fossil record, which is often interpreted as having long periods of stasis interrupted by very sudden abrupt changes (Bechly and Meyer, 2017). Examples of such sudden changes include the origin of photosynthesis, the Cambrian explosions, the evolution of complex eyes and the evolution of animal flight. The accompanying genetic changes are believed to have happen very rapidly, at least on a macroevolutionary timescale, during a time period of length t. In order to test whether this is possible, a mathematical model is needed in order to estimate the prevalence P A ð Þ of the event A that the required genetic changes in a species take place within a time window of length t.
More specifically, in the framework of Section 2 we consider a time interval of length t (typically measured in units of generations) and ask the question whether evolutionary mechanisms (mutations, natural selection, and random genetic drift) may change a DNA-string of nucleotides for a whole population (species), from one pattern to another through a series of m coordinated genetic changes. The outcome x is the evolutionary path of the system from the starting point of the interval, T ¼ T x ð Þ is the time required to bring about a series of m specific changes and A is the set of all outcomes x for which these changes take place within time t. This corresponds to a prevalence P Ajh ð Þ ¼ P T X ð Þ tjh ð Þ , where X is random, with a distribution that assigns probabilities to all possible outcomes, according to a population genetic model of the system, whereas h includes the parameters of that model, such as the (effective) size of the population, the length of the DNA-string, the mutation rate, the type of genetic changes required in each of the m steps, and the selective fitness of individuals that have acquired i ¼ 0; 1; Á Á Á ; m genetic changes. For instance, if the final target of the evolutionary process is an irreducibly complex system with m subunits, the fitness of the corre- Table 1 The table gives an overview of scientific data and statistical models. The data structure corresponds to the outcome x of the corresponding model for Proteins, Molecular motors and Cellular networks, or the specificity (functioning) f x ð Þ of this outcome for fine-tuned physics.  sponding targeted DNA-string is higher that the fitness of individuals with no genetic changes (i ¼ 0Þ, whereas individuals that have acquired i ¼ 1; Á Á Á ; m À 1 genetic changes should have even lower fitness than those with no genetic changes. The larger the population is, the more difficult it is for deleterious mutations of the intermediate steps to spread and get fixed in the whole population. Therefore, the prevalence P A ð Þ of an irreducibly complex system is extremely small for all but very small populations. It is important here to differentiate between mutational adaptations which are based on internally-coded information and those which are the results of mere chance. More specifically, one or several mutations of the former kind are needed to build up new information and move the system from state i to i þ 1. But at the same time, other random mutations of the second kind will arrive, and sometimes these mutations destroy information and move the system back from state i to state i À 1. The effect of such back mutations is to enlarge the required time T to reach the target of m coordinated genetic changes, and consequently making the prevalence P A ð Þ of an irreducibly complex system even smaller. In order to estimate the prevalence of the system, we thus need to find the distribution of the waiting time T until m coordinated genetic changes take place. For one single change (m ¼ 1Þ, this is a well studied problem of population genetics when the target represents a single point mutation (Crow and Kimura, 1970;Durrett, 2008). These results have been generalized to more complicated settings with m ¼ 1, where the target represents a whole DNAstring of nucleotides, using either analytical approximations (Durrett and Schmidt, 2007;Behrens and Vingron, 2010;Tugrul et al., 2015) or simulations (Sanford et al., 2015).
The distribution of the waiting time for m ¼ 2 genetic changes includes a pioneering article of Kimura (1985), and more recent publications in the context of tumour spread by Komarova et al. (2003) and Iwasa et al. (2004). The mathematical results of the latter two papers were used by Durrett andSchmidt (2008, 2009) in order to estimate the time required for two coordinated mutations to change the expression of a gene in such a way that the first mutation deactivates a binding site within a nearby regulatory region, whereas the second mutation activates a second binding site within the same regulatory region. This work was later extended by , to an arbitrary number m of mutations. Behe (2007) has argued that m ¼ 2 coordinated mutations seems to be the edge of what evolution is capable of achieving, with the development of chloroquine resistance in the parasite that causes malaria (P. falisparum) as a well known example. Behe (2009) also stressed the importance of including back mutations in models for the waiting time of coordinated mutations. This has been confirmed, in different contexts, by Axe (2010b) and Hössjer et al. (2018). In one section of the latter paper the authors consider a system with m subunits, each of which may experience forward and backward mutations independently, back and forth, in any order. They further assume a neutral model where all intermediate states of i ¼ 1; Á Á Á ; m À 1 acquired forward mutations have no selective disadvantage. It is proved in equation (12.109) of Hössjer et al. (2018) that the expected waiting time until the system acquires all m forward mutations, is approximately when m is large, with u > 0 the probability of a forward mutation per generation and individual, and the probability of a backward mutation per generation and individual denoted by Cu > 0. If each subunit is a single DNA nucleotide A,G, C or T, then typical parameter values are u ¼ 10 À8 =3 and C ¼ 3, since only one mutation out of three is a forward mutation (corresponding to the targeted nucleotide of that subunit), whereas all mutations are back mutations. The waiting time in (7) is approximately exponentially distributed, so by Taylor expansion the prevalence is given as Notice in particular that the expected waiting time in (7) grows with m at an exponential rate when back mutations are allowed (C > 0Þ, whereas the prevalence in (8) decreases exponentially with m. The waiting time grows even more quickly with m for an irreducibly complex system with back mutations, since the intermediate states are not neutral but deleterious. Consequently, the prevalence P A ð Þ of an irreducibly complex system with back mutations is exceedingly small even for moderately large m.
A number of authors have tried to overcome the waiting time problem by proposing mechanisms of change within the evolutionary pathway X that shorten the time to reach the target. These mechanisms include symbiogenesis, the action of transposable elements, horizontal gene transfer, and the use of alternative evolutionary pathways. However, LeMaster (2018) argues that none of these mechanisms really solve the waiting time problem.
It is also possible to address the waiting time problem in the context of fine-tuning of structures of the living cell that connect to the origin of life, such as proteins (see equation (5) of Section 4.1), protein complexes (Section 4.2) or the genetic code (Wichmann and Ardern, 2019).
Þ%t=E T ð Þ then corresponds to the probability that (some aspect of) life arose purely by chance within a prescribed time frame t. Whereas the fine-tuning of the diversity of live (given that life first occurred) requires a Darwinian (biological) evolutionary process X in order to estimate the probability P A ð Þ that the observed genomic structure occurred randomly, within a prescribed time frame, the origin on life corresponds to a scenario where X is a chemical evolutionary process.
6. Modelling of fine-tuning in biological systems

Previous modeling work
Intelligent Design (ID) has gained a lot of interest and attention in recent years, mainly in USA, by creating public attention as well as triggering vivid discussions in the scientific and public world. ID aims to adhere to the same standards of rational investigation as other scientific and philosophical enterprises, and it is subject to the same methods of evaluation and critique. ID has been criticized, both for its underlying logic and for its various formulations (Olofsson, 2008;Sarkar, 2011).
William Dembski originally proposed what he called an ''explanatory filter" for distinguishing between events due to chance, lawful regularity or design (Dembski, 1998). Viewed on a sufficiently abstract level, its logics is based on well-established principles and techniques from the theory of statistical hypothesis testing. However, it is hard to apply to many interesting biological applications or contexts, because a huge number of potential but unknown scenarios may exist, which makes it difficult to phrase a null hypothesis for a statistical test (Wilkins and Elsberry, 2001;Olofsson, 2008).
The re-formulated version of a complexity measure published by Dembski and his coworkers is named Algorithmic Specified Complexity (ASC) (Ewert et al., 2013;. ACS incorporates both Shannon and Kolmogorov complexity measures, and it quantifies the degree to which an event is improbable and follows a pattern. Kolmogorov complexity is related to compression of data (and hence patterns), but suffers from the property of being unknowable as there is no general method to compute it. However, it is possible to give upper bounds for the Kolmogorov complexity, and consequently ASC can be bounded without being computed exactly. ASC is based on context and is measured in bits. The same authors have applied this method to natural language, random noise, folding of proteins, images etc (Marks et al., 2017).

Towards a general statistical framework for testing fine-tuning
More recently, George Montañez published a model for detecting fine-tuning that incorporates randomness and specificity, and which unifies many previous attempts (Montañez, 2018). In order to describe this method, let f x ð Þ be a function that quantifies, for each outcome x 2 X, how specified it is, with a larger value corresponding a higher degree of specificity. Let x obs be the observed outcome, and define the set of outcomes which are either at least as unlikely or at least as specified as the observed one. The prevalence P A ð Þ corresponds to the outlyingness of x obs , that is, how likely it is to observe an outcome at least as improbable and/or specified as x obs . Another possibility is to define an event that consists of all outcomes at least as specified as the observed one. An advantage of (10) over (9) is that (10) makes it possible to treat models where some outcomes are discrete whereas others are continuous, as is common in problems with censoring and truncation.
The choice of specificity function f is crucial. In the simplest case an outcome is either specified or not, quantified as 1 or 0. This corresponds to an indicator function where A is the set of specified outcomes, that is, a function that equals 1 for all outcomes in A and 0 for all outcomes outside of A. Notice that (10) retrieves A whenever f satisfies (11) and we observe a specified outcome (x obs 2 A).
In other applications, there are different degrees of specificity, and this requires more sophisticated choices of f than (11). It is possible, for instance, to state Haldane's Dilemma in the framework of (10). Haldane (1932) asked the question whether natural selection is capable of removing deleterious mutations as they arrive within a species over time. If not, they may cause a mutational load that increases to such an extent that the survival of the species is threatened (Lynch et al., 1993). Such an increased mutational load corresponds to an increase of genetic entropy (Sanford, 2008) or a decreased biological fitness. Haldane's Dilemma is in fact related to the waiting time problem of Section 5. More specifically, we ask the following question: If a population evolved randomly during a time period of length t, then at the end of this time period what fraction of individuals x within the population X would have a fitness f x ð Þ at least as large as the one observed, f x obs ð Þ, for some individual alive at this time point? In the context of (8), this corresponds to a prevalence P Ajh Þ , where f X ð Þ is the fitness of a randomly chosen individual X at the end of the time period, according to predictions of an evolutionary model.
The parameters h of this model include the fitness distribution at the beginning of the time period, the (effective) size of the population, the mutation rate, and the mutational spectrum (the distribution of fitness changes caused by mutations). If the mutational spectrum is such that mutations are neutral on average, then Fisher's Fundamental Theorem of Natural Selection (Fisher, 1930, Price, 1972 predicts that biological fitness increases over time, corresponding to a large prevalence P A ð Þ. However, it is well known (Kimura, 1979) that most mutations are slightly detrimental. Basener and Sanford (2018) recently extended Fisher's Theorem, allowing for arbitrary mutational spectra. In particular, they showed that Kimura's mutational spectrum implies a fitness decreases over time, in line with the predictions of Haldane's Dilemma. Consequently, the prevalence P A ð Þ is very small for species that have existed for a long period t of time.
Regardless of whether (9) or (10) is used, and regardless of whether f corresponds to a binary or continuous function, the prevalence P A ð Þ involves a number of unknown parameters h. Therefore, in order to estimate the prevalence we need some training data set (=data) different from x obs in order to estimate the unknown parameters, either through a frequentist approach (2) or a Bayesian approach (3). In the former case P A ð Þ is referred to as a p-value. We also face the challenge that the prevalence P A ð Þ depends on f . We must either know f beforehand or be able to estimate it in some way (this is related to the abovementioned difficulty of framing a null hypothesis of testing). Our previous examples in Sections 3-5 correspond to a binary specificity (11), where f x ð Þ ¼ 1 (or equivalently x 2 A) when x is a universe that either exists or is habitable (Section 3) or when x is a protein or protein complex that functions (Sections 4.1-4.2). In Section 5 we addressed the waiting time problem and asked the question whether the time T ¼ T x ð Þ until a pre-specified sequence of changes of an evolutionary path x occur, is less than t or not. This corresponds to the binary specificity function (11) g . For Haldane's Dilemma we rather used a continuous specificity function that corresponds to biological fitness.

Model selection
A general approach is to detect fine-tuning by demonstrating that the prevalence of the event (9) or (10) is low. A critic may say that this to some extent is a ''fine-tuning-of-the gaps"argument, since we may never know for sure whether a better naturalistic model, with a much higher prevalence, will be found in the future. That is, if the prevalence P A ð Þ is low, we have only falsified one specific naturalistic model, not necessarily naturalism in general. Of course, we can never be sure whether a better naturalistic explanation will be found later on or not. However, one may argue that the most suitable approach of science is to compare the best explanations founds so far within two competing worldviews. This naturally leads to model selection. Recall from Fig. 1 that a statistical model M is a collection of data generating mechanisms P Ájh ð Þ for all parameters h that the model allows for. It is possible for some problems to suggest a design model M 1 that competes with the currently most promising naturalistic model M 2 , in terms of which model explains data the best.
Such a model selection can be performed by computing estimates b P AjM 1 ð Þ and b P AjM 2 ð Þ of the prevalence of A (chosen as in (9) or (10)) for both models and then choosing the model with the largest estimated prevalence. The prevalence of each model can be estimated by a frequentist (2) or a Bayesian approach (3). In either case, the prevalence corresponds to the outlyingness of x obs , so that the chosen model is the one for which x obs is least of an outlier. We may interpret such a model selection between M 1 and M 2 as a comparison of the two models' goodness-of-fit, to the set A of all possible data sets that are at least as specified as x obs . A more traditional kind of model selection, which does not take specificity into account, is to compare the goodness-of-fit of the observed outcome x obs for both competing models by comparing b P x obs jM 1 ð Þand b P x obs jM 2 ð Þ , or versions of these probabilities that are penalized by model size. This corresponds to choosing A ¼ x obs f g in (2) or (3). We believe the model selection approach is very promising for future fine-tuning research. It can be used, for instance, when deciding whether the diversity of life is best explained by Darwinian macroevolution (M 2 ) or a design-inspired model (M 1 ). Examples of design-inspired models are the Dependency Graph of Winston Ewert (2018), and a forest of microevolutionary family trees, where the species within each family tree descend from a designed common ancestral population (Tan, 2015;. One may also study the more restricted problem of human/chimp ancestry, and compare a model M 2 with common ancestry of the two species, with a unique origin model M 1 , according to which each species is founded by one single couple (Sanford and Carter, 2014;Hössjer et al., 2016a;2016b, Carter et al., 2018, Hössjer and Gauger, 2019. In order to extend and strengthen the results of these articles, data x could involve not only DNA patterns, but also one or several layers of organization from the cell, as outlined in Section 4.

Concluding remarks
Statistical modeling and inference on molecular systems may provide valuable insights for our way to understand the physical and biological universe. In this paper, we have elaborated on basic information from DNA sequences, proteins, protein complexes, signaling pathways and networks, using the prevalence P A ð Þ of an observed event A of fine-tuning, which corresponds to a Shannon information of Àlog 2 P A ð Þ. By elaborating such models, we may adequately capture some of the richness of the natural world. In this context, statistical methods are part of a new approach that in many cases enable us to quantify how challenging it is for naturalistic, random processes to explain contemporary scientific observations and material, and instead propose fine-tuning as a credible alternative explanation.
The laws, constants, and primordial initial conditions of nature present the flow of nature. These purely natural objects discovered in recent years show the appearance of being deliberately finetuned. Functional proteins, molecular machines and cellular networks are both unlikely when viewed as outcomes of a stochastic model, with a relevant probability distribution (having a small P A ð Þ), and at the same time they conform to an independent or detached specification (the set A being defined in terms of specificity). These results are important and deduced from central phenomena of basic science. In both physics and molecular biology, fine-tuning emerges as a uniting principle and synthesis -an interesting observation by itself.
In this paper we have argued that a statistical analysis of finetuning is a useful and consistent approach to model some of the categories of design: ''irreducible complexity" (Michael Behe), and ''specified complexity" (William Dembski). As mentioned in Section 1, this approach requires a) that a probability distribution for the set of possible outcomes is introduced, and b) that a set A of fine-tuned events or more generally a specificity function f is defined. Here b) requires some apriori understanding of what fine-tuning means, for each type of application, whereas a) requires a naturalistic model for how the observed structures would have been produced by chance. The mathematical properties of such a model depend on the type of data that is analyzed. Typically a stochastic process should be used that models a dynamic feature such as stellar, chemical or biological (Darwinian) evolution. In the simplest case the state space of such a stochastic process is a scalar (one nucleotide or amino acid), a vector (a DNA or amino acid string) or a graph (protein complexes or cellular networks).
A major conclusion of our work is that fine-tuning is a clear feature of biological systems. Indeed, fine-tuning is even more extreme in biological systems than in inorganic systems. It is detectable within the realm of scientific methodology. Biology is inherently more complicated than the large-scale universe and so fine-tuning is even more a feature. Still more work remains in order to analyze more complicated data structures, using more sophisticated empirical criteria. Typically, such criteria correspond to a specificity function f that not only is a helpful abstraction of an underlying pattern, such as biological fitness. One rather needs a specificity function that, although of non-physical origin, can be quantified and measured empirically in terms of physical properties such as functionality. In the long term, these criteria are necessary to make the explanations both scientifically and philosophically legitimate. However, we have enough evidence to demonstrate that fine-tuning and design deserve attention in the scientific community as a conceptual tool for investigating and understanding the natural world. The main agenda is to explore some fascinating possibilities for science and create room for new ideas and explorations. Biologists need richer conceptual resources than the physical sciences until now have been able to initiate, in terms of complex structures having non-physical information as input (Ratzsch, 2010). Yet researchers have more work to do in order to establish fine-tuning as a sustainable and fully testable scientific hypothesis, and ultimately a Design Science.

Author statement
ST initiated the study, and OH developed the statistical model in Section 2. ST and OH designed the study and wrote the final manuscript together. Both authors equally contributed to this work and approve the manuscript.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.