Explaining the rise of moralizing religions: a test of competing hypotheses using the Seshat Databank

ABSTRACT The causes, consequences, and timing of the rise of moralizing religions in world history have been the focus of intense debate. Progress has been limited by the availability of quantitative data to test competing theories, by divergent ideas regarding both predictor and outcomes variables, and by differences of opinion over methodology. To address all these problems, we utilize Seshat: Global History Databank, a large storehouse of information designed to test theories concerning the evolutionary drivers of social complexity. In addition to the Big Gods hypothesis, which proposes that moralizing religion contributed to the success of increasingly large-scale complex societies, we consider the role of warfare, animal husbandry, and agricultural productivity in the rise of moralizing religions. Using a broad range of new measures of belief in moralizing supernatural punishment, we find strong support for previous research showing that such beliefs did not drive the rise of social complexity. By contrast, our analyses indicate that intergroup warfare, supported by resource availability, played a major role in the evolution of both social complexity and moralizing religions. Thus, the correlation between social complexity and moralizing religion seems to result from shared evolutionary drivers, rather than from direct causal relationships between these two variables.


Introduction
Religious constructs relating to supernatural agency, ritual efficacy, and the afterlife have been documented across the ethnographic record and likely have deep roots in our species' evolutionary history (Boyer, 2001;Hood et al., 2009). By contrast, moralizing religions, in which moral behavior between humans is a principal concern of supernatural agents or powers, appear to be a much more recent cultural innovation (Bellah, 2011;Botero et al., 2014;Henrich et al., 2010;Norenzayan & Shariff, 2008;Purzycki et al., 2016;Strathern, 2019;Watts et al., 2015). Here we use the term "moralizing religion" to refer to clusters of beliefs and practices postulating a system of supernatural punishment and reward for morally salient behavior, where such systems are primarily concerned with the way humans interact with other humans, rather than how they interact with supernatural forces. "Moralizing supernatural punishment and reward" (MSP), on the other hand, refers to the presence of such beliefs and practices in any degree. This terminology does not assume that the supernatural mechanism involved is agentic (as in the case of phrases like "Big Gods" or "moralizing gods"), recognizing that non-agentic variants of MSP, for example based on karmic principles (found in Hinduism and Buddhism and their offshoots) can foster prosocial and cooperative norms. Moreover, our preferred terminology does not privilege sanctions over incentives as the principal mechanism for moralizing enforcement (arguably a drawback with phrases like "broad supernatural punishment") (Willard et al., 2020). This approach also acknowledges that the process of supernatural moral enforcement in human affairs involves religious traditions operating as systems, rather than relying on a single aspect of religious belief, such as an all-seeing punitive deity. Broadening our approach to moral enforcement in this way allows us to explore a wider range of dimensions of religion that may have been involved in the evolution of sociopolitical complexity. In this paper we use a similarly broad definition of sociopolitical complexity (SPC) that aggregates social scale (e.g., population and territory), levels of hierarchy, as well as sophistication of government institutions, information systems, and economic exchange , see also Methods).
All the so-called "world religions" recognized today exhibit primary concern for interpersonal morality through systems of moralizing supernatural punishment, and scholars have long debated why that may be so (Darwin, 1871;Wilson, 2002). An influential trend in the evolutionary theorizing of religion proposes that belief in all-knowing, morally-concerned, punitive deities -"Big Gods"-facilitated increases in social complexity (Johnson, 2005;Norenzayan, 2013;Norenzayan & Shariff, 2008;Roes & Raymond, 2003;Swanson, 1960). One formulation of the Big Gods theory  begins with the premise that religious beliefs and behaviors originated as an evolutionary byproduct of ordinary cognitive tendencies, such as mindbody dualism (Bering, 2006) or teleological reasoning (Kelemen, 2004). By exploiting these intuitive biases, culturally evolved beliefs in supernatural surveillance and punishment increased the ability of groups to sustain complex social organizations and successfully scale up and expand. Competition among cultural groups gradually aggregated these elements into cultural packages, in the form of organized religions. Thus, Big Gods coevolved with larger and more complex societies (Norenzayan et al., 2016, p. 6). A variant of the Big Gods theory proposes that "broad supernatural punishment" (including non-agentic forces such as karma) contributed to the transition to large-scale, complex sociopolitical organization in different parts of the world (Raffield et al., 2019;Watts et al., 2015).
Entangled with theories exploring the relationship between MSP and sociopolitical complexity is the fact that rising complexity itself is often seen as resulting from the evolutionary demands of increasingly intense intergroup competition in the form of warfare. Several theorists have argued that warfare is a critical factor explaining the rise and spread of MSP (Bellah, 2011;Geertz, 2014;Martin, 2014;Turchin, 2006Turchin, , 2016. Specifically, they propose that the intensification of military competition between polities placed increasing evolutionary pressure to develop cultural systems that foster within-group cooperation and cohesion-characteristics thought crucial to success in between-group rivalries (Whitehouse et al., 2017). Alternative explanations for the evolution of MSP have focused on (among other things) animal husbandry (Peoples & Marlowe, 2012), resource scarcity (Botero et al., 2014), and rising affluence and material security .
What most theories of the evolution of MSP have in common is the idea that belief in supernatural punishment motivates prosociality (i.e., behavior that facilitates cooperation) in ways that contribute to the flourishing of complex societies. However, efforts to demonstrate empirically that there is a link between adherence to a moralizing religion and prosocial behavior have, so far, proven inconclusive (Kavanagh et al., 2020;McKay & Whitehouse, 2014). While religiosity has often been shown to predict self-reported prosociality (Brooks, 2006), studies using behavioral measures of prosociality have produced mixed results (Annis, 1976;Darley & Batson, 1973;Ge et al., 2019;Grossman & Parrett, 2011;Hofmann et al., 2014;Smith et al., 1975;Townsend et al., 2020). Moreover, there is evidence, from the past as well as from research in contemporary populations, that religiosity can trigger prejudice and antisocial attitudes towards minorities and outgroups in ways that would be more likely to foment conflict rather than cooperation in largescale societies (Johnson et al., 2012;Scheepers et al., 2002;Siegman, 1962;Whitley, 2011). Equally concerning is that secular primes may be just as effective as religious ones in motivating prosocial behavior (Mazar et al., 2008;Paciotti et al., 2011). This raises the question whether religious priming studies are tapping supernatural beliefs specifically or only a set of moral norms that happen to be associated with those systems of belief but which could just as readily have been incorporated into a secular belief system (McKay & Whitehouse, 2014). Thus, even when religious priming has been clearly linked to cooperation, this may be because the primes render moral norms more salient but not because those norms are attributed a supernatural origin. One of the most pervasive problems with efforts to demonstrate the possible role of MSP in fostering prosocial behavior is the lack of precision regarding how exactly beliefs in supernatural punishment motivate cooperation, as distinct from other features of religion, such as group bonding through collective rituals or moving in synchrony, found in all kinds of societies, not only those which postulate mechanisms of supernatural moral enforcement (Feinman, 2016). What features of religion are most useful in different kinds of societies, and what specific beliefs in supernatural punishment and reward might, under the right circumstances, contribute to an increase in sociopolitical complexity? Our goal in this paper is to help to clarify many of these key issues.
Previous comparative research on MSP and moralizing religions has depended on the availability of cross-cultural data on the topic. Data compilations, such as the Standard Cross-Cultural Sample (SCCS, White, 2008) and the Ethnographic Atlas (Murdock, 1967), have been exploited to produce a number of insights (Botero et al., 2014;Brown & Eff, 2010;Johnson, 2005;Peoples & Marlowe, 2012;Peregrine, 1996;Roes & Raymond, 2003;Swanson, 1960), and these are revisited in our discussion section. However, such repositories of cultural information have serious limitations that restrict their application in testing theories of cultural evolution. First, these databases promote a global "ethnographic present" that largely excludes modern European populations and large-scale complex societies of the past. Many entries draw exclusively on dated summaries of contact-era accounts, or ethnographic research conducted with indigenous populations living under colonial rule, strongly influenced by the scholarly discourse of the mid-twentieth century. Second, synchronic or static databases, such as the SCCS and the Ethnographic Atlas, cannot tell us how societies change over time, and thus provide only limited insight into the causal mechanisms at work in cultural evolution (Turchin, 2018). Although some researchers have treated the social institutions of contemporary small-scale societies as a window into Pleistocene foragers or early Neolithic villages, all extant societies are inevitably affected by the more complex societies that surround them or by the spread of moralizing religions. One way to get around these problems is to use the methods of phylogenetic analysis developed in evolutionary biology (Currie & Mace, 2012;Mace & Holden, 2005;Watts et al., 2015). However, this approach can be highly sensitive to assumptions underlying the phylogenetic analysis (Lukas et al., 2021), as demonstrated by efforts to reconstruct the origins of the Indo-European linguistic family (Bouckaert et al., 2012) and the debates these have prompted (Anthony & Ringe, 2015;Chang et al., 2015;Pereltsvaig & Lewis, 2015).
Here we showcase a relatively new approach to testing theories on the cultural evolution of MSP, large-scale prosociality, and sociopolitical complexity using Seshat: Global History Databank (François et al., 2016;Turchin et al., 2015;, which systematically samples past societies around the world from the Neolithic to the Industrial Revolution. Seshat data are resolved at 100-year intervals, enabling us to fit dynamic regression models with lagged predictor variables, thus greatly increasing the statistical potential to empirically test causal theories. In another paper (Whitehouse et al., 2022), we seek to establish how beliefs in moralizing religion (specifically moralizing supernatural concern as primary) have evolved in different parts of the world and test predictions of theories explaining this evolution. Here we expand that analysis in two major ways. First, in addition to social complexity, we consider other predictor variables suggested by a broader range of theories, including intensity of warfare, resource abundance, and animal husbandry. We regard this as an important first foray into a very large topic, recognizing that other potential drivers of social complexity, such as interregional trade, craft specialization, and urbanization, may have coevolved together with new forms of religious, economic, and military institutions that should also be explored in future analyses of Seshat data. Second, whereas previously we focused only on the earliest appearance of moralizing supernatural concern, treating it as a binary variable, here we consider a more varied and nuanced set of outcome variables. This is a significant advance because the great variety of religious practices in past populations is not easily categorized as either moralizing or not.
In early formulations of the Big Gods theory (Norenzayan, 2013;Swanson, 1960) proponents characterized such gods in "all or nothing" terms, and claimed that only big societies have Big Gods, while small-scale societies lack them. More recently, at least some proponents of the theory have described the phenomenon more as a continuum. Thus, in a recent article  conclude that although moralizing supernatural punishment may be present in a broad range of societies, "the trend in the cultural evolution of religion has been an expansion of deities' scope, powers, and monitoring abilities." Another example is the proposal that local moralizing gods may provide sufficient support for cooperation in smaller societies but become less effective in larger multi-ethnic empires, where gods associated with universally applicable morals and global provenance over human affairs are required (e.g., Lang et al., 2019;Purzycki et al., 2016). Such conclusions, while suggestive, still await thorough empirical investigation. Here we fractionate MSP into a variety of more precisely specified dimensions, including the degree to which supernatural agents were thought to care about the moral behavior of adherents, the power they had to monitor and enforce moral norms, the focus of their concerns (did these apply to whole populations, elite individuals, or rulers only, for example?), and the scope of punishments (were sinners singled out for punishment, or did entire communities suffer for one individual's transgressions?). We also distinguish between punishments meted out in this life versus the afterlife, as well as between agentic and impersonal supernatural powers.

Overview
Translating knowledge constructed by historians, archaeologists, and scholars of religion about past societies into coded data that can be analyzed with statistical methods is not a straightforward task. Our knowledge about religions in past societies is obviously incomplete. Experts often disagree and offer divergent interpretations from the available evidence; there are multiple ways in which the information in human narratives can be summarized to create computer-readable data; and there are many thorny issues to address in statistical analysis (for example, how should missing data and expert disagreement be handled?). Previous work utilizing Seshat data Mullins et al., 2018) developed methods for dealing with these issues in various ways, suited to the research questions at hand. Solutions require collaboration and debate, often inspired by critical engagement. Our work is guided by a strong commitment to open science and the aim to be as transparent as possible in our approaches to data gathering and analysis in order to facilitate fruitful discussion and help progress the scientific study of world history .
Whereas the data on the predictor variables pertaining to social complexity, warfare, and agricultural productivity were already available in Seshat, while the data-gathering strategy for these variables has been described elsewhere (Turchin et al., 2015, the MSP data were gathered following a strategy described in detail below (Variables Used in the Analysis). An early draft of this paper was made publicly available more than a year before submitting it for publication to enable scholars and analysts to propose alternative approaches and analyses, with the goal of collectively investigating how such decisions affect the results. The aim was not to achieve universal consensus but to sharpen our interpretations and draw attention to areas that remain contentious. Seshat is designed on the understanding that historical and archaeological data are debatable and dynamic rather than authoritative and static.
Testing theories about cultural evolution, and especially the role that religion played in it, requires a massively interdisciplinary approach. In particular, we benefit from humanities scholars adding, correcting, or providing alternative interpretations in the sections of the analytic narrative on which they have expertise (see Structured Analytic Narratives). Similarly, we benefit from social scientists proposing additional or alternative ways of encoding information from analytic narratives into machine-readable data. Finally, we benefit from computational and quantitative scientists refining our statistical methods or offering alternative analytical approaches. The Seshat project has already demonstrated that such a transdisciplinary collaboration is both possible and fruitful by bringing together humanities scholars and social and quantitative scientists . Our goal in this target article is even more ambitious, insofar as we propose to expand the scale of this collaboration beyond the Seshat project to explicitly include a broader network of voices, including potential critics. In this way, we hope that critique and discussion can be channeled in productive directions and will result in an overall advancement of the field.

Seshat: global history databank
Seshat (http://seshatdatabank.info/) is a large database of information about global history from the Neolithic Revolution up to the Industrial Revolution (François et al., 2016;Turchin et al., 2018;Turchin et al., 2015). During the early stages of the project, we created an initial stratified sample of past societies by identifying 10 world regions distributed as widely as possible across the Earth's surface and within each of those regions designated three "Natural Geographical Areas" (NGAs) with discrete ecological boundaries, on average about 10,000 km 2 in size, thus creating a sampling scheme of 30 such areas around the world (http://seshatdatabank.info/methods/world-sample-30/). The 30 regions and their selection rationale were published previously (Turchin et al., 2015) before the start of data collection. Our aim was to maximize variability in our global sample while minimizing historical relationships between cultures. We are in the process of adding further NGAs to the initial sample of 30 and at the time of writing Seshat contains 35 NGAs comprising 372 unique polities (see http://seshatdatabank.info/databrowser/). Data on political systems (polities) that emerged and persisted in each of the NGAs are organized into a continuous time series. For the purposes of the present study, these are queried at 100-year intervals, going back as far into the history of that area as scholarly literature would allow (up to a maximum of roughly 10,000 years before present). In the case of NGAs containing clusters of very small-scale polities that share a similar culture but are not under a single system of jurisdictional control, we refer to these as "quasi-polities" and code information on all of them generically, unless information is available that would allow us to differentiate between these polities.
All variables for which data have been gathered and entered into Seshat are derived from a Seshat Codebook that can be accessed and downloaded (http://seshatdatabank.info/methods/code-book/). The Codebook was designed by, and is continually updated and extended in consultation with, a large network of professional historians, archaeologists, anthropologists and other specialists whom we refer to as "Seshat experts". Especially during the early phases of data entry, variables in the codebook were revised and improved through continuous discussions among Seshat research assistants, experts, and the Data Review Board (See Data Gathering and Collation below). Most variables in Seshat require the data to take the form of a number or numerical range or they specify a feature that can be coded as absent, present or unknown (additionally coding items as "inferred present" or "inferred absent", where the evidence permits). All data are linked to scholarly sources, including peer-reviewed publications and personal communications from established authorities. A large subset of our dataset, including all variables used for this paper, can already be accessed at the project website http://seshatdatabank.info/databrowser/downloads.html .
While the Seshat project is constantly changing and evolving, many of our procedures and methods have become fairly standardized. Here, we redeploy certain descriptions about our methods and the construction of variables from other published work. This is done for parsimony and to reflect the interconnection between the various outputs from the Seshat team; together, these contribute to our collective effort to explore different hypotheses about the rise and fall of largescale societies across the globe and human history.

Developing a coding scheme for quantitative data analysis
Capturing variation in religious beliefs and practices across time and space requires a conceptual scheme capable of disambiguating a wide range of features that are theoretically important. For example, to test hypotheses concerning the role of MSP in the rise of social complexity, we seek to capture features that could plausibly facilitate cooperation as increasingly anonymous social interactions become harder to monitor and as cross-cutting structural tensions in society grow more intense. We also attempt to capture the degree of penetration of a particular religion into the region under consideration.
The Natural Geographic Areas (NGAs) covered in this paper were primarily determined by the availability of data previously compiled in the Seshat Databank. This approach was essential in order to explore the possible causal influence of key factors-sociopolitical complexity, intensity of interpolity competition, and production/resources-on the rise and spread of MSP. We thus restrict our analyses in this paper to regions where we have structured, reliable data on these potential predictor variables. Our unit of analysis, here as in all other Seshat papers, is not the NGA, but a Seshat polity, which we define pragmatically as an independent political unit ranging in scale from autonomous villages (independent local communities) through simple and complex chiefdoms, to states and empires . We populate our list by determining historical polities that occupied each of our sample regions (NGAs) over time, starting with the early modern period and working back in time to the Neolithic, or as far as available evidence allows (see François et al., 2016;Manning et al., 2017;Turchin et al., 2015).
For each of the polities in our sample, we gathered data for ten variables on supernatural moral enforcement (see Variables Used in the Analysis below). All variables were "binary" in the sense of attempting to capture presence or absence of a particular religious feature. Obviously, the extent to which it is possible to code variables in this way varies between different world regions and chronological periods. We discuss this issue at greater length below (under the heading Assessing the Effect of Uncertainty in Quantifying MSP). As usual, we employed the Seshat approach to capturing uncertainty and disagreement as well. Thus, codes of "absence" and "presence" could be modified with "inferred". "Unknown" was also a possible code. Finally, codes of "absent-to-present" and "presentto-absent" (which are different from "unknown") could be used to code a particular aspect of MSP during transitional periods that cannot be precisely dated. The first seven variables on moralizing supernatural punishment (Table 1) were combined into an integrated measure of moralizing supernatural enforcement (see the next section). Three additional variables code two other characteristics of moralizing religions (AfterLife, ThisLife, and Agency; Table 1).

Variables used in the analysis
Moralizing supernatural concern and punishment Ten variables pertaining to moralizing supernatural concern and punishment/reward were coded, as follows: Moralizing Supernatural Concern is Primary (MSCP). MSCP is coded as present when the principal concerns of supernatural agents or forces pertain to cooperation in human affairs. It is coded as absent when the primary concern is the behavior of humans towards the supernatural realm, e.g., by discharging ritual obligations. Importantly, codings of MSCP as present were applied not only when moralizing religion took the form of a morally concerned agent but also when beliefs in nonagentic forms of supernatural moral concern were present, including karmic principles emphasizing incentives to behave morally as well as punishments for transgressions.

MSP (Moralizing Supernatural Punishment) is Certain:
This variable reflects the predictability of supernatural punishment for transgression or reward for ethical behavior. A code of absence here could result from a variety of characteristics of supernatural agents: if they are fickle or capricious, if they can be bought off or tricked, or, alternatively, if they are not independently concerned about human morality and need to be persuaded or induced to punish transgressions.
MSP is Broad: This reflects how many aspects of morality deities care about and enforce. It is coded as absent when moralizing supernatural punishment/reward pertains to only very narrowly circumscribed domains, for example, kin-based moral precepts punishing incest or rewarding hospitality rather than enforcing moral norms across a broad range of social situations.
MSP is Targeted: This reflects whether punishment and rewards are targeted specifically at culpable individuals. It is coded as absent when the whole group is punished rather than just the individual transgressor.
Ruler: This reflects whether supernatural forces or agents punish/reward rulers for their antisocial/ prosocial behavior. It can be absent where such punishment is present generally, but rulers remain exempt.
Elites: This reflects whether elites of the polity subscribe to a religion with moralizing elements. In some cases, only a vocal segment of the elites advocated a particular moralizing religion (for example, early Buddhists, some Christians, Confucians) but not entire elite populations. Table 1. Summary of the supernatural moral punishment/reward variables used in constructing the measures of MSP used in analysis. For more details, see Variables Used in the Analysis below.

Primary
The principal concerns of supernatural agents or forces pertain to cooperation in human affairs (rather than the behaviour of humans toward the supernatural realm, for example by discharging ritual obligations) Certain Moralizing supernatural punishments and/or rewards are certain and predictable (rather than arbitrary or capricious) Broad Moralizing supernatural punishments and/or rewards enforce norms across a broad range of moral domains (instead of just a few domains) Targeted Moralizing supernatural punishments and/or rewards are targeted specifically at culpable individuals (instead of the whole group) Rulers Moralizing supernatural forces or agents punish and/or reward rulers Elites The elites of the polity subscribe to moralizing supernatural punishments and/or rewards Commoners The commoners of the polity subscribe to moralizing supernatural punishments and/or rewards AfterLife Moralizing enforcement in afterlife: punishment is delayed until after the death of the transgressor ThisLife Moralizing enforcement in this life: punishment occurs during transgressor's lifetime Agency Moralizing enforcement is administered by a supernatural agent, such as a deity or spirit (as opposed to an impersonal supernatural force, such as karma).
Commoners: This reflects the extent to which beliefs in MSP are adopted by the masses. A typical situation in which this variable is coded absent is when the state religion professed by rulers and elites, and endorsing beliefs in supernatural punishments and rewards, is different from the popular religion which lacks or professes only much weaker beliefs in supernatural enforcement. On the other hand, this variable might be coded as present, even while the Elites variable is coded absent, for example when popular religion emphasizes supernatural enforcement, but the religion of rulers and elites does not. Assessing the beliefs of commoners is methodologically challenging. Depending on the period and polity, different types of evidence may be used to determine whether belief in supernatural punishments and/or rewards was widely distributed. In the prehistoric periods of Latium, for example, we used linguistic evidence (comparisons of oath formulas across a broad range of Indo-European languages) to code "inferred present." For the historical period, in addition to linguistic evidence, we used written sources such as popular comic plays with moralizing sentiment and expectations of MSP. In polities where a moralizing religion (i.e., one with primary concern for interpersonal cooperation) has been installed for a long time, we generally code MSP "present" for commoners, making due allowance for transitional periods.
AfterLife: Moralizing enforcement in afterlife, reflecting whether punishment is delayed until after the death of the transgressor.
ThisLife: Moralizing enforcement in this life. Reflects whether punishment occurs during transgressor's lifetime. It is possible to code both this variable and AfterLife as present, if punishment can occur both in this life and in afterlife.
Agency: Moralizing enforcement is agentic. Reflects whether punishment/reward is administered by a supernatural agent, such as a deity or spirit (as opposed to being administered by an impersonal supernatural force, such as karma).
Our main measure of moralizing supernatural punishment (MSP) is based on the first seven MSP characteristics in the list above (Primary through Commoners). If all characteristics were present, the aggregated moralizing religion variable was set to 1 (the maximum). Each code of absent reduced the maximum by half; that is, the overall score was multiplied by 0.5. The minimum of the aggregated measure, thus, is 0.5 7 ≈ 0.008. Unknowns were treated as missing data and are dropped from the analysis.
This procedure assumes multiplicative effects. We also reran all analyses with an alternative, additive aggregation scheme (equating present with 1, absent with 0, absent/present with 0.5, and adding together these numerical scores).
The resulting MSP measure (whether multiplicative or additive) is a categorical variable with 15 levels (due to "half-tones" introduced by transitional periods absent/present). It is used as the response (dependent) variable in dynamic regression analyses.
The last three variables (AfterLife through Agency) were used to explore whether the immediacy of punishment (in this life, or the afterlife) and the mechanism of punishment (by a supernatural agent or supernatural force) affects our results. To do this we constructed three additional measures that reflected only moralizing punishment/reward in the afterlife, only that in this life, and only that administered by supernatural agents. Thus, MSP after , relying on punishment in the afterlife, was calculated by setting MSP to zero if AfterLife = absent. The other two measures, MSP this and MSP agen , were constructed analogously by setting MSP to zero if ThisLife or Agency were coded as absent.

Predictor variables
We constructed several predictor variables theorized to interact with moralizing supernatural punishment/reward, as outlined in the Introduction. These include: Sociopolitical Complexity (SPC). Current theories disagree about whether high levels of MSP help to drive the rise of sociopolitical complexity , or if the causal influence goes the other way around Whitehouse et al., 2022). Some argue that as societies grow, evading punishment for norm violations becomes easier, while surveillance and enforcement become harder (Norenzayan, 2013). On this view, rising complexity puts evolutionary pressure on societies to adopt cultural systems that "offload" surveillance and enforcement to moralizing gods or forces.
Following previously established procedures , we aggregated 51 Seshat variables coding different dimensions of sociopolitical complexity into eight "complexity characteristics": polity population size, population size of the largest settlement, polity territory size, levels of hierarchy, polity-produced infrastructure, sophistication of government institutions, information systems, and sophistication of economic exchange. Our investigation of the dimensions of sociopolitical complexity (SPC) characterizing polities in the Seshat sample indicated that they are well captured with the first Principal Component, which explains more than three-quarters of variance in the data . We use this principal component, SPC1, as our measure of complexity. In order to make SPC1 easily interpretable, we scale it in such a way that it corresponds to log 10 (Polity Population). In other words, polities with SPC1 = 3 have, on average, populations of 1000, and SPC1 = 6 corresponds to polities with populations of 1,000,000.
Warfare Intensity. As we noted in the Introduction, the positive relationship between MSP and SPC may arise as a result of both these factors responding to the evolutionary demands of increasingly intense intergroup competition in the form of warfare. We characterize this evolutionary intergroup pressure through the intensity of warfare, which we measure with two proxies.
The first proxy aggregates 46 variables measuring the realized sophistication and variety of military technologies in Seshat polities, MilTech. These variables code for the presence or absence of various types of technology in six composite categories: handheld weapons, armor, projectiles, and defensive structures, as well as the use of metals for making weapons and armor, and of transport animals used for military logistics. We describe these as "realized" technologies, as our coding approach assigns 1 when there is evidence that a particular weapon, projectile, etc. was used by the coded society and 0 when such evidence is absent. The reason for this "strong evidence" scheme is that our focus is not on whether a technology was known, but whether it was used. A large variety of sophisticated means of attack and defense, thus, serves as a quantitative proxy for the intensity of warfare in the environment of the polity. The MilTech measure used here is the sum of the six composite categories, which are, in turn, aggregated using the above scheme. Thus, the total range over which MilTech can vary is 0-46. Details on the 46 variables and methods of aggregation are in (Turchin, Korotayev, et al., 2020).
The second warfare proxy is Cavalry (mounted warriors or soldiers). We single out this variable as a potential predictor because several hypotheses explaining the rise of moralizing "world" religions during the Axial Age (c.800-200 BCE) identify as the major driving force the new forms of horse-based warfare, which emerged among societies in the Pontic-Caspian Steppe and then spread to the rest of Eurasia (and, ultimately, the whole world) (Bellah, 2011;Jaspers, 1953;Turchin, 2006). The data on the spread of mounted warfare are from Turchin et al. (2016). Previous research suggested that horse-mounted warfare in particular is an important predictor influencing the evolution of the social scale and complexity of polities, beyond the influence of military technologies generally (Turchin, 2009;Turchin et al., 2013). The Cavalry variable differs from the "Horse" variable included in the MilTech measure as Horse codes the use of horses in military activity including logistics (such as draft or pack animals), whereas Cavalry measures the adoption of a package of technological and tactical features employed in mounted warfare.
Resource Scarcity vs. Greater Affluence. Two prominent theories make opposite predictions about the role of resource abundance in the evolution of MSP. Botero et al. (2014) review several studies suggesting that beliefs in moralizing high gods promote cooperation in situations of increased environmental risk. Furthermore, ecological threats can strengthen mechanisms of norm enforcement in human groups (Gelfand et al., 2011). Analysis of a large set of historical data about cultural, linguistic, and ecological factors found that populations inhabiting resource-scarce or uncertain environments have greater tendency to adopt beliefs about moralizing high gods (Botero et al., 2014).
Invoking recent ideas in evolutionary psychology, , conversely, proposed that increasing affluence and declining uncertainty have predictable effects on human motivation and reward systems, moving individuals away from "fast life" strategies (resource acquisition and coercive interactions) and toward "slow life" strategies (self-control techniques and cooperative interactions). These authors adapted Morris' (2013) "energy capture" measure as a proxy for affluence and concluded that economic development, not political complexity or population size, accounts for the rise of moralizing religions in North China, North India, and the Eastern Mediterranean.
Not only do the theories proposed by Botero et al. and Baumard et al. offer contrasting takes on the same relationship-moralizing religion and resource abundance/scarcitybut questions have also been raised about the proximate measures used in their analyses. The Botero et al. study utilized data from the Ethnographic Atlas to obtain measures of religious practices, political complexity, and economic characteristics; we noted above limitations of this dataset, which is a static database that under-samples large-scale societies. The Baumard et al. approach combines coarse-grained data (for example, on the "Mediterranean") with fine-grained dynamics of individual societies (e.g., "Greece") at specific points in time. Both the theoretical and empirical aspects of this study have been criticized (Curry et al., 2019;Mullins et al., 2018;Purzycki et al., 2018).
Despite such shortcomings, these works offer intriguing and valuable insights into the possible role of ecological and economic factors in the emergence of moralizing religion. Here, we attempt to add some clarity to these debates by utilizing a quantitative approach for agricultural productivity (as a proxy for resource abundance), SPC, and moralizing religion, based throughout on the same polity-level unit of analysis, while utilizing a global sample of past societies and following the development of key variables in time. For productivity specifically, we use a synthetic measure of a polity's agricultural practices (Agri). Agri is measured in tons of the main carbohydrate crop (wheat, rice, maize, root vegetables, etc.) per hectare per year (see Turchin et al. (2021) for details). In addition, we conduct tests of the effect of environmental variables, using the approach of Botero et al. (2014) to reduce a variety of environmental characteristics to two predictor variables (the first two principal components, EnvPC1 and EnvPC2).
Pastoralism. The final hypothesis that we test here was formulated by Peoples and Marlowe (2012). In their analysis of the beliefs in High Gods, using the Standard Cross-Cultural Sample, they found that the incidence of active and moral High Gods to be highest in pastoralist societies. Their explanation of this pattern invoked instability and violence, characterizing the fraught pastoralist life and the ease with which their primary resource (livestock) can be stolen. "When drought devastates pasture, disease decimates herds, and constant violence over grazing rights becomes unrelenting, a bond of cooperation within one group or tribe must provide a survival advantage when challenged by other feuding groups" (Peoples & Marlowe, 2012, p. 264).
To develop a proxy for this hypothesis, we use the data recently published by the ArchaeoGlobe project (Stephens et al., 2019). This project synthesized the knowledge of c.250 archaeologists who have coded 146 world regions ("AG regions") for the presence of pastoralism (as well as foraging, extensive and intensive agriculture, and urbanization, but our focus is on pastoralism) at 10 time intervals stretching from 10k BP (8,000 BCE) to 1850. ArchaeoGlobe experts coded each AG region for each time step for pastoralism, Pastor, using a categorical scale with four levels, which we translated into a numerical range. These levels and associated numbers are 0: none (no evidence that any land in the region was used for pastoralism), 1: minimal (pastoralism was present, but less than 1% of land in the region was used for it), 2: common (between 1% and 20% of land was used for pastoralism), and 3: widespread (greater than 20% of land was used for pastoralism).

Data gathering and collation
MSP and moralizing religion: Our data gathering strategy followed a transparent and rigorous process taking place over several years and involving project experts, research assistants, and a Data Review Board (DRB) (see http://seshatdatabank.info/methods/). The latter comprises the senior team responsible for data management on a given paper. For the present paper, the DRB included three historians (DH, PF, and JL), an anthropologist (HW), and a complexity scientist (PT). The process of data collection for MSP variables typically involved matching each of the fully trained research assistants with one or more experts (recognized authorities on the polity in question, typically holding a relevant doctorate and occupying a faculty position in a university). Initial input by the experts focused on providing help with assembling initial reading lists or, where necessary, advice on how to interpret some of the key historiographical debates. Research assistants gathered the information necessary to put forward a provisional coding recommendation, together with a condensed overview of the data used to buttress that coding, highlighting any areas of uncertainty. These codes are thus based on scholarly sources and are fully referenced. In addition to the codes of "absence", "presence", and "unknown", a coding of "inferred" absence (or presence) was used when direct evidence for a particular variable was sparse or lacking but indirect evidence made clear that it was more likely to have been absent (or present) than not. This approach avoids a situation in which researchers inaccurately coded the trait "unknown" when in fact what was known was more than nothing. In addition, variables could be coded as first absent but then present during transitional periods or could be coded in multiple ways simultaneously where experts disagreed, thus providing grounds for more than one coding outcome. Research assistants then conducted consistency checks. Coding recommendations and the data provisionally used to buttress them were then presented to experts for further review, often in multiple iterations. Where research assistants found no information on a particular variable, they assigned a temporary code of "suspected unknown", which was later converted to "unknown" after being confirmed by an expert.
When Seshat experts pointed out disagreements in the literature or disagreed among themselves on a particular coding, we kept a record of this so that multiple analyses could be run taking into account contrasting interpretations. Finally, the DRB reviewed the resulting coding recommendations and supporting data. At this stage the DRB could approve codes as ready for analysis or request further review, where appropriate, involving additional experts to address remaining points of uncertainty. The DRB was also responsible for ensuring at this point that coding conventions were consistently applied across NGAs. Only when the DRB was satisfied that the rationales for coding decisions and the associated buttressing statements were transparently and compellingly articulated, following a set of agreed coding conventions, were the data "frozen" and converted into the correct syntax for the analysis. As such, final responsibility for coding decisions relating to data frozen for publication was assumed by the DRB rather than being outsourced to contributing experts.

Structured analytic narratives
Analytic Narratives (Bates et al., 1998;Bates et al., 2000) are formalized written accounts focusing on in-depth case studies. As part of the Supplementary Online Materials for this paper, we have compiled a group of analytic narratives pertaining to moralizing religions in world history, which will be developed as an edited volume. The goal is to employ the specialized knowledge possessed by historians, archaeologists, anthropologists, and scholars of religion to build and test generalizable theories concerning the factors driving the rise and spread of MSP and moralizing religions. Theories necessarily impose structure on the data by specifying which aspects of past societies are crucial for properly adjudicating between contrasting accounts. Our analytic narratives on moralizing aspects of religion are organized by space and time. Although the selection of regions represented in the analytic narratives was primarily determined by the availability of data previously compiled in the Seshat Databank, we welcome further expansion of geographical coverage as additional scholars become involved in the project.

Assessing the effect of uncertainty in quantifying moralizing religion
Our knowledge about past societies is imperfect and has many gaps. Thus, the statistical methods we use in testing various theories about the evolution of complex societies need to deal effectively with such uncertainty. Our goal should be to avoid the two extremes of either assuming that we know more than we really know, or the opposite, of treating imprecise or incomplete knowledge as no knowledge at all. The analytic strategy that we adopt in this paper involves running all the analyses for scenarios that span these two extremes and examining how this affects our results.
Suppose we have reasonably certain knowledge that some or most aspects of moralizing religion were absent in a particular society at a certain time T. Such knowledge could result from having enough written material produced by the society itself; or, perhaps, there are credible reports from an external observer. Can we make inferences about the state of MSP prior to time T? One scenario results from the assumption that if these elements were attested as absent at a certain point in time (A), then they were similarly absent at any preceding time. We would use the code of inferred absence (A*, with an asterisk indicating inference) and extend it back in time as long as there is absence of rapid cultural change resulting from, for example, conquest, migration, or close cultural contact with a different culture. In the absence of such catalysts, which are often visible archaeologically, culture typically changes slowly. Our data further indicate that declines in MSP are particularly rare (and much rarer than increases). The first inference scenario assumes that we can ignore such rare events.
The alternative would be to assume that no inferences can be made about the past and to treat such data points as unknown (U). At the analysis stage, we would simply omit the rows in the data matrix that contain such missing values. There are problems with this highly conservative approach, however. First, to renounce the ability to make judicious historical inferences on a case-by-case basis is to throw out what we do know about the cultures in question. Second, row deletion could lead to biased estimates because there are often systematic differences between the complete and incomplete cases. Some regions of the world have been subject to greater levels of research effort than others. Omitting many of the lesser-known cases, due to their larger proportion of missing values, would give too much weight to later or better-known societies and certain geographical areas. A third drawback to the conservative approach is that it reduces the sample size and, thus, our ability to detect causal influences in cultural evolution. For these reasons, we consider row deletion to be an inferior approach. Nevertheless, as we said at the beginning of this section, we conducted an analysis with row deletion in order to determine whether (and how much) this change in method affects our conclusions.

Statistical analyses
Descriptive statistics All analyses reported in this article are based on the Equinox2020 data release of the Seshat Databank  and were performed in R version 4.0.2 (2020-06-22). To explore and summarize the relationship between moralizing religion, predictor variables, and time, we first examine basic statistics and perform a correlation analysis. Results are presented in a correlation matrix. Correlation analysis among element concentrations was performed with R Performance-Analytics package.

Relative timing
We examine temporal interrelations in the dynamics of sociopolitical complexity and moralizing religion by looking at the relative timing of increases in these variables. This analysis advances a previous paper on the topic (Whitehouse et al., 2022) by employing a more nuanced and quantitative measure of the various constituent elements of moralizing religion identified above. The goal is to offer clarity on the competing theories about the reasons that MSP arose, when and where they did, and, more significantly, why they have come to play such a dominant role in religious practice around the world today.
First, we ask when each region in our sample crosses into "high complexity" territory, using the threshold of SPC1 = 5.3 (see Results: Correlations below for how this threshold was chosen). Next, we define a relative time scale (RelTime) with 0 at the time when the SPC1 trajectory crosses the 5.3 threshold. Thus, RelTime = -1000 corresponds to a time point 1,000 years before crossing the threshold, and RelTime = 500 corresponds to 500 years after that event. Only 19 NGAs cross this threshold and are thus retained in this analysis. To determine the relative timing between the increases in moralizing religion and SPC1, we calculate delMSP = MSP(t+1) -MSP(t), where t is time in centuries.

Regression analyses
To investigate the relationship between MSP and the potential predictor variables, we fitted a dynamic regression model to the data. This approach has been previously described (Turchin, 2018) and applied (see Turchin et al., 2018Turchin et al., , 2019 to Seshat data. It allows us to examine the effects of predictor variables (SPC1, MilTech, Cavalry, Agri, and others, see previous section) while controlling for serial autocorrelations, geographic cultural diffusion, and shared cultural history. The regression model used to examine the factors affecting MSP takes the following form: On the left side, Y i, t is the response variable quantifying MSP in a polity occupying NGA i at time t. We sampled polities (or quasipolities) within specific NGAs (Natural Geographic Areasee above) at century intervals (time step Δt = 100 years). The first term on the right side of the equation, a, is the regression constant (intercept). The second term represents autoregressive terms, meaning the influences of previous values of MSP within an NGA, with τ = 1, 2, … (number of centuries) referring to time-lagged values of Y. For example, this means that Y i, t−1 accounts for the value of MSP 100 years before t. The third term accounts for the potential influences of geographic diffusion on MSP, with c representing the regression coefficient for importance of diffusion and using a negative-exponential form to relate the distance between society i and society j (δ i,j ) to the influence of j on i. Here d scales the effect of distance on geographic diffusion. We use d = 1000 km because this value approximates the average distance between neighbor NGAs. We avoid potential issues of endogeneity by again applying Y j, t−1 to produce a weighted average of the occurrence of MSP in geographic proximity to i in the previous century, with weight diminishing to 0 as distance between i and j increases. The fourth term accounts for potential shared cultural history where w represents the influences of linguistic similarity. This weight is set to 1 if society i and society j share the same language, 0.5 for the same linguistic genus, 0.25 for the same linguistic family, and 0 if they are different linguistic families. Linguistic genera and families were derived from Glottolog (Hammarström et al., 2017) and the World Atlas of Language Structures (Dryer & Haspelmath, 2013). The penultimate term reflects the influence of predictor variables where g k are regression coefficients and X k,i,t−1 are time-lagged predictor variables. Finally, e i,t is the error term.
Analysis of possible evolutionary drivers of sociopolitical complexity was performed using the same general framework, but with SPC1 as the response variable (Y i, t ).

Confidence intervals
Post-regression diagnostic tests indicate that the distribution of residuals does not conform to the Normal. For this reason, we use nonparametric bootstrap (Efron & Tibshirani, 1993) to estimate confidence intervals associated with regression coefficients. To approximate the confidence intervals we resample, with replacement, the data to create 1,000 bootstrapped data sets. We then calculate the statistics of interest (regression coefficients associated with various predictors) and construct the frequency distribution of the 1,000 bootstrapped values. The 95 percent confidence interval is then approximated by eliminating the smallest 25 and largest 25 values and the Pvalue is approximated by the proportion of statistical values greater than 0 (if the hypothesis we test is that the effect of the predictor is positive), or less than 0 (otherwise).

Does the earliest appearance of minimal MSP predict increases in social complexity?
Our first empirical test of the Big Gods hypothesis, which investigated whether moralizing gods tend to appear before significant increases in social complexity (Whitehouse, François, Savage, et al., 2019), was critiqued (Beheim et al., 2019) on the grounds that little could be known about prehistoric beliefs in Big Gods (but see also Whitehouse et al., 2022). Here we examine how much a measure of moralizing supernatural punishment that moves the threshold significantly back in time for a given society affects our results.
To address this question, we defined "Minimal Moralizing Supernatural Punishment" (minMSP) equated to 1 if any element of MSP (primary, certain, broad, targeted, ruler, elite, commoners) is present and 0 if none are present. By definition the appearance of minMSP either precedes the presence of moralizing concern as primary or coincides with it.

Temporal Patterns
We first examine how incidence and degree of MSP has changed with time ( Figure 1). The heat map (red color indicates high density of points) suggests two hotspots: one corresponds to low values of moralizing religion and another one corresponds to high values. As time advances, an increasing proportion of trajectories migrate to the high-level hotspot. The earliest transition is observed in Egypt, which precedes the next earliest shift by nearly 2,000 years. The next two trajectories, Mesopotamia and North India, make the transition to high levels of MSP in mid-first millennium BCE, which corresponds to the Axial Age as it is traditionally dated (Hoyer & Reddish, 2019;Mullins et al., 2018). The majority of transitions, however, happen later-after 1 CE or in the Post-Axial Period. This concentration of transitions is captured by the yellow "bridge", which connects the two red hotspots. Only two sample trajectories early in this period (North China and Italy) are shown in order not to clutter Figure 1. Figure S2 in the Supplementary Results presents the basic statistics and correlations between the predictor variables and moralizing religion (also including calendar date to show how all variables evolve with time). The focus of Seshat is on agrarian polities, that, is the period between the Neolithic and Industrial Revolutions. The distribution of sampled time periods peaks between 1500 and 1800. Earlier periods are less well sampled, partly because the adoption of agriculture as a dominant subsistence practice occurred at different times in different world regions, and partly because earlier periods are less well known (for how we deal with the general problem of missing values, see Turchin et al., 2018).

Correlations of MSP with predictor variables
The distribution of SPC1 has two peaks. The first peak corresponds to mid-range societies with a modal polity population of a few thousand (these are typically organized as simple or complex chiefdoms), while the second peak includes large-scale societies with populations of a million or more (typically organized as states and empires). The breakpoint occurs at SPC1 = 5.3, corresponding to polity population = 200,000 (this is also the threshold at which polities tend to transition to the state-level of organization, see Turchin et al., 2019).
The frequency distributions of two other variables are characterized by similar bimodality: Mil-Tech and MSP. The distribution of MSP is even more bimodal than SPC1: most polities are characterized by low values (MSP < 0.2), or by high values (> 0.8), with a few values in between. As we saw in Figure 1, this pattern results from a relatively rapid transition from low to high MSP levels, compared to periods before and after this transition. The plot of MSP against SPC1 shows that the relationship between these two variables is nonlinear: MSP increases very slowly for SPC1 values below 5.3, followed by rapid rise beyond this threshold.
The distribution of agricultural productivities is unimodal, but with a long right tail. The scatter plot suggests that the relationship between moralizing religion and Agri may be nonlinear, with middle ranges of Agri corresponding to highest values of MSP. We will investigate whether adding nonlinearity in this variable improves the model fit in Dynamic Regression Analysis.
Cavalry is a binary variable with 0 = absence and 1 = presence. Transition periods between absence and presence, when the precise timing of the switch is uncertain are coded as 0.5, but such transitions are rare.
Examining cross-correlations, we observe that SPC1, MilTech, and Cavalry all correlated strongly with MSP. However, such correlation analysis cannot reveal causal interconnections between variables. We now proceed to using the temporal component of Seshat to empirically test such theories.

MSP and complex societies: relative timing
The smoothed delSPC1 curve peaks at RelTime = 0 and is symmetric around the peak. This result confirms that the average rate of increase in SPC1 is fastest at the time when it crosses the 5.3 threshold (this creates the bimodal distribution of SPC1). Next, we observe that most increases in moralizing religion occur after RelTime = 0 ( Figure 2). As the smoothed delMSP curve indicates, the average time lag between crossing the high complexity threshold and the transition from low to high moralizing religion is about 300 years. While there is much variation, the great majority of MSP increases come after the peak in SPC1 increase. Thus, these temporal relations are not consistent with an interpretation that the causality flows from MSP to SPC1 (see also discussion of this finding in Whitehouse et al., 2022).

Dynamic regression analysis
We first focus on the possible causal factors explaining the evolution of MSP. Model selection by Akaike Information Criterion (AIC, see Supplementary Results for all models with delAIC < 2) indicates that the best model (with lowest AIC) is as shown in Table 2.
Apart from autocorrelation terms, the strongest effect on MSP is by Cavalry and Agri (compare standardized regression coefficients in the column "Estimate"), followed by MilTech. The bootstrapped approximated confidence intervals for Agri.sq, EnvPC1, and EnvPC2 overlap 0, suggesting lower statistical support for these terms. Sociopolitical complexity (SPC1) is not selected for the best-fitting model, but shows up in some worse-fitting models. However, its coefficient is not statistically significant and negative to boot (see Supplementary Results for details).
The coefficient of determination for all these models is nearly the same and is very high (R 2 = 0.923). However, one reason for such high R 2 is because of temporal autocorrelation terms (that is, the previous value of MSP, one century before, has a very strong effect on the current value of MSP; this effect is nonlinear as suggested by a strong MSP.sq term). If we rerun the regression model while omitting autoregressive terms, we obtain the following results (Table 3).
This table omits P-values, because their estimates are highly biased when autoregressive terms are omitted. The high R 2 = 0.75 indicates that the predictor variables explain three-quarters of variance in moralizing religion, suggesting that MSP is strongly conditioned on these predictor variables.
Next, we examine the evidence for reverse causation, from MSP to SPC1. Full analysis of the factors affecting the evolution of sociopolitical complexity is reported elsewhere (Turchin et al. in prep, see also Supplementary Results); here we summarize it. Our analysis shows that the primary influence on the evolution of sociopolitical complexity is warfare (or, more precisely, intense military competition between polities). Two variables, in particular, have a strong effect: development of military technologies (MilTech) and the spread of cavalry. A secondary factor promoting high social complexity is agricultural productivity. When we add MSP to the model, we obtain the following results (Table 4).
This result is strong evidence against the causal effect of MSP on SPC1, because the MSP term is associated with a negative t-statistic that is not statistically significant at P < 0.05 level.
We now test whether a different measure of MSP, Minimal MSP (defined as the first appearance of any MSP elements, see Methods), has an effect on this result. The average difference between the first appearances of minMSP and MSP as primary is c.1000 years (the median is 450 years). When we use minMSP as a possible predictor (instead of MSP), however, we still do not detect any significant effect on SPC1 (see Supplementary Results for details). Thus, the best model (by AIC) suggests that the main factors driving the evolution of social complexity are the proxies for warfare intensity (MilTech and Cavalry) and agricultural productivity (Agri). At the same time we find no effect of MSP, whether we use as predictor the full quantitative measure or Minimal MSP (as well as moralizing supernatural concern as primary, see Whitehouse et al., 2022).
Overall, dynamic regression analysis reveals the following structure of causal arrows connecting warfare, sociopolitical complexity, and moralizing religion. There are no causal arrows going from either MSP to SPC1, or from SPC1 to MSP. Instead, SPC1 is affected by other evolutionary forces (intensity of military competition and productivity of agriculture). The main factors affecting MSP are very similar: the warfare proxies (Cavalry and MilTech) and intensity of agriculture. However, the effect of Agri on MSP is nonlinear, requiring a quadratic term to fully capture. Additionally, we detect a moderate effect of Pastoralism (the estimated standardized coefficient is lower than for other strongly supported terms, but the bootstrap-estimated 95% confidence interval does not overlap zero). Finally, we found weaker evidence that environmental variables (EnvPC1 and EnvPC2) have an effect on MSP. EnvPC2, in particular, is consistent with the hypothesis that environmental risk may play some role, although it is not a major driver of MSP (and statistical support for it is not high, as the bootstrap-estimated 95% confidence interval overlaps 0). The result of these causal influences is a positive correlation between all variables (see correlation graph in SOM). But the dynamic regression analysis indicates that the strong correlation between MSP and SPC1 is not causal and, thus, in a sense, spurious (Rohrer, 2018), arising because both processes are driven by a similar set of causal factors.
We have extensively tested how this overall result is affected by (1) various ways in which moralizing religion is quantified, (2) by utilizing distinct methods for handling the effect of uncertainty in moralizing religion, and (3) by using alternative measures of MSP focusing solely on AfterLife, ThisLife, or Agency as response variables (see Methods and Supplementary Results). These analyses suggest that the overall result is robust. The strongest effects that we have detected, namely the effect of warfare intensity and agriculture on both sociopolitical complexity and moralizing religion, and the absence of direct causation between moralizing religion and complexity, are supported in all scenarios and model specifications.

Discussion
The analysis presented here provides strong support for the view that military competition between societies is one of the main factors driving the evolution of MSP and moralizing religions. A number of military innovations helped to shift the balance between offensive and defensive warfare in favor of offense, intensifying military competition between societies and increasing the probability that defeated groups were eliminated as cultural entities (Turchin, 2003(Turchin, , 2009(Turchin, , 2016. The resulting process of cultural multilevel selection favors the spread of cultural traits that (1) sustain large-scale societies (because having more soldiers and taxpayers increases the probability of survival in between-society competition) and (2) promote "ultrasocial" institutions, including religious ones, that increase internal cohesion and cooperation in large-scale societies (because, all else being equal, societies that solve collective action problems most effectively are more likely to survive such competition). One of the most important military innovations in history to have shifted the balance of offense/ defense in favor of the former, was mounted warfare or cavalry (Turchin, 2009). The potential of horse-riding in combat was successfully harnessed by Pontic-Caspian nomads around 1000 BCE (Drews, 2004). Together with a powerful but short compound bow (which could be used on horseback) and new iron-smelting technologies (making arrows deadlier), mounted warfare, and the nearly simultaneous spread of iron metallurgy triggered a military revolution in agrarian societies located along the Steppe belt. New forms of warfare spread rapidly through Afro-Eurasia, triggering additional military innovations in areas such as armor to better protect against projectiles (Drews, 2004). Agrarian societies that were unable to secure an ample supply of horses for their cavalries were forced to dramatically scale up the size of their infantry armies to survive in the face of the new existential threat (Turchin, 2016). Previous work (Turchin et al., 2013) modeled these processes in theoretical terms, strongly suggesting that the pressures from cavalry warfare played a significant causal role in the rise and spread of "Macrostates' (defined specifically as polities controlling at least 100,000 km 2 of territory) across Afro-Eurasia (see also Bennett, 2020).
Although cavalry warfare provides us with one of the most important evolutionary drivers for large-scale societies, ultrasocial institutions, and moralizing religions, it is only one instance of a military innovation that had large consequences in history. Other such innovations include the chariot, which earlier revolutionized warfare in the Bronze Age Eurasia. Furthermore, although the horse stands out as by far the most effective animal in warfare, domestication of other transport animals, such as donkeys, camels, and llamas (often also linked to the expansion of trade rather than military goals) is also statistically associated with the subsequent rise of large-scale societies (Turchin, 2009). Finally, after 1500 CE, important military innovations included the spread of gunpowder weapons and ocean-sailing ships (Chase, 2003;Cipolla, 1965;Roberts, 1956).
Cavalry warfare thus appears in at least some regions of the world to have been an important evolutionary driver not only of social complexity, but also for the rise and spread of moralizing religions. In addition to questions of ultimate causation, this interpretation of the data also raises interesting questions about the possible proximate mechanisms linking warfare to the proliferation of MSP beliefs. One possible mechanism would be the well-documented psychological effects of outgroup threat on both levels of religiosity (Jong & Halberstadt, 2018) and on normative tightness (Gelfand et al., 2011). But although existential anxiety in general, and warfare in particular, seem to motivate stricter adherence to norms that may or may not entail MSP beliefs, evidence of a direct causal link between militarization and MSP beliefs specifically is presently scant, requiring further investigation. Morover, it is possible that the real evolutionary driver of the rise and spread of moralizing religion was not warfare, but some other process with which our warfare intensity proxies are strongly correlated. Future investigations of this issue should consider additional explanatory factors, based on empirically discernible proxies for the postulated processes, and adding these variables to the analysis.
Whatever the ultimate drivers of interpolity competition intensity turn out to be, our results here clearly support the finding of our earlier papers (Whitehouse et al., 2022), that the appearance of moralizing religion follows rather than precedes the rise of large-scale complex societies. Our analysis shows that the sharpest rises in social complexity precede moralizing religions, on average by three centuries, a finding that has been the subject of much recent debate (Beheim et al., 2019). More significantly, the strong correlation between sociopolitical complexity and moralizing religion is a result of shared evolutionary drivers, including intense military competition aided by increasing agricultural productivity. In addition, moralizing religion, but not sociopolitical complexity (see Supplementary Results: Evolutionary Drivers of Sociopolitical Complexity), is also affected by pastoralism. These causal arrows are summarized in Figure 3.
We do not include possible effects of environmental variables in Figure 3, because the evidence for them is statistically weak (estimated confidence regions overlap 0). In this our results differ from an analysis of data in the Ethnographic Atlas by Botero et al. (2014), who found that belief in moralizing high gods is more prevalent in societies that inhabit poorer environments that are prone to ecological distress. In contrast, our results suggest that the first principal component, positively correlated with means and predictabilities of temperature and precipitation and negatively with variance in temperature (see the PCA results and Figure S1 in Supplementary Results), has a positive effect on MSP (with a caveat that statistical support for it is weak). On the other hand, the positive effect of the second principal component, proxying for hot and dry environments, is more in line with the conclusion of Botero et al. These researchers additionally documented a positive correlation between moralizing high gods and their measure of political complexity (number of jurisdictional hierarchy levels). Our study also found this correlation, but we conclude that it was not causal, since this effect disappeared once the military competition proxies were included in the model. The third factor, detected by the analysis of Botero et al., was the positive effect of animal husbandry. In this our results agree, as we also found a statistically significant effect of pastoralism on MSP, although its magnitude was not high. More generally, there is a nearly universal agreement among the analyses based on the Standard Cross-Cultural Sample, the Ethnographic Atlas, and now Seshat Databank, that animal husbandry/pastoralism is a significant positive influence on moralizing religion (see also Brown & Eff, 2010;Peoples & Marlowe, 2012).
Such a mixture of agreement and disagreement between our results and analyses of the ethnographic databases is to be expected. It is due, in part, to Seshat's dynamic approach that allows us to trace how variables change with time, thus giving us a better ability to capture cause-effect relationships. Additionally, the Seshat Databank places a much greater emphasis on large-scale complex societies, which are undersampled in the ethnographic databases. Furthermore, analyses based on existing ethnographic databases utilize a simple binary measure of moralizing religion. While this approach works well for establishing broad-brush patterns, our analysis demonstrates the benefits of capturing additional nuance in the dynamics of MSP.
The relationship between MSP and "affluence," or economic development, is similarly complicated, as suggested by a comparison of our results to those from the previous analysis of , who proxied affluence with an index of energy capture, derived from Morris (2013) (we do not yet have a Seshat variable for a direct comparison, but coding efforts for such a measure are underway). Nevertheless, a key input in their measure is agricultural productivity (Agri), for which we can use Seshat. Our results provide support for the view that this aspect of development is a factor in the rise and spread of MSP: when Agri is added to the regression model it helps to account for substantial additional variation in the MSP measure. However, this effect is nonlinear, and the strongest positive effect of Agri is achieved at intermediate levels of this variable.
Nonlinear effect of Agri may offer an explanation for the divergent effects of environmental variables, which our analysis detected. Perhaps the effect of the first environmental principal component is associated with the postive effect of Agri, observed at low to intermediate levels of this variable, while the effect of the second principal component (hot and dry environments) is associated with the negative effect of Agri at intermediate to high levels of this variable. The latter effect is also in agreement with the finding that Pastoralism is a strong positive influence on the MSP (because Pastoralism is associated with hot and dry environments). We emphasize that this interpretation is speculative, given the data we currently have. We need to develop better and more nuanced instruments to unravel this complex nexus of environmental and productivity influences on religion. Thus, at present time our regression results can neither support nor reject the lifehistory theory. Instead, our critique centers on the empirical and conceptual foundation of previous tests of the theory.
The life-history theory proposed by Baumard and colleagues (Baumard & Boyer, 2013;Baumard & Chevalier, 2015;cf. Purzycki et al., 2018) utilized measures of MSP extracted from a standard list of Axial Religions and Movements (Greek philosophy and Second Temple Judaism in Eastern Mediterranean, North Indian movements such as Buddhism and Jainism, and Taoism and Confucianism in North China). These measures do not fully capture the diversity and nuance of the historical data, especially the nature and extent of moralizing monitoring and enforcement and institutionalized measures to promote prosociality. It also excludes from consideration other faiths that were at least as moralizing as those included (Hoyer & Reddish, 2019;Mullins et al., 2018). For example, Baumard et al. count Egypt as part of their non-Axial regions. Yet the Seshat data, buttressed by an extensive analytic narrative devoted to Egypt, indicates that Egypt was one of the earliest regions in the world to develop a religion in which concern for interpersonal morality could be described as primary, preceding the Axial Age, as usually defined, by two millennia (see Figure 1).
The life-history approach also brings to the fore various theoretical concerns. Baumard et al. focus on how individuals respond to affluence, while sociologists of religion emphasize that the process of adopting theistic beliefs is essentially social (Stark, 1996;Stark & Bainbridge, 1996). As preindustrial societies grew more affluent, most individuals living in them did not enjoy greater affluence. Part of the explanation for this lies in changes, which can be analyzed in Malthusian and Marxian terms, that meant that population growth up to the carrying capacity of cultivable land negated advances in productive technology and resulted in elites appropriating surpluses. Untangling these issues requires considering both individual-level and society-level processes, but the interplay between them is controversial. This can be seen in the divergent views of evolutionary psychologists, human behavioral ecologists, and cultural evolutionists on the role of cultural group selection in explaining developments in human cooperation (Richerson et al., 2016).

Conclusions
This article focuses on a sample of regions where a multifaceted coding of MSP could be developed and analyzed alongside documented increases in sociopolitical complexity. This coding breaks relevant evidence down into constituent elements focused on the type, range, and focus of moralizing supernatural powers. Combining these elements into a single quantitative measure makes it possible to trace the evolution of this significant cultural innovation in considerable detail. We find some evidence that beliefs in moralizing supernatural powers have ancient roots in some parts of the world, but the idea that such powers can monitor and enforce moral norms tends to increase after rather than before the sharpest rises in social complexity Whitehouse et al., 2022). Not only do these moralizing powers' abilities increase in scope, but we find that they become more strongly focused on moral behavior, punishment for violations becomes more targeted and certain, and the provenance of punishment is extended to more groups in a society. In short, we find that as societies grow in complexity (notably driven by increasingly intense inter-state warfare), they tend to produce religions concerned with policing morality in human affairs in increasingly systematic ways. This policing function may have facilitated cooperation as societies grew more internally differentiated and, at the same time, fostered effective collective action against rival polities.
In our sample, we find that both MSP and sociopolitical complexity were strongly influenced by the evolutionary demands of intense inter-state competition, particularly cavalry warfare. Our analysis also supports the hypothesis that the productivity of agriculture and pastoralism have a positive effect on the evolution of MSP. Utilizing the large dynamic dataset gathered by Seshat: Global History Databank, we were able to trace how all of these factors relate to each other in our global sample and make inferences about temporal causality. We structured the Seshat Sample deliberately to include regions where large-scale societies organized as states formed early, as well as regions with smaller-scale ones (and everything in between).
Our dynamic data show that moralizing religions tend to persist even after the states first adopting them have disappeared. A possible explanation for this is that, once established, moralizing religions confer such a significant advantage to the populations of a given area that they are preserved, even following societal collapse, invasion, or reductions in sociopolitical complexity. Moreover, as doctrinal systems (Whitehouse, 2004) moralizing religions tend to spread very quickly and efficiently to neighboring societies or are readily adopted by new powers occupying territory encompassing populations that adhere to such belief systems.
As well as clarifying some long-standing debates among scholars in a variety of fields, our findings raise several significant questions that can be approached in new ways: Do all ten MSP characteristics we identify here have similar evolutionary effects? Are some characteristics-for example affecting the domain or intensity of moralizing enforcement-more effective than others at strengthening cooperation? Do some MSP characteristics help to suppress structural inequalities or tensions, contributing to stability at high levels of complexity? Do MSP characteristics foster trust across ethnic divisions, as these grow more complex and fractious as a result of invasion, incorporation, migration, and the expansion of trading networks? How do beliefs in the afterlife or impersonal forces like karma fit into the picture? How do these MSP elements overlap, or interact, with those traits identified as universal Moral Foundations found in all human societies (Curry et al., 2019;Haidt & Joseph, 2008;McKay & Whitehouse, 2014)? Do these religions confer societal benefits beyond success in intense intergroup competition, such as increased longevity and other measures of well-being for different segments of the population? How did the specialization, volume, and scope of trading networks contribute to the development and spread of MSP? How did MSP interact with secular institutions designed to solve collective-action problems, such as imperial bureaucracies and policing organizations that monitored people's contributions to public goods (and could impose punishments when individuals fell short)?
Although our focus in this article is on the moralizing aspect that may promote social cooperation, religion may also serve as an instrument of social control by legitimizing inequality and despotic power. And, as we acknowledged in the Introduction, religious differences can trigger prejudice and antisocial attitudes towards minorities and outgroups, leading to conflict, rather than cooperation. How do we study such divergent functions within a single evolutionary framework? These questions require further exploration. Utilizing large, dynamic (time-resolved) databases like Seshat is, we argue, a useful approach to address such big questions about human evolution.
Developing a more refined set of measures of MSP led us to identity a number of key areas in which additional work is needed to develop a truly global explanation of the relationship between sociopolitical complexity and the rise and spread of moralizing religions. While this study analyzes moralizing punishments and rewards using several new variables, it does not yet address all degrees and aspects of MSP. For example, it considers whether rulers are subject to systems of reward and punishment, but not how such rules might or might not apply to other social categories (e.g., lowstatus groups, women, children). It focuses on moral transgressions most likely to affect interpersonal cooperation (such as assault or lying) but does not address whether moral offenses against the gods, such as failure to carry out ritual obligations, are cultural proxies for good behavior in other domains. This paper demonstrates the utility of a more fine-grained approach to the historical development of MSP, encouraging further refinements like these.
Our study also draws attention to the need to broaden the scope of analysis in order to be more globally representative. Having focused on regions that figure prominently in explanations for the rise of MSP during the Axial Age, we recognize the importance of considering regions where sociopolitical, economic, and religious trajectories were quite different . These include centers of domestication and state formation like Mesoamerica and the central Andes, as well as parts of sub-Saharan Africa, Polynesia, and North America that had complex societies at the time of contact, where local traditions represented a diverse range of supernatural powers. Our analysis has already started addressing this gap, but building an even broader set of case studies will present opportunities-and challenges-for refining and testing military, economic, and other factors that are common across a global sample. Because of the lack of detailed pre-colonial religious texts in these regions, archaeological data constitute a key source of evidence, raising important practical questions about how to develop analytical narratives that draw on both the written and material records .
Synthesizing the evidence from archaeology and history on the evolution of moralizing religions represents an exceedingly challenging aspiration, but a necessary one for robust conclusions to be reached. The findings reported here may not accord well with those assembled by scholars working with datasets based on late 19 th and early and mid twentieth century ethnographies of indigenous societies, many of which experienced generations of colonial rule and missionary efforts before the arrival of anthropologists. Furthermore, using the ethnographic present as a proxy for inferring processes of religious evolution in the past blurs the distinction between the prehistoric origins of complex societies, detectable only in their archaeology, and more recent increases in social complexity, as recounted in the writings of explorers, missionaries, ethnographers, and other literate observers prior to and during early phases of colonization. There is good reason to suspect that these are problematic records to use as proxies for Pleistocene foragers, early Holocene farmers, or the cities and states that developed long before the first narrative histories (Singh & Glowacki, 2021).
Seshat results highlight some of the limitations of continuing with indirect studies of human sociocultural evolution but they also offer an important way forward toward a more comprehensive explanation of the human past. We hope that by providing access to our data, analyses, and conclusions in parallel with the process of peer review, we will encourage critical engagement from a broad range of scholars, allowing us not only to demonstrate the usefulness of a quantitative approach to the analysis of world history in tackling longstanding puzzles in the study of cultural evolution but at the same time increasing the scope and quality of the data and methods available to researchers.

Data availability and supplementary online materials
The Seshat team makes our data and analysis scripts publicly available in several ways. First, we periodically publish "snapshots" of the Seshat Databank for well-curated variables and polities. The current such data release is Equinox-2020 (http://seshatdatabank.info/databrowser/), which presents data in both browsable format and through a spreadsheet. Whereas the spreadsheet contains data in computer-readable form suitable for statistical analyses, Seshat Data Browser also includes narrative paragraphs, explaining the codes, as well as references. Second, we deposit in open access all data on which analyses are based at the time of publication of the article that reports these analyses. These "replication datasets" are published as downloadable spreadsheets (see see Seshat Datasets (http://seshatdatabank.info/datasets/)).
In addition to the Supplementary Results and code for analysis (https://osf.io/pa4qf/), we have made available Analytic Narratives describing moralizing supernatural punishment and reward in the polities for each region, which have guided our coding decisions (http://seshatdatabank.info/ databrowser/moralizing-supernatural-punishment-narratives.html). A separate coding table summarizes the codes generated for each of the NGAs used in these analyses (http://seshatdatabank. info/databrowser/moralizing-supernatural-punishment-nga_tables.html). Finally, a list of domain experts consulted is available at http://seshatdatabank.info/databrowser/moralizing-supernaturalpunishment-acknowledgements.html.