Energy, hierarchy and the origin of inequality

Where should we look to understand the origin of inequality? I propose an unusual window of evidence—modern societies. I hypothesize that evidence for the origin of inequality is encoded in the institutional structure of industrial societies. To test this idea, I use a model to project modern trends into the past. This model takes the modern relation between energy, hierarchy, and inequality and creates a hindcast of the origin of inequality. The results are broadly consistent with the available evidence. The model predicts an explosion of inequality with the transition from hunter-gathering to agriculture, followed by a plateau. This finding potentially opens a new window of evidence into the origin of inequality.


Introduction
The origin of inequality is one of the great mysteries of human social evolution. For the vast majority of our history, we lived in small bands that were fiercely egalitarian [1]. But then around 10,000 years ago, something changed [2,3]. For reasons that remain poorly understood, we began to abandon our ancestral state, and started allowing some individuals to command vastly more resources than others. At first inequality was the exception, but it soon spread until it became the most common form of organization.
These windows focus either on societies that are long gone, or societies whose form is archaic. This is perfectly reasonable, but it also limits the evidence we can uncover. The archaeological and written record of inequality will always be sparse. And traditional societies are rapidly disappearing from the world. Given the limits of these windows, where else might we look to study the origin of inequality? I suggest we draw inspiration from evolutionary biology.
One of the great breakthroughs in studying the origin of life was the discovery that the DNA of living organisms contains a coded history of their evolution [31]. Might something similar be true of human societies? Might the social structure of modern societies contain a coded history of the origin of inequality? I test this possibility here. I use institutions as the social corollary of DNA. Institutions are systems of organizing that are passed between generations. I think we can use modern institutional trends to infer the origin of inequality. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 bars indicating the uncertainty in average firm size. As energy use per capita increases, average firm size increases as well.
The growth of average firm size is not caused by a horizontal shift in the distribution. Instead, it is caused by a fattening of the distribution tail. Fig 1B visualizes this behavior. Here I group the countries of the world into quintiles (5 groups) ranked by energy use per capita. For each quintile, I plot the aggregate firm size distribution. Note how the slope of the firm size distribution decreases with greater energy use. This indicates that large firms become more common.
The firm size distribution can be modeled by a power law [48][49][50][51]. This means that the probability of finding a firm of size x is roughly proportional to x −α , where α is the power-law exponent. A smaller power-law exponent indicates a fatter tail. As shown in Fig 1B, greater energy use is associated with a smaller power-law exponent for the firm size distribution. This provides a simple way to model the relation between energy use and firm size.

Institution size and hierarchy
The second step of the energy-hierarchy-inequality hypothesis is to connect institution size to hierarchy. I hypothesize that (virtually) all human institutions are hierarchically organized. This means they have a nested chain of command that grows with institution size. As the hierarchy grows, new ranks are added at a logarithmic rate [52,53]. This scaling behavior has been observed in business firms [54], historical empires [55], and hunter-gather societies [56]. Hierarchical organization also means that elite ranks should become more common as a hierarchy grows. Assuming that managers occupy top ranks, this implies that the management share of employment should increase with average firm size. This trend has been observed at the international level [48].
The most direct evidence for hierarchical organization comes from firm case studies [57][58][59][60][61][62]. Fig 2 shows the hierarchical structure of six case-study firms (which come from Britain, the Netherlands, Portugal, and the United States). Although the specific structure varies, all six firms share the pyramid shape that we expect of a hierarchy. I use these case studies to inform the energy-hierarchy-inequality model (see Section 7 for details).
To summarize, the evidence suggests that institutions tend to become larger as energy use increases. If institutions are hierarchically organized, this implies that the growth of energy is associated with the growth of hierarchy.

Hierarchical power and income
The last component of the energy-hierarchy-inequality hypothesis is a relation between hierarchical power and income. The idea is that elites use their power within a hierarchy to gain preferential access to resources.
Is the same true for income? Evidence suggests so. But before looking at this evidence, I note a key difference between human and non-human hierarchies. All other animals form linear hierarchies-an ordinal ranking from top to bottom. But humans form branching hierarchies, in which each superior controls multiple subordinates. This has important consequences for income distribution. In a branching hierarchy, the number of subordinates grows exponentially with rank (Fig 3). If income stems from power over subordinates, than it too should increase exponentially with rank. This means that hierarchy can lead to vast inequalities.
To make this relation quantitative, I define 'hierarchical power' as: The idea is that control over subordinates is a form of power-it increases "the possibility of imposing one's will upon the behavior of other persons" [80]. All individuals start with a baseline power of 1, indicating they have control over themselves. Hierarchical power then increases proportionally with the number of subordinates. Is income within hierarchies a function of hierarchical power? Evidence from case-study firms suggests so. Fig 4 plots average income (relative to the bottom hierarchical level) against average hierarchical power for each rank in our six case-study firms. There is a strong correlation. A similar correlation exists between changes in income and changes in hierarchical power [81]. In an idealized hierarchy, the total number of subordinates (blue) tends to grow exponentially with hierarchical rank (red). The exact relation will depend on the span of control-the number of subordinates directly below each superior.
https://doi.org/10.1371/journal.pone.0215692.g003  [57][58][59][60][61][62]. The vertical axis shows average income within each hierarchical level of the firm (relative to the base level), while the horizontal axis shows my metric for average power, which is equal to one plus the average number of subordinates below a given hierarchical level. Each point represents a single firm-year observation, and color indicates the particular case study. Grey regions around the regression indicate the 95% prediction interval. For methods, see Section 7. https://doi.org/10.1371/journal.pone.0215692.g004 Energy, hierarchy and the origin of inequality The power-income relation implies that inequality should increase as a hierarchy grows. This is because hierarchical power gets concentrated as a hierarchy gets larger (Fig 5). Importantly, this relation is non-linear. The initial growth of hierarchy rapidly concentrates power. But further growth of hierarchy leads to progressively slower growth of hierarchical-power concentration. If income scales with hierarchical power, the same should be true of inequality. As a hierarchy grows, inequality should explode and then plateau.
To summarize, modern evidence suggests a joint relation between energy use, hierarchy, and inequality. As energy use increases, societies become more hierarchical. If income is proportional to hierarchical power, this should cause an increase in income inequality. To investigate the origin of inequality, I propose that we extrapolate this relation back in time.

An energy-hierarchy-inequality model
To extrapolate the energy-hierarchy-inequality evidence, I create a numerical model. This model simulates the empirical relation between energy, hierarchy, and income. I discuss the basic components of the model below. For a technical discussion, see Section 7.

Model assumptions
The energy-hierarchy-inequality model extrapolates modern trends into the distant past. To do this, we assume the following: Assumption 1. Institutions have a power-law size distribution. The growth of institution size is synonymous with a decline in the power-law exponent.

Assumption 2.
Institutions are hierarchically organized with a structure equivalent to modern firm hierarchies.

Assumption 3.
The modern trend between energy use per capita and institution size applies to all societies.

Assumption 4.
Income scales with hierarchical power in all societies. The rate of scaling may vary over time and space.
Are these assumptions realistic? Regarding assumption 1, there is evidence that pre-capitalist societies had a power-law distribution of institution size. For instance, feudal manor size was roughly power-law distributed [82,83]. Similarly, slave estate size in the antebellum American South was roughly power-law distributed (see S1 Fig). Evidence also suggests that huntergatherer settlement sizes had a power-law distribution tail [84]. The types of institution certainly vary across time and space. But regardless of type, the power-law distribution of institution size seems common. Assumptions 2, 3 and 4 are speculative. But given empirical evidence, why not extrapolate it and see where it takes us?

Model structure
The energy-hierarchy-inequality model has four main steps, discussed below. For technical details, see Section 7.
Step 1: Generate the institution-size distribution. The model generates an institution size distribution using a discrete power law. The power-law exponent varies stochastically over different model iterations. This simulates changes in institution size.
Step 2: Estimate energy use from institution size. Energy use per capita (E pc ) is modeled as a function of average institution size � I: The parameters c 1 and c 2 are determined from a regression on the international energy and firm data shown in Fig 1A. Step 3: Create hierarchical structure. The model uses firm case-study data (Fig 2) to determine the hierarchical structure of institutions. All modeled institutions have the same 'shape', but the number of ranks varies with institution size.
Step 4: Endow individuals with income Individual income I scales with hierarchical power P as where β determines the rate of scaling and � is a noise factor. To simulate variation between societies, β varies stochastically between model iterations. I use case studies of modern firms, as well as an antebellum US slave estate, to determine a plausible range for this variation. The noise factor � adds a small amount of dispersion to the power-income relation. This is determined by income dispersion within hierarchical levels of the case-study firms. On its own, the noise factor corresponds to a Gini index of about 0.1.

Between-Institution Income Dispersion.
The model excludes income dispersion between institutions. US evidence suggests that between-institution income dispersion accounts for a minority of total income dispersion (about 30%) [85]. I assume that the growth of between-institution dispersion is not important for the emergence of inequality. Future research can determine if this is an appropriate assumption. Fig 6 visualizes the energy-hierarchy-inequality model as a landscape. Hierarchies appear as pyramids, with hierarchical rank indicated by height and color. On top is a subsistence society that consumes 5GJ of energy per capita per year. This is 3200 Kcal per day-not much above the metabolic needs of an average person. Hierarchical organization is negligible. Consequently, hierarchical power is very equally distributed, with a Gini index of 0.13. We expect very little inequality in this society. On the bottom is an industrial society that consumes 500GJ of energy per capita per yearsimilar to modern Iceland or Qatar. Hierarchical organization is ubiquitous. Consequently, hierarchical power is extremely concentrated, with a Gini index of 0.76. We expect significant inequality in this society.

Extrapolating the origin of inequality
I use the EHI model to extrapolate the origin of inequality. Fig 7 shows the predicted relation between energy use, the concentration of hierarchical power, and inequality. There are four notable predictions: 1. Hierarchy vanishes at metabolic levels of energy use, causing a collapse of inequality.
Hierarchical organization disappears as energy use approaches metabolic levels (i.e. food energy only). Consequently, hierarchical-power concentration is eliminated and inequality becomes negligible.
2. Inequality explodes during the transition to agriculture. Virtually all increases in inequality occur during the transition from subsistence to agrarian levels of energy use. (In Fig 7, agrarian energy use is represented by Eastern Eurasia from 5,000 BCE to 1500 CE [86]).
3. The range of inequality grows with energy use. The transition to agriculture opens a huge range of 'inequality space'. The governing factor is β-the rate that income scales with hierarchical power. Societies with low β remain equal during the transition to agriculture. But societies with high β experience an explosion of inequality.

Energy growth beyond agrarian levels has little effect on inequality.
After the transition to agriculture, the concentration of hierarchical power plateaus. As a result, further increases in energy use have a negligible effect on inequality.

Testing the energy-hierarchy-inequality prediction
The EHI model predicts how the emergence and evolution of inequality should relate to energy use. Fig 8 compares this prediction to the available evidence. Fig 8A compares the model to archaeological data for ancient societies. The caveat is that the archaeological data measures inequality using house size [18]. This is not strictly comparable to the 'income inequality' produced by the EHI model. Nonetheless I make a comparison. The archaeological data is grouped by societal adaptation. Horizontal error bars indicate the plausible range of energy use for each adaptation. Points represent the mean estimate. (For sources and methods, see Section 7). The model's prediction is consistent with the archaeological evidence-inequality explodes during the transition to agriculture.
Fig 8B compares the model to data from pre-industrial societies [29]. Horizontal error bars show the uncertainty in energy use (which is estimated from GDP). Again, the model is consistent with the empirical data. In pre-industrial societies, inequality increases rapidly with energy use.
Fig 8C compares the model to modern evidence. The model's range is consistent with the empirical data. But there is a downward trend in the empirical data that is not predicted by the model. I discuss possible interpretations of this trend below. Fig 8D also compares the model to modern evidence, but measures inequality using the top 1% income share. The empirical data is in a range that is consistent with the model. Again, there is a downward trend in the empirical data, but far less pronounced than in Fig 8C. To summarize, EHI model predictions for the origin of inequality are consistent with the available evidence. But for industrial societies, the model predictions are more ambiguous. Modern evidence is within the range predicted by the model. However, the data shows a decline of inequality with energy use that is not predicted.

The Kuznets curve: The decline of hierarchical despotism?
Fig 9A aggregates all the empirical data in Fig 8A-8C to show the long-term trend between energy use and inequality. A Kuznets curve-like pattern [87] emerges (an inverted U-shaped relation). Inequality tends first to increase with energy use, and then decline. The increase is predicted by the model, but the decrease is not. Is the model wrong?
More evidence is required to answer this question. The problem is that the model predicts a huge range of 'inequality space' for industrial societies. The range of this space is determined by β-the scaling of income with hierarchical rank. I have assumed that the distribution of β is independent of energy use. But this could be wrong. To test the model, we need independent estimates of β in real-world societies. Such estimates do not presently exist.
While we cannot confirm or falsify the model, we can infer how β should behave if the model is correct. To do this, we match the empirical data to the best-fit model iteration. We Testing the energy-hierarchy-inequality model. This figure compares the EHI model to empirical data. Panel A shows archaeological data from ancient societies, measured using housing size and reported by 'adaptation'. Horizontal lines indicate the plausible range of energy use for each adaptation. Panel B shows income inequality in pre-industrial societies. Energy use is estimated from per capita income data (horizontal lines show the uncertainty). Panel C shows data for modern nation-states, with vertical lines showing the range of inequality estimates for each country. Panel D also shows modern data, but measures inequality using the top 1% income share. For sources and methods, see Section 7. https://doi.org/10.1371/journal.pone.0215692.g008 Energy, hierarchy and the origin of inequality then assign the model's β to the real-world society. The resulting inference is shown in Fig 9B. If the model is correct, β should decline with energy use.
Future research can test this inference. For now, I reflect on what it means. The parameter β determines how rapidly income scales with hierarchical power. I interpret β as an index of hierarchical despotism. It measures elites' ability to use their hierarchical power to concentrate resources. A larger β indicates a more despotic hierarchy (greater returns to hierarchical power). The model predicts that hierarchical despotism declines as energy use increases.
This suggests that the Kuznets curve is created by two trends that accompany increases in energy: (1) the growth of hierarchy; and (2) the decline of hierarchical despotism. The first half of the Kuznets curve is created by the growth of hierarchy, which concentrates hierarchical power, leading to greater inequality. But hierarchical power concentration eventually plateaus. At this point, the decline in hierarchical despotism dominates the trend. This causes the second half of the Kuznets curve-inequality declines with greater energy use.
The decline of hierarchical despotism is an untested inference. But it seems plausible. History suggests that as societies develop, they introduce checks on power. These include the rule of law, democracy, and labor unions. Might these checks on power gradually reduce hierarchical despotism? Future research can test if this is true.

Discussion
The results of the model suggest that the energy-hierarchy-inequality hypothesis is plausible. By projecting modern trends into the past, we can accurately predict the origin of inequality. I discuss here some of the implications. I begin with the model's limitations, and then speculate about causation and the role of hierarchy in the emergence of inequality.

Model limitations
The energy-hierarchy-inequality model is built on correlation and is not intended to answer questions of causation. This is a limitation, but is also the main reason the model provides insight. Energy, hierarchy and the origin of inequality The model takes two correlations as inputs: (1) the correlation between energy use and institution size; and (2) the correlation between hierarchical power and income. The model does not explain why these correlations exist. Instead, it explains why they might be important for the origin of inequality. The model indicates that if these trends existed in the past, they imply that inequality arose during the transition to agriculture.
Even though this operates at the level of correlation, it is an important insight. It suggests that modern trends provide a new window into the origin of inequality. The task for future research is to use this window to better understand the ultimate causes.

Causation
For the EHI hypothesis, understanding causation means explaining our two correlations. We want to know why the growth of energy use relates to the growth of hierarchy, and why income increases with hierarchical power. Answering these questions exceeds the scope of this paper. But I will speculate here.
I suspect that energy relates to hierarchy via a feedback loop, meaning causation runs both ways. I think this because different evidence suggests different causal directions. The collapse of the Soviet Union is one example where hierarchy seems to drive energy use. When the Soviet government collapsed, energy use in former Soviet states decreased dramatically [48]. Because there was no global energy shortage at the time, we can plausibly infer that the institutional collapse caused the decline in energy use.
But we can also think of reasons for the reverse causation-when energy drives (or limits) the growth of hierarchy. This ties into the surplus theory of social stratification [88][89][90]. In agrarian societies, the energy surplus of farmers is too small to support many non-farm workers [34]. Thus, there is little room for a managerial class. But if the surplus grew, it could loosen these limits and allow the growth of hierarchy [48,91]. This suggests that energy growth could cause the growth of hierarchy.
These two examples suggest that causation can run both ways-energy can drive the growth of hierarchy and vice versa. Untangling this causal process is a difficult task for future research.
What causes the relation between income and hierarchical power? I think it is likely caused by many different factors. In despotic hierarchies (such as slave plantations), superiors may use coercion and brute force to derive their income. But in less despotic hierarchies, ideology is likely more important. The substance of these ideologies differs, but the function is always the same-to justify the power of elites. Traditional societies often justify power through kinshiptracing lineage to a founding ancestor [92,93]. Feudal societies use religion, as in the divine right of kings [94]. Capitalist societies use ownership to justify power [95]. In each society, the ideology justifies both the authority of elites and their greater access to resources. To understand why income relates to hierarchical power, I think we must understand the ideologies that legitimize power. These ideologies have been well studied [95][96][97][98][99][100][101][102][103][104][105][106][107][108][109][110][111], but much remains to be learned.
We must also take seriously the social practices that evolve to check the power of elites. In modern societies, this would include labor unions and democratic oversight. There is strong evidence that labor unions limit inequality [112,113]. This suggests that by organizing lowranked individuals, unions check the power of elites. There is also evidence that democratic oversight limits the income of elites. For instance, US CEOs in industries that are regulated by government earn less than CEOs in non-regulated industries [114]. And elite compensation in the democratically-controlled public sector is far lower than in the private sector. As one example, the US president earns about 40 times less than CEOs in the largest US firms [115][116][117].
While there are many plausible causes for the power-income relation, studying them brings us back to the problem of measurement. I focused on hierarchical power because it was easy to quantify. But when we try to look under the hood of this power, measurement becomes difficult. For instance, how do we measure the effect of an ideology? To understand causation, we must wrestle with these difficulties. Again, this is a task for future research.

The emergence of hierarchy
Although we poorly understand the mechanisms at work, I want to speculate about the origin story told by the EHI model. It suggests that the origin of inequality can be reframed as the emergence of hierarchy. But this raises a question. After hundreds of thousands of years of living in (relatively) egalitarian societies, why would humans suddenly choose to organize in despotic hierarchies?
Scientists have long puzzled over this question. Was there an advantage to hierarchy, as functionalist theory contends [118,119]? Or was it a matter of coercion, as conflict theory contends [102,[120][121][122]? Or did the emergence of hierarchy involve both function and coercion [101,[123][124][125]? I think the latter is most likely. Without a functional advantage, it is hard to understand why hierarchy would emerge. But without coercion, it is hard to understand the great inequalities that exist within hierarchies.
Let's begin with the advantages of hierarchy. The modern evidence indicates that hierarchy increases with energy use. One interpretation is that hierarchy somehow enables, or is necessary for, greater energy use (for a different interpretation, see [91]). If this is true, then we need to ask two questions. First, why is using more energy advantageous? Second, why is hierarchy required to use more energy?
Regarding the first question, if life is the struggle for energy [35,126], then using more energy may give a competitive advantage to an organism (or group of organisms). This is the idea behind the maximum power principle, which attempts to give an energetic basis to Darwinian fitness [127][128][129]. It proposes that organisms (and ecosystems) evolve to maximize power-the flow of energy per unit of time. While it has some empirical support [130,131], the maximum power principle remains controversial.
Still, there are clear instances where using more energy is advantageous to human groups. The most conspicuous is warfare. The evolution of military armament moves towards increasingly devastating weaponry (bows and arrows, guns, missiles, and nuclear warheads). This reduces to energetics: the destructive capability of a weapon is proportional to the amount of energy it releases. We need only look at the history of European conquests to see how better armament led to a group advantage [132]. Greater energy use may also allow reproductive benefits. For instance, in existing traditional societies, agrarian societies tend to have higher fertility than hunter-gatherers and horticulturists [133]. To summarize, using more energy may be advantageous in inter-group competition. We can think of this as a form of 'group selection' [134,135]. The idea is that groups that use more energy outcompete groups that use less energy.
But why is greater energy use associated with greater hierarchy? One possibility is that using more energy requires greater social coordination, and hierarchy is the most potent way to achieve this. Here is my reasoning. Increasing energy use involves profound technological changes. Most notably, the scale and complexity of technology increases [48]. I suggest that this increasing complexity requires more social coordination. This is where hierarchy comes in. While humans can organize without hierarchy, the scale appears limited. The problem is that human sociability likely has biological limits [136]. Individuals generally cannot maintain more than a few hundred social relations. Hierarchy sidesteps these limits [55]. A member of a hierarchy needs to interact only with his direct superior and direct subordinates. This allows group size to grow without the need for more social interactions.
If hierarchy confers energetic benefits (via coordination), we can imagine a feedback loop emerging: Hierarchical organization enables large-scale coordination that then enables greater energy use, that then enables more hierarchy (and so on). This explains why energy and hierarchy go together. But it leads to a problem. For the vast majority of human history, hierarchical organization was negligible. Clearly there was no energy-hierarchy feedback loop. What are we missing?
The missing ingredient is resource distribution within the hierarchy. The problem is that hierarchy is a double-edged sword. It allows greater coordination, but it also leads to despotism. The nested chain of command gives enormous power to top-ranked individuals. When this power is (predictably) used for personal gain, it leads to vast inequalities. This would explain why income scales with hierarchical power. The resulting inequality means that hierarchy may not benefit low-ranking individuals. If the material gains from coordination are monopolized by elites, low-ranking individuals may be better off leaving the hierarchy. The stability of a hierarchy thus depends on the net advantage for low-ranking individuals [125]. If there is no advantage, the hierarchy will be unstable.
For the majority of human history, the costs of hierarchical despotism likely outweighed any coordination benefits from hierarchy. We know that modern hunter-gatherers (and presumably ancient ones as well) aggressively suppress individuals with power-seeking tendencies [137,138]. Without a concentrated energy source (such as agriculture) the benefits to largescale coordination were likely marginal. Therefore, hierarchy was not tolerated because it conferred no advantage.
This likely changed during the Neolithic revolution. The details remain poorly understood, but we can guess that the benefits of large-scale coordination increased. This is likely related to sedentism and the development of agriculture [139,140]. Irrigation likely also played an important role [141,142]. I argue that during the Neolithic revolution, the energy-hierarchy feedback loop took hold. As a result, hierarchical power became more concentrated. Elites predictably used their power for personal gain, resulting in the emergence of inequality.
I have so far treated inequality as an effect of hierarchy. But it may actually play a role in the growth of hierarchy. I have argued that the growth of hierarchy depends on the net advantage to low-ranking individuals. One way to increase this advantage is to increase the returns to hierarchical coordination (through environmental or technological change). But another way to increase the net advantage is to decrease hierarchical despotism. If the gains of hierarchy are more equally distributed, the net benefit to low-ranking members is greater.
This reasoning means that inequality may play a causal role in the growth of hierarchy and the growth of energy use. This is speculation, but it fits with the inference that hierarchical despotism declines with energy use (Fig 9B). Perhaps limiting hierarchical despotism is a prerequisite for industrialization? Or put another way, is it possible to have an industrial economy built on slavery-the most despotic mode of human organization? These are open questions worth investigating.
To summarize, I think that understanding the energy-hierarchy-inequality relation requires merging both functional and conflict theories of social stratification. It requires understanding what Wilson calls the "fundamental problem of social life" [134]. The idea is that cooperative groups beat uncooperative groups. But selfish individuals beat unselfish individuals within groups. Hierarchy nicely highlights both aspects of this problem. It is a powerful tool for coordination, and thus has potential group benefits. But it is also predictably used for selfish gain, thus resulting in great inequality. Thinking in this way may provide an important tool for understanding the origin of inequality.

Conclusions
Origin questions are some of the most seductive in science. At the same time, they are among the most difficult questions to answer. The problem is that origins are always locked in the past, meaning evidence is frustratingly sparse. Scientific progress on origin questions happens when we find reliable windows into the past.
It is instructive to see how new windows of evidence have advanced other fields. In modern cosmology, the breakthrough came when Edwin Hubble discovered that galaxies are receding from us. Reversing this trend implied that the universe had once been smaller-perhaps infinitely so. And so the big bang theory was born [143]. In biology, the breakthrough came with the discovery of DNA. By comparing the DNA of different organisms, we can infer the history of evolution. It suggests that all life has a single origin [31].
What about the origin of inequality? Obviously we should continue to gather historical and archaeological evidence. But this evidence will always remain limited. We should also continue studying traditional societies. But these societies are rapidly disappearing from the world. That leaves modern societies as a source of evidence.
I have proposed that the institutional structure of modern societies contains a coded history of the origin of inequality. To test this idea, I used a model to project into the past the modern relation between energy use, hierarchy, and inequality. The model predictions are generally consistent with the evidence. This suggests we may have found a new window into the origin of inequality. Because the GEM data over-represents large firms, I use only firms with 1000 or fewer employees. For method details, see the Appendix in Ref. [48]. Uncertainty in average firm size is estimated using the bootstrap method. Firm size distribution power-law exponents are estimated using the R PoweRlaw package [144]. Energy data comes from the World Bank, series EG. USE.PCAP.KG.OE.

Sources for Figs 2 and 4.
Firm case-study data comes from [57][58][59][60][61][62]. For a description of this data, see the Appendix in Ref. [145]. Hierarchical power is defined as P = 1 + S, where P is hierarchical power and S is the number of subordinates. Because the case studies provide data for aggregate hierarchical structure only (not the chain of command), I calculate average hierarchical power, Here S h is the average number of subordinates below level h. It is defined as the sum of employment (E) in all subordinate levels, divided by employment in the level in question: Income is normalized relative to the average income in the base hierarchical level (in the year in question).
Sources for Fig 7. I assume that human metabolic needs range from 2000 Kcal to 2500 Kcal per day. Western and Eastern Eurasia energy use data comes from Morris [86]. US total energy consumption is from Historical Statistics of the United States, Tables Db164-171 (1900Db164-171 ( -1948 and Energy Information Agency Table 1. 3 (1949-2000). US population is from Maddison [146]. Qatar data comes from the World Bank (series EG.USE. PCAP.KG.OE).
Sources for Fig 8. Fig 8A. Archaeological inequality data is from Kohler et al.
[18] and is measured using house size. I estimate the energy use range for each adaptation using the data in Table 1. Results for this energy range are shown in Fig 10. Fig 8B. Pre-industrial inequality data is from Milanovic  Fig 9B. The hierarchical despotism index β is estimated by matching empirical data to the best fit model iteration. β is chosen by minimizing the following error function: E r and G r are energy use per capita and the Gini index of inequality (respectively) in the realworld society. E m,i and G m,i are energy use per capita and the Gini index of inequality (respectively) in the model iteration i. I assign real-world societies the model parameter β i associated with the best-fit model iteration i.

Hierarchy model equations
This section provides technical details for the algorithm used to generate institutional hierarchies. Notation is shown in Table 2. Energy, hierarchy and the origin of inequality

Generating the employment hierarchy.
To generate the hierarchical structure of an institution, we begin by defining the span of control (s) as the ratio of employment (E) between two consecutive hierarchical levels (h), where h = 1 is the bottom hierarchical level. It simplifies later calculations if we define the span of control in level 1 as s = 1. This leads to the following piecewise function: The model assumes that the span of control is not constant; rather it increases exponentially with hierarchical level. I model the span of control as a function of hierarchical level (s h ) with  Table 1. https://doi.org/10.1371/journal.pone.0215692.g010 As one moves up the hierarchy, employment in each consecutive level (E h ) decreases by 1/s h . This yields Eq 7, a recursive method for calculating E h . Since we want employment to be whole numbers, we round down to the nearest integer (notated by #). By repeatedly substituting Eq 7 into itself, we can obtain a non-recursive formula (Eq 8). In product notation, Eq 8 can be written as Eq 9.
Total employment in the whole institution (E T ) is the sum of employment in all hierarchical levels. Defining n as the total number of hierarchical levels, we get Eq 10, which in summation notation, becomes Eq 11.
In practice, n is not known beforehand, so we define it using Eq 9. We progressively increase h until we reach a level of zero employment. The highest level n will be the hierarchical level directly below the first hierarchical level with zero employment: To summarize, the hierarchical employment structure of our model institution is determined by 3 free parameters: the span of control parameters a and b, and base-level employment E 1 . Code for this hierarchy generation algorithm can be found in the C++ header files hierarchy.h and exponents.h, located in the Supplementary Material [158]. 7.2.2 Calculating hierarchical power in the hierarchy model. I define an individual's hierarchical power as one plus the number of subordinates (S) under their control: Because the hierarchy model simulates only the aggregate structure of institutions (employment by hierarchical level), hierarchical power is calculated as an average per rank. For hierarchical rank h, the average hierarchical power ( � P h ) is defined as the average number of subordinates ( � S h ) plus 1: Energy, hierarchy and the origin of inequality Each individual with rank h is assigned the average power � P h . The average number of subordinates � S h is equal to the sum of employment (E) in all subordinate levels, divided by employment in the level in question: As an example, consider the hierarchy in Fig 11. The average number of subordinates below each individual in hierarchical level 3 (red) would be: Therefore, these individuals would all be assigned a hierarchical power of 7.

Restricting model parameters
The model's parameters are summarized in Table 3. My method for restricting these parameters is detailed below.   [49] and other G7 countries [51] can be modeled accurately with a power law. A power law has the simple form shown in Eq 17, where the probability of observation x is inversely proportional to x raised to the exponent α: The hierarchy model assumes that all human societies have power-law institution size distributions. The model simulates different societies by allowing the power-law exponent α to vary stochastically between different model iterations.
A characteristic property of power-law distributions is that as α approaches 2, the mean becomes undefined. In the present context, this means that the model can produce institution sizes that are extremely large-far beyond anything that exists in the real world. To deal with this difficulty, I truncate the power-law distribution at a maximum institution size of 2.3 million. This is the present size of Walmart, the largest firm that has ever existed.
Code for the discrete power-law random number generator can be found in the C++ header file rpld.h, located in the Supplementary Material [158]. This code is an adaptation of Collin Gillespie's [144] discrete power-law generator found in the R poweRlaw package (which is, in turn, an adaptation of the algorithm outline by Clauset [159]).

Span of Control Parameters.
The parameters a and b together determine the shape of the model's institutional hierarchies. These parameters are estimated from an exponential regression on firm case-study data (Fig 12A). The model assumes that these parameters are constant across all institutions. The resulting modeled hierarchy shape is shown in Fig 12B. Because the case-study sample size is small, there is considerable uncertainty in the span of control parameters. I incorporate this uncertainty into the model using the bootstrap method [160], which involves repeatedly resampling the case-study data (with replacement) and then estimating the parameters a and b from this resample. I run the model many times, each time with a and b determined by a bootstrap resample of case-study data. The resulting variation in the shape of the model's hierarchies is indicated by the error bars in Fig 12B. Code implementing this bootstrap can be found in the C++ header file boot_span.h located in the Supplementary Material [158].

Base-level employment.
Given span of control parameters a and b, each hierarchy is constructed from the bottom hierarchical level up. Thus, we must know base level employment. To get this value, I input a range of different base employment values into Eqs 6, 9, and 11 and calculate total employment for each value. The result is a discrete mapping relating base-level employment to total employment. I then use the C++ Armadillo interpolation function to linearly interpolate between these discrete values. This allows us to predict base level E 1 , given total employment E T . Code implementing this method can be found in the C++ header file base_fit.h, located in the Supplementary Material [158].

Power-income exponent.
The model assumes that income scales with hierarchical power as where I h is income in hierarchical level h, I 1 is income in the base hierarchical level, P is hierarchical power, and � is the stochastic noise factor. To simulate variation between societies, I allow β to vary over different model iterations. I use two different data sources to determine a plausible range for this variation. The first is case-study data from modern firms [57][58][59][60][61][62]. I determine β from regressions on the data shown in Fig 4. For each case-study firm, I regress log(I h /I 1 ) onto logP h . The slope of the relation is the estimate for β. I estimate the uncertainty in β using the bootstrap method [160]. I repeatedly resample case-study data and re-run the regression to estimate β. The resulting probability distribution of β is shown in Fig 13A for each case-study firm.  [57][58][59][60][61][62]. The span of control is the subordinate-to-superior ratio between adjacent hierarchical levels. The x-axis corresponds to the upper hierarchical level in each corresponding ratio. Case-study firms are indicated by color. Horizontal 'jitter' has been introduced to better visualize the data. The line indicates an exponential regression, with the grey region indicating the regression 95% confidence interval. Panel B shows the idealized firm hierarchy that is implied by the regression in Panel A. Error bars show the uncertainty in the hierarchical shape, calculated using a bootstrap resample of case-study data. https://doi.org/10.1371/journal.pone.0215692.g012 Energy, hierarchy and the origin of inequality The second data source is a case study of a US slave estate-Cannon's Point Plantation [161]. I estimate β from the living standard of the plantation owner relative to his slaves. For this estimate, we solve the power-income relation for β: Although we do not know the hierarchical structure of the slave estate, we know that the owner sits on top of the hierarchy. All of the slaves are his subordinates. Therefore the number of slaves (n slave ) gives us a rough estimate for the owner's hierarchical power: If we know the living standard of the owner (I owner ) and slaves (I slave ), we can combine Eqs 19 and 20 to get a rough estimate for β: The living standard of the owner is equal to his income. But slaves have no income, so we must use another method to estimate their living standards. One way is to use the slave expenses paid by the owner. Another method is to compare the owner and slaves in terms of house size. The results for both methods are shown in Fig 13B. Again, I use the bootstrap technique to investigate the plausible range of β that is implied by the Cannon's Point data. I sample different values for the owner's income, the slaves' income (living standard), and the number of slaves and put them repeatedly into Eq 21.
As we would expect, the resulting β for our slave estate is far higher than in our case-study firms. In a slave regime, the evidence suggests that β could approach 1. To put this in perspective, this means income scales linearly with hierarchical power. If this were the case in industrial societies, the CEO of Walmart would earn 2 million times that of an entry-level worker. Nothing like this exists in industrial societies-for good reason. They are not based on slavery. But slavery was ubiquitous in human history, so we need to allow for its existence in our model.
Based on the case-study data in Fig 13, I allow β to vary over the range 0.2 � β � 1. 7.3.5 Power-income noise factor. Noise (�) in the power-income relation is modeled with a lognormal random variate with dispersion determined by the parameter σ: The noise factor reproduces the average within-hierarchical level income dispersion in casestudy firms [57][58][59][60][61][62]. The distribution of within-hierarchical level income dispersion is shown in Fig 14. To determine σ, we first calculate the mean Gini index ( � G) of the case-study data shown in Fig 14. We then calculate σ using: This equation is derived from the definition of the Gini index of a lognormal distribution: G = erf(σ/2). To incorporate uncertainty in the case-study data, each model iteration uses a different bootstrap resample to calculate � G. Code implementing this method can be found in the C++ header file boot_sigma.h, located in the Supplementary Material [158]. 7.3.6 Estimating energy use from average institution size. The energy-hierarchyinequality model assumes that energy use E pc is proportional to average institution size I: The parameters c 1 and c 2 are determined from regressions on the international firm data shown in Fig 1A.   Fig 13. Probability distribution of β in case-study institutions. This figure shows the probability distribution of the parameter β in different case-study institutions. This parameter indicates the scaling behavior between income and hierarchical power: income / (hierarchical power) β . Probabilities are determined using the bootstrap method. Panel A shows the β probability distribution for casestudy firms [57][58][59][60][61][62]. Panel B shows the β probability distribution for a US slave estate (Cannon's Point Plantation [161]). I show results for measuring inequality in terms of both house size and income. Steps' indicate the bins in the original data. The red line shows the best-fit power-law distribution, which has an exponent α = 2.7. The shaded region indicates the range of uncertainty for a sample of 1 million. Slave-estate size roughly follows a power-law distribution. Data is from [163], as reported in [164]. The best-fit power law is determined using the methods in [165]. Energy, hierarchy and the origin of inequality