1 Introduction

Why was Japan the first non-Western nation to industrialize? Why did China take longer to modernize? At first glance, China’s later industrialization appears puzzling. Unified growth theory suggests that the transition from stagnation to growth is driven by the positive interaction between population expansion and technological progress and its impact on the demand for human capital and the onset of the demographic transition (Galor 2005, 2011). All else equal, China, one of the most technologically advanced and certainly the most populous nation in the world throughout most of history, should be an early industrializer. While differences in geographical endowment, institutions, culture, and diversity could help explain the Great Divergence between China and Europe (Jones 1981; Landes 1998; Pomeranz 2000; Ashraf and Galor 2013), these differences seem unsatisfactory in explaining the economic divergence between China and Japan.

Traditional accounts typically attribute Japan’s earlier industrialization to the Meiji Restoration. According to this view, Qing China (1644–1911) and Tokugawa Japan (1600–1868) were both governed by despotic rulers who were uninterested in promoting economic growth.Footnote 1 Their paths diverged only after 1868, when the Tokugawa regime was overthrown and the new Meiji government introduced drastic reforms that transformed Japan. As Beasley (1972) put it,

During the middle decades of the nineteenth century China and Japan both faced pressure from an intrusive, expanding West. ... Emotionally and intellectually, Chinese and Japanese reacted to the threat in similar ways. ... Yet they differed greatly in the kind of actions that this response induced. ... The Meiji Restoration is at the heart of this contrast, since it was the process by which Japan acquired a leadership committed to reform and able to enforce it. For Japan, therefore, the Restoration has something of the significance that the English Revolution has for England or the French Revolution for France; it is the point from which modern history can be said to begin.

Recent reassessments have put the Chinese and Japanese economies on the eve of the modern age in better standing. It has been shown that, like Western Europe, China and Japan experienced widespread commercialization and proto-industrialization during the early modern period (Pomeranz 2000). However, the revisionist view, too, tends to play down the differences between pre-1850 China and Japan, and focus instead on their similarities (Pomeranz 2000; He 2013).

Indeed, early modern China and Japan had much in common. Both depended heavily on small-scale, labor-intensive, and rice-based agriculture. Both were ruled by stable and sophisticated governments long before the arrival of the West. Furthermore, they shared a common cultural, institutional, and technological heritage. As a result of active cultural borrowing from China, Tokugawa Japan was also deeply influenced by Confucianism. Chinese administrative codes played an important role in shaping the way that the Tokugawa shogunate was run (Jansen 1992). Existing evidence suggests that living standards in China and Japan were comparable during this period (Maddison 2001; Baten et al. 2010; Allen et al. 2011; Broadberry 2013).Footnote 2

We point to an important empirical observation that fits neither traditional nor revisionist perspectives, however. As Figure 1 illustrates, from 1650 to 1850, tax revenue per capita was significantly higher in Tokugawa Japan than in Qing China, and the gap widened over time.Footnote 3 In our estimates, the Chinese state’s annual revenue on the eve of the Opium War (1839–1842) was equivalent to 2 % of its national income at the maximum, while the comparable number for the Tokugawa shogunate was more than 15 %.Footnote 4

Fig. 1
figure 1

Per Capita Tax Revenue in China and Japan. Sources: Shogunate’s land tax from Ohno (1996); Japan’s population estimates from Hayami and Miyamoto (1988); China’s tax revenues from Sng (2014); China’s population estimates from Perkins (1969)

What were the reasons for these diverging revenue trends? The existing literature offers two hypotheses for China’s low tax revenue in general: the absence of warfare and the ideology of benevolence. Economic historians have shown that warfare was a major driver for European states to expand fiscal capacity (Hoffman and Rosenthal 1997; O’Brien 2005; Dincecco et al. 2011; Gennaioli and Voth 2011). In this view, the absence of interstate competition in China and the resulting low fiscal demand were the primary reasons for low taxation in China (Rosenthal and Wong 2011). Alternatively, China historians have argued that low taxation was mainly a reflection of the Confucian ideology of “benevolent rule” (Elliott 2009; Rowe 2009). However, neither of these hypotheses can fully explain the diverging trends in China and Japan because Tokugawa Japan, too, experienced no interstate competition and shared the Confucian ideology of benevolence.Footnote 5 If anything, the European experience suggests that China, not Japan, should have developed a higher fiscal capacity because it was exposed to military threats from Inner Asia.

In this paper we focus on geography as a primary factor. China was a sprawling land empire with vast inner frontiers, while Japan was a small island nation. We propose that the difference in their geographical size and heterogeneity led to a much more acute problem of political control in the former than in the latter.Footnote 6 In pursuing our research, we follow the methodology of comparative and historical institutional analysis proposed by Greif (1998). That is, we first develop a context-specific model based on historical details to theoretically examine the nature of the problems that the rulers in China and Japan faced and then check its implications using comparative historical evidence.

Between 1650 and 1850, both nations were ruled by stable dictatorships. Following Olson (1993), we model stable dictators as “stationary bandits” who understand that excessive exaction in the short run would be counterproductive in the long run.Footnote 7 However, the ruler’s encompassing interest is by itself insufficient to guarantee good governance. Because dictators cannot rule alone and have to rely on agents to govern, a principal–agent problem is inherent in dictatorships.Footnote 8 Unless the interests of the ruler and the agents are well-aligned, in the absence of perfect monitoring, the agents tend to pursue their self-interest at the ruler’s expense. For example, they may extort the taxpayers and thereby increase the likelihood of rebellion.

We hypothesize that in a stable dictatorship, agency problems increase with its geographical size and heterogeneity. Given premodern information technologies, it is costly for the ruler of a large domain to monitor the agents closely. This gives the agents strong incentives to extort the taxpayers. To prevent overexploitation that could foment rebellion, the ruler has to keep taxes low and government small. By contrast, in a smaller domain, lower monitoring costs allow the ruler to impose heavier taxes without risking popular resistance.

If the sole purpose of taxation is to support the consumption of the ruling class, whether it enriches the ruler or his agents will not matter to the taxpayers. However, unlike corruption, taxation is rarely a pure rent-seeking activity. The ruler, as the owner of his domain, may use the tax receipts to invest in public goods to keep his property productive. If so, the competition between the ruler and the agents over the economic surplus may have an impact on social welfare, especially in the long run.

To formalize our hypothesis, we build a dynamic principal-agent model and analyze optimal taxation and public goods provision in a stable dictatorship. The ruler taxes the peasants through agents and invests part of the tax revenue in a local public good that protects the economy from exogenous shocks (e.g., natural disasters). If the ruler under-invests in the public good, the risk of a large shock destroying the economy increases.

The static predictions of the model are straightforward: holding monitoring technology constant, as the geographical size of the ruler’s domain increases, bureaucratic expropriation worsens and per capita tax revenue falls due to managerial diseconomies of scale.

New insights come from the dynamic implications. While one may expect economic expansion to generate more tax revenues and higher public good investments, this is not always the case. The model predicts that economic expansion could actually hurt the ruler because it also exacerbates agency problems. When monitoring cost is sufficiently high, bureaucratic expropriation will outpace economic expansion. It is only when monitoring cost is low that economic change is likely to bring net benefits to the ruler as well as the population.

Our model provides a potential explanation for the tax revenue dynamics in China and Japan documented in Fig. 1. To further check its implications, we examine the provision of local public goods (coinage, transportation network, urban management, forest protection, famine relief) in the two regimes. In line with the model’s prediction, we find that, compared to the Chinese emperor, the Tokugawa shogun displayed a greater capability to provide these public goods over a longer period of time.

We take the size of domains in China and Japan as exogenous in our analysis. Given the high agency costs, one may ask if China’s vast size was ever optimal. In a broader framework, such as Alesina and Spolaore (1997), the ruler determines the size of his domain by balancing the accompanying costs and benefits, where agency costs are just one such factor. In the case of China, we conjecture that the benefits of political integration—peace and risk sharing among contiguous regions—outweighed high agency costs, thereby justifying its size. We do not model this, however, to keep the scope of our analysis manageable.Footnote 9

To our knowledge, this study is the first comparative analysis of state capacity in preindustrial Asia.Footnote 10 The European experience indicates that most states had a strong fiscal system in place before industrializing (Dincecco 2011; Johnson and Koyama 2014a, b). Indeed, there is a growing body of theoretical and empirical research highlighting the importance of state capacity in facilitating modern economic growth (Acemoglu 2005; Besley and Persson 2009, 2013; Dincecco and Prado 2012; Dincecco and Katz 2014). Studies also show that a proactive state could accelerate the transition from stagnation to growth by implementing policies that promote human capital formation (Doepke 2004; Doepke and Zilibotti 2005; Galor and Moav 2006; Galor et al. 2009).Footnote 11 In light of these works, our finding of an increasingly weak state in China in contrast to Japan might help explain the puzzle of China’s late industrialization.

This paper builds directly upon Sng (2014), who studies the impact of geographical size on the principal–agent problem in late imperial China. Our work significantly extends his model by incorporating public goods provision and offers new comparative empirical evidence by bringing Japan into the picture. Our approach is complementary to Brandt et al. (2014), who provide a comprehensive survey of the long-run evolution of the Chinese political economy since the tenth century.

Importantly, four recent contributions also explore the impact of geographical proximity and size on the quality of political and corporate governance. Stasavage (2011) finds that in preindustrial Europe, high communication and travel costs prevented representative assemblies in large polities from convening regularly and functioning effectively. Using contemporary data from 127 countries, Olsson and Hansson (2011) detect strong negative effects of territorial size on the rule of law. Giroud (2013) shows that a reduction of travel time (a proxy for monitoring costs) between company headquarters and plants has positive effects on plant-level productivity and profits. Campante and Do (2014) provide strong evidence that isolated capital cities in US states are associated with lower accountability, greater levels of corruption, and worse public goods provision. These studies show that distance and size are a challenge to good governance not only in premodern regimes in Asia and Europe, but also in modern states and corporations.

The rest of the paper is organized as follows: Section. 2 provides the historical background. Section 3 presents the model and derives predictions. Section 4 provides comparative historical evidence. Section 5 concludes.

2 Historical background

In this section, we compare the geography, administrative structure, and system of tax collection in Qing China and Tokugawa Japan to motivate our theoretical model.

2.1 Geography

Tokugawa Japan was an archipelago comprising three main islands,Footnote 12 while China was a continental empire (Fig. 2). At its peak, China under the Qing dynasty (1644–1911) controlled a landmass larger than China or the United States today. Even if we disregard the thinly populated Inner Asian borderlands, the region known as China proper—the 18 provinces between the Great Wall and the South China Sea, which accounted for about 98 % of the empire’s population—was still 12 times the size of Tokugawa Japan.

Fig. 2
figure 2

Early modern China and Japan. Source: CHGIS, Version 4, Cambridge: Harvard Yenching Institute, January 2007

If information transmission posed any challenge to effective public administration, this challenge was clearly more acute in China than in Japan. To send a high-priority official document from Beijing 1000 kilometers down south to Shanghai would take up to 10 days (Xie 2002). By contrast, a similar trip between Japan’s two biggest cities, Edo (Tokyo) and Osaka, about 520 km apart, would only require 4 days (Nakane and Oishi 1990). It is also worth noting that no one in Japan lived more than 120 kilometers from the sea, which offered a cheap mode of transportation in an age before railroads.

2.2 Administrative structure

Both China and Japan were ruled by a succession of stable dictators between 1650 and 1850. However, while China was ruled by one dictator—the emperor of the Qing dynasty—during this period, multiple dictatorships coexisted in Japan.

Nominally, Japan was led by the shogun of the Tokugawa house, who controlled 15 % of the arable land (Fig. 3). The bulk of the remaining land was divided into 260-odd mutually exclusive domains, each headed by a daimyo (local lord).Footnote 13 While a daimyo had to swear allegiance to the shogun and subject himself to a system of controls aimed to prevent dissent, he retained virtually complete autonomy over his domain.Footnote 14 As such, instead of treating Tokugawa Japan as a unified but decentralized empire, we interpret it as a league of dictatorships and treat each daimyo as a dictator.Footnote 15 We focus primarily on the shogunate, for which historical records are most abundant, and compare it with China proper.Footnote 16

Fig. 3
figure 3

Tokugawa Japan in 1664. Source: China Historical GIS Project,“Tokugawa Japan GIS, Demo Version.” Feb 2004

The systems of territorial administration in China proper and the shogunate were broadly similar. To administer his domain, the Qing emperor structured his bureaucracy into four layers (center–province–prefecture–local). China proper was organized into 18 provinces; each province was then divided into several prefectures, and each prefecture into several counties. The responsibility of local administration fell on the county, which sat at the bottom of the bureaucratic hierarchy. Each county was headed by a magistrate, whose term was usually limited to 3 years (Ch’u 1962).

In the Tokugawa shogunate, local administration was also carried out by nonhereditary magistrates (daikan). Like his Chinese counterpart, the shogunate magistrate was subjected to rotation.Footnote 17 They also shared a similar scope of responsibilities. In both regimes, the magistrate was expected to focus on two tasks: collection of taxes and adjudication of disputes (Wang 1985; Totman 1967).

There were only two layers of government (center–local) in the shogunate. At any one time, 40–50 magistrates reported directly to the shogun’s cabinet (Totman 1967). By contrast, there were about 1,500 county-level jurisdictions and hence 1,500 magistrates in Qing China. A shogunate magistrate typically governed 50,000–100,000 people, while the size of an average Chinese county ranged from 100,000 (in 1700) to 300,000 (in 1850).

2.3 Monitoring system

Because China proper was almost 90 times bigger than the shogunate domain, it had a greater number of administrative officials and a longer bureaucratic chain of command. This implies that unless the Chinese emperor possessed superior monitoring technologies, it would be more difficult for him than for the shogun to monitor local officials. There is little evidence to suggest that monitoring technologies were better in China, however. In fact, the two regimes instituted similar monitoring systems that combined top-down, parallel, and bottom-up monitoring.

In top-down monitoring, local officials were supervised by higher-ranking officials within the same bureaucratic hierarchy. In the shogunate, the magistrate’s office was periodically audited by the Finance Office in Edo (Totman 1967, p. 76). In China, the administration conducted a grand review once every three years to evaluate the magistrate’s performance and mete out reward or punishment accordingly (Watt 1977).

Top-down monitoring, however, could be ineffective in the presence of bureaucratic patronage networks. To prevent this, the Chinese emperor established an independent surveillance agency known as the Censorate to detect bureaucratic malpractices and report them to the emperor (Feuerwerker 1976). Likewise, the shogun sent out censors to keep an eye on the local administration (Totman 1967; Nakane and Oishi 1990).

Finally, to carry out bottom-up monitoring, both regimes adopted petition systems. The system had a long tradition in China, where it had been in place since the seventh century (Ocko 1988; Fang 2009). In Japan, it was not until 1720 that the shogun set up petition boxes in major cities and permitted the public to make suggestions for better governance or to report misconduct and abuse of power by shogunate officials. The petitions were sent directly to the shogun for his review. Over 75 % of large local domains instituted similar systems (Ohira 2003).

In both cases, the petition system was costly to implement, as it typically generated a large number of petitions including irrelevant requests and false accusations. In the Tokugawa shogunate, each petition was investigated and petitioners were punished for misstatements. The system functioned reasonably well and was maintained until the end of the Tokugawa period (Ohira 2003).Footnote 18

By contrast, the sheer size of the Chinese population made it extremely costly for the Qing rulers to verify the authenticity of every petition. Both the emperors Qianlong (r. 1736–1795) and Jiaqing (r. 1796–1820) initially encouraged petitions from their subjects but quickly reversed their policies after receiving a flood of complaints that they could not possibly deal with (Fang 2009). The system did not function as intended, and some complainants resorted to extreme measures, such as committing suicide outside the palace gates, to attract the emperor’s attention to their grievances. In other words, although both China and Japan used similar systems of bottom-up monitoring to check corruption, it was less effective in China due to its much greater size and population.

The rulers in China and Japan were concerned about the well-being and grievances of their subjects for both ideological and practical reasons. Because Confucianism demanded rulers to treat their subjects benevolently, it legitimized popular resistance against an oppressive ruler.Footnote 19 This fear of a violent rebellion served as a constraint on dictators in both China and Japan and gave them an incentive to prevent the overexploitation of their subjects.

2.4 The system of tax collection

Land taxation was the most important source of government revenue in both Qing China and Tokugawa Japan. Both economies depended heavily on small-scale, labor-intensive agriculture. Every land-holding household was obligated to pay the land tax, the amount of which was determined based on the size and quality of the land the family held (Ch’u 1962; Nakane and Oishi 1990). In the case of Japan, the fiscal base was measured in rice, the primary staple crop nationwide. Fields, forests, residential lands, mines, and fishing grounds were also assessed and taxed in terms of rice (Nishikawa 1985, pp. 23–24). If rice were not the main crop cultivated, then part of the tax would be levied in cash at a conversion rate set by the lord.

By contrast, regional diversity necessitated the denomination and collection of taxes in a variety of crops and metals in China. Although most taxes had been monetized by the seventeen century, the peasants still had to pay part of their land taxes in kind, which, depending on the region, could be rice, wheat, millet, barley, sorghum, beans, or other staple crops. Furthermore, it was common for the portion of the land tax denominated in silver to be paid in copper coins when and where silver was scarce (Ch’u 1962). In such cases, commutation rates were set by magistrates based on local conditions. This high heterogeneity created great difficulties for the imperial court to monitor the over-collection of taxes by the county administration (Ch’u 1962; Zelin 1984).

In Qing China, the primary unit of taxation was the household, whereas in Tokugawa Japan, it was the village instead of the household. Under the village contract system (murauke), the Japanese rulers levied the land tax on each village based on its total assessed yield. Village leaders were in charge of assigning and collecting taxes from individual households and transferring the sum to the magistrate. Moreover, households in the same village were made collectively responsible for the payment of taxes. This arrangement reduced the frequency of contact between the magistrate and individual peasants and, therefore, limited the opportunities for tax officials to abuse power. Indeed, the magistrate rarely showed up in the villages except for annual inspections, and villages retained a high degree of autonomy in running their affairs in Japan (Walthall 1991).

For this system to work, it was necessary that village communities remained tightly knit to facilitate mutual monitoring and discourage free riding. To restrict geographical mobility, the shogunate and local lords mandated every village to keep a household registry and required their subjects to obtain permission before changing residency or traveling.

We do not model the village contract system in Japan in the next section as doing so would further reduce the monitoring costs for the Japanese rulers and strengthen our main results. It should be noted, however, that the village contract system was not a uniquely Japanese system. In fact, China had instituted a similar system during the Ming dynasty (1368–1644). The system eventually unraveled, however, as the potential for migration given China’s vast inner frontiers made it difficult to maintain tightly knit communities that were necessary to implement collective responsibility.Footnote 20 By contrast, the village contract system was firmly institutionalized in Tokugawa Japan. Even though it was abolished by the Meiji government with the introduction of a new land tax system, tax collection was delegated to local communities that continued to use a collective responsibility system well into the 1930s (Sakane 2011b).

We also do not incorporate taxpayer heterogeneity in our model. In China, taxpayers could be broadly classified into two groups: the gentry and the peasants. Historical studies suggest that unlike ordinary peasants, the gentry were rarely subjected to bureaucratic extortion because of their political connections (Ch’u 1962; Watt 1977).Footnote 21 Some gentry took advantage of their sheltered position to act as tax farmers, earning extralegal income by paying taxes on the peasants’ behalf and charging for the service.

There was heterogeneity among taxpayers in Japan too, although not to the extent observed in China. Under the village contract system, wealthy peasants were typically appointed as village leaders. Some village leaders took advantage of their position and colluded with the magistrate to extort the villagers (Nishizawa 2004).

Based on these observations, we consider local elites (the gentry and the village leaders) as tax intermediaries instead of taxpayers and incorporate them as a constituent of the tax agent.Footnote 22

3 The model

Motivated by the historical observations, in this section we develop a formal model to study the impact of geographical size on a ruler’s capacity to collect taxes and provide public goods.

Consider a discrete-time, infinite-horizon game with three types of players: ruler, tax agents, and peasants. As a stable dictator with dynastic succession, the ruler is assumed to live infinitely long, while the agents and the peasants are assumed to be short-lived.

For analytical simplicity, we assume that the dictatorship consists of \(S\) homogenous regions and that \(S\) is exogenously given to the Ruler.Footnote 23 We let the number of regions \(S\) represent the geographical size of the dictatorship and take a region as the unit of analysis. In other words, when comparing large and small dictatorships, we assume that the two regimes differ only in the number of regions they encompass and that all regions in the two regimes are “identical.”

3.1 The basic setup

We first describe a basic, single-period game in a representative region consisting of a fixed number of jurisdictions.Footnote 24 Assume that the region is populated by \(N\) Peasants who engage in agricultural production.Footnote 25 Let \(Y\) denote the agricultural output in the region and assume that it increases with labor inputs at a diminishing rate: \(Y=Y(N)\), where \(N>0\), \(Y(0)=0\), \(Y'(\cdot )>0\), and \(Y''(\cdot )<0\). In other words, the aggregate output increases with population, and hence population growth and economic growth are synonyms in our model.

In each region, the ruler sets a tax rate \(\tau \) and stations a fixed number of Agents to collect taxes from the Peasants, where one Agent is assigned to every jurisdiction.Footnote 26 We interpret the Agent as a figure that represents all tax intermediaries in his jurisdiction.Footnote 27 When collecting taxes, the agent may demand extralegal surcharge of rate \(\theta \) from the peasants, in addition to the official tax rate \(\tau \), for his private benefit. As a result, the effective expropriation rate for the peasants is \(\tau +\theta \), creating a potential wedge between what the ruler receives and what the peasants pay.

When the agent announces \(\tau +\theta \), the Peasants pay the portion of their outputs to the agent as demanded.Footnote 28 If \(\tau +\theta \) is within an exogenously given rate of \(r\), then the Peasants consider it acceptable and stay put. However, if it exceeds \(r\), then the peasants deem this “unjust” and revolt. We assume that only the ruler actively seeks to prevent rebellion for three reasons. First, because peasant rebellion destroys productive capacity and affects future agricultural outputs, it hurts the long-lived ruler much more than the short-lived agents. Second, within each jurisdiction there are coordination problems among tax intermediaries that collectively constitute the agent. Even if revolts hurt them, it is individually rational for each of them to ignore the no-revolt condition in setting \(\theta \). Third, a rebellion could spill over to other jurisdictions. If agents in different jurisdictions cannot coordinate their actions in setting \(\theta \), then it will be individually rational to ignore the no-revolt condition. By contrast, as the sole dictator governing the entire domain, the ruler internalizes externalities across both time and space.

To discourage the agents from engaging in extralegal expropriation, the ruler employs the following monitoring mechanism. First, the ruler conducts audits in randomly selected regions after the agents finish tax collection. Let \(A(S)\) denote the probability of the representative region receiving audits where \(0\le A(S)\le 1\). Due to the ruler’s resource constraints, we assume that the probability of audits decreases with the number of regions in a dictatorship: \(A'(\cdot )<0\).Footnote 29 In other words, in the absence of modern information technologies, the ruler faces managerial diseconomies of scale.

Next, when an agent is indicted on misconduct charges in the auditing process, the ruler punishes the agent by imposing a fine \(X\). Audits, however, detect misconduct only imperfectly with probability \(D(\theta )\) where \(0\le D(\theta )\le 1\) and \(D(0)=0\). We assume that the detection probability increases with the rate of surcharge \(\theta \) at an increasing rate, but that the marginal rate of detection is concave in \(\theta \): \(D'(\cdot )>0\), \(D''(\cdot )>0\), and \(D'''(\cdot )\le 0\).Footnote 30 A simple example would be a quadratic function: \(D(\theta )=\theta ^2\).

To summarize, the timing of events in the basic, single-period game in the representative region is as follows: (1) The ruler sets a tax rate \(\tau \) to maximize tax revenue. (2) The representative Agent selects \(\theta \) to maximize his expected payoff and proceeds to collect taxes. (3) The peasants pay \(\tau +\theta \) of their outputs to the agents and decide whether or not to revolt. (4) The ruler conducts randomized audits and punishes the agents if the audits uncover misconduct.

The representative agent To provide benchmark results, we derive the equilibrium of the single-period game. First, consider the optimization problem of the representative agent. The Agent chooses a rate of extralegal surcharge \(\theta \) to maximize his expected payoff, given the monitoring mechanism, \(A(\cdot ), D(\cdot )\), and \(X\):

$$\begin{aligned} \mathop {\max }_{0\le \theta \le 1}\text { }v^{A}=\theta \cdot Y(N) - A(S)\cdot D(\theta ) \cdot X \end{aligned}$$
(1)

The optimal rate of surcharge \(\theta ^{*}\) is given by the following condition:

$$\begin{aligned} Y(N)=A(S)\cdot D'(\theta ^{*})\cdot X \end{aligned}$$
(2)

The ruler The ruler chooses a tax rate to maximize tax revenue. In doing so, however, we assume that, unlike the agents, the ruler is deeply concerned about peasant rebellion and thus constrained by the no-revolt condition: \(\tau + \theta \le r\).

Formally, the ruler’s maximization problem can be written as:

$$\begin{aligned}&\mathop {\max }_{0\le \tau \le 1}v^{R}=\tau \cdot Y(N)\nonumber \\ s.t.&\tau + \theta \le r \end{aligned}$$
(3)

Anticipating the responses by the agents and the peasants, the ruler sets a tax rate given the optimality condition (2) and the no-revolt condition. It is simple to show that there is a unique equilibrium in the single-period game in which \(\tau ^{*}\) and \(\theta ^{*}\) are determined by \(Y(N)=A(S)\cdot D'(\theta ^{*})\cdot X\) and \(\tau ^{*}+ \theta ^{*}=r\).

Comparative statics To examine the effects of the size of a dictatorship on the optimal tax and corruption rates, we perform comparative statics with respect to the number of regions \(S\). From the optimality condition \(Y(N)=A(S)\cdot D'(\theta ^{*})\cdot X\) and the assumptions \(A'(S)<0\) and \(D''(\theta )>0\), we obtain the following result:

Result 1

The equilibrium corruption rate \(\theta ^{*}\) is higher in a larger dictatorship: \(\frac{d\theta ^{*}}{dS}>0\).

From \(\tau ^{*}+ \theta ^{*}=r\), it also follows that:

Result 2

The equilibrium tax rate \(\tau ^{*}\) is lower in a larger dictatorship: \(\frac{d\tau ^{*}}{dS}<0\).

In other words, assuming that production and monitoring technologies are identical across comparable regions in the two dictatorships, the model predicts lower official tax rates and higher extralegal expropriation rates in Qing China than in Tokugawa Japan. These results are driven solely by the assumption of managerial diseconomies of scale, \(A'(S)<0\).

3.2 The dynamic setup

We now consider a dynamic game (\(t=1, 2, 3...\)) and introduce two additional features. First, to provide a link between tax revenue and the economy, we allow the ruler to spend part of the revenue on a local public good. Second, to study dynamic implications, we endogenize population and permit the economy to grow.

If the ruler spends all tax revenue on nonproductive pursuits, such as private consumption, then from an economic point of view, there is little difference between taxation and corruption. Suppose, however, that the ruler spends part of the tax revenue on public goods in each region. For simplicity, assume that a random shock (e.g., natural disaster) hits the representative region at the end of every period. Assume also that the ruler can invest in a public good in the beginning of every period to prepare for the possible disaster. We consider a local public good (as opposed to a pure public good) that is non-excludable but rivalrous within the region and has no spillover effects on other regions. Most infrastructure investments, such as roads, flood control, fire protection, and famine relief, satisfy these conditions and must be provided on a regional basis.

Let \(\gamma _{t}\) be the level of public good the ruler provides in period \(t\). Unless the level of public good investment is sufficiently large relative to the size of the shock, the shock destroys the region’s economy and terminates the game at the end of period \(t\). Let \(G(\gamma _{t})\) denote the probability that the region survives the shock and the game continues into period \(t+1\) given the investment \(\gamma _{t}\). We assume that the continuation probability increases with \(\gamma _{t}\) but at a diminishing rate: \(G(0)=0\), \(G'(\cdot )>0\), and \(G''(\cdot )<0\).Footnote 31 In other words, the ruler now has an incentive to invest in the public good to protect the regional economy from the random shock to secure future tax revenue.Footnote 32

Next, we model consumption and reproduction decisions of a representative peasant. Assume that the peasant lives for just one period, earns income from agricultural production, and spends his after-tax income on consumption and reproduction to maximize his utility.Footnote 33 Let \(u(c_{t},n_{t+1})\) represent the utility the peasant receives from the consumption \(c_{t}\) and the number of offspring \(n_{t+1}\) produced in period \(t\). Collectively, \(n_{t+1}\) gives total population in the next period \(N_{t+1}\), namely, \(N_{t+1}=N_{t}\cdot n_{t+1} \). Following Hansen and Prescott (2002), we assume that the two goods are complements and are subject to diminishing marginal utility: \(u_{1}(.)>0\), \(u_{2}(.)>0\), \(u_{11}(.)<0\), \(u_{22}(.)<0\), \(u_{12}(.)>0\).

The timing of events in the dynamic game in period \(t\) (\(t=1, 2, 3...\)) is as follows: (1) The ruler sets a tax rate \(\tau _{t}\) and public good investment \(\gamma _{t}\). (2) The representative agent selects a rate of extralegal expropriation \(\theta _{t}\). (3) The representative peasant pays \(\tau _{t}+\theta _{t}\) of his income to the Agent, makes consumption and reproductive decisions (\(c_{t}, n_{t+1}\)), and revolts if \(\tau _{t}+\theta _{t}>r\). (4) The ruler conducts randomized audits and fines the agents if misconduct is detected. (5) Exogenous shock hits the region and destroys the economy unless \(\gamma _{t}\) is sufficiently large; the game continues to the next period with probability \(G(\gamma _{t})\).

The representative peasant We derive an equilibrium of the dynamic game by backward induction.

First, the optimization problem of the representative Peasant in period \(t\) is given by:

$$\begin{aligned}&\underset{c_{t}, n_{t+1}>0}{\max }u_{t}=u(c_{t},n_{t+1})\end{aligned}$$
(4)
$$\begin{aligned} s.t.&c_{t}+n_{t+1}\le [1-\tau _{t}-\theta _{t}]\cdot y_{t} \end{aligned}$$
(5)

where individual income is defined by \(y_{t}=\frac{Y(N_{t})}{N_{t}}\). Note that \(y_{t}\) is exogenous to the peasant even though \(N_{t}=N_{t-1}\cdot n_{t}\) because \(n_{t}\) is a decision variable of the previous generation. From the first order condition and the assumption \(u_{12}(.)>0\), it can be shown that the optimal number of offspring \(n_{t+1}^{*}\) is an increasing function of net individual income \((1-\tau _{t}-\theta _{t}) \cdot y_{t}\).

The representative agent The representative Agent is assumed to be short-lived. As a result, the maximization problem of the representative Agent is essentially the same as in the single-period game, and thus the optimal rate of extralegal expropriation in period \(t\) is given by:

$$\begin{aligned} Y(N_{t})=A(S)\cdot D'(\theta ^{*}_{t})\cdot X \end{aligned}$$
(6)

The ruler The ruler is assumed to live for infinitely many periods. He sets the current and future values of \((\tau , \gamma )\) to maximize the expected discounted value of the tax revenue stream. In doing so, we again assume that the ruler is bound by the no-revolt condition in every period. Let \(V^{R}_{t}\) represent the ruler’s present value of the future revenue stream in period \(t\). His maximization problem in period \(t\) is given by:

$$\begin{aligned} \mathop {\max }_{0\le \tau _{t+j} \le 1, \gamma _{t+j}\ge 0}V^{R}_{t}=&\tau _{t} \cdot Y(N_{t})-\gamma _{t}+G(\gamma _{t})\cdot V^{R}_{t+1}\\ s.t.&\tau _{t+j} + \theta _{t+j} \le r \quad \forall j=0, 1, 2 \ldots \nonumber \end{aligned}$$
(7)

The optimal level of public good investment \(\gamma _{t}\) is given by the following condition:

$$\begin{aligned} G'(\gamma ^{*}_{t})\cdot V^{R*}_{t+1}=1 \end{aligned}$$
(8)

In other words, the ruler invests in the public good up to the level where the marginal return from the investment equals its marginal cost. The higher the present value of his future revenue stream \(V^{R*}_{t+1}\), the more willing the ruler is to invest in the public good to increase the continuation probability.

The ruler sets an optimal tax rate, taking the agent’s optimality condition (6) and the Peasant’s no-revolt condition as given. Because these conditions are the same as before, the equilibrium tax and corruption rates (\(\tau _{t}^{*}, \theta _{t}^{*}\)) in the dynamic game are again determined by \(Y(N)=A(S)\cdot D'(\theta _{t}^{*})\cdot X\) and \(\tau _{t}^{*}+ \theta _{t}^{*}=r\) (\(t=1, 2, 3 \ldots \)).

Population dynamics We now turn to equilibrium population dynamics. Because the Peasant’s net income is \((1-r)\cdot y_{t}\) in the equilibrium and \(r\) is a constant, the optimal number of offspring can be expressed as \(n^{*}_{t+1}=n^{*}_{t+1}(y_{t})\), where \(n^{*}_{t+1}(\cdot )\) is strictly increasing in \(y_{t}\). This, in turn, provides the population dynamics because by definition:

$$\begin{aligned} n^{*}_{t+1}(y_{t})=\frac{N^{*}_{t}\cdot n^{*}_{t+1}}{N^{*}_{t}}=\frac{N^{*}_{t+1}}{N^{*}_{t}} \end{aligned}$$
(9)

In the spirit of Malthus, Condition (9) implies that the direction and rate of population growth depends on the peasant’s per capita income. Let \(\underline{y}\) denote the level of income defined by \(n^{*}_{t+1}(\underline{y})=\frac{N^{*}_{t+1}}{N^{*}_{t}}=1\). If \(y_{t}>\underline{y}\), then \(N_{t+1}>N_{t}\) or population will expand; if \(y_{t}<\underline{y}\) instead, then population will contract. Either way, in the long run the region’s population will converge to a stationary level \(N(\underline{y})\) associated with the steady-state per capita income \(\underline{y}\) (see Fig. 4).

Fig. 4
figure 4

Converging to the steady state population level

3.2.1 Comparative statics

We compare the two dictatorships that differ only in the number of regions that they encompass. In particular, we assume the same initial populations in the representative regions in the two dictatorships. Recall that the optimal tax rate is higher in a smaller dictatorship in every period (Result 2). However, since population growth depends not on the official tax rate (\(\tau \)) alone, but on the effective expropriation rate (\(\tau +\theta \)), the two representative regions will be identical in population size in every period.

Result 3

A larger dictatorship invests less in the public good per region (and therefore per capita): \(\frac{d\gamma ^{*}_{t}}{dS}<0\) \(\forall \) \(t\).

Proof

Suppose that the result does not hold; at some \(t\), \(\gamma ^{*}_{t}(S_{large})\ge \gamma ^{*}_{t}(S_{small})\), where \(S_{large}>S_{small}\). Let \(\{\gamma ^{*}_{t+j}(S_{large})\}^{\infty }_{1}\) represent the sequence of public good provision that maximizes \(V^{R}_{t+1}(S_{large})\) at \(t\). Given Result 2, this sequence is financially feasible for the ruler of the smaller dictatorship to adopt. Let \(U\) represent the value of \(V^{R}_{t+1}(S_{small})\) when he implements this sequence. Let \(V^{R*}_{t+1}(S_{small})\) represent the maximum attainable value of \(V^{R}_{t+1}(S_{small})\). By definition, \(V^{R*}_{t+1}(S_{small})\ge U\).

Result 2 implies that \(U>V^{R*}_{t+1}(S_{large})\). Therefore, \(V^{R*}_{t+1}(S_{small})>V^{R*}_{t+1}(S_{large})\) must hold. The ruler’s optimality condition (8) and the assumption \(G''(\cdot )<0\) then imply that \(\gamma ^{*}_{t}(S_{large})< \gamma ^{*}_{t}(S_{small})\). This completes the proof by contradiction. \(\square \)

The intuition of Result 3 is straightforward. When the agency problem is more severe and hence the continuation payoff \(V^{R*}_{t+1}\) is lower, then the ruler has less incentive to invest in the future of the region. For ease of exposition, we assume that the agency problem exists only in tax collection but not in public goods provision. Historically, however, the agency problem in the provision of public goods was a serious concern as shown in the next section. Relaxing this assumption will only strengthen the result.

Next, we explore dynamic implications. We focus on the case where the initial size of the region’s population is below the stationary level \(N(\underline{y})\). According to the equilibrium population dynamics, the population will grow until it reaches the steady state unless interrupted by external shocks. Because aggregate output \(Y(N)\) increases with the population, one may expect the ruler’s tax revenue to increase with the population too. The next result, however, establishes that the ruler’s revenue first rises and then falls as the population expands.

Result 4

For any given \(S\), there exists a unique threshold population \(\hat{N}(S)\) such that the ruler’s period tax revenue \(v^{R*}_{t}\) increases with \(N\) if \(N<\hat{N}(S)\), and decreases with \(N\) if \(N>\hat{N}(S)\). Moreover, the threshold population \(\hat{N}(S)\) is smaller in a larger dictatorship: \(\frac{d\hat{N}(S)}{dS}<0\).

Proof

From the agent’s optimality condition \(Y(N)=A(S)\cdot D'(\theta ^{*})\cdot X\) and the assumptions \(Y(\cdot )>0, Y^{\prime }(\cdot )>0, D^{\prime }(\cdot )>0\), and \(D^{\prime \prime }(\cdot )>0\), it follows that:

$$\begin{aligned} \frac{d\theta ^{*}}{dN}=\frac{Y^{\prime }(N)}{A(S)\cdot X\cdot D^{\prime \prime }(\theta ^{*})}=\frac{D^{\prime }(\theta ^{*})\cdot Y^{\prime }(N)}{D^{\prime \prime }(\theta ^{*})\cdot Y(N)} >0 \end{aligned}$$
(10)

which implies that the equilibrium corruption rate is strictly increasing in population.

Recall that the ruler’s period tax revenue is given by \(v^{R*}=\tau ^{*}\cdot Y(N)\). Note that \(\tau _{t}^{*}+\theta _{t}^{*}=r\) implies \(\frac{d\tau ^{*}}{dN}=-\frac{d\theta ^{*}}{dN}\). Then it follows that:

$$\begin{aligned} \frac{dv^{R*}}{dN}&=\tau ^{*} \cdot Y^{\prime }(N) + \frac{d\tau ^{*}}{dN}\cdot Y(N)\nonumber \\&=\left[ \tau ^{*} -\frac{D^{\prime }(\theta ^{*})}{D^{\prime \prime }(\theta ^{*})}\right] \cdot Y^{\prime }(N) \end{aligned}$$
(11)

From \(\frac{d\theta ^{*}}{dN}>0\), \(D''(\cdot )>0\) and \(D'''(\cdot )\le 0\), \(\frac{D^{\prime }(\theta ^{*})}{D^{\prime \prime }(\theta ^{*})}\) is strictly increasing in \(N\).Footnote 34 Because \(\frac{d\tau ^{*}}{dN}<0\), \(\tau ^{*}-\frac{D^{\prime }(\theta ^{*})}{D^{\prime \prime }(\theta ^{*})}\) is a strictly decreasing function of \(N\). Since \(Y^{\prime }(\cdot )>0\), the sign of \(\frac{dv^{R*}}{dN}\) is determined by the sign of \(\tau ^{*}-\frac{D^{\prime }(\theta ^{*})}{D^{\prime \prime }(\theta ^{*})}\). Let \(\hat{N}(S)\) be the population level at which \(\tau ^{*}-\frac{D^{\prime }(\theta ^{*})}{D^{\prime \prime }(\theta ^{*})}=0\). It is simple to verify that \(\frac{dv^{R*}}{dN}>0\) if population is below \(\hat{N}(S)\), and \(\frac{dv^{R*}}{dN}<0\) if population is above \(\hat{N}(S)\).

Finally, note that \(\tau ^{*}-\frac{D'(\theta ^{*})}{D''(\theta ^{*})} =\tau ^{*}-\frac{Y(\hat{N})}{D''(\theta ^{*})\cdot A(S)\cdot X}=0\). This and \(\frac{d\tau ^{*}}{dS}<0\) (Result 2) and the assumptions \(Y'(\cdot )>0\), \(D'''(\cdot )\le 0\), and \(A'(\cdot )<0\) together imply that \(\frac{d\hat{N}(S)}{dS}<0\). \(\square \)

The above analysis makes it clear that population growth and the resulting economic expansion have two opposing effects on the ruler’s tax revenue (see Eq. 11). On one hand, it enlarges the tax base. On the other hand, it increases the rate of extralegal expropriation and reduces the fraction of the economic surplus that goes to the ruler. Result 4 shows that the latter effect begins to dominate the former once the population in the region crosses the threshold. What is more, it shows that the larger the dictatorship, the sooner the region reaches the tipping point where the negative effect of economic growth dominates the positive effect.

More generally, the following result holds:

Result 5

For any given \(N\), economic expansion is less beneficial to the ruler in a larger dictatorship: \(\frac{d}{dS} \left( \frac{dv^{R*}}{dN}\right) <0\).

Proof

It follows from equation (11), \(\frac{d\theta ^{*}}{dS}>0\) (Result 1) and \(\frac{d\tau ^{*}}{dS}<0\) (Result 2), and the assumptions \(D''(\cdot )>0\) and \(D'''(\cdot )\le 0\).

According to Result 5, at every population level \(N\), the positive effect of economic growth on the ruler’s revenue is always larger and the negative effect always smaller in a smaller dictatorship. In other words, the ruler in a larger dictatorship gains consistently less from the economic expansion due to greater agency costs.

Two dynamic outcomes For two dictatorships that differ significantly in size, the model predicts two distinct outcomes.

In the case of the small dictatorship, as its ruler is capable of capturing a significant portion of the economic surplus consistently (Results 2 and 5), he will invest relatively heavily in the public good (Result 3) to protect the economy from periodic external shocks. In the absence of extraordinarily large shocks to disrupt the process, the population in every region that he governs will expand until per capita income falls to \(\underline{y}\). At this point, the economy enters the steady state and will stay there unless a large exogenous shock knocks it out of that state (Fig. 5a).

Fig. 5
figure 5

Two dynamic outcomes

The picture is different in the large dictatorship. In this case, the ruler’s revenue begins to fall early while the economy still expands. As fiscal conditions worsen, the ruler cuts his investment in the public good. His regime could even go bankrupt before the economy enters the steady state. Here, we observe a clear pattern of dynastic rise and fall. The establishment of the dynasty brings order and stability initially, which allows economic expansion to take place. However, in a paradoxical manner, the regime finds itself increasingly incapable of managing the prosperity that it has helped create (Fig. 5b).

Our results affirm the conjecture in Usher (1989) that a society under despotic rule could either evolve into a stationary state or into a dynastic cycle. We show in the next section that the Tokugawa patterns match the stationary state scenario. The Japanese population grew steadily between 1600 and the early 1700s, and stayed almost constant from then until 1850. The shogunate’s revenue followed a similar path. By contrast, China saw an almost uninterrupted population expansion from the 1680s right up to 1850. Yet the fiscal capacity of the Qing state began to contract in the first half of the 1700s in a manner that fits the second scenario.

4 Empirical evidence

Assuming that premodern China and Japan used similar production and monitoring technologies, our model predicts lower rates of corruption (Result 1), higher tax rates (Result 2), and higher levels of public goods provision per region (Result 3) in Tokugawa Japan than in Qing China. The model also predicts that, with economic expansion, the fiscal revenue in the shogunate would likely reach a steady state, while the revenue in China would hit its peak and begin to fall (Results 4 and 5).

In what follows, we assemble comparable quantitative data from primary and secondary sources to check these predictions. We first discuss the issue of corruption in China and Japan with respect to Result 1. We then provide further evidence that Results 2, 4, and 5 are consistent with the fiscal and population patterns observed historically. Finally, we evaluate Result 3 by comparing the provision of key local public goods.

4.1 Corruption

By its very nature, evidence of corruption is elusive. Nevertheless, consistent with our theoretical predictions, historical accounts by contemporary observers in Qing China suggest that bureaucratic corruption was pervasive during the eighteenth century and that it worsened over time.Footnote 35 By comparison, the problem was reasonably contained in Tokugawa Japan.

As noted earlier, regional diversity in crops and multiple commutation rates across the empire made it difficult for the Chinese emperor to monitor local officials. In collecting land tax, over-collection (fu-shou) by magistrates and their underlings was reported to be endemic (Ch’u 1962; Zelin 1984).Footnote 36

One popular form of over-collection was the manipulation of commutation rates, wherein magistrates demanded households to pay taxes in copper (instead of the officially stipulated silver or grain) and set the commutation rate above the market rate (Ch’u 1962, p. 142). Feng Guifen, a contemporary scholar, observed in the 1840s that commoners paid up to 16,000 wen of copper cash for every shi of rice that they owed in taxes, when the market price of rice was less than 3000 wen per shi (Wang 1973, p. 38). In Guangdong, the commutation rate increased by 250 % (from 2 taels to 7 taels of silver per shi of grain) from the mid-eighteenth century to the mid-nineteenth century, even though the grain price rose only 40 % during the same period (Lin 1997).

Setting a high commutation rate was just one example of over-collection. In Shanxi, 12 surtaxes unauthorized by the central government were imposed between 1735 and 1795, followed by 7 additional items between 1796 and 1820 and 15 more between 1821 and 1850 (Wang 1973, p. 59).

According to Zhang (1962, p. 32), in the early nineteenth century a typical Chinese magistrate earned 30,000 silver taels a year through extralegal channels. By this estimate, the extralegal incomes of the 1,500 magistrates (45 million taels) would have exceeded the annual amount of tax silver that entered the state coffers (40 million taels in the early nineteenthcentury). Ni and Van (2006) estimate that corruption consumed more than 20 % of China’s agricultural output in 1850.

While bureaucratic corruption was a subject of intense discussion in official and scholarly discourse in Qing China, it attracted less attention in Tokugawa Japan. Political and intellectual elites in the late Tokugawa period were more concerned with the declining economic status of the ruling class (who served as government officials) than their misconduct (Jansen 1989; Totman 1993).

Bribes and gifts to tax officials were the common form of illegal exaction in Japan. However, historical accounts indicate that corruption was reasonably contained after the reforms in the 1680s (Nishizawa 2004, Chap. 2). For example, Teranishi Takamoto, a well-known magistrate in the 1790s, did extensive research and estimated that in a typical local jurisdiction of 50,000 koku, villagers spent about 500 ryo in bribing or entertaining the magistrate and his subordinates, which amounted to 1.0 % of the agricultural output.Footnote 37 Similarly, according to village records in the 1860s, when petitioning for a tax reduction in the face of poor harvest, peasants in a village of 1,000 koku spent 13.6 ryo as gifts to tax officials who came for harvest inspection. This is equivalent to 1.6 % of the village’s total output.Footnote 38 Case studies suggest that bribery of this scale was probably widespread and tolerated by the Japanese lords.

To obtain an upper bound estimate of corruption in Japan, we consider one of the worst corruption incidents that was uncovered and prosecuted by the shogunate (Matsuo 1995; Nishizawa 2004). In 1794, multiple villages in Osaka were found to have falsely obtained a tax reduction for harvest failure by bribing the local officials. Subsequently, 32 tax officials and 86 village leaders were charged with corruption and severely punished. Even in this case, the bribes constituted 12.8 % of total output,Footnote 39 which is considerably below Ni and Van’s (2006) estimate of 20 % in Qing China.

4.2 Tax rate

The model predicts a lower tax rate in the larger dictatorship (Result 2). It further predicts that tax rates will decline with population growth, but at a faster rate in the larger dictatorship. Assuming that per capita output was comparable between China and Japan, per capita tax revenue is a proxy for tax rate. As shown in Fig. 1, per capita tax revenue was consistently higher in the Tokugawa shogunate than in China. Furthermore, per capita tax revenue fell over time in both regimes but at a faster rate in China.

Intra-country comparisons within China as well as within Japan provide further evidence in support of Result 2. Sng (2014) has shown that the Qing state collected more taxes in regions closer to the capital where the imperial court could monitor the tax officials better. Although fiscal information on the smaller Japanese domains is fragmented, existing evidence suggests that tax rates were higher outside the shogunate (Nakabayashi 2012). Compared to an average tax rate of 34 % in the Tokugawa shogunate, the lord of Aizu taxed his peasants at 50–55 % between 1637 and 1764 (Furushima 1963). In Choshu domain, agricultural outputs were taxed at an average rate of 40 % in 1840 (Nishikawa 1985). As Fig. 6 illustrates, tax rates in Kumamoto were also higher than those in the shogunate (Miyamoto 2004; Hosokawa Hanseishi Kenkyuukai 1974).Footnote 40

Fig. 6
figure 6

Tax rate. Sources: Ohno (1996), Miyamoto (2004) and Hosokawa Hanseishi Kenkyuukai (1974)

Importantly, unlike the case in early modern Europe where “war made the state and the state made war” (Tilly 1975), high tax rates in Japan were not driven by interstate competition. The Tokugawa era was one of extraordinary peace. In the two centuries after the Shimabara rebellion (1637–1638), no major armed incident occurred. Until the West forced Japan to open up in the 1850s, tensions between the shogunate and local domains were never high enough to make war a real possibility.

4.3 Population growth and fiscal change

According to demographic trends, the Tokugawa era can be divided into two sub-periods. From 1600 to the early 1700s, population grew from 12 million to 30 million and towns and cities proliferated. From 1700 to 1850, however, Japan’s population stayed at around 30 million (Fig. 7a). As Fig. 8a shows, aggregate tax revenue of the shogunate evolved in tandem with population change: land tax revenues grew steadily before the early eighteenth century and stayed more or less flat afterwards.Footnote 41

Fig. 7
figure 7

Population. Sources: Hayami and Miyamoto (1988), Perkins (1969) and Cao (2000)

Fig. 8
figure 8

Aggregate tax revenue (Real). Sources: Ohno (1996) and Sng (2014)

In China proper, the population expanded steadily from the late 1600s to around 1850 (Fig. 7b). However, the Qing state’s tax revenue peaked in the first half of the eighteenth century and tailed off from then on (Fig. 8b). Ironically—but consistent with Result 4—the turning point occurred in the midst of the High Qing period, when the Chinese economy was expanding steadily and interregional trade was flourishing (Shiue and Keller 2007).Footnote 42

According to Fig. 8, even though the Qing state collected lower taxes per capita, its aggregate tax revenue remained far greater than that of the Tokugawa shogunate throughout the period. If we assume that a ruler maximizes aggregate tax revenue and not per capita revenue, then it might have been perfectly rational for the Qing emperor to prefer to govern a large empire.

4.4 Provision of local public goods

Finally, we compare the provision of local public goods by the state in China and Japan. For our analysis, it is important to distinguish two categories of local public goods: local public goods that serve (one or several) local communities and those that serve a wide region (or province). Historical studies indicate that in early modern China and Japan, much of the community-level local public goods, such as local roads and irrigation, were privately provided under the initiatives of local elites.Footnote 43

By contrast, region-level local public goods, such as coinage and intercity transportation, require the involvement of the state to overcome coordination problems that arise from their greater scale of provision. Result 3 predicts that per capita provision of these public goods would be lower in Qing China than in the Tokugawa shogunate. In what follows, we compare state provision of the following local public goods in the two regimes: (a) coinage, (b) transportation network, (c) urban management, (d) forest protection, and (e) famine relief (Table 1).

Table 1 Public goods provision in Qing China and Tokugawa Japan

4.4.1 Coinage

The circulation of quality, standardized coins helps to reduce the cost of everyday transactions. The Tokugawa shogunate produced gold, silver, and copper coins. The Chinese state minted copper coins only. In the absence of a reliable government-issued, large-denomination currency, the Chinese used silver bullion and foreign denominated silver coins for large transactions. As Deng (2008) put it, “China’s silver stock was made of a collage of pieces in just about all shapes, sizes and qualities under the sun.”

Lin (2006) suggests that even in its heyday, the Qing state did not produce enough copper coins to satisfy the needs of its growing population. As a result, it had to tolerate the use of counterfeit coins to relieve currency scarcity. When the output of the Qing mints peaked between 1756 and 1765, national production reached 3,640 million pieces annually, or 15 copper coins per head. By comparison, the shogunate produced 1,096 million copper coins annually between 1764 and 1788, or 35 pieces per head (Table 1a).Footnote 44

4.4.2 Transportation network

The Tokugawa period witnessed the development of an extensive road network nationwide. The shogunate built a system of five major highways (Gokaido), centered on Edo (Fig. 3). Local lords, too, constructed roads and bridges to facilitate the flow of goods from rural areas to their castle towns (Yamamoto 1993).Footnote 45

The shogunate also built a coastal transportation network to bring personnel and goods to Edo (Yamamoto 1993). Coastal waters were charted and lighthouses built to guide ships through the rocky coastline. In the 1670s, the shogunate established two shipping routes—the eastern sea circuit and the western sea circuit (Fig. 3)—which together formed a complete loop surrounding the main island of Honshu and lowered transport costs (Nakai and McClain 1998, pp. 164–5).

By contrast, the Chinese rulers did relatively little to improve its transport infrastructure. With the notable exception of trade along the Grand Canal, most long-distance trade was carried out among regions either well served by natural inland waterways or along the coast. Many roads appeared to be poorly built (Pomeranz 2000). Schran (1978) observes that “as a rule, the rivers and lakes were not made more passable for boats by the removal of obstacles such as rocks, silt, and debris, by the dredging or marking of channels, by the construction of two paths, etc,” and “the Chinese people adapted to this limited involvement of the government in communication by ‘struggling’ on their own (individually or in groups) against the natural elements as well as each other.”

In Table 1b, we use the length of trunk roads as a crude measure of state investment in transportation. The Qing imperial postal system, which the imperial court relied on to maintain communications with the rest of the country, was about 13,770 km long, or eight times the length of the Gokaido (Fig. 2). However, this implies a trunk road density (length divided by domain size, in km per 100 km\(^{2}\)) of only 0.26, compared to 3.37 in the shogunate if we assume that the Gokaido served only the shogunate domain. Even if we divide the length of the Gokaido by the whole of Japan, the resulting road density, at 0.51, would still be twice that of China.Footnote 46

4.4.3 Urban management

The state played an active role in Japan’s urban expansion. Local lords transformed their castles into towns as they strove to expand their tax base (McClain 1980). As these castle towns grew, their rulers imposed detailed regulations and devised new systems of urban administration (Nakai and McClain 1998). For example, after a big fire in 1657, the shogunate created open spaces in Edo to serve as fire breaks (Hanley 1987). Professional firefighting units were set up and watchtowers were built.Footnote 47 Measures were also taken to ensure that waste materials were properly recycled or disposed of, and streets and waterways were kept clean and open in Edo as well as in the smaller towns and cities (Hanley 1987).

Contrary to Max Weber’s claim that a heavy state presence in Chinese cities stifled China’s economic development, formal administration penetrated far less in Chinese cities than in Japanese ones (Rozman 1973). Over 95 % of the towns and cities in early modern China did not have a permanent bureaucratic presence (Zelin 2004). Unlike the case in Japan, where there was a distinction between the urban magistrates (machi-bugy \(\bar{o}\)) and the rural ones (daikan), Chinese counties were not functionally differentiated into urban and rural units.Footnote 48 The urban infrastructure in China was less developed than in Japan (Mosk 2011). A Chinese scholar observed in the early 1900s that “the hundred and one undertakings, such as roads, streetlights, removal of rubbish, water supply, school system, police, fire protection, etc., which people of the West are accustomed to regard as functions of a municipal government are, with a few exceptions of recent date, never undertaken by the proper government officials” (Rowe 1989, p. 135).

Rozman (1973) calculated that in 1800, Japan’s urbanization rate (16.5 %) was more than twice that of China (5.5 %), and “the most urbanized province of China [Zhili] was considerably less urban than the least urbanized region of Japan [Tohoku]” (Table 1c). Some scholars have pointed out that conventional measures of urbanization may have underestimated China’s true level of urbanization, for these measures overlook the proliferation of small market towns in early modern China (Li 2000; Brandt et al. 2014). Our comparative analysis shows that the lack of state leadership in solving urban collective action problems may help to explain why, instead of seeing its largest cities growing, China’s “urbanization” took such a unique path.

4.4.4 Forest preservation

Population growth and urbanization brought about rapid deforestation in seventeenth-century Japan. By the mid-seventeenth century, few prime forests were still in existence. Because environment degradation threatened the long-term economic and security interests of the shogun and local lords, they responded by issuing regulations to restrict entry into forests and clearance of woodland for cultivation. Over time, they created new administrative bodies (e.g., the Kinai Office of Erosion Control) and positions (e.g., forest magistrates) to enforce the regulations, demanded the compilation of forest registers to track illegal logging, set up inspection points along rivers and roads to detect smugglers, and implemented sumptuary rules to prohibit the use of precious timber on “wasteful” activities.Footnote 49 Attempts were also made to delineate the boundaries between domains as well as between villages to avoid “the tragedy of commons.” Finally, the shogunate and some domains promoted reforestation programs actively (Totman 1989).

Early modern China, too, experienced rapid deforestation (Elvin and Liu 1998; Elvin 2004; Marks 2012). Like the Japanese rulers, the Qing emperor was acutely aware of the problem and the threat it posed to social stability. When flash floods caused by excessive land reclamation plagued the middle reaches of the Yangzi River in the second half of the eighteenth century, the state intervened spontaneously. However, its efforts were thwarted by corruption and inefficient administration. Zhang (2006, p. 100) observes that the government’s attempt to issue new regulations and throw resources at the problem was unsuccessful and “money was wasted on a top-heavy, inefficient, and corrupt bureaucracy.”Footnote 50 According to McCaffrey (2003), the Qing state’s inability to manage the rivalry among local communities in the region was a major contributing factor to the White Lotus Rebellion in 1796.

Saito (2009) provides a quantitative measure to compare environmental preservation outcomes in China and Japan. Between 1600 and 1850, the estimated woodland area in Japan fell from 27 million hectares to 25.5 million hectares, and the movement between the two time points followed a U-shaped trajectory: forest cover first contracted sharply before rebounding. In Lingnan, a region in South China that “share[d] much the same flora and climate” as Japan, forest-covered area was almost halved from 18.3 million hectares in 1700 to 9.6 million hectares in 1850 (Table 1d).

4.4.5 Famine relief

For agrarian societies, crop failures could undermine social stability. To mitigate this threat, the Qing emperors built a nationwide system of public granaries known as the ever-normal granaries. Located in the provincial, prefectural, and county capitals, these granaries were managed by the local magistrates and were expected to perform two main functions: famine relief in natural disasters and price smoothing in normal times (buying low and selling high) (QSG, 1976 [1927], juan 121).

The size and frequency of the Qing state’s granary operations reached a peak in the mid-eighteenth century, and by the 1780s, the system was on a path of decline (Will and Wong 1991). Corruption was considered to be a main factor. In 1781, a major embezzlement case was exposed in the northwestern province of Gansu, where provincial officials conspired to falsely report droughts and carry out phantom relief operations. In 1792, when Emperor Qianlong instructed the governor-general of Zhili to provide relief to famine-stricken areas, he found out that the province’s reported grain reserve was grossly inflated and it had no capacity to execute his order (QSL, QL juan 1417).

In Japan, the shogunate established a nationwide system of rice stockpile in 1633 in which a fixed amount of rice was stored in over 50 castles in various domains for military and emergency purposes (Yanagitani 1985, 1989). During the Kyoho famine in 1732, the shogunate used this system to send a large volume of rice to disaster areas and successfully contained the damage (Yanagitani 1985; Kikuchi 1997). The same system, however, proved inadequate in coping with the much greater Tenmei famine from 1783 to 1786, which resulted in heavy casualties in the northern domains and major riots in Edo and Osaka. Not to repeat the dismal experience, the shogunate set up nonmilitary granaries modeled after China and (1) ordered magistrates to create village granaries and store 0.1 % of rice output (0.2 % of other grains) throughout the shogunate domain; (2) established ever-normal granaries in major cities; and (3) ordered local lords to build granaries in their domains and store 2.5 % of rice output (Ando 2000).Footnote 51

The new granaries complemented the existing ones and played a major role in mitigating the impact of the Great Tempo Famine from 1832 to 1837, especially in Edo (Yoshida 1991; Kikuchi 1997). Right after the famine, the shogunate ordered local lords to stock an additional 2.5 % of rice output in their granaries (Ando 2000).

The shogunate made periodic inspections to monitor the amount of rice in the castle and city granaries. Although the actual amount often fell short of the stipulated amount due to high maintenance and replenishment costs, corruption did not appear to be a major problem. Ando (2000) reports corruption cases in Edo involving town leaders overstating the number of needy households to misappropriate money, but the degree of overstatement was meager. Although data are scarce, our estimates indicate that the amount of emergency reserve increased steadily from 1750 to 1850 in Japan, reflecting the successive reforms. In the shogunate domain, the amount of reserve, measured in husked rice, grew from less than 300,000 koku in 1751 to 368,000 koku in 1843, and to 555,000 koku in 1861; in per capita terms, it rose from less than 0.038 koku in 1751 to 0.046 koku in 1843, and to 0.068 koku in 1861 (Table 1e).Footnote 52

In China, the amount of grain stockpiles in public granaries increased initially from 14.7 million koku in 1751 to 17.0 million koku in 1782, but declined subsequently to 11.9 million koku in 1843, and to 10.3 million koku in 1850. In per capita terms, it declined from 0.065 koku in 1751 to 0.060 koku in 1782, and fell further to 0.030 koku in 1843 and to 0.025 koku in 1850.Footnote 53

In other words, while the Qing emperor had higher capacity to provide famine relief than the Tokugawa shogun in the mid-eighteenth century, their positions were reversed by the mid-nineteenth century, which is consistent with our theoretical predictions.Footnote 54

5 Conclusion

In this paper, we provide a comparative and historical institutional analysis of state capacity in Qing China and Tokugawa Japan. We show that despite a common cultural, institutional, and technological heritage, there was already a fiscal divergence between China and Japan before their first attempts at modernization in response to the arrival of the West.

Theoretically and empirically, we demonstrate that the extraordinary geographical size of China imposed increasingly insurmountable constraints on the regime’s capacity to collect taxes and provide essential local public goods as its economy expanded.Footnote 55 It is our conjecture that this factor alone might have been sufficient in holding back China’s transition from stagnation to growth even in the absence of Western imperialism.Footnote 56

By contrast, aided partly by Japan’s geographical compactness, the Tokugawa shogunate and the local lords were able to perform basic functions and thereby maintain social order up until the arrival of the Black Ships.Footnote 57 Although the shogunate was replaced by a coalition of powerful local lords during the Meiji Restoration, the new government inherited from the Tokugawa period an established system of political controls and a functioning tax system that provided the newly established state with a sizable fraction of the national income from the onset (Sakane 2011a, b; Nakabayashi 2012).

It is often argued that the paths of China and Japan diverged after 1868 due to the rise of a more effective government in Japan (Beasley 1972; Ma 2004). In this paper, we show that the divergence of their state capacity was well underway before 1850. In light of recent findings on the importance of a proactive state in facilitating the transition from stagnation to growth (Doepke 2004; Galor et al. 2009; Dincecco and Prado 2012), we conjecture that Tokugawa Japan’s legacy of a strong state might have prepared Japan better for the age of industrialization. In other words, we see the proactive Meiji government as a product of Japan’s history, not a radical break from the past.