The Nature of Equilibrium in Macroeconomics: A Critique of Equilibrium Search Theory

The standard Walrasian equilibrium theory requires that the marginal value product of production factor such as labor is equal across firms and industries. However, productivity dispersion is widely observed in the real economy. Search theory allegedly fills this gap by encompassing apparent disequilibrium phenomena in the neoclassical equilibrium framework. Taking up Lucas and Prescott (1974) as a primary example, we show that the neoclassical search theory cannot explain the observed pattern of productivity dispersion. Non-self-averaging, a concept little known to economists, plays the major role. Empirical observation suggests strongly the presence of disturbing forces which dominate equilibrating forces due to optimizing behavior of economic agents. We must seek a new concept of equilibrium different from the standard Walrasian equilibrium in macroeconomics. --


Introduction
In every branch of economics, equilibrium is a central organizing concept. The Walrasian equilibrium which is arguably most important of all, was once confined to the realm of microeconomics. Macroeconomics was then synonymous with Keynesian economics. It was taken for granted that Keynesian economics meant to explain demand deficiency, unemployment, and recession, analyzes different kind of equilibrium than the Walrasian equilibrium. For example, a famous treatise on general equilibrium theory by Arrow and Hahn (1971) has an independent chapter entitled the Keynesian model. Such understanding of macroeconomics has been completely redrawn over the last forty years. Real business cycle (RBC) theory (Kydland and Prescott 1982) now being taught at many leading graduate schools all over the world is basically a macro version of the Walrasian equilibrium theory. A moment of reflection, however, suggests to us that the standard Walrasian equilibrium cannot well account for the productivity dispersion across firms and industries widely observed in the real economy. Mortensen (2003), for example, documents that the marginal value product of labor differs across firms. Okun (1973) attempted to explain his own celebrated law by way of productivity dispersion in the economy. Obviously, the Walrasian equilibrium which requires the uniformity of marginal value product of production factor like labor contradicts such well known empirical findings.
Search theory allegedly fills this gap by encompassing apparent "disequilibrium" phenomena such as unemployment and productivity dispersion in the neoclassical equilibrium framework. Many economists believe that this endeavour succeeded, and, therefore, that we can well explain apparent "disequilibrium" phenomena by the standard neoclassical theory. Lucas (1987) concluded his Yrjo Jahnsson Lectures as follows: The most interesting recent developments in macroeconomic theory seem to me describable as the reincorporation of aggregative problems such as inflation and the business cycle within the general framework of "microeconomic" theory. If these developments succeed, the term "macroeconomic" will simply disappear from use and the modifier "micro" will become superfluous. We will simply speak, as did Smith, Ricardo, Marshall and Warlras, of economic theory. If we are honest, we will have to face the fact that at any given time there will be phenomena that are well-understood from the point of view of the economic theory we have, and other phenomena that are not. We will be tempted, I am sure, to relieve the discomfort induced by discrepancies between theory and facts by saying that the ill-understood facts are the province of some other, different kind of economic theory. Keynesian "macroeconomics" was, I think, a surrender (under great duress) to this temptation. It led to the abandonment, for a class of problems of great importance, of the use of the only "engine for the discovery of truth" that we have in economics. Now we are once again, putting this engine of Marshall's to work on the problems of aggregate dynamics. (Lucas, 1987;p.107-108.) The purpose of this paper is to show that Lucas' verdict is unwarranted. The neoclassical equilibrium search theory cannot, in fact, explain an important stylized fact of the macroeconomy, namely the pattern of productivity dispersion observed in the real economy. We need a different approach to understand the macroeconomy than microeconomics. To be concrete, in what follows, we take up Lucas and Prescott (1974) as a primary example of the equilibrium search theory. However, the major point of the present paper does not pertain only to their specific model, but is quite generic. Specifically, our criticism applies to another well-known model of equilibrium search due to Mortensen and Pissarides (1994).

2
Equilibrium Search and Unemployment Lucas and Prescott (1974) is a model of equilibrium search and unemployment. The model, in the authors' own description, is as follows: We think of an economy in which production and sale of goods occur in a large number of spatially distinct markets. Product demand in each market shifts stochastically, driven by shocks which are independent over markets (so that aggregate demand is constant) but autocorrelated within a single market. Output to satisfy current period demand is produced in the current period, with labor as the only input. Each product market is competitive.
There is a constant workforce which at the beginning of a period is distributed in some way over markets. In each market, labor is allocated over firms competitively with actual money wages being market clearing. Each worker may either work at this wage rate, in which case he will remain in this market into the next period, or leave. If he leaves, he earns nothing this period but enters a "pool" of unemployed workers which are distributed in some way over markets for the next period. In this way, a new workforce distribution is determined, new demands are "drawn", and the process continues.
In this process, all agents are assumed to behave optimally in light of their objectives and the information available to them. For firms, this means simply that labor is employed to the point at which its marginal value product equals the wage rate. For workers, the decision to work or to search is taken so as to maximize the expected, discounted present value of the earnings stream. In carrying out this calculation, workers are assumed to be aware of the values of the variables affecting the market where they currently are (i.e., demand and workforce) and of the true probability distributions governing the future state of this market and the present and future states of all others. That is, expectations are taken to be rational. (Lucas and Prescott, 1974;p.190) Markets are all competitive, so that the marginal value product of labor equals the wage in every market. However, the state of demand represented by a realization of a stochastic variable s differs across markets while at the same time, mobility of labor is not instantaneous. As a consequence, the marginal value products of labor and wages differ across markets. That is, in contrast to the standard general equilibrium model, in Lucas and Prescott (1974) model, productivity dispersion exists in equilibrium as actually observed in the economy. The problem is the nature of stochastic equilibrium in their model.

Stochastic Equilibrium in Lucas/Prescott Model
The stochastic disturbances in Lucas and Prescott (1974) are the demand shifts s. They are assumed to be independent across markets and the number of markets is large.
By large, we mean either a continuum of markets or a countable infinity. Economically, then, the assumption of independent demand shifts means that aggregate demand is taken to be constant through time. (Lucas and Prescott, 1974; Footnote 8 on p.192) The micro disturbances are assumed to cancel each other. The central limit theorem is implicitly assumed to hold true. As Lucas and Prescott acknowledge, "the direct ancestor" of their model is Phelps (1969)' famous "island model" in which N islands meant to describe N local markets are identical in structure and are at equal distance from each other. This assumption is common in literature, and may appear innocuous. However, it is actually very special, and crucial in leading us to the result which does not square with productivity dispersion actually observed in the economy.
One may think of Lucas and Prescott's N markets as leaves of a one-level tree with N branches from the root. This organization is a special case of multi-level trees; see Chapter 5 of Aoki and Yoshikawa (2007a) for ultrametric trees. In the one-level tree arrangement of N markets, each branch is the same as any other branch because markets are identical by assumption. Then, every one of the N markets can serve as a representative market. Mixing these markets randomly by introducing a probability distribution, as Lucas and Prescott do in their paper (to be specific, their probability distribution Φ on p.198) does nothing to the model. The mixture is identical to any one of the branches; that is, the mixture is again a representative market.
Thus, Lucas and Prescott can describe the determination of the stationary distribution of employment, workforce, and wages or marginal value products in a representative market. On their own assumption, they state as follows: The distribution of the workforce over locations (indexed by (s, y)) would in this case be the same as the stationary distribution of (s, y) in any one market. (This follows from our assumptions that the number of markets is large and that demand shifts are independent across markets.) (Lucas and Prescott, 1974;p.202) The same assumption allows Lucas and Prescott to focus on the means characteristics of which are described in a representative market. Specifically, worker's search depends crucially on the expected present value of search, λ. The maximization exercises (Section 3 of their paper) are done on the assumption that λ is common to all the markets, and that "the search process eliminate rents on average." The focus on the means is justified by the central limit theorem, and the assumption that the number of markets N is large. With the normal distribution, for example, the coefficient of variation, that is the ratio of standard deviation over the mean, converges to zero as N goes to infinity. This property is called self-averaging.
When the coefficient of variation does not converge to zero even if N goes to infinity, the model is said to be non-self-averaging. In such a case, the focus on the means is not justified even if N is large. In what follows, we explain that a large class of models are, in fact, non-self-averaging, and that self-averaging applies only to very special cases. Furthermore, the observed productivity dispersion points to non-selfaveraging.

Non-Self-Averaging and Power-Law
The normal and Poisson distributions commonly assumed in economics are selfaveraging. However, self-averaging does not hold true for a large class of stochastic models. 1 In fact, recent empirical works point to non-self-averaging. For example, the empirical distribution of the marginal value product of labor is found to obey the power law; see Aoyama, Yoshikawa, Iyetomi and Fujiwara (2008). When the summands are distributed as power-law, a normalized sum does not have vanishing coefficient of variation; namely the system is non-self-averaging.

Non-Self-Averaging: An Example
We can best understand how non-self-averaging arises with the help of a simple model of growth. We assume that the economy grows by innovations. Innovations are shochastic events. There are two kinds of innovations. Namely, an innovation, when it occurs, either raises productivity of one of the existing sectors, or creates a new sector. Thus, the number of sectors is not given, but increases over time.
By the time n-th innovation occurs, the total of n sectors are formed in the economy wherein the i-th sector has experienced n K i innovations (i = 1,2,…, ). By definition, the following equality holds: n K n 1 + n 2 + …+ n k = n (1) when K n =k. If n-th innovation creates a new sector (sector k), then n k =1.
The aggregate output or GDP when n innovations have occured is denoted by Y n . Y n is simply the sum of outputs in all the sectors, y i .
Output in sector i grows thanks to innovations which stochastically occur in that sector. Specifically, we assume For our purpose, it is convenient to rewrite Equation (1) as follows. (4)
We now describe how innovations stochastically occur. An innovation follows the two parameter Poisson-Dirichlet (PD) distribution. 3 Given the two-parameter PD (α, θ) distribution, when there are k clusters of sizes i , (i = 1, 2,…,k), and n=n n 1 +n 2 +…+n k , an innovation occurs in one of the existing sectors of "size" n i with probability rate : The "size" of sector i, n i is equal to the number of innovations that have already occurred in sector i. The two parameters α and θ satisfy the following conditions: α θ + > 0, and 0 < α < 1.
With α = 0 there is a single parameter θ , and the distribution boils down to the one-parameter PD distribution, PD(θ).
On the other hand, a new sector emerges with probability rate 4 p: _________________________ 2 See Chapter 2 of Aoki and Yoshikawa (2007a) for partition vector.
3 Kingman invented the one-parameter Poisson-Dirichlet distribution to describe random partitions of populations of heterogeneous agents into distinct clusters. The oneparameter Poisson-Dirichlet model is also known as Ewens model, (Ewens 1972); see Aoki (2000aAoki ( , 2000b for further explanation. The oneparameter model was then extended to the two-parameter Poisson-Dirichlet distributions by Pitman; see Kingman (1993), Carlton (1999), Feng and Hoppe (1998), Pitman (1999, 2006, and Pitman and Yor (1996), among others. Aoki (2008) has shown that the two-parameter Poisson-Dirichlet models are qualitatively different from the one-parameter version because the former is not self-averaging while the latter is. These models are therefore not exponential growth models familiar to economists but they belong to a broader class of models without steady state constant exponential growth rate. None of the previous works, however, have comparatively examined the asymptotic behavior of the coefficient of variation of these two classes of models.
It is important to note that in this model, sectors are not homogeneous with respect to the probability that an innovation occurs. The larger sector i is, the greater the probability that an innovation occurs in sector i becomes. Moreover, these probabilities change endogenously as changes over time.
i In the two-parameter PD ( n θ α, ) distribution, the probabilitiy that the number of sectors increases by one in n + 1 conditional on k K n = , is given by 5 On the other hand, the corresponding probability that the number of sectors remains unchanged is It can be shown that this two-parameter PD model is non-self averaging. Namely, in the two-parameter PD model, the aggregate output Y n becomes non-self-averaging (Aoki 2008;Aoki and Yoshikawa 2007b). We note that the one parameter PD model (α = 0) is self-averaging. It is then important to understand why the two-parameter PD model is non-self averaging. The answer lies in (10) and (11).
In this model, innovations occur in one of the two different types of sectors, one, the new type and the other, known or pre-existing types. The probability that an innovation generates a new sector is (θ + K n α)(n + θ) whereas the probability that an innovation occurs in one of the existing sectors is (n -K n α)(n + θ). K n  is the number of types of sectors in the model by the time n  innovations occurred. Plainly, these probabilities and their ratio vary endogenously, depending on the histories of how innovations occurred. In other words, the mix of old and new sectors evolve endogenously, and is pathdependent. Specifically, the greater the number of existing sectors is, the greater the probability that a new sector emerges becomes. A kind of "size effect" on probability is the reason why non-self averaging emerges in the two parameter PD model. We note that in one parameter PD model in which α= 0, two probabilities (10) and (11) become independent of K n , and that the model becomes self-averaging.
The example explained above is a growth model. However, it should be understood easily that the point is generic. Namely, the two parameter Poisson-Dirichlet model in which the probabilities vary endogenously depending on the histories of the "events" leads us to non-self-averaging. To the extent that a kind of "size effect" on probabilities ((10) and (11) above) is generic, we should expect that non-self-averaging is generic.

_________________________
4 Probabilities of new types entering Ewens model, are discussed in Aoki (2002, Sec.10.8, App. A.5). 5 Because the following inequality holds: θ α θ + + n k > θ θ + n we observe that the probability that a new sector emerges is higher in the two-parameter PD model than in the one-parameter PD model.

www.economics-ejournal.org
Despite of this fundamental fact, virtually all the models of equilibrium search rest naively on the assumption of self-averaging.

Concluding Remarks
In this paper, we explained that self-averaging taken for granted by economists is not actually so robust but holds true only for a limited class of models. When model is nonself-averaging, we cannot legitimately focus on the means. It, in turn, means that the maximization exercises done for the representative agent or a representative market are meaningless.
The fact that the empirical productivity dispersion obeys the power-law (Aoyama, Yoshikawa, Iyetomi and Fujiwara 2008) rather than the normal distribution strongly suggests that the macroeconomy is non-self-averaging. It has an extremely important implication.
The optimizing behavior of economic agents introduced into modern micro-founded macroeconomics produces the "regression towards means" because a price vector common to all the economic agents guide them that way; Workers move away from low to high productivity sectors. In the limit, in the standard Walrasian model, equilibrium price vector equates the marginal conditions across agents. In Lucas / Prescott model, the mobility of labor is not perfect, and as a result, productivity dispersion persists. However, guided by expected present value of search λ common to all the workers, labor flows away out of low-productivity sectors toward high-productivity sectors. This process which Lucas and Prescott analyze in detail necessarily narrows dispersion.
The power-law distribution of productivity as actually observed, however, suggests that the disturbances to the macroeconomy which generates non-self-averaging actually dominates the "regression towards means" due to the optimizing behavior of economic agents. Thus, productivity dispersion in the macroeconomy cannot be properly accounted for by the equilibrium search theory such as Lucas and Prescott (1974), which rests heavily on the assumption of self-averaging and maximization exercises.
To understand productivity dispersion, we must explore disturbing forces generating non-self-averaging rather than equilibrating forces due to optimizing behavior of economic agents. Some of such disturbing forces are analyzed empirically by Davis, Haltiwanger, and Schuh (1996) under the heading of job creation and destruction.
An important research topic is to explore the stochastic process of these disturbing forces, or equivalently, the nature of "stochastic macro-equilibrium" due to Tobin (1972). We explained in Section 4 that non-self-averaging emerges when "size-effects" on probabilities are present. Simon (1975 and1977) present a model in which power-law emerges. The stochastic equilibrium in the macroeconomy must be analyzed by such models in which disturbing forces generate power-law and non-self-averaging. Perhaps surprisingly, it resurrects the old Keynesian economics or the principle of effective demand; see Yoshikawa (2003), Chapter 3 of Aoki and Yoshikawa (2007a), and Aoyama, Yoshikawa, Iyetomi and Fujiwara (2008).