Imitators and optimizers in a changing environment

https://doi.org/10.1016/j.jedc.2007.06.001Get rights and content

Abstract

We analyze the repeated interaction between an imitating and a myopically optimizing player in an otherwise symmetric environment of changing marginal payoff. Focusing on finite irreducible environments and the case of strategic substitutes, we unfold a trade-off between the degree of interaction and the size of environmental shocks. The optimizer outperforms the imitator if interaction is weak or shocks are large. As to the case of duopoly, this translates into small cross-price elasticities or large shocks in marginal cost and/or the maximum willingness to pay. In these cases, a changing environment creates selection pressure against imitative behavior.

Introduction

The purpose of this paper is to analyze which type of behavioral rule, imitation or myopic optimization, earns highest payoff in submodular games with randomly perturbed marginal payoffs. Our motivation for looking into this issue comes from the fact that the evolutionary literature has shown that more sophisticated optimizers are often outperformed by less sophisticated imitators (Conlisk, 1980, Droste et al., 2002, Schipper, 2001). Particularly, imitators earn strictly higher payoff in submodular (or Cournot type) games if interaction takes place in a stable environment. In this case the imitation rule works like a commitment device, inducing the imitator to bring higher quantities to the market, thus leading to higher relative payoff.

Presuming a stable environment is a strong assumption. First, cost and/or demand conditions normally change over the business cycle. In that sense, the models referred to above, make an unrealistic assumption. Second, stability implicitly builds in an advantage of imitative behavior. Within an environment of changing demand or cost conditions, imitation becomes a more risky behavioral rule. For instance, an imitator might mimic a firm producing a high quantity last period, not taking into account that demand is lower in the current period. Whenever this leads to a market price below average cost, the imitator will incur lower payoff than an optimizer, who will produce a lower quantity because of reduced demand.

In this paper, we consider a two-player symmetric game. To introduce a changing environment we assume that marginal payoff follows an exogenous and stationary Markov chain on a finite environmental state space. One player, the optimizer, always adapts the myopic best reply to his rival's previous action, given the current state. The imitator, copies whichever action earned highest payoff in the previous period. Unlike other papers in the evolutionary literature, we add no noise to the players’ actions.

There are two channels of interaction between the two behavioral rules. First, the degree of interaction (measured by the slope of the best-response function) determines how strongly the optimizer will respond to a change in the imitator's quantity. Second, the size of environmental shocks decides how costly it would be for an optimizer not to adjust to the actual state of the environment. This trade-off is reflected in our main result, Theorem 3.3, according to which the optimizer earns strictly higher expected payoff in the long run if interaction is weak or environmental shocks are large.

The intuition can be understood by considering the limit where the degree of interaction is zero. In this limit, both players will act like independent monopolists, serving separate but identical markets. The myopic optimizer will always choose a monopolist's payoff-maximizing quantity while the imitator will mimic the optimizer's previous quantity. Obviously, the imitator will always earn strictly lower relative payoff whenever the environment has changed, since a different environmental state implies a different monopoly quantity. Thus, for some given set of environmental states, the optimizer will be better off, whenever interaction is sufficiently weak. The picture becomes blurred when we address the reverse question. It is not at all clear whether there exists an environment for any degree of interaction such that the optimizer outperforms the imitator. To tackle this problem, we provide an upper boundary on the degree of interaction such that for all lower degrees the question can be answered in the positive.

To establish our results, we rely on the theory of Markov chains with continuous state space and discrete time (Meyn and Tweedie, 1996). In particular, we show that introducing a stochastic environment with finitely many states suffices to make an otherwise deterministic process, defined on uncountable action spaces, ergodic. This setup does not allow us to apply the ergodicity theorems typically used in the evolutionary literature (cf. e.g. Kandori et al., 1993, Young, 1993; and Vega-Redondo, 1997). As a further obstacle, our process does not satisfy the (weak) Feller property on the overall state space. This is due to the imitator's response being discontinuous on one of the zero relative profit lines. In consequence, we cannot apply the standard ergodicity theorems for continuous state space Markov chains either (cf. e.g. Stokey et al., 1989, Chapter 12).

The main step in establishing the evolutionary advantage of an optimizer in changing environments is then (a) to find a recurrent absorbing subset of the state space on which the optimizer earns strictly higher payoff than the imitator and (b) that this subset will be reached in an uniformly expected finite number of periods from all states outside this set. On this absorbing subset, the imitator's response becomes a continuous function of the optimizer's previous quantity. Our main theorem builds on this property to show that the process restricted to the absorbing set is a T-chain – a stronger property than the Feller property.

Gale and Rosenthal, 1999, Gale and Rosenthal, 2001 and Rhode and Stegeman (2001) examine the dynamic interaction between different behavioral rules in a changing environment. Rhode and Stegeman (2001, Appendix) present simulations where two players – one imitator and one econometrician – interact within an occasionally changing environment. The econometrician does not know the payoff function (i.e., the current state), but regresses it based on all strategies observed in the past. The simulations show that the imitator outperforms the econometrician in the presence of structural change. Note, however, that Rhode and Stegeman (2001) do not give the econometrician the edge of observing the current environmental state.

The papers by Gale and Rosenthal study the interaction between one single experimenter and a finite number of imitators. While the experimenter randomly searches for a better strategy, the imitators adjust towards the average action of other agents. In contrast to Gale and Rosenthal, we get two important properties without relying on random experimentation. First, in terms of Gale and Rosenthal, our overall process is stable in the large. That is, it converges with probability one to an absorbing subset of the state space. Second, it is unstable in the small: any small subset of the absorbing set is left with probability one. The latter result is a direct consequence of our focus on changing environments. Interestingly, a changing environment thus suffices to obtain these two properties.

The paper is organized as follows. In Section 2 we outline the model, which we subsequently analyze (Section 3). Section 4 discusses our main assumptions. Section 5 concludes.

Section snippets

The stage game and the environment

We consider a symmetric two-player game with players O and I. Every period each player chooses an action qQ[0,q¯]. Given the action profile (qO,qI) and payoff parameters π=(π11,π12,π1,π0), player i's payoff is un=-π11(qn)2/2-π12qOqI+π1qn+π0forn=O,I.Focussing on strategic substitutability, we assume π11>π12>0 and π1>0.

Let payoff parameter π1 follow a stationary Markov chain with state space Θ={θ1,,θH} and strictly ordered states, 0<θ1<<θH, where H2. Let H{1,,H}, be the corresponding index

Analysis

In this section, we provide a condition that allows us to decompose the state space into two sets, one uniformly transient and the other absorbing (Lemmas 3 and 4). Lemma 3 moreover shows that the optimizer earns strictly higher payoff on the absorbing set. Subsequently, we show that the process restricted to the absorbing set constitutes a ϕ-irreducible aperiodic positive Harris chain (Theorem 3.1). It follows that it has a unique invariant distribution, that the strong law of large numbers

Discussion of the main assumptions

In this section, we comment on the main assumptions of our analysis.

Assumption (E): We check where Assumption (E) enters the analysis. First, we use it to show that the set X2 will be left in H+1 periods with strictly positive probability. The proof relies on (θH-θ1)/θH>ρ. This represents a weaker condition since Assumption (E) implies (θH-θ1)/θH>2ρ(H-1). Second, Assumption (E) entails q¯h<q̲h+1 for all hH{H} and hence X^ijX^ij= for ii or jj. To guarantee these properties, we could

Conclusions

The purpose of this paper has been to analyze the dynamic interaction between an imitator and an optimizer in a changing environment. To this end, we put forward a symmetric quadratic two-player game, recurrently played by one imitator and one myopic optimizer within an environment of changing marginal payoff. To create a stark contrast with the stable environment, we investigated the polar opposite case of a permanently changing environment. Our focus was on aperiodic stochastic environments.

Acknowledgments

We thank three anonymous referees and the associate editor for helpful comments. The paper was written while the first author was visiting the University of Bergen. He would very much like to thank the people at the Department of Economics for creating such a warm and inspiring research atmosphere and Ruhrgas for making this research visit possible.

References (17)

There are more references available in the full text version of this article.

Cited by (10)

  • Imitation by price and quantity setting firms in a differentiated market

    2015, Journal of Economic Dynamics and Control
    Citation Excerpt :

    In this state, optimisers and imitators earn the same profit, and there is no evolutionary pressure that acts against either of the two behavioural rules. A second way of incorporating behavioural heterogeneity, similar in spirit to Hehenkamp and Kaarbøe (2008), Josephson (2009) and Schipper (2009), is by saying that an individual firm either always imitates, or always optimises. The difference between this conceptualisation and the one in the preceding paragraph is that here, there is no within firm behavioural heterogeneity; rather, the behavioural heterogeneity is across firms.

  • Altruistic versus egoistic behavior in a Public Good game

    2012, Journal of Economic Dynamics and Control
    Citation Excerpt :

    There is a growing interest in appraising the effect of rule selection when games are played repeatedly. See, for example Conlisk (1980), Robson (2003), Juang (2002), Droste et al. (2002), Hommes (2006), Josephson (2008), Hehenkamp and Kaarbøe (2008), and Schipper (2009). Conlisk (1980) analyzes a dynamic model with imitators and optimizers.

  • Cooperation through imitation and exclusion in networks

    2011, Journal of Economic Dynamics and Control
  • Imitators and optimizers in Cournot oligopoly

    2009, Journal of Economic Dynamics and Control
  • Stochastic adaptation in finite games played by heterogeneous populations

    2009, Journal of Economic Dynamics and Control
    Citation Excerpt :

    He shows that the stochastically stable states of the process correspond to Stackelberg equilibria where the imitators are better off than the best repliers. On the other hand, Hehenkamp and Kaarbøe (2008) show, in the setting of a symmetric two-player game with an imitator and a myopic best replier, that there is evolutionary pressure against imitators in a changing environment under certain conditions. A few papers also allow individuals to switch behavior as a function of their past performance.

View all citing articles on Scopus
View full text