Only Time Will Tell: Credible Dynamic Signaling

This paper characterizes informational outcomes in a model of dynamic signaling with vanishing commitment power. It shows that contrary to popular belief, informative equilibria with payoff-relevant signaling can exist without requiring unreasonable off-path beliefs. The paper provides a sharp characterization of possible separating equilibria: all signaling must take place through attrition, when the weakest type mixes between revealing own type and pooling with the stronger types. The framework explored in the paper is general, imposing only minimal assumptions on payoff monotonicity and single-crossing. Applications to bargaining, monopoly price signaling, and labor market signaling are developed to demonstrate the results in specific contexts.


Introduction
An antitrust case was recently brought against iPhone producer Apple by Epic Games, which tries to determine "whether the market for in-app purchases within the App Store is unfairly monopolistic, and whether iOS itself is a monopoly that should be opened up to third-party stores and side-loaded apps" (Robertson [2021c]).This can be seen as the culmination of long-lasting rumblings among the iPhone app developers.For example, as reported by the New York Times a year earlier: Many companies and app developers complain that Apple forces them to pay its commission to be included in the App Store, which is crucial to reaching the roughly 900 million people with iPhones.... "If you're not in the App Store today, you're not online.
Your business cannot function," said Andy Yen, the chief executive of ProtonMail.... "If you want to pass through their gates, they're going to charge you 30 percent of your revenue."(Nicas and McCabe [2020]) One of Apple's main defensive points in regards to its App Store monopoly, both within the scope of this case and more broadly, is that it never raised the commission it charges the app developers, even after it had allegedly acquired substantial monopoly power in the smartphone market. 1Apple argued that it has, in fact, reduced the commission for some developers as recently as 2020 via its Small Business Program (see Statt [2020] for more details).The judge in the Epic v. Apple hearings was not convinced by this argument and has argued that this is not indicative of competition: "The issue with the $1 million Small Business Program, at least from what I've seen thus far: that really wasn't the result of competition.That seemed to be a result of the pressure that you're feeling from investigations, from lawsuits, not competition," (Robertson [2021b]) -said the judge, referring to the increasing scrutiny that Apple and other tech giants have been facing from the regulators in recent years. 2 Does Apple's argument have merit?Can the prices that a firm sets serve as evidence of its competitive environment when it faces regulatory pressure?When the firm has market power, it can still price low in an attempt to signal to the regulator that it is pressured by competition -but it is also facing a stronger temptation to exploit its market power and set high prices than a genuinely competitive firm.The classical signaling theory a la Spence [1973] suggests that both pooling and separating equilibria can arise, i.e., prices may or may not be informative of the firm's competitive environment.However, the standard signaling models are static, meaning that for their results to apply, the firm in our story needs to be able to not only set the price today, but also commit to not change it in the future.Without such commitment no single pricing decision would seemingly have enough weight to be informative, as originally noted in a different context by Admati and Perry [1987]: if the firm could set some price that conclusively proves it faces strong competition, it would do so for a short time, document this decision, and use it as proof in the future (after reverting back to monopolistic pricing).
This paper shows that the scope for informative signaling, while limited, does in fact exist in dynamic settings without commitment, contrary to the intuition above.In particular, it explores a general signaling model, in which a single long-lived sender is privately informed of his type and engages in a repeated interaction with a receiver, where the periods are vanishingly short.The receiver makes inferences about the sender's type from his action choices, and the sender's payoff is increasing in his reputation with the receiver.We show that under appropriate monotonicity and single-crossing conditions on the sender's payoff function, payoff-relevant signaling is possible in this setting via what is effectively a war of attrition, in which all sender types pool on the same action, with the lowest type mixing between pooling with the rest and separating to a myopically optimal action.Beyond such attrition, actions are as informative as cheap talk -meaning that attrition is the only way in which payoff-relevant signaling can proceed in this class of models.
The contribution of this paper is both in showing the existence of a wedge between signaling and cheap talk in the setting under consideration (i.e., that signaling is possible), and characterizing 1 "In the more than a decade since the App Store debuted, [Apple has] never raised the commission or added a single fee.In fact we have reduced them for subscriptions and exempted additional categories of apps."(Written testimony of Apple CEO Tim Cook, available in Balluck [2020]).This testimony was referred to during the Epic v. Apple hearings (Robertson [2021a]).In particular, during the closing remarks, "Apple countered by claiming that its cut of app and in-app purchases ... didn't go up at all when Epic Games claimed Apple became a monopoly in 2010 " (Peterson [2021], emphasis added).
the equilibrium outcomes.
In the context of the Apple example, this paper implies that while singular price drops or cuts can not be a convincing evidence of a lack of monopoly power in equilibrium, persistent pricing at a low level can be suggestive of it.A monopolist would in such an informative equilibrium mix between maintaining the pooling (competitive) price and raising it to a monopoly level, revealing itself.However, there exists no perfectly separating equilibrium, in which the firm charges different prices when it has monopoly power and when it is competing against other firms in the industry.
As a consequence, it is not possible to completely rule out the possibility that the firm has market power based on pricing decisions alone.All this is demonstrated by the applied model in Section 5.2.
The main model explored in this paper is substantially more general than the story above and is intended to serve as a framework suitable for many applications.To illustrate this, Section 5 also develops applications to bargaining and labor market signaling.In the former, a seller who privately knows the value of the item bargains with a single potential buyer over the price of the sale.In the latter, a student privately informed of his ability repeatedly chooses whether to invest in education, which may signal this ability to firms on the job market.Both problems are classical applications of signaling theory.We show that the results from the general model apply, so all informative equilibria in each of these settings must take the attrition form.
Going back to the general model, it is worth noting that even though the attrition structure is restrictive, it allows for nontrivial equilibrium multiplicity.In addition to various possible combinations of informative and uninformative periods, multiple attrition outcomes can be sustained in equilibrium in any given period, which differ with respect to the probability of separation of the lowest type.So while attrition is the unique form that informative equilibria can take, multiple informative equilibria may exist in which attrition proceeds at different speeds.The extent to which this dimension of multiplicity manifests in a given setting depends on the richness of the action set.
Finally, this paper provides a takeaway regarding modelling assumptions that would be valuable to applied theorists investigating whether the receiver perfectly learns the sender's type (equivalently, whether social learning occurs) in a given setting in the limit as t → ∞.In particular, if one adopts the simplifying assumption of there being only two types, then they could plausibly arrive at a conclusion that asymptotic learning is perfect.Yet, as this paper implies, this conclusion would not extend to the setting with finitely many types: learning can only occur regarding the lowest type, but cannot distinguish any of the higher types.In turn, this impossibility result for finite types does not necessarily extend to the setting with an interval of types, where asymptotic learning is possible again (Fuchs and Skrzypacz [2010] provide an example of such model and equilibrium in the bargaining context).The latter observation implies that a finite-type approximation of a continuous-time signaling model may produce misleading results.
The fact that informative equilibria of attrition form exist in dynamic signaling models has been observed in applied models before.In particular, similar equilibria in specific settings have been obtained by Vincent [1990], Deneckere and Liang [2006], Daley and Green [2012], Lee and Liu [2013], Dilmé and Li [2016], Dilmé [2017], Kaya and Kim [2018] in the context of bargaining; Strebulaev, Zhu, and Zryumov [2016] in corporate finance; Vettas [1997], Aköz, Arbatli, and Celik [2020], Gryglewicz and Kolb [2021], Smirnov and Starkov [2021] in industrial organization/marketing; Smirnov and Starkov [2019] and Vong [2021] in cheap talk games; Gul and Pesendorfer [2012] in disclosure games; De Angelis, Ekström, and Glover [2021] in Dynkin games; Pei [2021] in trust games.The contribution of this paper is in setting up a general model that nests many of the models above and in identifying the sufficient conditions that yield uniqueness of attrition as the only informative equilbrium outcome.
Our analysis relies on the restriction of off-equilibrium path beliefs to be "reasonable".In particular, we adopt the assumption of non-increasing belief supports or, as labeled by Osborne and Rubinstein [1990], NDOC ("Never Dissuaded Once Convinced") assumption.As the name suggests, it implies that once the receiver has ruled out some type of the sender as impossible, the receiver stands by this belief and never again assigns positive probability to that type, including off the equilibrium path.Kaya [2009] and Roddie [2012a,b] have shown that in the absence of NDOC full instantaneous separation is possible in dynamic settings, since the sender's behavior can be disciplined by strong reputational threats in case of deviations.While the approach can be justified when the sender's type may change over time and hence needs constant re-verification, in other settings it is susceptible to a critique of using unreasonable off-path threats to sustain an equilibrium -a practice typically reproved in the literature on equilibrium refinements for static signaling games, as well as equilibrium concepts for dynamic games. 3his paper belongs to the literature on signaling models, which developed from a seminal contribution by Spence [1973].See Riley [2001] for an excellent survey of the early literature on static signaling models.Admati and Perry [1987] were among the first to recognize that if signaling is viewed as a dynamic process, and the sender cannot commit (contractually or otherwise) to future actions, then perfectly separating outcomes may not be sustainable in equilibrium, as in our salesman example above.Beaudry and Poitevin [1993] proposed a similar point in a contracting model, in which the informed party can propose to renegotiate after a contract is signed and before it is executed.The literature has responded to this conceptual challenge by searching for aspects of such dynamic interactions which would neutralize this impossibility and restore perfectly separating equilibria.For example, Weiss [1983] considers a model in which the sender derives explicit utility from signaling.Nöldeke and van Damme [1990a] and Swinkels [1999] argue that perfect separation can be sustained via tacit collusion on the receivers' side. 4Roddie [2012a,b] obtains perfectly separating outcomes in a dynamic signaling model, in which the receiver's beliefs violate NDOC, which is justified by the possibility that the sender's type can change over time, and thus needs to be constantly reaffirmed.This paper shows instead that informative -albeit not perfectly separating -equilibria can exist in dynamic settings without any of the aforementioned features.Dilmé [2017], Heinsalu [2018], and Whitmeyer [2021] consider dynamic signaling models in which the sender's actions are only imperfectly observed by the receiver.While equilibria with perfect separation in strategies are possible in such settings, the receiver can not learn the sender's type instantaneously.This paper shows that exogenous noise in observations is not necessary to generate such an equilibrium with gradual learning, and the noise can stem from the sender's strategy instead.
The remainder of this paper is organized as follows.Section 2 sets up the general model and introduces the two assumptions that serve as sufficient conditions for our results: payoff monotonicity and NDOC.We then proceed to analyze two versions of this model.The two-type version in Section 3 can be seen as an illustrative example.The version with finitely many types, which requires an additional single-crossing assumption, is then explored in Section 4. Section 5 considers applications to price signaling, bargaining and labor market signaling, setting up the respective models and verifying that the required assumptions hold.Section 6 concludes.The proofs of most statements are relegated to Appendix A. Appendix B constructs an informative equilibrium in the context of the price signaling application from Section 5.2.

Primitives
We will be looking at a continuous limit of a discrete-time infinite-horizon game.Time is indexed by t ∈ T ≡ {0, dt, 2dt, ...}; all the results apply to the limit as dt → 0. There are two players: a long-lived sender (agent) and a receiver.The agent has some persistent type θ ∈ Θ, where Θ ⊆ R is finite.Equivalently, θ can be the state of the world that the agent is privately informed of.
In every period t, a Stackelberg-type sequential game is played between the agent and the receiver.5First, the agent chooses an action a t ∈ A from a compact action set A. This action choice affects the realization of a public outcome x t ∈ X, the distribution of which at time t depends on the agent's type θ and the past history h t .We assume that outcomes never allow to perfectly identify θ: the support of x t conditional on (h t , a t , θ) is independent of θ. 6 After an outcome has realized, the receiver selects an action b t ∈ B from a compact set B. A time-t history is defined as h t ≡ ((a 0 , x 0 , b 0 ), . . ., (a t−dt , x t−dt , b t−dt )), i.e., as the record of past actions and outcomes up to, but not including, time t ∈ T .Let H t denote the set of all such time-t histories, and H ≡ ∪ t∈T H t be the set of all such histories.For any two periods s > t, we write h s h t if history h s succeeds history h t , i.e., h s and h t coincide on (0, . . ., t − dt).Weak succession is denoted as h s h t and means "either s > t and h s h t , or s = t and h s = h t ".
The receiver begins the game with a commonly known prior belief p 0 ∈ ∆(Θ) about the agent's type.In every period t the receiver observes the agent's latest action a t and the realized outcome To avoid having to deal with the repeated game effects and focus purely on the signaling concerns, we will be working under the assumption that the receiver is myopic and only maximizes the current flow payoff.This assumption can be justified on its own merit in some settings, e.g., when "the receiver" is a proxy for a competitive market of receivers or a sequence of short-lived players.7This assumption is not strictly necessary, although it simplifies the exposition.Footnote 9 describes the extent to which the results can be applied in a model with a strategic receiver.
The receiver's strategy is b: Strategy b * is optimal for the receiver at a given history h t if the action it prescribes maximizes the receiver's current flow payoff given the agent's strategy and the receiver's belief at that history where the expectation is taken over θ.For simplicity, assume no sunspots: for any a, x, and any a, x).Moving on to the agent, define a bliss (myopically optimal) action set for the agent of type θ at history h t given receiver's strategy b as where the expectation is taken over x t .Note that b * exists and A * is non-empty due to the upper semi-continuity of the respective utility functions w and u.A pure strategy for the agent of type θ is a θ : H → A. Given some belief system p and the receiver's strategy b, let U (a θ |h t , b, θ) denote the expected discounted continuation utility of type θ from following strategy a θ starting from h t ∈ H: where r is the agent's discount rate, and the expectation is taken over future outcomes x s .The agent is assumed to maximize his expected discounted sum of utilities.Strategy a θ is optimal for the agent of type θ given belief system p and receiver's strategy b if it maximizes his continuation payoff at every history h t ∈ H, i.e., if where V (h t , b, θ) is hereinafter referred to as the value function.With a slight abuse of notation we let Finally, a behavioral strategy for the agent of type θ is α θ : H → ∆(A).By the Kuhn's Theorem (Aumann [1964]), behavioral strategies are equivalent to mixed strategies in this setting.
Let α θ (a|h t ) denote the probability with which action a should be played by type θ after history h t according to strategy α θ (h t ).A behavioral strategy α θ is then optimal for θ if there exists an equivalent mixed strategy (i.e., a probability distribution over pure strategies), such that all pure strategies in its support are optimal.

Equilibrium Concept
Introduced above is a dynamic game of incomplete information.The greatest common factor among the solution concepts used for this class of games (and requiring belief consistency) is Perfect Bayesian Equilibrium (PBE).In such an equilibrium, all players maximize their expected continuation payoffs given their beliefs about other players' actions and beliefs, and these beliefs must be consistent on path with the players' knowledge of the game.
Definition 1.A Perfect Bayesian Equilibrium is given by an agent's strategy profile α = {α θ } θ∈Θ with α θ : H → ∆(A), a receiver's strategy b: H × A × X → B, and a belief system p : H × A × X → ∆(Θ) such that: 1. for all θ: strategy profile α θ is optimal for the agent of type θ; 2. strategy b is optimal for the receiver at all histories h t ∈ H; 3. belief p is updated using Bayes' rule whenever possible.
Our main results characterize signaling in all PBE that satisfy asumption (NDOC) as defined in the following subsection, hence they will also apply if one imposes additional restrictions or equilibrium refinements on top of PBE with (NDOC).As mentioned previously, we explore equilibria for small but positive dt, and we are interested in the properties of these equilibria as dt → 0.

Assumptions
The two sections above define the primitives of the model but impose only very minimal restrictions on them.This section describes the two significant assumptions that will be imposed The assumptions can then be phrased as follows (with the discussion following afterwards): (MON) Flow payoff function ũt (a t , p t , θ) is weakly increasing in p t (w.r.t.FOSD order) for all t, h t , a t , θ and all optimal b.8 Further, if p t > F OSD δ θ then ũ(a t , p t , θ) > ũ(a t , δ θ , θ), and if (NDOC) Process p t is progressively absolutely continuous.I.e., belief supports are nonincreasing: for any h s h t , S(h s ) ⊆ S(h t ).
The first assumption, (MON), requires that the agent's flow payoff function is increasing in his reputation p t .This captures the core idea of signaling models: the agent would like to signal that his type is high because that induces a favorable reaction from the receiver.For example, a firm with a reputation for quality product is more likely to sell more units, an able worker is more likely to be offered a job, and a strong bargainer is more likely to see the opponent conceding to a demanding offer.Monotonicity is primarily a restriction on the model primitives, namely the utility functions: given the receiver's preferences w, her optimal strategy b is unique up to indifference for any a and p(h t ).This makes ũ(a, p, θ) a well-defined function given some tie-breaking rule for the receiver.Hence given w, (MON) is a condition on the agent's utility function u(a, b, θ).9While the condition is phrased using weak monotonicity, strict preferences relative to degenerate reputation are required to guarantee the presence of signaling effects: any type must always be strictly willing to pool with the higher types and to separate away from the lower types.
The second assumption, (NDOC), is the refinement of the equilibrium beliefs that drives our analysis.In particular, it says that if p(θ|h t ) = 0 then p(θ|h s ) = 0 for any pair of histories h s h t in H.Note that this applies both on and off the equilibrium path.In other words, once the receiver is convinced that a given type of the agent is inconsistent with the evidence (the observed history), she can never be dissuaded from this conviction.This restriction appears reasonable, since over time, only more evidence is collected, but the existing evidence is never forgotten -including the evidence that lead the receiver to rule out certain types of the agent at the time.10 Alternatively, (NDOC) can be seen as a weak form of renegotiation-proofness in some settings, as suggested by Ely and Välimäki [2003].For example, in the context of labor market signaling, suppose that a firm offers a contract to a high school graduate, according to which it would hire the worker as soon as he obtains a college degree.Suppose further that in equilibrium such a contract is only accepted by able workers (whose cost of learning is low), while less able workers reject it in favor of getting a job immediately.Then if such a contract is accepted, the firm knows the worker is able, and it is in the best mutual interest of the firm and the worker to renegotiate the contract to start the job immediately, since the delay to obtain education is wasteful for both parties.
To simplify the analysis, we strengthen (NDOC) by rendering the receiver pessimistic off the equilibrium path -her beliefs off path must put all weight on the lowest type among those she has not yet ruled out.This stronger condition is labeled as (NDOC-P) and is defined as follows: (NDOC-P) The off-equilibrium-path beliefs are such that after any action a that is not on equilibrium path at h t ∈ H: p(h t , a, x t ) = δ min S(ht) for any x t ∈ X. 11 Given (MON), this condition imposes the strongest possible punishment on the sender for any deviation among those punishments that satisfy (NDOC).Therefore, we argue that for any equilibrium that satisfies (NDOC), there exists an equivalent one that satisfies (NDOC-P), despite the latter being a stronger condition.This claim is formalized by the following lemma, with the proof available in Appendix A.
Lemma 1.If (MON) holds then for any equilibrium that satisfies (NDOC), there exists a payoffequivalent and on-path strategy-equivalent equilibrium that satisfies (NDOC-P).

Two Types
This section explores the version of the model with only two types: Θ = {L, H}.Here we show that signaling must take the form of attrition regardless of the sender's payoffs, as long as they are monotone in reputation p t .The first part of Theorem 1 states that perfect separation cannot occur at any history in equilibrium: if a given action is on path for θ = H then it is also on path for θ = L.This statement captures the idea of Admati and Perry [1987] and Nöldeke and van Damme [1990a].We also observe that there may effectively be only one such pooling action in any period, in the sense of all pooling actions must be payoff-equivalent for all types of the agent.This follows trivially from the fact that both types must be indifferent between playing any such action if there are more than one.
The insight that is novel (in the general setting) is that the converse to the first statement is not necessarily true: if α L (a|h t ) > 0 then α H (a|h t ) may or may not be positive.In other words, there may exist actions which perfectly identify the low type, even if there do not exist any that 11 On-pathness is defined in the usual way; see Section 4.2 for a formal definition.
identify the high type.It is immediate that the low type must be mixing for this to be possible.All this is summarized by the second part of the theorem.The statement does not claim existence of any such separating actions, since they, as previously mentioned, need not exist in any given casethough Appendix B presents an example in which such an informative equilibrium exists.
Theorem 1. Suppose Θ = {L, H}, (MON) holds, and dt → 0. In any equilibrium (α, b, p) such that (NDOC-P) holds, at any h t ∈ H with S(h t ) = {L, H}, and for any a ∈ A: Further, all such a are payoff-equivalent in the sense that V (a |h t , b, θ) is the same across such a for both types θ.
for any a such that α H (a |h t ) > 0.
Note that the attrition structure of signaling imposes strong restrictions on actions that can be played in equilibrium.Firstly, any separating action that perfectly identifies the low type must be myopically optimal for him, since the low type does not have any strategic incentives to play anything else.Secondly, if the low type mixes between pooling and separating, then he must be indifferent between the two: the gains from pooling (higher reputation) are exactly offset by the cost of taking suboptimal actions in current and/or future periods.
It is worth emphasizing that the result holds under very minimal assumptions on payoffs and signals: the only requirements imposed on the model are that the sender's payoff is increasing in p (which, in fact, is only required for the low type) and that the outcomes x are not perfectly revealing.
If the setting of interest fits this framework, then attrition is the only informative equilibrium structure that can arise in this setting, unless one is willing to allow for NDOC-nonconformant beliefs off the equilibrium path.Under attrition, the high type is playing some pooling action, while the low type mixes between that and a separating action.
One important case, which lies beyond the scope of our model, but is nonetheless worth mentioning, is that with a behaviorally committed type of the sender, and a strategic type, who prefers to mimic the committed type.For example, in the bargaining model of Abreu and Gul [2000], a player may be either committed to rejecting all offers that give him anything less than the whole surplus, or fully strategic.Similarly, in the (static) cheap talk model of Chen [2011], the sender may either be committed to truthful communication, or communicate strategically.Theorem 1 applies to such problems (with the exception of the payoff equivalence part of statement 1), since its proof only relies on the incentives of the low type -which in these settings is the strategic type.
It follows that if the committed type is unable to verifiably demostrate his commitment, perfect separation is impossible in equilibrium, and all equilibria feature either attrition of the strategic type as in Abreu and Gul [2000], or full pooling.

Finite Types
We now move to exploring the setting with more than two but finitely many types.In this section we show that the insight of Theorem 1 can be extended to this case, although allowing for many types does raise a number of additional issues and calls for extra assumptions.

Single-Crossing
In order to secure the result in case of many types, we need to impose the following new assumption on payoffs: (SC) For all optimal b, any a , a ∈ ∪ ht∈H ∪ θ∈S(ht) arg max a U (a|h t , b, θ), and all h t ∈ H, function U(θ) ≡ U (a |h t , b, θ) − U (a |h t , b, θ) either crosses zero at most once, or is identically zero.
This assumption belongs to a family of single-crossing conditions widely encountered in the literature on signaling, monotone comparative statics, and mechanism design. 12The purpose of our condition is standard: to ensure that the agent's preferred strategy is, in some sense, monotone w.r.t.his type.There are, however, some distinctive features that differentiate it slightly from other single-crossing conditions in the literature.
Firstly, (SC) is a condition on the expectation of a discounted sum E t e −rt u(a t , b t , θ) rather than on the flow utility u(a, b, θ).While the latter would be more preferable, aggegating singlecrossing is not a trivial problem.Quah and Strulovici [2012] discuss this problem and offer possible solutions, but none of them apply to our setting.Secondly, (SC) is more demanding than might appear initially.The dependence of U (a|h, b, θ) on a realizes not only directly -through the effect of agent's own action a t on his flow utility u(a t , b t , θ) -but also via an indirect reputation channel.
The receiver's response b t depends on agent's reputation p t , which is, in general, affected by the agent's action choice a t .Further, this reputation effect is persistent, with the choice of a t affecting not only the contemporaneous response b t , but also the continuation payoff: for a given fixed path {a s , x s } s>t , reputation {p(h s )} s>t will be persistently shifted by a t , meaning the receiver's responses {b s } s>t are affected.
All of the above means that (SC) is a non-trivial condition and may be difficult to verify in some settings.If anything, verifying (SC) might as well be the main impediment to exploiting this paper's results in applied models.However, this task is far from impossible, with Section 5 demonstrating a number examples of applied models that can be easily verified to satisfy (SC).

Attrition Structure of Equilibrium Signaling
Theorem 2 that we gradually build up to is the analog of Theorem 1 for the case when |Θ| > 2, in the sense of characterizing the actions available in equilibrium at any history.We begin, however, by stating a weaker result which, by looking at strategies rather than actions, provides a clearer characterization of the attrition structure of equilibrium signaling with |Θ| > 2. Proposition 1 below establishes that as long as (SC) and other previously stated assumptions hold, strategies played in an arbitrary equilibrium of the game can be split into two classes.The first class consists of pooling strategies played by all types.While a nominal multiplicity of such strategies may arise, they must all be payoff-equivalent, so this class is, in a sense, degenerate.The second class is that of separating strategies employed by the lowest type -these may vary in which pooling strategies they mimic and for how long.However, any separating strategy is only played by the lowest type.
To state this and other results we need to introduce some additional notation and definitions.
Firstly, denote the two boundaries of the belief support at a given history h t as S(h t ) ≡ max S(h t ) and S(h t ) ≡ min S(h t ) respectively.Furthermore, in a manner similar to type support S, given an equilibrium strategy profile {α θ }, let us define action support as We say that given the receiver's strategy b, a pure strategy a arrives at history h t -and denote it as a h t -if a(h τ ) = a τ (h τ ) for all h τ s.t.h t h τ .Further, say that a is on path for θ at h t if a h t and a is on path according to type θ's equilibrium strategy α starting from h t : α θ (a(h t )|h t ) > 0.
Say that a is on path at h t if it is on path at h t for some θ ∈ S(h t ).
We proceed by defining payoff equivalence of strategies in a straightforward manner.
Definition 2. Fix an equilibrium (α, b, p) and history h t .Any two pure strategies a , a h t are: • payoff-equivalent at h t if they are not payoff-distinct at h t .
The result can then be stated as follows.
Proposition 1. Suppose the payoff function u satisfies (MON) and dt → 0. Fix an equilibrium (α, b, p) such that (NDOC-P) and (SC) hold.Fix some history h t ∈ H and define θ ≡ S(h t ).Then for any pure strategy ā on path at h t for some type θ ∈ S(h t )\θ, the following hold: 1. ā is optimal for all θ ∈ S(h t ) at h t ; 2. any ā optimal for any θ ∈ S(h t )\θ is payoff-equivalent at h t to ā ; 3. there exists ā that is payoff-equivalent at h t to ā and is on path for θ at h t ; 4. any a that is on path at h t and payoff-distinct at h t from ā is only on path for θ.
To understand this proposition, it is illustrative to ignore payoff equivalence for a second and treat any pair of payoff-equivalent strategies as the same strategy.In this reading, the proposition implies that any pure strategy a on path for some type θ ∈ S(h t ) is on path for all types θ ∈ S(h t ), including the currently-lowest type θ.Therefore, no type of the agent can ever conclusively separate from θ.At the same time, there may exist strategies that separate θ away from the remaining types.
The weight that the receiver's belief assigns to θ may thus decrease over time along the pooling path of play, it may even converge to zero asymptotically as t → ∞, but it may never become exactly zero.However, the interpretation above is overly strong, since payoff-equivalent strategies do not need to coincide at all histories.In other words, it is a statement about strategies, whereas we would like to have a result about actions.

From Strategies to Actions
The question that remains unanswered by Proposition 1 is whether payoff-equivalence implies strategy-equivalence (i.e., that any two payoff-equivalent strategies must prescribe the same actions at either all, or at least some histories) or, if not, whether payoff-equivalent strategies at least produce equivalent belief paths for all types.Unfortunately, the answer to both of the above is negative: in general, not only may there be multiple payoff-equivalent strategies, but they may even induce different beliefs.This is demonstrated by the following example.
Example 1. Suppose Θ = {0, 1, 2}, types are ex ante equiprobable, A = R + , and outcomes are uninformative.Suppose the sender's reduced-form utility function (3) is given by ũ(a, p, θ) = E p (θ) (so agent's actions are cheap talk; (SC) holds trivially in this scenario).Then the following strategies constitute an equilibrium together with respective beliefs: type θ = 2 plays strategy a = (a , 0, 0, ...), while types θ = 1, 3 play a = (a , 0, 0, ...), where a = a are arbitrary.In this PBE some information about type is conveyed in period zero -namely, type θ = 2 separates from θ = 1, 3.However, all types of the sender are indifferent between the two strategies, hence information revealed by a 0 is not relevant to the sender's payoff -although it may be relevant for the receiver.
However, we are arguably more interested in payoff-relevant signaling, which relies on the heterogeneity of the agent's preferences across types to convey information, as opposed to the agent's utmost indifference.Narrowing the focus to such payoff-relevant information revelation allows to carry the insight of Proposition 1 over from strategies to actions.We begin by stating the formal definitions of payoff-relevant and irrelevant signaling in our setting.
Definition 3. Fix an equilibrium (α, b, p) and history h t ∈ H.
• Payoff-relevant signaling happens at h t if there exist a , a ∈ A(h t ) and θ ∈ S(h t ) such that • Payoff-irrelevant signaling happens at h t if there exist a , a ∈ A(h t ) such that p(h t , a , x) = p(h t , a , x) for some x ∈ X but V (a |h t , b, θ) = V (a |h t , b, θ) for all θ ∈ S(h t ).
In other words, payoff-relevant signaling implies that at a given history h t there are two distinct actions on path, a and a , and there is some type of the agent for which the choice between these two actions has payoff consequences.Note that since both actions are on path, it cannot be the case that all types prefer one over another -both a and a must be optimal for some types of the agent.Payoff-relevance of this action choice is then defined as some type θ ∈ S(h t ) having strict preference between the two.
We are now ready to state the theorem that characterizes payoff-relevant signaling in terms of actions, making the implications of Proposition 1 more explicit.The result below expands the message obtained in Theorem 1 to the case of finitely many types, albeit at the cost of restricting model scope to payoff functions that satisfy (SC) and to the continuous-time limit of the model (as opposed to any sufficiently small dt).
Theorem 2. Suppose the payoff function u satisfies (MON) and dt → 0. Fix an equilibrium (α, b, p) such that (NDOC-P) and (SC) hold.Fix some history h t ∈ H.If payoff-relevant signaling happens at h t then, defining θ ≡ S(h t ), the following hold: 1. any on-path action a ∈ A(h t ) is on path for θ at h t ; 2. A(h t ) ∩ A * (h t , b, θ) is nonempty, and any a in the intersection is on path only for θ at h t ;

any action ā ∈
What the theorem says is that in any equilibrium with payoff-relevant signaling, there are effectively at most two types of actions -as opposed to strategies in Proposition 1 -on path at any history: pooling actions (typical element ā) and separating actions (typical element a).The latter are only ever played by the currently-lowest type θ = S(h t ) and separate him from the remaining types.As in Theorem 1, any separating action must be myopically optimal for the lowest type given that he is revealed.
Pooling actions, on the other hand, are optimal for all types.Further, if no payoff-irrelevant signaling takes place, then any pooling action is, in fact, on path for all θ ∈ S(h t ) -i.e., all types do actually pool on the pooling action(s).Notably, both payoff-relevant and payoff-irrelevant signaling may occur simultaneously at a given history.In that case there will be more than one pooling action, and while all of them are necessarily on path for θ, the higher types may vary in their action choices, despite all types being indifferent between all of these pooling actions.
The corollary below relates to the situations when payoff-relevant signaling occurs at successive histories.It states that the pooling action in the earlier history must then be such that the low type is indifferent between separating and pooling -meaning that flow payoffs the low type gets from the separating and pooling actions must be the same.The low type must be indifferent between separating at t and t + dt, so one period of pooling must be exactly as attractive as one period of being identified as θ.In practice, this means that pooling action must be costlier for θ than the separating action, since the former yields higher reputation payoff.
Finally, Theorem 2 applies to all histories, including those off the equilibrium path.Applying it inductively starting from the root history, we obtain Corollary 2 below, which states that in the absence of payoff-irrelevant signaling, only the lowest type L ≡ min Θ can ever separate from the rest, while the remaining ones can never separate from one another.
Corollary 2. Suppose the conditions in Theorem 2 hold.In any equilibrium in which no payoffirrelevant signaling happens, for any on-path history h t , one of the following must hold: 1. S(h t ) = Θ; 2. S(h t ) = {min Θ}.
It is worth noting that there may be histories h t at which p(h t ) assigns arbitrarily small weight to the lowest type.So while this type can never be ruled out completely along the pooling path, asymptotically the receiver's belief may assign arbitrarily low weight to it.Notably, this implies that the mechanism of attrition of the lowest type can yield full separation asymptotically if there are only two types of the sender but not if there are more (but finitely many), which is an important takeaway, since many applied papers treat two-type models as proxies for more general settings.
At the same itme, the assumption that is crucial to our analysis is that the lowest type is separated away from all other types.Otherwise -e.g., with an interval type space -full asymptotic revelation is possible again.An example of such outcome in the context of bargaining is presented by Fuchs and Skrzypacz [2010].Their equilibrium resembles the attrition equilibria of this paper, except in their equilibrium, an interval of lowest types separates away in every period instead of the single lowest type mixing between that and pooling.This observation serves to illustrate the nontrivial implications of model discretization: if one attempts to compute equilibria of a continuous-time signaling model with an interval of types by approximating it with a disrete-time finite-type model, different approximations may yield qualitatively different results.In particular, if an interval type space is approximated by a grid that is too coarse relative to time discretization, the researcher would conclude that no asymptotic learning takes place in the discretized model, whereas it could take place in the continuous model.
Theorem 2 and its corollaries effectively provide a cookbook on how to construct an equilibrium with payoff-relevant signaling only.Suppose we want signaling to occur during the time interval Then in every period, along the pooling path we shall have two actions available to the sender: a separating action a ∈ A * (h t , b, L) only taken by the lowest type L ≡ min Θ and a pooling action ā that satisfies the condition in Corollary 1 -the latter action will be played by L with some probability and by all other types for sure.Note that we have a degree of freedom in this construction: reputation from taking a pooling action depends on the probability with which type L separates in that given period.Hence by changing these probabilities we will be able to sustain different pooling actions ā in equilibrium.To complete the construction, we need to verify that from time T onwards, the pooling strategy is such that L is exactly indifferent at T (or the last period before T ) between separating and following this pooling path, and to verify that all other types always weakly prefer the pooling action to the optimal deviation.Appendix B provides an example of an equilibrium constructed using this cookbook, in the context of the price signaling model developed in the following section.

Applications
This section presents examples of applied models in different settings that fit our framework.
It is meant to demonstrate some instances of models yielding additively and/or multiplicatively separable payoff functions that allow (SC) to be verified with little effort.We begin by presenting in Section 5.1 a specific separable framework and show that there exist simple sufficient conditions for (MON) and (SC) within this framework.We then proceed to applying this framework, starting with a model for the example from the Introduction, in which a firm can use its historic prices to convince the antitrust authority that the firm has no monopoly power.Section 5.2 sets up the model and verifies (both directly and using results from Section 5.1) that our results apply to it; Appendix B constructs an example of an informative equilibrium to verify that they exist and to show more concretely how they can look.Other applications, to labor market signaling and bargaining, are considered in Sections 5.3 and 5.4 respectively.

Separable Settings
This section shows that if the agent's flow utility function ũ(a, p, θ) is separable in a specific way, then (MON) and (SC) can be easily verified.While this form of the utility function may appear restrictive, the remainder of Section 5 shows that it captures a wide range of settings, including classic signaling and bargaining models.
In particular, suppose the agent's flow utility function ũ defined in (3) can be represented as for some collection of functions φ 0 , φ 1 , ψ.Then we can derive simple conditions on these three functions that are sufficient for (MON) and (SC) to hold.These conditions are given by the two respective propositions below.
As in the rest of the paper, monotonicity in p is understood with respect to ≥ F OSD order on p (see footnote 8).Note that φ 1 (a, p)ψ(θ) can be negative for some or all a, p, θ; without loss we let φ 1 absorb the negative sign in this case.
Proposition 3. If representation (4) applies and outcomes x t are uninformative at all h t , then (SC) holds if ψ(θ) is strictly monotone in θ.
We now continue to the more specific applications that use these results.

Price Signaling
In this section we revisit the example from the Introduction, in which Apple is using its past pricing decisions to argue that it faces competitive pressure with regards to its App Store.This section constructs a simple model for this story that fits the framework of Section 2, thereby demonstrating that past prices can not serve as conclusive proof of a lack of monopoly power.
We then construct an informative equilibrium in Appendix B for the special case of this model, showing that prices can serve as suggestive evidence.
To construct the simplest model possible, let us adopt a framework in the spirit of monopolistic competition.Consider a firm (Apple) that serves app developers, and for simplicity ignore the downstream market, in which developers interact with app users.The firm faces residual demand curve q t = (1 − θa t )dt in every period t ∈ T ≡ {0, dt, 2dt, ...}, where a t ∈ R + is the price (App Store commission) the firm sets in that period, and θ ∈ Θ ⊆ R ++ is the degree of the competitive pressure that the firm faces, hereinafter referred to as the state.In this specification, if Apple App Store is, in fact, competing with Google Play Store and other app and game stores on other devices, this is reflected by higher θ and lower residual demand for Apple's services.
In every period, after the firm sets price a t , the developers may file a complaint to the antitrust authority (regulator).This happens with probability λ(a t )dt, which is weakly increasing in a t .If a complaint is filed, the regulator opens an investigation, which results in a fine of size F if evidence indicates that the competitive pressure θ is low.In particular, assume that if an investigation is launched, a fine is imposed with probability γ(E[θ|h t ]) that is strictly decreasing in the expectation E[θ|h t ] of the state inferred from some prior belief p 0 ∈ ∆(Θ) and the history of the firm's past pricing choices h t = (a 0 , ..., a t−dt ).
To verify that Theorems 1 and 2 apply to this model, we need to verify that assumptions (MON) and, in case of Theorem 2, (SC) hold.The flow payoff function can be written down as where ).This utility function fits representation (4) with φ 0 (a, p) = a − λ(a)γ(p)F , φ 1 (a, p) = −a 2 , and ψ(θ) = θ.One can see that ψ(θ) > 0 is strictly increasing, and φ 0 (a, p), φ 1 (a, p) are weakly increasing in p, since φ 0 (a, p) is strictly decreasing in γ, which is strictly decreasing in E[θ|p], which is strictly increasing in p w.r.t.FOSD shifts.Hence Propositions 2 and 3 apply, and (MON) and (SC) hold in this model.
Both assumptions can be verified directly.To verify (MON), note that (5) only depends on p through γ(p), and as argued above, γ is strictly increasing in p w.r.t.FOSD shifts, hence (MON) holds.To verify (SC), fix some a , a ; then the function U(θ) ≡ U (a |h t , θ) − U (a |h t , θ) for some fixed h t is given by: for some constants C 1 , C 2 that depend on a , a , h t .We see that U(θ) is a linear function of θ, hence (SC) holds.
The results of Theorems 1 and 2 therefore apply: the firm can not prove conclusively, by referring to prices alone, that its market power is sufficiently low (i.e., that θ is below some threshold θ set by the regulator).However, prices can serve as suggestive evidence.To demonstrate this, an informative equilibrium is constructed in Appendix B given specific functional forms for λ and γ.
It should be self-evident that the goal of this section is to produce the simplest model for the setting.As a result, the model reduces the effects of competition to a residual demand curve, reduces consumers to non-strategic complainers, assumes the antitrust action is just a fixed fine, that investigations do not condition on past investigations, etc.A paper aiming to explore this particular phenomenon could set up a more convincing model that avoids the aforementioned simplifications.

Labor Market Signaling
In this section we revisit the classic labor market signaling model (Spence [1973]), which sparked the original discussion around dynamic signaling (Nöldeke and van Damme [1990a], Swinkels [1999]).In the dynamic version of this model, a long-lived candidate of privately known ability θ ∈ Θ ⊆ R + acquires costly and, w.l.o.g., unproductive education in an attempt to signal her ability to potential employers.A high-ability worker is more productive on the job and can thus bargain for a higher wage, while also having lower cost of education than a low-ability worker.In every period t ∈ T ≡ {0, dt, 2dt, ...} she chooses education intensity e ∈ E ⊂ R. The flow cost of education is given by c(e|θ) ≡ l(e) • m(θ), where l(e) is increasing in e with l(0) = 0, and m(θ) is strictly decreasing in θ.
There is a population of homogeneous competitive employers, who observe the full history of the candidate's education choices and grades.In every period they simultaneously offer employment contracts to the candidate. 13After observing all contracts, the candidate may accept at most one of them.If a contract is accepted, in every future period the candidate receives wage w • dt, where w is as specified in the contract.Let d ∈ {0, 1} denote the worker's acceptance decision -whether she chooses to accept an offer in a given period or not.If the candidate chooses to accept, she would trivially find it optimal to choose the highest-wage contract.
Alternatively, we could once again invoke Propositions 2 and 3, since (6) can, given the buyer's equilibrium strategy {y B (p), z B (y S , p)}, be represented in terms of (4) with φ 0 ((y S , z S ), p|χ) = The main implication of Theorems 1 and 2 for this bargaining model is that the perfectly efficient allocation is unattainable for any finite delay.However, our results make no statements on whether the Coase conjecture is realized and all seller types trade at t = 0 at the lowest acceptable price, or delay can be used as an (imperfect but effective) screening device. 17The exact shape that the equilibria can take depends on c(θ), v(θ), and x(h t ).For example, if c(θ) = c for some constant c and v(θ) ≥ c + g for some "gap" value g and the uninformed buyer makes all the offers -then the Coase conjecture realizes, and in the unique equilibrium all types of the seller pool on accepting the lowest price (Gul, Sonnenschein, and Wilson [1986]).In the alternating-offer scenario with the same assumptions, on the other hand, a delay equilibrium is possible, where attrition occurs for some finite time.During that time, some of the lowest-type sellers sell, and after that all the remaining sellers pool on the same price (Ausubel and Deneckere [1998], as presented in Ausubel, Cramton, and Deneckere [2002]).If v(θ) = v ≥ c(θ) for some constant v and all θ, then in the seller-proposing game pooling is, again, the only equilibrium, but not in the Coase conjecture sense.
In that equilibrium all types of the seller propose v in every period and obtain surplus v − c(θ), with their offer being accepted straight away (Ausubel and Deneckere [1989]).For a review of these and other results on bargaining under incomplete information, see Ausubel et al. [2002].As applied to more recent literature, our framework also subsumes the model of Daley and Green [2020], who explore a setting in which public news arrive during the bargaining process.

Conclusion
This paper explores a model of dynamic signaling with observable actions.In this model a single privately-informed agent takes an action every period, but cannot commit to future actions.
The receiver tries to infer the agent's information from his actions, and the receiver's opinion is relevant to the agent's payoff.The existing literature has implied that signaling is impossible in such setting, unless strong assumptions about off-equilibrium-path beliefs are adopted.This paper confirms the negative result that perfect separation is impossible in such a setting.However, it provides a novel positive result, showing that imperfect signaling is possible under reasonable offpath beliefs.Further, we show that such signaling must necessarily happen through attrition of the lowest type of the agent.In this attrition scenario, all types pool on the same action (or split across a number of different yet payoff-equivalent actions), while the lowest type also plays some depends on xt -or, after taking expectations, on its distribution χ(ht).However, in common settings of buyerproposing (χ(ht) ≡ 1), seller-proposing (χ(ht) ≡ 0), and even alternating-offer (χ0 ∈ {0, 1}, χ(h t+dt ) = 1 − χ(ht)) bargaining, we can treat χ(ht) as an exogenous parameter. 16If c(θ) is constant, Proposition 3 can be applied by setting ũ(a, p, θ) = φ0(a, p). 17There is a subset of literature exploring possible reasons for delay in bargaining (e.g., Abreu andGul [2000] andFeinberg andSkrzypacz [2005]).Our focus is different: instead of demonstrating sufficient conditions within the bargaining models under which delay is the only equilibrium, this paper considers a much more general class of models and aims to provide weaker sufficient conditions under which attrition is the only informative equilibrium.In particular, when applied to bargaining, our model does not rule out the possibility of instant agreement, in which all types of the sender pool on the same action.
separating action with positive intensity.
The paper identifies sufficient conditions, under which the results hold.These include a restriction on monotonicity of the agent's preferences w.r.t.his reputation, and a restriction on the off-path beliefs to be reasonable.In the case of many types single-crossing of agent's preferences must also hold.The latter is, arguably, the strongest of the three assumptions and the one that would be most difficult to verify in the applied work.However, the paper presents a number of applied signaling models to demonstrate that this notion of single-crossing can be used in applied work.Future work could involve the exploration of simpler notions of single-crossing that would work for dynamic signaling games.
Importantly, the paper assumes the receiver to be myopic.The main issue that arises when both the agent and the receiver are strategic is the folk theorem, which says that any individually rational payoff for either player can be sustained in equilibrium for dt small enough.The consequence of this equilibrium multiplicity is that (MON) and (SC) almost never hold across the whole spectrum of equilibria in a given setting.The solution, if one wishes to explore settings with a strategic long-lived receiver, is to focus on some selected equilibria -i.e., to restrict attention to some fixed strategy (or a class of strategies) b of the receiver and to test (MON) and, if necessary, (SC) against those strategies.This approach has been demonstrated in this paper in the bargaining application.
Exploration of the specific equilibrium selection criteria that could yield favorable results lies beyond the scope of this paper, but could be another prospective direction for future research.
Payoff-equivalence is shown as follows: for any two a , a ∈ A such that α H (a |h t ) > 0 and α H (a |h t ) > 0 it must be that V (a |h t , b, H) = V (a |h t , b, H), otherwise the high type would only play one of the actions and not the other.The first part of the argument showed that α L (a |h t ) > 0 and α L (a |h t ) > 0, hence V (a |h t , b, L) = V (a |h t , b, L) by the same logic.
Statement 2. Begin with the first part (that a ∈ A * (h t , b, L)).For any such a that α H (a |h t ) = 0 and α L (a |h t ) > 0 and any outcome x t , we have p(h t+dt ) = δ L , where h t+dt = (h t , a , x t ).By Lemma 2, at all h s h t , only bliss actions are played: a s ∈ A * (h t , b, L).If a / ∈ A * (h t , b, L) then playing a bliss action a ∈ A * (h t , b, L) at h t instead -and continuing with a s at all subsequent histories -yields a strictly higher flow payoff at h t and the same continuation payoff.Hence playing a at h t was not optimal.
The second part of the second statement follows from the same argument as did payoff equivalence for L in the first statement.

A.3 Proofs: Finite Types
Before proceeding to the proof of Theorem 2, it is convenient to split parts of it off into supplementary lemmas.We begin by arguing in Lemma 3 that at no history can actions lead to separation of types into disjoint sets that can be compared by a strong set order -unless one of these sets is a singleton coinciding with the lower bound of the other set.In particular, we show that sets of types in the support of two different actions have to necessarily overlap (not in the sense of having common elements, but in the sense of upper and lower bounds). Lemma h s h t by the same argument as above.Unlike in the previous case, ∆U is not bounded away from zero for all dt.However, for the IC condition U (a |h t , b, θ) ≥ U (a |h t , b, θ) to hold, we must have that ∆U → 0 as dt → 0. This would require that p → δ S(ht,a ) as dt → 0, which yields a contradiction in the limit, since Next, Lemma 4 puts the (SC) property to use, establishing a form of monotonicity of optimal strategies w.r.t.type ("higher types play higher strategies").The main problem in the dynamic setting is the lack of any nice complete order over strategies a, so given two arbitrary strategies, we generally cannot say which one of them is "higher".Therefore, we rephrase this monotonicity result to say instead that if a given strategy (or its equivalent) is optimal for two agent types, then it must also be optimal for all types in between.We cannot say with certainty that the given strategy is chosen on equilibrium path by any of these types in between, but we can claim that any strategy they play must be payoff-equivalent to the one under consideration.
Lemma 4. Suppose (SC) holds.Fix any equilibrium (α, b, θ) (for any dt > 0) and history h t ∈ H.If there exists a pair of strategies a, ā h t that are payoff-equivalent at h t and are on path at h t for some types θ and θ > θ respectively, then any strategy â h t on path at h t for any θ ∈ (θ, θ) must be payoff-equivalent at h t to ā, a.
Proof.Fix any such â.Strategy ā has to be optimal for type θ.In particular, when evaluated at h t , it has to be better than â: The same holds for type θ, since ā and a are payoff-equivalent: At the same time, θ at least weakly prefers â to ā, meaning that the converse holds for θ: If this inequality is strict, then this is a direct contradiction with (SC), which requires that U (ā|h t , b, θ) − U (â|h t , b, θ) as a function of θ either crosses zero at most once, or is exactly zero.
denote an outcome profile which prescribes some outcome for every history.Fix some x.Coupled with some pure strategy of the agent, the receiver's equilibrium strategy b and the equilibrium belief system p, it fully determines the path of play and the agent's payoffs.
Begin the second layer of induction, iterating forwards on time periods from t.At h t and any subsequent history h s h t , one of the following must apply: 1.There is an action a on path for both types S(h t ) and S(h t ) at h s .If this is the case, call h s a non-splitting history and continue to h s+dt = (h s , a, x(h s )).
2. There is no action a on path for both S(h t ) and S(h t ) at h s .If this is the case, call h s a splitting history.
Proceed along the non-splitting path (according to the chosen x) until the first splitting history h s .Pick arbitrary actions ā and a that are on path for S(h t ) and S(h t ) at h s respectively, and consider two continuation histories hs+dt ≡ (h s , ā, x(h s )) and h s+dt ≡ (h s , a, x(h s )).Then we have that |S(h s+dt )| < |S(h s )| = k for both continuation histories, because S( hs+dt ) ⊆ S(h s )\S(h s ) and S(h s+dt ) ⊆ S(h s )\ S(h s ).
Therefore, by the induction assumption, the statement of the lemma holds at both hs+dt and h s+dt .
In particular, the statement of the lemma for hs+dt states that there exist two hs+dt -payoff-equivalent strategies on path at hs+dt for S(h s ) and S( hs+dt ) respectively.Playing ā at h s is on path for both of these types, hence there also exists a pair of strategies on path at h s for the two types respectively, which grant the same payoff conditional on x.22However, the argument above applies to any outcome profile x and, in particular, to any outcome x(h s ), hence there also exists a pair of strategies ā , ā on path at h s for the two types S(h s ) and S( hs+dt ) respectively, which are payoff-equivalent at h s (unconditionally).
By a mirror argument, there also exists a pair of h s -payoff-equivalent strategies a , a on path at h s for S(h s ) and S(h s+dt ) respectively.Note further that by Lemma 3 we have that S(h s+dt ) > S( hs+dt ).Lemma 4 hence applies: a must be payoff-equivalent to ā , ā , thus so is a .We have shown that the statement of the lemma holds at h t if |S(h t )| = k and h t is a splitting history.
We are left to cover non-splitting histories.Suppose h t is non-splitting.Fix x.Then we know that the statement of the lemma holds at the first splitting history h s following h t along the path of pooling actions and fixed outcomes x.Therefore, there exists a pair of strategies on path at h t for S(h t ) and S(h t ), which grant the same payoff at h t conditional on x.This applies to any outcome profile x, hence there exists a pair of strategies ā, a on path at h t for S(h t ) and S(h t ), which are payoff-equivalent at h t .This concludes the induction argument and the proof of the lemma.
Proof of Proposition 1.Let θ ≡ S(h t ).Note that the statement of the proposition holds trivially if |S(h t )| = 1, so for the remainder of this proof we assume that this is not the case (i.e., θ = θ).From Lemma 5 we know there exist h t -payoff-equivalent ā, a h t on path at h t for θ and θ respectively.Then by Lemma 4, any pure strategy a h t on path at h t for any θ ∈ S(h t )\{ θ, θ} is payoff-equivalent at h t to ā, a.
Suppose now there exists a pure strategy ā h t on path at h t for θ, which is payoff-distinct at h t from ā.By (SC), all types θ ∈ S(h t )\ θ must have a strict preference at h t between ā and ā .The former is optimal for these types, hence ā is only on path for θ.The two strategies cannot prescribe different actions ā(h t ) = ā (h t ) at h t , since this is in violation of Lemma 3. The same, however, applies to any subsequent history h s such that |S(h s )| > 1.At all h s h t s.t.|S(h s )| = 1, Lemma 2 implies that all pure strategies on path at h s are h s -payoff-equivalent.Both facts together imply that ā(h s ) = ā (h s ) for all h s h t .This contradicts ā and ā being payoff-distinct, hence such ā does not exist.Therefore, any pure strategy a on path at h t that is h t -payoff-distinct from ā is only on path for θ.This concludes the proof.
Proof of Theorem 2. From Proposition 1, all actions a ∈ A(h t ) are on path for θ, which proves the first statement of the theorem.
From the fact that payoff-relevant signaling happens at h t we know that there exist two pure strategies a, ā that are payoff-distinct at h t and prescribe different actions at h t : a ≡ a t = ā ≡ āt .From Proposition 1 we know at least one of these strategies -suppose a -is on path for θ but not for any other θ ∈ S(h t )\ θ at h t .Furthermore, it follows from the definition of payoff-relevant signaling that there is no ā s.t.ā h t and ā t = a, and which is payoff-equivalent to ā at h t .Therefore, a is only on path for θ, while ā is optimal for all θ ∈ S(h t ) at h t .
We now show that a ∈ A * (h t , b, θ).Suppose not.Then type θ can play some a s ∈ A * (h s , b, θ) at every history h s h t .Compared to following a, this strategy would yield the same payoff at all times s > t and a strictly higher payoff at t (same as in the proof of Theorem 1), hence a is not optimal for θ at h t -a contradiction.
To complete the proof of statements 2 and 3 of the theorem, we need to show that ā / ∈ A * (h t , b, θ).
Assume not.Consider the strategy of playing ā at h t and all subsequent histories.Compared to following a, this strategy would yield θ a weakly higher payoff at all times s > t and a strictly higher payoff at t (due to p(h t , (ā, x t )) > δ θ for all x t and to the strict part of (MON)), hence it is a profitable deviation from a for θ at h t -a contradiction.
This completes the proof of Theorem 2.
Proof of Corollary 1.The low type must be indifferent between taking a separating action a at h t and pooling on ā at h t and separating at h t+dt .This indifference dictates that one period of pooling must be exactly as attractive as one period of being revealed as θ, i.e., E x [ũ (ā, p(h t , (ā, x)), θ) |θ] = ũ a, δ θ , θ .
Proof of Corollary 2. Proposition 1 states that all pure strategies a on path at h t for any θ ∈ S(h t )\S(h t ) are payoff-equivalent at h t .Since there is no payoff-relevant signaling in equilibrium, the set of such strategies is a singleton: if there is more than one then there exists h s h t at which the two prescribe different actions, type θ = 1.The bliss price for type θ can be calculated by maximizing the flow payoff:
It makes sense to select the negative root, since the intuition behind the problem suggests a P (h t ) < a S (1): convincing the regulator that the market is competitive involves setting prices below the monopoly price, not above.
To finish the equilibrium construction, it remains to pick an arbitrary prior belief p 0 , per-period separation probability for the lowest type (which pins down the evolution of p t ), calculate the resulting path of a P (h t ), and verify that it is incentive compatible for the competitive types to join in on the pooling price a P (h t ), rather than separating away to a S (θ) = z(δ1) 2θ .This IC condition is difficult to verify analytically (and it may or may not hold, depending on the parameters, possibly leading to nonexistence of an informative equilibrium), but can be verified numerically.Figure B.1 presents a numerical example with period length dt = 0.1.In this example, separation occurs over t ∈ [0, 10] ∩ T , and type θ = 1 separates with intensity 0.5 (which is equal to separation probability 0.5dt = 0.05 per period) over that time interval.From t = 10 onwards, all types switch to pooling indefinitely.
From Panel B.1a we can see that the pooling price in this example is higher than the competitive prices, i.e., the bliss price that types θ = 2 or θ = 3 would set, but this observation is specific to the parameters.The latter follows from Corollary 1, which mentions that the pooling action is dictated by the lowest type's indifference and is completely independent from other types' preferences.Panel B.1b demonstrates visually that all the IC constraints hold: the monopolist θ = 1 is indifferent between pooling and separating at all t, while the competitive types θ ∈ {2, 3} always prefer pooling.Panels B.1c and B.1d demonstrate that the extent of information revelation can still be substantial in equilibrium: the regulator becomes more convinced of market competitiveness over time, with the probability she assigns to θ = 1 dropping from 0.63 to 0.01 along the pooling path.
throughout and which are sufficient for the results in the two-type model: Monotonicity and Never Dissuaded Once Convinced.(The version of the model with more than two types requires a third assumption, Single Crossing, which is introduced and discussed separately, in Section 4.1.)To introduce these assumptions, a few extra bits of notation would prove useful.Firstly, let δ θ denote the Dirac delta: given some θ ∈ Θ, p(h t ) = δ θ is equivalent to S(h t ) = {θ}.Secondly, let ũt denote the agent's induced flow payoff function given some fixed strategy b of the receiver: χz S y B (p) + (1 − χ)z B (y B , p)y S and φ 1 ((y S , z S ), p|χ) = −χz S − (1 − χ)z B (y B , p) being monotone in p and ψ(θ) = c(θ) being strictly monotone.16 p t ) ≡ 1 − λF 1 − E[θ|pt] and updates her belief p t upon this information.Specifically, we use p(h t , a t , x t ) to denote the resulting posterior belief.Hereinafter, belief p t is also referred to as the agent's reputation.Further, ) ⊆ Θ be the support of belief p t , i.e., the set of types to which p t assigns positive weight.With abuse of notation, let S(h t ) ≡ S(p(h t )). the end of every period t, the agent receives flow payoff u(a t , b t , θ), and the receiver obtains payoff w(a t , b t , x t , θ).For simplicity, we assume that the agent's payoff u does not depend on the realized outcome x t except through the effect it has on the receiver's action b t .Both functions are assumed to be upper semi-continuous in the respective player's action: lim a→at u(a, b t , θ) ≤ u(a t , b t , θ) and lim b→bt w(a t , b, x t , θ) ≤ w(a t , b t , x t , θ) for all a t , b t , x t , θ. Further, assume that function u is bounded.
a, b, θ denote the highest expected continuation utility that type θ can achieve conditional on taking action a at history h t .The outer expectation is taken w.r.t. the realization of period-t outcome x t , which affects the receiver's belief p t and, thus, her action b t and the agent's contemporaneous utility u(a t , b t , θ).The t + dt-history is h t+dt = (h t , (a t , x t , b t )).
3. Suppose (MON) holds and dt → 0. Fix any equilibrium and any history h t ∈ H. Then for any a , a ∈ A(h t ) we have S(h t , a ) ≥ S(h t , a ), with equality only if S(h t , a ) is a singleton.19Proof.Assume by contradiction that S(h t , a ) < S(h t , a ) for some a , a ∈ A(h t ).Pick any type θ ∈ S(h t , a ) and any strategy a on path for θ at h t .Construct strategy a as a (h t ) = a and a (h s ) = a (h s ) for all h s h t .This strategy constitutes a profitable deviation for θ at h t .To see this, observe that the agent's lifetime utility can be written asU (a|h t , b, θ) ≡ E ũ a(h t ), p(h t , a(h t ), x t ), θ dt+ ), p(h s , a(h s ), x s ), θ dt | h t , θ .Since p(h s , a (h s ), x s ) ≥ F OSD δ S(ht,a ) > F OSD δ S(ht,a ) ≥ F OSD p(h s , a (h s ), x s ) for any x s and all h s h t , (MON) implies that U (a |h t , b, θ) − U (a |h t , b, θ) ≥ E ũ a , p(h t , a , x t ), θ − ũ a , p(h t , a , x t ), θ dt + ∆U, −r(s−t) ũ a (h s ), δ S(ht,a ) , θ − ũ a (h s ), δ S(ht,a ) , θ dt | h t , θBy construction of a , a (h s ) = a (h s ) for all h s h t , hence ũ a (h s ), δ S(ht,a ) , θ = ũ a (h s ), δ S(ht,a ) , θ .By assumption, S(h t , a ) > S(h t , a ), hence (MON) implies ũ a (h s ), δ S(ht,a ) , θ > ũ a (h s ), δ S(ht,a ) , θ for all h s h t .We conclude that ∆U > 0. Further, value of ∆U is strictly positive regardless of dt. 20r dt small enough, it dominates the first term on the RHS of (A.1) because u is bounded, implying that for small enough dt, U (a |h t , b, θ) > U (a |h t , b, θ) which contradicts a being optimal for θ at h t .Now supposeS(h t , a ) = S(h t , a ).Suppose by way of contradiction that |S(h t , a )| > 1, meaning S(h t , a ) < S(h t , a ).Let p ≡ E[p(h t , a , x t ) | h t , a ] be the belief induced by action a alone (note p < F OSD δ S(ht,a ) = δ S(ht,a ) ).Consider type θ ≡ S(h t , a ).This type θ must have an on-path strategy a that yields reputation p > F OSD p at all histories h s (h t , a ) with positive probability (with certainty if outcomes x are uninformative). 21Construct strategy a as above.Then inequality (A.1) holds with e Lemma 5 below is the final step before we can move on to the proofs of main results.It can be seen as a weaker version of Proposition 1, claiming that the highest and lowest types at any history have a strategy in common.Lemma 5. Suppose (MON) and (SC) hold and dt → 0. Fix an equilibrium (α, b, p) and any h t ∈ H.There exist h t -payoff-equivalent strategies ā, a h t , on path at h t for S(h t ) and S(h t ) respectively.Proof.We will proceed by induction on the support size |S(h t )|.The claim of the lemma holds trivially for |S(h t )| = 1, and by Theorem 1 it also holds for |S(h t )| = 2.The remainder of the proof shows that if the claim holds when |S(h t