1 Introduction

The price formation process in financial markets involves equating supply and demand for securities over time for arriving investors with heterogeneous trading preferences. In present day markets, large investors act on their underlying trading preferences, sometimes called parent demands, by splitting their trading into dynamic sequences of smaller orders, called child orders (see O’Hara [32]), to minimize their price impact. Since the parent demands driving child-order trading are private information, investors use information from arriving child orders to form inferences over time about the dynamically evolving fundamental state of the market. In particular, investors learn about imbalances in the underlying aggregate parent demands and the associated pressure on future market-clearing prices and incorporate this information in their current child orders. Given the widespread prevalence of optimized order-splitting of parent orders into flows of child orders, dynamic learning about aggregate parent demands is a critical part of market dynamics.Footnote 1

This paper is the first to provide an analytically tractable equilibrium model of dynamic learning, trading, and pricing with parent trading demands. We consider a continuous-time model with high-frequency trading at times \(t\in [0,1]\) over short time-horizons with [0, 1] being a day or an hour. Trading occurs between price-sensitive optimizing traders with two different types of parent trading targets: One group has fixed individual targets, and the other group wants to track a stochastically evolving target over time. Since parent targets are initially not public, information about parent demand imbalances is partially revealed through market-clearing stock prices. Our analysis models the equilibrium dynamic learning process, stock holdings, and stock-price processes.

Our main results are:

  • We construct and solve two different equilibrium models: A simpler price-friction equilibrium and a subgame perfect Nash financial-market equilibrium. In the price-friction equilibrium, price impact is due to an exogenous trading friction, but in the subgame Nash equilibrium, price impact includes both exogenous frictions and an endogenous price impact due to market clearing with constrained market asset-holding capacity. We find that these two equilibria are numerically similar.

  • Intraday price drifts due to price pressure change over the trading day and are path-dependent. This leads to time-varying incentives for investors to provide liquidity to the child orders of other investors.

  • A practical application of our model is that we can compute total trading costs for investors given the effects of dynamic learning and optimal trading by other investors. We show these costs are quadratic in the rebalancers’ trading targets.

  • Trading in our model reflects a combination of liquidity provision and speculation but not predatory trading. We conjecture that the absence of predatory trading is because our model replaces the exogenous price-elastic residual supply used in both Brunnermeier and Pedersen [9] and Carlin, Lobo, and Viswanathan [10] with endogenous demands coming from rational profit-maximizing investors.

Our paper advances several strands of research on market microstructure. First, dynamic learning and trading have been extensively studied in the context of markets with strategic investors with long-lived asymmetric information as in Kyle [29]. However, equilibrium trading, learning, and pricing with optimal dynamic order-splitting by large uninformed investors are less understood. Thus, we model price pressure to equate supply and demand rather than adverse selection. Second, Grossman and Miller [21] model pricing and liquidity provision with impatient traders who submit single orders equal to their parent demands and with symmetric payoff information. In contrast, we model liquidity provision with optimal order-splitting of parent demands into child order flows. Third, Choi, Larsen, and Seppi [12] construct an equilibrium with optimal dynamic trading and learning in a market with a strategic rebalancer with an end-of-day trading target and an informed investor who trades on private long-lived asset-payoff information. By filtering the order flow over time, the rebalancer learns about the underlying asset payoff, the informed investor learns about the rebalancer’s trading target, and market makers learn about both when setting prices. That earlier paper provides a characterization result for equilibrium and gives numerical examples but does not have an existence proof or analytic solutions. In contrast, our model is solved analytically and gives the equilibrium in closed form. Fourth, Brunnermeier and Pedersen [9] and Carlin, Lobo, and Viswanathan [10] show how dynamic rebalancing by a large investor can lead to predatory trading. However, these papers abstract from the learning problem by assuming the parent trading needs are publicly observable. They also make an ad hoc assumption about the price sensitivity of a residual market-maker trading demand due to exogenous price-elastic noise traders. In contrast, our model assumes the underlying parent trading demands are private information, which leads to a learning problem. In addition, our prices are rationally set with no ad hoc residual demand. Fifth, a large body of research models optimal order-splitting strategies for a single strategic investor given an exogenous pricing rule with no learning about latent trading demands of other investors (see, e.g., Almgren and Chriss [3, 4], Almgren [2], and Schied and Schöneborn [34]). In contrast, we solve for optimal trades, learning, and pricing jointly. van Kerval, Kwan, and Westerholm [25] solve for optimal trading strategies for two dynamic rebalancers with learning over time about each other’s latent trading demands. This leads to predictions about the effect of aggregate parent demand on individual investor child orders, which are then verified empirically. However, they assume an ad hoc linear pricing rule, and there are no existence proofs or analytic solutions. In contrast, price pressure in our Nash model is partly endogenously determined in equilibrium, and we solve our model analytically. As in van Kervel, Kwan, and Westerholm [25], trading in our model is a combination of speculation on expected future price changes and trading-demand accommodation.

The mathematics of our model is tractable because we use a modeling approach from the asset-pricing literature for non-dividend paying stocks. The simplification involves finding equilibrium price drifts that clear the market without determining the levels of market-clearing prices as discounted future cash flows. Karatzas and Shreve [27,  Chap. 4] use this approach in complete market settings, and Cuoco and He [14] consider an extension to incomplete markets. Atmaz and Basak [1] show that non-dividend paying stocks are relevant for asset pricing. However, the non-dividend paying stock approach is new in the mainstream microstructure literature. Gârleanu and Pedersen [20], Bouchard, Fukasawa, Herdegen, and Muhle-Karbe [7], and Noh and Weston [31] use the zero-dividend stock approach to model prices given exogenous transaction costs. We extend this approach to include learning and endogenous price impact.

2 Model

We model equilibrium trading, learning, and pricing in a market with a risky stock and a riskless bank account over a short time horizon [0, 1] (e.g., a trading day). For simplicity, the net supply of both the stock and bank account are set to zero. Since the time horizon is short, the risk-free interest rate on the bank account is set to zero. Stock differs from the bank account in two ways: First, investors have individual parent demands for the stock. Second, stock prices are stochastic over time. Stock valuation can be viewed as the sum of two components: One component is a fundamental valuation of future dividends absent price pressure from trading targets. The other component is incremental price pressure for markets to clear given parent trading demand imbalances. It is the price pressure component that is the focus of our analysis. Our analysis treats these two components as being orthogonal and, for simplicity, normalizes the dividend valuation component to zero. Thus, hereafter, when we refer to the “stock price”, this is shorthand for the “price pressure valuation component of stock prices.” Our prices are random due to random trading demand imbalances. In a more complicated model, a separate fundamental dividend valuation component could be added to our stock-price pressure valuation to get the full stock price.

Two different groups of investors trade in our equilibrium model.

  1. (i)

    Price-sensitive rebalancers. Rebalancer \(i\in \{1,...,M\}\) maximizes her expected profit subject to a parent trading target \(\tilde{a}_i\) where \(\tilde{a}_i\) is private information for i. The targets \((\tilde{a}_1,...,\tilde{a}_M)\) are assumed independent and homogeneously distributed \(\tilde{a}_i \sim \mathcal {N}(0,\sigma _{\tilde{a}}^2)\) for all rebalancers \(i\in \{1,...,M\}\) with identical zero means and standard deviations \(\sigma _{\tilde{a}}\). The aggregate target is

    $$\begin{aligned} \tilde{a}_\Sigma := \sum _{i=1}^M \tilde{a}_i. \end{aligned}$$
    (2.1)

    Rebalancer i’s control is her stock holdings, which are denoted by \((\theta _{i,t})_{t\in [0,1]}\) for \(i\in \{1,...,M\}\). For simplicity, the initial endowed holdings of both the bank account and the stock are normalized to zero for all rebalancers. When \(\tilde{a}_i\) is close to zero \((\tilde{a}_i\approx 0)\), rebalancer i is a “high-frequency" liquidity provider with inventory penalties. Because \(\tilde{a}_i\) is private information for i, other traders k, \(k\ne i\), do not know whether rebalancer i has an active latent trading demand \((|\tilde{a}_i|>>0)\) or is a liquidity provider \((\tilde{a}_i\approx 0)\).

  2. (ii)

    Price-sensitive trackers. Trackers \(j\in \{M+1,...,M+\bar{M}\}\) all track a dynamic target given by a common exogenous Brownian motion process \(w_t\) over time \(t\in [0,1]\)

    $$\begin{aligned} w_t := w_0 + w^\circ _t,\quad t\in (0,1], \end{aligned}$$
    (2.2)

    where the initial target is \(w_0 \sim {{\mathcal {N}}}(0,\sigma ^2_{w_0})\), and \(w^\circ _t\) is a standard Brownian motion that starts at zero, has a zero drift, and a unit volatility.Footnote 2 While trackers observe the same \(w_t\) at time \(t\in [0,1]\), rebalancers do not and instead filter \(w_t\) over time \(t\in [0,1]\). Tracker j’s control is her stock holdings, which are denoted by \((\theta _{j,t})_{t\in [0,1]}\) for \(j\in \{M+1,...,M+\bar{M}\}\). Their initial stock and money market holdings are also normalized to zero. We assume the random variables \((\tilde{a}_1,...,\tilde{a}_M)\), \(w_0\), and \((w^\circ _t)_{t\in [0,1]}\) are all independent.

van Kerval, Kwan, and Westerholm [25] show that interactions between multiple heterogenous investors are an empirically important part of the trading process. Our model with \(M \ge 1\) and \({\bar{M}} \ge 1\) lets us analyze such trading interactions. In the following, index \(k\in \{1,...,M+{\bar{M}}\}\) denotes any generic trader, index \(i\in \{1,...,M\}\) denotes a rebalancer, and index \(j\in \{M+1,...,M+{\bar{M}}\}\) denotes a tracker. This allows us to express the stock-market clearing condition as

$$\begin{aligned} 0= \sum _{k=1}^{M+\bar{M}} \theta _{k,t}= \underbrace{\sum _{i=1}^M \theta _{i,t}}_\text {rebalancer demand}+\underbrace{\sum _{j=M+1}^{M+\bar{M}} \theta _{j,t}}_\text {tracker demand},\quad t\in [0,1]. \end{aligned}$$
(2.3)

Investor stock demands change over time due to stochastic shocks to the tracker target \(w_t\) and due to randomness in imperfect learning about the rebalancer targets. As a result, the stock-price process that clears the market as in (2.3) changes randomly over time. Thus, stock randomness in our model — given that the fundamental dividend valuation is normalized to zero — comes from learning about traders’ parent targets (which are initially private information of the individual rebalancers and the trackers) and from random changes over time in the trackers’ target \(w_t\).Footnote 3

Investor information is represented as generic filtrations \({\mathcal F}_{i,t}\) and \({\mathcal F}_{j,t}\) for rebalancers and trackers. These filtrations are constructed explicitly in the equilibria considered below. In the price-friction equilibrium in Sect. 3, the filtrations \(\mathcal {F}_{i,t}\) and \(\mathcal {F}_{j,t}\) are

$$\begin{aligned} \begin{aligned}&\sigma (\tilde{a}_i,S_{i,u})_{u\in [0,t]},\quad t\in [0,1],\quad i\in \{1,...,M\},\\&\sigma (w_u,S_{j,u})_{u\in [0,t]},\quad t\in [0,1],\quad j\in \{M+1,...,M+\bar{M}\}, \end{aligned} \end{aligned}$$
(2.4)

where \(S_{i,t}\) and \(S_{j,t}\) denote perceived stock-price processes for a rebalancer i and a tracker j. In the Nash equilibrium in Sect. 4, more complicated filtrations are needed to derive traders’ optimal off-equilibrium response functions.

Our model is a model of dynamic learning. As we shall see, trackers infer the aggregate target \(\tilde{a}_\Sigma \) in (2.1) from the initial stock price, and so trackers have no need to filter the rebalancers’ individual targets \((\tilde{a}_1, ..., \tilde{a}_M)\). The situation is different for each rebalancer \(i\in \{1,...,M\}\), who only observes her own target \(\tilde{a}_i\) and past and current stock prices. When \(\sigma _{w_0} >0\), these observations are insufficient to infer \(\tilde{a}_\Sigma \) and \(w_t\) separately, so rebalancer i filters based on \(\tilde{a}_i\) and on past and current stock-price observations to learn about the underlying latent parent demands \(\tilde{a}_\Sigma \) and \(w_t\). In contrast, when \(\sigma _{w_0}:=0\), the model only has static learning about \(\tilde{a}_\Sigma \) at time \(t=0\) from the initial stock price. At later times \(t\in (0,1]\), the rebalancers can infer \(w_t\) from their stock-price observations. The static learning model with \(\sigma _{w_0}:=0\) was developed in Choi, Larsen, and Seppi [13].

2.1 Individual maximization problems

This section introduces the individual maximization problems. A generic trader k’s optimal stock holdings are determined in terms of a trade-off between expected terminal wealth \(X_{k,1}\) and a penalty for deviations of their holdings \(\theta _{k,t}\) over time from their parent target \(\tilde{a}_i\) (rebalancers) or Brownian motion \(w_t\) (trackers). An investor’s terminal wealth \(X_{k,1}\) depends on the stock prices \(S_{k,t}\) associated with k’s holdings \(\theta _{k,t}\) over time. An exogenous continuous (deterministic) function \(\kappa :[0,1]\rightarrow [0,\infty ]\) models the severity of the target penalty over time.Footnote 4 For example, more severe target penalties later in the day would be associated with a penalty severity function \(\kappa \) that is increasing with time t. The rebalancer and tracker objectives are

$$\begin{aligned} \begin{aligned}&\sup _{\theta _{i,t}\in \mathcal {F}_{i,t}} \mathbb {E}\Big [ X_{i,1} - \int _0^1 \kappa (t)(\tilde{a}_i-\theta _{i,t})^2dt\Big |\,\mathcal {F}_{i,0}\Big ],\quad i\in \{1,...,M\},\\&\sup _{\theta _{j,t}\in \mathcal {F}_{j,t}} \mathbb {E}\Big [ X_{j,1} - \int _0^1 \kappa (t)(w_t-\theta _{j,t})^2dt\Big |\,\mathcal {F}_{j,0}\Big ],\quad j\in \{M+1,...,M+\bar{M}\}, \end{aligned} \end{aligned}$$
(2.5)

where \(\tilde{a}_i\) is the ideal holdings for rebalancer i and \(w_t\) is the ideal holdings for tracker j at time \(t\in [0,1]\). However, stock-market clearing prevents \(\theta _{i,t}\) and \(\theta _{j,t}\) from being \(\tilde{a}_i\) and \(w_t\). The suprema in (2.5) are taken over progressively measurable holding processes \(\theta _{i,t}\) and \(\theta _{j,t}\) with respect to traders’ filtrations \(\mathcal {F}_{i,t}\) and \(\mathcal {F}_{j,t}\). As we shall see in Sects. 3 and 4 below, our traders optimally use controls given as smooth functions evaluated at a finite set of state processes (i.e., Markov controls). The next section constructs such a set of Markovian state processes. To rule out doubling strategies, we require square integrability

$$\begin{aligned} \mathbb {E}\left[ \int _0^1 \theta _{k,t}^2 dt \right] <\infty ,\quad k\in \{1,...,M+\bar{M}\}. \end{aligned}$$
(2.6)

Terminal wealth \(X_{k,1}\) in (2.5) is generated by trader k’s perceived wealth process

$$\begin{aligned} dX_{k,t}&:= \theta _{k,t}dS_{k,t},\quad X_{k,0} := 0,\quad k\in \{1,...,M+\bar{M}\}, \end{aligned}$$
(2.7)

which is affected by k’s holdings \(\theta _{k,t}\) both directly and also indirectly via the impact of k’s holdings on an associated perceived stock-price process \(S_{k,t}\). Trader k’s holdings \(\theta _{k,t}\) are price sensitive because market-clearing price pressure affects price drifts and, thus, investor wealth. In (2.7), the zero initial wealth \(X_{k,0}=0\) is because trader k’s initial endowed money market and stock holdings are normalized to zero. Thus, \(\tilde{a}_i\) and \(w_t\) are ideal holding changes relative to investors’ normalized initial zero holdings. Given the objectives in (2.5), trading reflects a combination of motives: Investors seek to have stock holdings close to their own targets \(a_i\) and \(w_t\), but they also seek to increase their expected terminal wealth by trading on price pressure from other investors trading on their targets. Thus, traders demand liquidity (to come close to their targets) and supply liquidity for markets to clear (by being willing to deviate from their targets so that other traders can trade towards their targets, given the appropriate price incentives), and speculate on future predictable price pressure.

Our remaining model construction involves specifying investor stock-price perceptions \(S_{i,t}\) and \(S_{j,t}\) and the associated investor filtrations \({\mathcal F}_{i,t}\) and \({\mathcal F}_{j,t}\). We then state conditions that these perceptions and filtrations must satisfy in equilibrium. Finally, we give theoretical results that ensure equilibria exist.

2.2 State processes

The fundamental underlying state of the market in our model depends on the aggregate parent demand imbalances \(\tilde{a}_\Sigma \) and \(\bar{M} w_t\). As already noted, there is a significant informational difference between trackers and rebalancers. Each tracker directly observes \(w_t\) in (2.2) and — as we shall see — can therefore infer the aggregate rebalancer target \(\tilde{a}_\Sigma \) in (2.1) from the initial stock price. In contrast, rebalancers learn about \(w_t\) and \(\tilde{a}_\Sigma \) using dynamic filtering. Thus, the rebalancer filtrations \(\mathcal {F}_{i,t}\), \(i\in \{1,...,M\}\), and tracker filtrations \(\mathcal {F}_{j,t}\), \(j\in \{M+1,...,M+{\bar{M}}\}\), are not nested. Rebalancers know prices and their individual target \(\tilde{a}_i\), whereas trackers know \(\tilde{a}_\Sigma \), \(w_t\), and prices.

Before considering specific stock-price perceptions in Sects. 3 and 4 below, we describe a set of conjectured state processes \((Y_t,\eta _t,q_{i,t},w_{i,t})\) for rebalancer \(i\in \{1,...,M\}\). These processes are all endogenous in the equilibria we construct. However, it is convenient to describe the state processes’ informational properties first, before showing how they arise in equilibrium. The processes \((Y_t,\eta _t)\) are public in that they are adapted to \(\mathcal {F}_{k,t}\) for all traders \(k\in \{1,...,M+\bar{M}\}\). Furthermore, \(\eta _t\) will be adapted to \(\sigma (Y_u)_{u\in [0,t]}\). The state processes \((q_{i,t},w_{i,t})\) are specific to individual rebalancers. They are adapted to i’s filtration \(\mathcal {F}_{i,t}\), but they are not adapted to other traders’ filtrations \(\mathcal {F}_{k,t}\) for \(k\ne i\).

Rebalancers learn by extracting information about aggregate demand imbalances from stock prices. In the equilibria we construct, the information extracted from stock prices over time t is a state process \(Y_t\), which has the form

$$\begin{aligned} Y_t:=w_t - B(t)\tilde{a}_\Sigma ,\quad t\in [0,1], \end{aligned}$$
(2.8)

where \(B:[0,1]\rightarrow \mathbb {R}\) is a smooth deterministic function of time that is endogenously determined in equilibrium. The function B(t) controls how \(\tilde{a}_\Sigma \) and \(w_t\) are mixed in stock prices. The process \(Y_t\) is not directly observable for the rebalancers, but Lemma 3.1 below shows that \(Y_t\) can be inferred from stock prices. Because rebalancer \(i\in \{1,...,M\}\) also knows her own target \(\tilde{a}_i\), by knowing \(Y_t\) over time \(t\in [0,1]\), she equivalently knows

$$\begin{aligned} \begin{aligned} Y_{i,t}:&= Y_t + B(t)\tilde{a}_i\\&= w_t - B(t)(\tilde{a}_\Sigma -\tilde{a}_i). \end{aligned} \end{aligned}$$
(2.9)

Unlike \(Y_t\) in (2.8), the process \(Y_{i,t}\) is independent of rebalancer i’s private trading target \(\tilde{a}_i\) and satisfies

$$\begin{aligned} \sigma (\tilde{a}_i,Y_u)_{u\in [0,t]} = \sigma (\tilde{a}_i,Y_{i,u})_{u\in [0,t]},\quad t\in [0,1]. \end{aligned}$$
(2.10)

Rebalancers use knowledge of \(Y_t\) to estimate \(\tilde{a}_\Sigma \) and \(w_t\) from stock prices at time t. For a continuously differentiable function \(B:[0,1]\rightarrow \mathbb {R}\), we define two processes

$$\begin{aligned} \begin{aligned} q_{i,t}&:= \mathbb {E}\left[ \tilde{a}_\Sigma -\tilde{a}_i\,\Big |\, \sigma (Y_{i,u})_{u\in [0,t]}\right] ,\\ dw_{i,t}&:=dw_t-B'(t)\big (\tilde{a}_\Sigma -\tilde{a}_i - q_{i,t} \big )dt,\quad w_{i,0} := Y_{i,0}, \end{aligned} \end{aligned}$$
(2.11)

for each rebalancer \(i \in \{1,...,M\}\) and \( t\in [0,1]\). The expectation \(q_{i,t}\) describes what rebalancer i has learned up through time t about the aggregate target \(\tilde{a}_\Sigma -\tilde{a}_i\) of the other rebalancers.Footnote 5 In particular, \(q_{i,t}\) is a path-dependent process because it depends on the path of \(Y_{i,s}\) over time \(s\in [0,t]\).

Let the function \(\Sigma (t)\) denote the remaining variance

$$\begin{aligned} \Sigma (t):= \mathbb {V}[\tilde{a}_\Sigma -\tilde{a}_i -q_{i,t}]=\mathbb {E}[(\tilde{a}_\Sigma -\tilde{a}_i -q_{i,t})^2],\quad t\in [0,1], \end{aligned}$$
(2.12)

where the second equality follows from the zero-mean assumptions for \((\tilde{a}_1,...,\tilde{a}_M)\) and \(w_0\). Because the targets \((\tilde{a}_1,...,\tilde{a}_M)\) are assumed independent and homogeneously distributed \({{\mathcal {N}}}(0,\sigma ^2_{\tilde{a}})\), the initial variance \(\Sigma (0)=\mathbb {E}[(\tilde{a}_\Sigma -\tilde{a}_i -q_{i,0})^2]\) is identical across all rebalancers \(i\in \{1,...,M\}\). This property and the formula for \(\Sigma (t)\) in (2.15) below imply that \(\Sigma (t)\) is also identical for all index \(i\in \{1,...,M\}\) for all \(t\in [0,1]\).

Now consider the \(w_{i,t}\) processes. Eq. (2.11) gives the dynamics of \(Y_{i,t}\) as

$$\begin{aligned} \begin{aligned} dY_{i,t}&= dw_t - B'(t)(\tilde{a}_\Sigma -\tilde{a}_i)dt\\&=dw_{i,t} - B'(t)q_{i,t} dt. \end{aligned} \end{aligned}$$
(2.13)

The following result is a special case of the Kalman-Bucy result from filtering theory (See Appendix B for details).

Lemma 2.1

(Kalman-Bucy) For a continuously differentiable function \(B:[0,1]\rightarrow \mathbb {R}\), the process \(w_{i,t}\) is independent of \(\tilde{a}_i\), is a Brownian motion, and satisfies (modulo \(\mathbb {P}\) null sets)

$$\begin{aligned} \sigma (\tilde{a}_i,Y_{i,u})_{u\in [0,t]}=\sigma (\tilde{a}_i,w_{i,u})_{u\in [0,t]},\quad t\in [0,1]. \end{aligned}$$
(2.14)

Furthermore, the remaining variance at time t is given by

$$\begin{aligned} \begin{aligned} \Sigma (t)&= \frac{1}{\frac{1}{\mathbb {V}[\tilde{a}_\Sigma -\tilde{a}_i -q_{i,0}]}+\int _0^t \big (B'(u)\big )^2du},\quad t\in [0,1]. \end{aligned} \end{aligned}$$
(2.15)

\(\diamondsuit \)

Because the process \(w_{i,t}\) is independent of \(\tilde{a}_i\), \(w_{i,t}\) is also a Brownian motion with respect to the filtration \(\sigma (\tilde{a}_i,w_{i,u})_{u\in [0,t]}\). Furthermore, Lemma 2.1 shows that \((\tilde{a}_i,w_{i,t})\) are informationally equivalent with \((\tilde{a}_i,Y_{i,t})\) in the sense that (2.14) holds. However, while \(w_{i,t}\) on the left in (2.11) is observable by rebalancers, the individual terms \(w_t\) and \(\tilde{a}_\Sigma \) in \(w_{i,t}\)’s decomposition on the right of (2.11) are not.

The stock-market clearing condition (2.3) lets us relate prices to the state processes driving investor demands. The sum \(\sum _{i=1}^M q_{i,t}\) is an important term in this relation, so the following decomposition results are useful:

Lemma 2.2

Let \(B:[0,1]\rightarrow \mathbb {R}\) be a continuously differentiable function.

  1. 1.

    The decomposition

    $$\begin{aligned} \sum _{i=1}^Mq_{i,t} = \eta _t + A(t) \tilde{a}_\Sigma ,\quad t\in [0,1], \end{aligned}$$
    (2.16)

    holds with the process \(\eta _t\) being adapted to \(\sigma (Y_u)_{u\in [0,t]}\) with \(Y_t\) in (2.8) and

    $$\begin{aligned} \begin{aligned} A'(t)&= - \big (B'(t)\big )^2\Sigma (t)\big (A(t) +1\big ),\quad A(0)=-\tfrac{(M-1)B(0)^2\sigma ^2_{\tilde{a}}}{\sigma ^2_{w_0} +(M-1)B(0)^2\sigma ^2_{\tilde{a}}},\\ d\eta _t&= - \big (B'(t)\big )^2\Sigma (t)\eta _tdt- MB'(t)\Sigma (t)dY_t,\quad \eta _0 =-\tfrac{M(M-1)B(0)\sigma ^2_{\tilde{a}}}{\sigma ^2_{w_0} +(M-1)B(0)^2\sigma ^2_{\tilde{a}}}Y_0. \end{aligned} \end{aligned}$$
    (2.17)
  2. 2.

    The inverse relation

    $$\begin{aligned} q_{i,t}&= \frac{\eta _t}{M} - F_1(t)\left( \tfrac{(M-1)B(0)^2 \sigma _a^2}{\sigma _{w_0}^2 + (M-1)B(0)^2\sigma _a^2} +F_2(t) \right) \tilde{a}_i \end{aligned}$$
    (2.18)

    holds with deterministic functions \(F_1(t)\) and \(F_2(t)\) given by the ODEs

    $$\begin{aligned} \begin{aligned} F_1'(t)&=-B'(t)^2 \Sigma (t) F_1(t), \quad F_1(0)=1,\\ F_2'(t)&=\tfrac{B'(t)^2 \Sigma (t)}{F_1(t)}, \quad F_2(0)=0. \end{aligned} \end{aligned}$$
    (2.19)

\(\diamondsuit \)

There are two key points: First, no investor knows \(\sum _{i=1}^Mq_{i,t}\), but it can be decomposed into a public term \(\eta _t\) and a term \(A(t)\tilde{a}_\Sigma \) that trackers know but not the rebalancers. Second, from (2.17), the process \(\eta _t\) depends on the path of \(Y_s\) over time \(s\in [0,t]\). Thus, the state process \(\eta _t\) reflects common path dependence due to \(w_t\). The expression (2.18) shows that the individual rebalancer expectation \(q_{i,t}\) includes a common learning component \(\frac{\eta _t}{M}\) and then the effect of i’s private information \(\tilde{a}_i\). In particular, it follows from (2.19), that \(F_1(t)\) and \(F_2(t)\) are both positive so that, consistent with intuition, the loading on \(\tilde{a}_i\) is negative in (2.18).

3 Price-friction equilibrium

Investor perceptions of the impact of their trading on stock prices are a key part of the optimizations in (2.5) and the resulting market equilibrium. We consider two specifications of investor stock-price perceptions. This section presents a simplified model in which perceived price impact is a fully exogenous trading friction. This approach is analogous to the exogenous price impact used in van Kerval, Kwan, and Westerholm [25]. We then solve for the endogenous stock-price process that clears the market (and also satisfies some weak consistency conditions) and the associated optimized investor-holding processes. Sect. 4 presents a richer model of price impact in which investor stock-price perceptions are partially endogenized in a subgame perfect Nash financial-market equilibrium.

Our equilibrium construction is a conjecture-and-verify analysis. Section 3.1 conjectures functional forms for investor perceptions of stock-price dynamics. Section 3.2 defines equilibrium and then solves for equilibrium price-perception coefficients and the associated price dynamics and holdings that satisfy the definition of equilibrium.

3.1 Stock-price perceptions

Recall that price pressure is different from the value of future dividends. It is a valuation adjustment needed to clear the stock market given trading demand imbalances. This allows us to model price pressure as zero-dividend asset prices as in, e.g., Karatzas and Shreve [27,  Chap. 4].

Rebalancers optimize (2.5) with respect to perceived stock-price processes of the form

$$\begin{aligned} \begin{aligned} dS^f_{i,t}&:= \Big \{f_0(t)Y_t +f_1(t)\tilde{a}_i +f_2(t)q_{i,t}+f_3(t)\eta _t+ \alpha \theta _{i,t}\Big \}dt+ \gamma dw_{i,t}, \\ S^f_{i,0}&:=Y_0,\quad i\in \{1,...,M\}, \end{aligned} \end{aligned}$$
(3.1)

where \(f_0,f_1,f_2,f_3:[0,1]\rightarrow \mathbb {R}\) are continuous (deterministic) functions of time \(t\in [0,1]\) and \((\alpha ,\gamma )\) are constants. The “f” superscript indicates that the perceived price \(S^f_{i,t}\) is defined with respect to a particular set of coefficient functions f in (3.1). The stock-price drift in (3.1) is perceived by rebalancer i to be affine in a set of state processes. Consistent with intuition, we will see that in equilibrium the loadings \(f_0(t)\) and \(f_3(t)\) on \(Y_t\) and \(\eta _t\) are negative. In particular, \(Y_t\) with \(B(t) < 0\) measures a mix of aggregate demand from rebalancers and trackers, and \(\eta _t\) reflects public expectations of aggregate private rebalancer expectations about other rebalancers’ parent-demand imbalances, both of which depress price change expectations. The other coefficients describe the perceived impact of rebalancer i on the stock-price drift.

Theorem 3.5 below endogenously determines \((f_0,f_1,f_2,f_3)\) in equilibrium. The exogenous parameters \((\alpha ,\gamma )\) can be found by calibrating model output to empirical data. The term \(\alpha \theta _{i,t}\) allows for ad hoc trading frictions. The price-friction parameter \(\alpha \) is an exogenous model input. Price taking is a special case with \(\alpha :=0\), whereas the empirically relevant case is \(\alpha <0\) such that buy (sell) orders decrease (increase) the future stock-price drifts.

The innovations in the rebalancers’ perceived stock prices \(dw_{i,t}\) come from new information rebalancer i learns over time about the underlying parent-demand state variable \(Y_t\), which has both a direct effect on the future stock-price drift and an additional indirect effect via its impact on \(\eta _t\) since \(\eta _t\) is adapted to \(\sigma (Y_u)_{u\in [0,t]}\) from Lemma 2.2.

The zero-dividend stock valuation approach (see, e.g., Chapter 4 in Karatzas and Shreve, [27]) has several consequences: First, we model perceived and equilibrium stock-price drifts rather than price levels. Second, in (3.1), the stock’s volatility and initial value are not determined in equilibrium but rather are model inputs. For simplicity, we set the volatility to be a constant \(\gamma > 0\) (i.e., positive demand innovations \(dw_{i,t}\) increase prices), and the initial price is set to be \(Y_0\) in (3.1). However, other choices of \(S_0\) would work equally well as long as \(S_0\) satisfies \(\sigma (S_0) = \sigma (Y_0)\).

The next result shows that \(w_{i,t}\) is rebalancer i’s innovations process in the sense that \(w_{i,t}\) is a Brownian motion relative to i’s filtration defined with perceived stock prices \(S^f_{i,t}\) in (3.1) and such that \(S^f_{i,t}\) and \(w_{i,t}\) generate the same information.

Lemma 3.1

Let \(f_0,f_1,f_2,f_3:[0,1]\rightarrow \mathbb {R}\) be continuous functions and let \(B:[0,1]\rightarrow \mathbb {R}\) be a continuously differentiable function. For a rebalancer \(i\in \{1,...,M\}\), let \(\theta _{i,t}\) satisfy (2.6) and be progressively measurable with respect to \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) with \(S^f_{i,t}\) defined in (3.1) and \(Y_t\) defined in (2.8). Then, modulo \(\mathbb {P}\)-null sets, we have

$$\begin{aligned} \sigma (\tilde{a}_i, w_{i,u})_{u\in [0,t]} = \sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]} ,\quad t\in [0,1],\quad i\in \{1,...,M\}. \end{aligned}$$
(3.2)

\(\diamondsuit \)

Thus, given a path of perceived prices generated by a price process \(S^f_{i,t}\) of the form in (3.1) and her personal target \(\tilde{a}_i\), rebalancer i can infer the path of \(w_{i,t}\). Furthermore, given the path \(w_{i,t}\), rebalancer i can infer \(Y_{i,t}\) using (2.14) and, thus, can infer \(Y_t\) from (2.10). Consequently, rebalancer i can infer \((q_{i,t},\eta _t)\) where we recall from Lemma 2.2 that \(\eta _t\) is adapted to \(\sigma (Y_t)_{t\in [0,1]}\).

Trackers optimize (2.5) with respect to a perceived stock-price process of the form

$$\begin{aligned} \begin{aligned} dS^{{\bar{f}}}_{j,t} :&= \Big \{\bar{f}_3(t)\eta _t+\bar{f}_4(t)\tilde{a}_\Sigma +\bar{f}_5(t)w_t+ \alpha \theta _{j,t}\Big \}dt + \gamma dw_t,\\ S^{{\bar{f}}}_{j,0}:&=Y_0,\quad j\in \{M+1,...,M+\bar{M}\}, \end{aligned} \end{aligned}$$
(3.3)

where \(\bar{f}_3,\bar{f}_4,\bar{f}_5:[0,1]\rightarrow \mathbb {R}\) are continuous (determinstic) functions, and the \(\alpha \) is a constant.Footnote 6 Trackers have different information in that they observe \(w_t\) directly and can infer \(\tilde{a}_\Sigma \) from the initial stock price \(Y_0\) using (2.8) and their knowledge of \(w_0\). Therefore, their perceived stock prices differ from those of the rebalancers. Theorem 3.5 below endogenously determines \((\bar{f}_3,\bar{f}_4,\bar{f}_5)\) in equilibrium, and \((\alpha ,\gamma )\) are exogenous model inputs. Again, \(\alpha :=0\) is the special case of price-taking.

The motivation for these price perceptions for the trackers is as follows. First, the perceptions in (3.3) allow trackers to condition their perceived price drift to take into account price pressure from target imbalances \(\tilde{a}_\Sigma \) and \(w_t\) that depress expected price changes. Since trackers and rebalancers trade differently on their targets, the price-drift impacts \(\bar{f}_4\) and \(\bar{f}_5\) are in general different. Second, the trackers understand that the state process \(Y_t\) affects the rebalancer demand and, thus, the stock-price drift. However, \(Y_t\) does not need to be included explicitly in the tracker perceived price drift in (3.3) since \(Y_t\) is given by a linear combination of \(\tilde{a}_\Sigma \) and \(w_t\), which are already included in the drift. Third, trackers know that rebalancers’ can infer \(\eta _t\) and that this potentially affects their price perceptions in (3.1), and, thus, is likely to affect their trading, and, thus, is likely to affect pricing. Thus, trackers allow for the pricing effect of \(\eta _t\) in their perceptions in (3.3). Fourth, as already noted, \(\alpha \) allows for possible exogenous trading frictions, if any.

An important difference between rebalancer and tracker perceived prices in (3.1) and (3.3) is that rebalancer price dynamics are based on the informational innovations \(dw_{i,t}\), whereas tracker price dynamics are based on the tracker target changes \(dw_t\). Reconciling the price perceptions of rebalancers and trackers will impose restrictions on equilibrium price perceptions and holdings and will rely on the relation between \(dw_{i,t}\) and \(dw_t\) in (2.11).

Given the price perceptions in (3.1) and (3.3), we solve (2.5) for optimal rebalancer and tracker holdings.

Lemma 3.2

Let \(f_0,f_1,f_2,f_3,\bar{f}_3,\bar{f}_4,\bar{f}_5:[0,1]\rightarrow \mathbb {R}\) and \(\kappa :[0,1]\rightarrow (0,\infty ]\) be continuous functions, let \(B:[0,1]\rightarrow \mathbb {R}\) be continuously differentiable, let \(\alpha \le 0\), and let the perceived stock-price process in the wealth dynamics (2.7) be as in (3.1) and (3.3). Then, for \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) and \(\mathcal {F}_{j,t}:=\sigma (w_u,S^{{\bar{f}}}_{j,u})_{u\in [0,t]}\), and, provided the holding processes

$$\begin{aligned} \begin{aligned} \hat{\theta }_{i,t}&:= \frac{f_0(t)}{2 (\kappa (t)-\alpha )}Y_t+\frac{f_1(t)+2 \kappa (t)}{2(\kappa (t)- \alpha )}\tilde{a}_i+\frac{f_2(t)}{2 (\kappa (t)-\alpha )}q_{i,t}+\frac{f_3(t)}{2 (\kappa (t)-\alpha )}\eta _t,\\ \hat{\theta }_{j,t}&:= \frac{\bar{f}_3(t)}{2 (\kappa (t)-\alpha )}\eta _t+\frac{\bar{f}_5(t)+2 \kappa (t)}{2(\kappa (t)- \alpha )}w_t+\frac{\bar{f}_4(t)}{2 (\kappa (t)-\alpha )}\tilde{a}_\Sigma , \end{aligned} \end{aligned}$$
(3.4)

satisfy (2.6), the traders’ maximizers for (2.5) are \(\hat{\theta }_{i,t} \) for rebalancer \(i\in \{1,...,M\}\) and \(\hat{\theta }_{j,t}\) for tracker \(j\in \{M+1,...,M+\bar{M}\}\). \(\diamondsuit \)

The proof of Lemma 3.2 shows that pointwise quadratic maximization gives the maximizers for (2.5) for rebalancers and trackers for arbitrary f and \({\bar{f}}\) functions.

Stock-price perceptions play two interconnected roles in our model. First, rebalancers and trackers solve their optimization problems in (2.5) based on their perceptions in (3.1) and (3.3) for how hypothetical holdings \(\theta _{i,t}\) and \(\theta _{j,t}\) affect price dynamics. Second, investor stock-price perceptions affect how they learn from observed prices. In particular, Lemma 3.1 shows that rebalancers use their stock-price perceptions (3.1) to infer the aggregate demand state variable \(Y_t\) based on past and current stock prices. In other words, dynamic learning by rebalancers depends critically on their stock-price perceptions. Similarly, trackers also use their stock-price perception of \(Y_0\) in (3.3) to infer the aggregate parent demand \(\tilde{a}_\Sigma \) from the initial price at time \(t=0\). However, thereafter, there is no additional learning from prices by the trackers at \(t>0\) since they directly observe their target \(w_t\).

3.2 Equilibrium

This section defines our first of two equilibrium concepts and then derives price-perception coefficients for the conjectured functional form in Sect. 3.1 that satisfy the equilibrium definition along with the associated equilibrium price dynamics and holdings. The notion of equilibrium in our first construction is relatively simple, being based just on market clearing and consistency of investor price perceptions.

Definition 3.3

Deterministic functions of time \(f_0,f_1,f_2,f_3,\bar{f}_3,\bar{f}_4,\bar{f}_5,B:[0,1]\rightarrow \mathbb {R}\) constitute a price-friction equilibrium if:

  1. (i)

    Maximizers \(\hat{\theta }_{k,t}\) for (2.5) exist for traders \(k \in \{1,...,M+\bar{M}\}\) given the stock-price perceptions (3.1) and (3.3) for filtrations \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) and \(\mathcal {F}_{j,t}:=\sigma (w_u,S^{{\bar{f}}}_{j,u})_{u\in [0,t]}\).

  2. (ii)

    Inserting trader k’s maximizer \(\hat{\theta }_{k,t}\) into the perceived stock-price processes (3.1) and (3.3) produces identical stock-price processes across all traders \(k\in \{1,...,M+\bar{M}\}\). This common equilibrium stock-price process is denoted by \(\hat{S}_t\).

  3. (iii)

    The money and stock markets clear. \(\diamondsuit \)

Definition 3.3 places only minimal restrictions on the perceived stock-price coefficient functions in (3.1) and (3.3): Markets must clear and result in consistent perceived stock-price processes when all investors use their equilibrium strategies. Section 4 below considers a subgame perfect Nash extension of our basic model that imposes more restrictions on allowable off-equilibrium stock-price perceptions such as off-equilibrium market clearing and various consistency requirements.

Definition 3.3(ii) requires that in equilibrium rebalancers and trackers perceive identical stock-price dynamics when using their equilibrium holdings. However, rebalancers and trackers have different information (i.e., rebalancers form imperfect inferences about \(w_t\) and \(\tilde{a}_\Sigma \), whereas trackers observe \(w_t\) directly and infer \(\tilde{a}_\Sigma \) at time 0). The resolution of this apparent paradox is investors’ different information sets: Trackers and rebalancers all agree on \(d\hat{S}_t\), but they disagree on how to decompose \(d\hat{S}_t\) into drift and volatility components. Because the trackers observe \(w_t\), they can use \(dw_t\) in their decomposition of \(d\hat{S}_t\). However, \(w_t\) is not adapted to the rebalancers’ filtrations and can therefore not be used in their \(d\hat{S}_t\) decompositions. Instead, rebalancers use their innovations processes \(dw_{i,t}\) when decomposing \(d\hat{S}_t\) into drift and volatility. By replacing \(dw_{i,t}\) in \(dS^f_{i,t}\) in (3.1) with the decomposition of \(dw_{i,t}\) in terms of \(dw_t\) from (2.11), we can rewrite \(dS^f_{i,t}\) in (3.1) as

$$\begin{aligned} \begin{aligned} dS^f_{i,t}&= \Big \{f_0(t)Y_t +f_1(t)\tilde{a}_i +f_2(t)q_{i,t}+f_3(t) \eta _t +\alpha \theta _{i,t}\\&\quad -B'(t)\big (\tilde{a}_\Sigma -\tilde{a}_i - q_{i,t} \big ) \gamma \Big \}dt+ \gamma dw_t,\quad i\in \{1,...,M\}. \end{aligned} \end{aligned}$$
(3.5)

Therefore, to ensure identical equilibrium stock-price perceptions for all rebalancers and trackers, it suffices to match the drift of \(dS^{{\bar{f}}}_{j,t}\) in (3.3) for the equilibrium holdings \(\theta _{j,t} = \hat{\theta }_{j,t}\), \(j\in \{M+1,...,M+\bar{M}\}\), with the drift of \(dS^f_{i,t}\) in (3.5) for the equilibrium holdings \(\theta _{i,t}:= \hat{\theta }_{i,t}\), \(i\in \{1,...,M\}\). This produces the following equilibrium requirement:

$$\begin{aligned} \begin{aligned}&f_0(t)Y_t +f_1(t)\tilde{a}_i +f_2(t)q_{i,t}+f_3(t) \eta _t +\alpha \hat{\theta }_{i,t} -B'(t)\big (\tilde{a}_\Sigma -\tilde{a}_i - q_{i,t} \big )\gamma \\&=\bar{f}_3(t)\eta _t+\bar{f}_4(t)\tilde{a}_\Sigma +\bar{f}_5(t)w_t+ \alpha \hat{\theta }_{j,t}, \end{aligned} \end{aligned}$$
(3.6)

for all rebalancers \(i \in \{1,...,M\}\) and all trackers \(j\in \{M+1,...,M+\bar{M}\}\). We note that the right-hand side of (3.6) does not depend on the rebalancer index i. Matching up coefficients in front of \((\tilde{a}_i,\tilde{a}_{\Sigma },q_{i,t},\eta _t,w_t)\) in (3.6) using \(\hat{\theta }_{i,t}\) and \(\hat{\theta }_{j,t}\) in (3.4) and \(Y_t\) in (2.8) produces five equations. In addition, inserting \(\hat{\theta }_{i,t}\) and \(\hat{\theta }_{j,t}\) in (3.4) into the market-clearing condition (2.3) and using (2.16) produce three more equations from matching \((\tilde{a}_{\Sigma },\eta _t,w_t)\) coefficients. All in all, we have eight equilibrium restrictions for \((f_0,f_1,f_2,f_3,\bar{f}_3,\bar{f}_4,\bar{f}_5)\) and \(B'\), which give the equilibrium coefficient functions (A.1) in Appendix A and the ODE for B(t) in (3.7) below.

Our equilibrium existence result is based on the following technical lemma. It guarantees the existence of a solution to an autonomous system of coupled ODEs. In particular, given rebalancer stock-price perceptions of the form in (3.1) with an aggregate demand state variable \(Y_t\) process of the form in (2.8) (and the associated \(\eta _t\) process), we must construct a deterministic function B(t) that gives an equilibrium.

Lemma 3.4

Let \(\kappa :[0,1]\rightarrow [0,\infty ]\) be a continuous and integrable function (i.e., \(\int _0^1 \kappa (t)dt <\infty \)). For an initial constant \(B(0) \in \mathbb {R}\), the coupled ODEs

$$\begin{aligned} \begin{aligned} B'(t)&= \frac{2 \kappa (t) (\bar{M} B(t)+1)}{\gamma (A(t)+\bar{M}+1)},\\ A'(t)&= - \big (B'(t)\big )^2\Sigma (t)\big (A(t) +1\big ),\quad A(0)=-\tfrac{(M-1)B(0)^2\sigma ^2_{\tilde{a}}}{\sigma ^2_{w_0} +(M-1)B(0)^2\sigma ^2_{\tilde{a}}},\\ \Sigma '(t)&= -\big (B'(t)\big )^2\Sigma (t)^2, \quad \Sigma (0) =\tfrac{(M-1) \sigma _{\tilde{a}}^2 \sigma _{w_0}^2}{(M-1)B(0)^2 \sigma _{\tilde{a}}^2+\sigma _{w_0}^2}, \end{aligned} \end{aligned}$$
(3.7)

have unique solutions with \(\Sigma (t) \ge 0\), \(\Sigma (t)\) decreasing, \(A(t) \in [-1,0]\), A(t) decreasing for \(t\in [0,1]\), and \(B(t),B'(t)<0\) when \(\bar{M}B(0) +1< 0\). \(\diamondsuit \)

The solutions to the ODEs for A(t) and \(\Sigma (t)\) in (3.7) agree with the expressions in (2.15) and (2.17). The exogenous price-friction coefficient \(\alpha \) does not appear in the ODEs (3.7). It is possible to restate the ODE system (3.7) using a single path-dependent ODE . The special case \(B(0) := -\frac{1}{\bar{M}}\) produces a model with no dynamic learning because \(B'(t)=0\) implies \(\Sigma '(t)=0\) and so \(dq_{i,t}=0\).

The following theorem gives the price-friction equilibrium in terms of the ODEs (3.7). In this theorem, the price-friction parameter \(\alpha \), volatility \(\gamma \), and initial value \(B(0)\in \mathbb {R}\) are free parameters. The intuition for B(0) being free is discussed after our equilibrium construction in Theorem 3.5.

Theorem 3.5

Let \(\kappa :[0,1]\rightarrow (0,\infty )\) be continuous, let the functions \((B,A,\Sigma )\) be as in Lemma 3.4, and let \(\alpha \le 0\). Then, we have:

  1. (i)

    A price-friction equilibrium exists and is given by the price-perception functions (A.1) in Appendix A.

  2. (ii)

    The equilibrium in (i) has holdings \(\hat{\theta }_{i,t}\) for rebalancer i and \(\hat{\theta }_{j,t}\) for tracker j given by

    $$\begin{aligned} \begin{aligned} \hat{\theta }_{i,t}&=-\tfrac{\gamma B'(t)-2 \kappa (t)}{2 \kappa (t)-\alpha }\tilde{a}_i -\tfrac{\gamma B'(t)}{2 \kappa (t)-\alpha }q_{i,t} \\&\qquad +\tfrac{\gamma B'(t)}{(M+\bar{M}) (2 \kappa (t)-\alpha )}\eta _t-\tfrac{2 \bar{M} \kappa (t)}{(M+\bar{M}) (2 \kappa (t)-\alpha )}Y_t,\quad i\in \{1,...,M\}, \\ \hat{\theta }_{j,t}&=\tfrac{\gamma B'(t)}{(M+\bar{M}) (2 \kappa (t)-\alpha )}\eta _t +\tfrac{2 M \kappa (t)}{(M+\bar{M}) (2 \kappa (t)-\alpha )}w_t \\&\qquad +\tfrac{\gamma (A(t)-M+1) B'(t)-2 \kappa (t)}{(M+\bar{M}) (2 \kappa (t)-\alpha )}\tilde{a}_\Sigma ,\quad j\in \{M+1,...,M+\bar{M}\}. \end{aligned} \end{aligned}$$
    (3.8)
  3. (iii)

    The equilibrium in (i) has the equilibrium stock-price process \(\hat{S}_t\) given by \(\hat{S}_0 := w_0 - B(0)\tilde{a}_\Sigma \) and dynamics with respect to the trackers’ filtrations \(\mathcal {F}_{j,t}:=\sigma (w_u,S^{{\bar{f}}}_{j,u})_{u\in [0,t]}\) given by

    $$\begin{aligned} \begin{aligned} d\hat{S}_t&=\Big \{\tfrac{\gamma B'(t)}{M+\bar{M}}\eta _t-\tfrac{2 \bar{M} \kappa (t)}{M+\bar{M}}w_t +\tfrac{\gamma (A(t)-M+1) B'(t)-2 \kappa (t)}{M+\bar{M}}\tilde{a}_\Sigma \Big \}dt + \gamma dw_t, \end{aligned} \end{aligned}$$
    (3.9)

    and dynamics with respect to the rebalancers’ filtrations \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) given by

    $$\begin{aligned} \begin{aligned} d\hat{S}_t&=\Big \{\tfrac{\gamma B'(t)}{M+\bar{M}}\eta _t-\tfrac{2 \bar{M} \kappa (t)}{M+\bar{M}}Y_t-\gamma B'(t)\big ( \tilde{a}_i + q_{i,t}\big ) \Big \}dt+ \gamma dw_{i,t}. \end{aligned} \end{aligned}$$
    (3.10)

    \(\diamondsuit \)

Several observations follow from Theorem 3.5:

  1. 1.

    Lemma 3.1 ensures that rebalancer i can infer her innovations process \(w_{i,t}\) from perceived prices \(S^f_{i,t}\) and \(\tilde{a}_i\), but rebalancer i cannot infer the trackers’ target \(w_t\) from the equilibrium prices \(\hat{S}_t\) in (3.9). This is because the aggregate target \(\tilde{a}_\Sigma \) also appears in the drift of \(d\hat{S}_t\) and \(\tilde{a}_\Sigma \) is not observed by individual rebalancers.

  2. 2.

    The equilibrium holdings (3.8) follow from inserting the equilibrium f and \({\bar{f}}\) functions in (A.1) in Appendix A into (3.4). Thus, the holdings in (3.8) are expressed in terms of the investors’ state processes, which, in particular, are adapted to the investors’ filtrations. However, these state processes are not mutually independent and so we give such representations of (3.8) in (A.2) and (A.3) in Appendix A. First, the price-friction equilibrium rebalancer holdings \(\hat{\theta }_{i,t}\) in (3.8) can be written in terms of the independent variables \((\tilde{a}_i, \tilde{a}_\Sigma -\tilde{a}_i, w_0)\) and a residual orthogonal term given as a stochastic integral with respect to \(w^\circ _t\) of a deterministic function of time. Likewise, the price-friction equilibrium tracker holdings \(\hat{\theta }_{j,t}\) can be written in terms of the independent variables \((\tilde{a}_\Sigma , w_0)\) and a residual orthogonal term given in terms of a stochastic integral with respect to \(w^\circ _t\) of a deterministic function of time. Both these residual terms are Gaussian. Section 3.4 illustrates the loading coefficients on these independent state processes.

  3. 3.

    Because the exogenous price-friction coefficient \(\alpha \le 0\) does not appear in the ODEs (3.7), \(\alpha \) is irrelevant for the equilibrium stock-price dynamics (3.9). However, \(\alpha \) does affect the equilibrium holdings in (3.8).

  4. 4.

    The stock-price volatility \(\gamma \) affects the stock-price drift and holdings via its impact on B(t) in (3.7) and, thus, on (A.1).

  5. 5.

    It can seem paradoxical that trackers and rebalancers all perceive the same equilibrium stock-price process \({\hat{S}}_t\), but they decompose its dynamics \(d{\hat{S}}_t\) into different perceived drifts and martingale terms (i.e., they have different Itô decompositions). The resolution lies in the rebalancers and trackers having different filtrations:Footnote 7 The drift and martingale terms in (3.10) are not adapted to \(\mathcal {F}_{j,t}\) and the drift and martingale terms in (3.9) are not adapted to \(\mathcal {F}_{i,t}\). The dynamics (3.9) and (3.10) all produce the same process \(\hat{S}_t\) because the innovations process \(w_{i,t}\) in (2.11) links \(dw_t\) with \(dw_{i,t}\) and the drift term \(B'(t)(\tilde{a}_\Sigma -\tilde{a}_i-q_{i,t})dt\).

  6. 6.

    Investors’ off-equilibrium perceived stock-price drifts differ linearly from their equilibrium drifts due to the differences \(\theta _{k,t}-\hat{\theta }_{k,t}\) between their off-equilibrium and equilibrium holdings.Footnote 8 Rebalancer i’s perceived stock-price drift in (3.1) can be decomposed for arbitrary holdings \(\theta _{i,t}\) as

    $$\begin{aligned} \begin{aligned}&f_0(t)Y_t +f_1(t)\tilde{a}_i +f_2(t)q_{i,t}+f_3(t)\eta _t+ \alpha \theta _{i,t}\\&= \tfrac{\gamma B'(t)}{M+\bar{M}}\eta _t-\tfrac{2 \bar{M} \kappa (t)}{M+\bar{M}}Y_t -\gamma B'(t)\big ( \tilde{a}_i + q_{i,t}\big ) +\alpha (\theta _{i,t} - \hat{\theta }_{i,t}), \end{aligned} \end{aligned}$$
    (3.11)

    where we have used the formulas for \((f_0,f_1,f_2,f_3)\) in (A.1) in Appendix A. Likewise, for arbitrary holdings \(\theta _{j,t}\), tracker j’s perceived stock-price drift in (3.3) is

    $$\begin{aligned} \begin{aligned}&\bar{f}_3(t)\eta _t+\bar{f}_4(t)\tilde{a}_\Sigma +\bar{f}_5(t)w_t+ \alpha \theta _{j,t}\\&=\tfrac{\gamma B'(t)}{M+\bar{M}}\eta _t-\tfrac{2 \bar{M} \kappa (t)}{M+\bar{M}}w_t +\tfrac{\gamma (A(t)-M+1) B'(t)-2 \kappa (t)}{M+\bar{M}}\tilde{a}_\Sigma +\alpha (\theta _{j,t} - \hat{\theta }_{j,t}), \end{aligned} \end{aligned}$$
    (3.12)

    where we have used the formulas for \((\bar{f}_3,\bar{f}_4,\bar{f}_5)\) in (A.1) in Appendix A. Continuity between equilibrium and off-equilibrium is a reasonable property of investor stock-price perceptions. The representation of the perceived rebalancer drift in (3.11) relative to \(\hat{\theta }_{i,t}\) from (3.8) also explains the presence of the rebalancer-specific terms \((\tilde{a}_i,q_{i,t})\) in the rebalancers’ perceptions in (3.1).

  7. 7.

    Investors initially use block trades at time 0 to trade to positions \(\theta _{i,0}\) and \(\theta _{j,0}\) from (3.8) that are generically different from their initial normalized holdings of 0. Thereafter, investors trade continuously at times \(t > 0\).

  8. 8.

    Theorem 3.5 verifies that price-perception coefficients in (3.1) and (3.3) can be constructed such that an equilibrium satisfying Definition 3.3 exists. However, as with many other rational expectation models, we do not have a proof of uniqueness. For example, there may be other public state variables in addition to \(\eta _t\) that could hypothetically be included in the perceived price drifts that might also be associated with other equilibria as defined in Definition 3.3.

The function B(t) from (3.7) is key both in constructing the equilibrium and for interpreting the equilibrium price and holding processes. First, there is the issue that the initial value B(0) is a free input in Theorem 3.5. The intuition is that our model determines equilibrium stock-price drifts but not price levels. As can be seen in Theorem 3.5(iii) , B(0) controls the initial price level in our model. Second, the relation between B(t) and price levels allows us to impose additional structure on B(t). In particular, \(w_t\) and \(\tilde{a}_\Sigma \) represent different types of demand imbalances. Thus, if \(B(t) < 0\), then \(Y_t\) in (2.8) plays the role of an aggregate demand state variable. How the two component quantities \(w_t\) and \(\tilde{a}_\Sigma \) are mixed in the aggregate demand state variable \(Y_t\) is different given the two components’ different informational dynamics (i.e., \(\tilde{a}_\Sigma \) is not time dependent while \(w_t\) changes randomly over time) and given their different impacts on investor demands (i.e., each rebalancer only knows their personal \(\tilde{a}_i\) component of \(\tilde{a}_\Sigma \) where other rebalancers’ targets do not affect investor i’s parent demand whereas \(w_t\) affects both an individual tracker’s parent demand and is also information about other trackers’ parent demands). It seems reasonable that the sign of the impact of \(w_t\) and \(\tilde{a}_\Sigma \) on the price level should be the same, which imposes the additional restriction that \(B(t) < 0\). From Lemma 3.4, a sufficient condition for \(B(t) < 0\) for all \(t \in [0,1]\) is \({{\bar{M}}}B(0) + 1<0\), which implies \(B'(t)<0\).Footnote 9

With the economically reasonable parametric restriction that \(B'(t) < 0\) and given that \(\alpha \le 0\) so that \(\alpha - 2 \kappa (t) < 0\), we can sign the impact of various quantities in the model on holdings and prices, which leads to the following comparative statics:

  1. 1.

    In (3.8), the equilibrium holdings \(\hat{\theta }_{i,t}\) of rebalancers are positively related to their parent targets \(\tilde{a}_i\). This is intuitive because rebalancers want holdings close to \(\tilde{a}_i\). Rebalancer holdings \(\hat{\theta }_{i,t}\) are negatively related to the aggregate demand imbalance state variable \(Y_t\). The fact that \(\theta _{i,t}\) is decreasing in \(Y_t\) is consistent with the theoretical results and empirical evidence in van Kerval, Kwan, and Westerholm [25] that investors buy less when there is a positive parent-demand imbalance for other investors in the market. The same intuition applies to the negative impact of the common component \(\eta _t\) on \(\hat{\theta }_{i,t}\). However, the impact of \(q_{i,t}\) on \(\hat{\theta }_{i,t}\) is positive. The intuition is that when rebalancer i expects the other rebalancers (given i’s ability to filter using her private target information \(\tilde{a}_i\)) to have a net positive parent-demand imbalance \(\mathbb {E}[\tilde{a}_\Sigma - \tilde{a}_i |\mathcal {F}_{i,t}]\) from (2.11), she buys at time t to speculate on the resulting anticipated positive drift in future price pressure in (3.10).

  2. 2.

    In (3.8), the equilibrium holdings \(\hat{\theta }_{j,t}\) of trackers are increasing in \(w_t\) (which reflects both her own parent demand and also information about the parent demands of other trackers). Tracker holdings \(\hat{\theta }_{j,t}\) are also decreasing in \(\eta _t\), which is related to imbalances in rebalancers’ aggregate parent demand expectations. The negative effect of \(\eta _t\) is consistent with the van Kerval, Kwan, and Westerholm [25] liquidity-provision result and empirical evidence. However, the impact of \(\tilde{a}_\Sigma \) is ambiguous in (3.8), and numerical calculations in Sect. 3.4 show that the sign is positive. This is again consistent with speculation on future predicted price pressure due to the tracker’s superior information about aggregated latent parent-demand imbalances.

  3. 3.

    The equilibrium stock-price drift in (3.9) is decreasing in the tracker parent demand \(w_t\). However, the impact of \(\tilde{a}_\Sigma \) in the price drift is again ambiguous, which is related to information about \(\tilde{a}_\Sigma \) being useful in forecasting future price pressure.

3.3 Tractability and model structure

This section discusses the key modeling features that make our model tractable. First, we assume all traders seek to maximize their individual objectives in (2.4). Linear-quadratic objectives have been used extensively in the literature because of their tractability. Such objectives have been used in, e.g., Sannikov and A. Skrzypacz [33], Gârleanu and Pedersen [20], and Bouchard, Fukasawa, Herdegen, and Muhle-Karbe [7]. The linear-quadratic objectives (2.5) allow us to solve for the optimal holdings in Lemma 3.2 using quadratic pointwise optimization. In the price-friction equilibrium, we could equivalently use dynamic programming to produce the same optimal holdings.

Second, our stock does not pay dividends, which means that only the stock drift can be endogenously determined in equilibrium. Models with non-dividend paying stocks have been used extensively in the literature. The monograph Karatzas and Shreve [27] gives an overview.Footnote 10 In particular, non-dividend paying stock models have been used for short horizon models like ours where consumption only takes place at the terminal time.Footnote 11 The rebalancers’ dynamic learning produces forward-running filtering equations and by considering a non-dividend paying stock, we circumvent having additional backward-running equations. Equilibrium models with both forward and backward-running equations include Kyle [29], Foster and Viswanathan [18, 19], Back, Cao, and Willard [5], and Choi, Larsen, and Seppi [12].

Third, price impact is often modeled as the impact of investor holdings and orders on price levels (e.g., as in Almgren [2]) and as the impact of orders on price changes (e.g., Kyle [29]). However, for the sake of tractability, we follow Cuoco and Cvitanić [15] and model price impact in terms of the impact of investor holdings on the price drift. Price impact matters for the trading decisions of strategic investors because of its effect on future expected price changes (e.g., high holding demand raises prices which lowers expected future price appreciation). Our price-friction specification simply assumes directly that investor holdings affect expected future price changes. Thus, while our price impact specification is a simplification, it is a reasonable simplification that preserve the essential economics of price impact.

Fourth, instead of exogenous noise traders, we use optimizing trackers with a Brownian motion target \(w_t\). Grossman and Stiglitz [22] and Kyle [29] are standard references with an exogenous Gaussian stock supply. Gaussian noise traders are also used in the predatory trading models in Brunnermeier and Pederson [9] and Carlin, Lobo, and Viswanathan [10]. In our setting, we could eliminate trackers by setting \(\bar{M}:=0\) and replace the stock-market clearing condition (2.3) by using \(w_t\) to model the exogenous stock supply as in

$$\begin{aligned} w_t= \sum _{i=1}^{M} \theta _{i,t},\quad t\in [0,1]. \end{aligned}$$
(3.13)

Including noise traders as in (3.13) in the model would be tractable in the price-friction equilibrium. However, surprisingly, exogenous noise-traders complicate constructing a Nash equilibrium with dynamic learning, whereas — as we show in Sect. 4 — optimizing trackers and market learning in (2.3) produce a subgame perfect Nash financial-market equilibrium in closed form. The models in Sannikov and Skrzypacz [33] and Choi, Larsen, and Seppi [13] have optimizing trackers but no dynamic learning.

3.4 Numerics

Our price-friction equilibrium is straightforward to compute numerically. This is because equilibrium stock prices and holdings are available in closed form given the solutions to the associated coupled ODEs in (3.7). We illustrate our models for several different parameterizations. In these parameterizations, there are \(M := 5\) rebalancers and \({\bar{M}} := 10\) trackers. The penalty function is a constant over the trading day and set to \(\kappa (t):=1\). The rebalancer target volatility is normalized to \(\sigma _{\tilde{a}} := 1\) whereas we consider \(\sigma _{w_0} \in \{\frac{1}{10}, 1\}\) to illustrate the impact of dynamic learning. Recall that \(\sigma _{w_0}:=0\) gives the model with only initial learning of \(\tilde{a}_\Sigma \) as developed in Choi, Larsen, and Seppi [13]. To be consistent with our negative B(t) restriction, we consider an initial value \(B(0) :=-0.2\). We consider two stock-price volatility parameters \(\gamma \in \{\frac{1}{2},1\}\) and a zero price-friction parameter \(\alpha :=0\) (i.e., the competitive equilibrium). As noted above, \(\alpha \) does not affect the endogenous price-drift coefficients, but \(\alpha \) does affect investor holdings.

3.4.1 Equilibrium holdings

First, we consider equilibrium holdings. Fig. 1 shows the coefficient functions for the equilibrium stock holdings \(\hat{\theta }_{k,t}\) in (3.8) for rebalancers and trackers using their orthogonal representations in (A.2) and (A.3) in Appendix A. Alternatively, we could plot coefficient loadings on the state processes \((\tilde{a}_i, q_{i,t}, \eta _t,Y_t)\) and \((\eta _t,w_t,\tilde{a}_\Sigma )\) in (3.8). We prefer to illustrate orthogonal loadings to avoid cancelation effects in the different state processes.

Fig. 1
figure 1

Plots of coefficient loadings over time for holdings \(\hat{\theta }_{k,t}\) using the orthogonal representations in (A.2) and (A.3) in Appendix A. The exogenous model parameters are \( \sigma _{\tilde{a}}:=1, M:=5, \bar{M}:=10, \;\alpha :=0\), \(B(0):=-0.2\), \(\kappa (t):=1\) for \(t\in [0,1]\), \((\gamma , \sigma _{w_0}) =(\frac{1}{2},\frac{1}{10}) (\text {blue}), (\frac{1}{2},1) (\text {amber}), (1,\frac{1}{10}) (\text {green}),\) and \( (1,1)(\text {red}).\)

Fig. 1E shows rebalancer i’s loadings over time on her own parent target \(a_i\). As expected, these loadings are positive, but they are less than 1 because trading towards a positive target depresses equilibrium price drifts in order for markets to clear. The rebalancer loading on \(\tilde{a}_i\) is over 0.9, which implies a large initial block trade at time \(t=0\). The negative coefficients on \(\tilde{a}_\Sigma - \tilde{a}_i\) (for rebalancer i) in Fig. 1A and \(\tilde{a}_\Sigma \) (for tracker j) in Fig. 1B are demand accommodation. In particular, rebalancers and trackers reduce their holdings when other rebalancers want to buy. The loadings on \(w_0\) in Fig. 1C and 1D are more subtle. When the initial tracker target \(w_0\) has a high volatility (as in the red and amber trajectories), the tracker holdings load positively on \(w_0\) over time in Fig. 1D, and the negative rebalancer loadings in Fig. 1C indicate demand accommodation by the rebalancers. However, when the initial tracker target has low volatility (as in the green and blue trajectories), the initial positive tracker loadings on \(w_0\) eventually flip signs as do the initial negative rebalancer loadings. At first glance, this is puzzling. The explanation is that, as noted above, the trackers and rebalancers have different stock-price drift perceptions in (3.9) and (3.10) given their different filtrations. In particular, there is dynamic learning over time by the rebalancers based on the information \(Y_t\) inferred from prices, whereas the trackers are fully informed about \(\tilde{a}_\Sigma \) and \(w_t\) (trackers infer \(\tilde{a}_\Sigma \) at time 0). Fig. 3C and 3D below illustrate that the rebalancers’ and trackers’ stock-drift perceptions are quite different in these two low \(\sigma _{w_0}\) parameterizations.

In addition to the effects illustrated in Fig. 1, investor holdings are also affected by the realized path of \(w_t = w_0+w^\circ _t\) over time. This is because of fluctuations in the underlying tracker parent demand and also due to the effect of \(w^\circ _t\) on dynamic learning by the rebalancers. Appendix A shows the exact specification of this term in the tracker holdings (given as a \(dw^\circ _u\) integral of a deterministic function). Given the linearity of investor holdings and since the Brownian motion \(w^\circ _t\) has zero expected increments, this random path effect disappears in ex ante expected investor holdings.

To summarize, Fig. 1 shows there are three main drivers of investor holdings: First, investors’ holdings in most cases are drawn partially towards their own targets \(\tilde{a}_i\) and \(w_t\). Second, investors provide partial accommodation to other investors’ parent demands. Third, dynamic learning and speculation on the price drift affect demand accommodation. Interestingly, there is no evidence in Fig. 1 of predatory trading. Predatory trading differs from demand accommodation in that a predator first trades in the same direction as another investor and then subsequently unwinds her position. In this context, the hump-shape of the blue trajectories (for low \(\sigma _{w_0})\) in Fig. 1C and 1D do not indicate predatory trading: Because \(w_0\) is the trackers’ own target, the blue hump in Fig. 1D cannot reflect predatory trading. Furthermore, the blue hump-shaped trajectory in Fig. 1C also differs from predatory trading because the tracker and rebalancer loadings have opposite signs as seen in Fig. 1D. This is due to market clearing. For example, when the rebalancers are buying given \(w_0> 0\), the trackers are actually selling. Instead of predatory trading, we shall see below, that the blue trajectories are explained by price perceptions and dynamic learning.

Fig. 2 plots the instantaneous intraday unconditional trading autocorrelations

$$\begin{aligned} \rho _k(t) := \lim _{h\downarrow 0}\frac{ \text {corr} (\hat{\theta }_{k,t+h}-\hat{\theta }_{k,t}, \hat{\theta }_{k,t+2h}-\hat{\theta }_{k,t+h})}{h},\quad k\in \{1,...,M+\bar{M}\}, \end{aligned}$$
(3.14)

for the price-friction equilibrium holding processes for both the rebalancer and tracker in (3.8). These autocorrelations are scaled by the time step \(h>0\) (the unscaled versions converge to zero as \(h\downarrow 0\)).

Fig. 2
figure 2

Plots of unconditional autocorrelation (3.14) of trading over time. The exogenous model parameters are \( \sigma _{\tilde{a}}:=1, M:=5, \bar{M}:=10, \;\alpha :=0\), \(B(0):=-0.2\), \(\kappa (t):=1\) for \(t\in [0,1]\), and \((\gamma , \sigma _{w_0}) =(\frac{1}{2},\frac{1}{10}) (\text {blue}),\,(\frac{1}{2},1) (\text {amber}), \,(1,\frac{1}{10}) (\text {green})\), and \((1,1)(\text {red}).\)

Thus, consistent with empirical evidence, trading is autocorrelated due to order splitting. Fig. 2 shows that rebalancers’ orders are positively autocorrelated (2A) whereas trackers’ orders exhibit negative autorcorrelation (2B).

Market clearing forces the intraday instantaneous unconditional cross correlation between rebalancers’ and trackers’ holdings to be negatively perfectly correlated

$$\begin{aligned} \lim _{h\downarrow 0} \text {corr} (\hat{\theta }_{i,t+h}-\hat{\theta }_{i,t}, \hat{\theta }_{j,t+h}-\hat{\theta }_{j,t})=-1, \end{aligned}$$
(3.15)

for all \(i\in \{1,...,M\}\) and \(j\in \{M+1,...,M+{\bar{M}}\}\).

3.4.2 Equilibrium prices

Next, we consider the price-friction equilibrium stock-price dynamics in (3.9) and (3.10). For the trackers, we can rewrite the drift in (3.9) in terms of the independent random variables \((\tilde{a}_\Sigma , w_0)\) and an residual orthogonal term given as a stochastic integral with respect to \(w^\circ _t\) of a deterministic function of time. For the rebalancers, we can rewrite the perceived drift in (3.10) in terms of the independent random variables \((\tilde{a}_\Sigma -\tilde{a}_i, w_0, \tilde{a}_i)\) and an residual orthogonal term given as a stochastic integral with respect to \(w^\circ _t\) of a deterministic function of time. These formulas are given in (A.4) and (A.5) in Appendix A and are illustrated in Fig. 3.

Fig. 3
figure 3

Plots of coefficient loadings over time in stock-price drifts in (A.5) (rebalancer i) and (A.4) (tracker j). The exogenous model parameters are \( \sigma _{\tilde{a}}:=1, M:=5, \bar{M}:=10, \;\alpha :=0\), \(B(0):=-0.2\), \(\kappa (t):=1\) for \(t\in [0,1]\), and \((\gamma , \sigma _{w_0}) =(\frac{1}{2},\frac{1}{10}) (\text {blue}), (\frac{1}{2},1) (\text {amber}), (1,\frac{1}{10}) (\text {green}),\) and \( (1,1)(\text {red}).\)

Fig. 3 shows that positive parent demands \(\tilde{a}_i\), \(\tilde{a}_\Sigma - \tilde{a}_i\), and \(\tilde{a}_\Sigma \) all depress perceived stock-price drifts. The same is true for the tracker perceived stock-price drift loading on the initial tracker parent demand \(w_0\). However, the relation between the rebalancer perceived drift and \(w_0\) is more nuanced. When the initial tracker demand volatility \(\sigma _{w_0}\) is high (red and amber lines in Fig. 3C and 3D), then rebalancers perceive that \(w_0\) depresses the price drift. However, when \(\sigma _{w_0}\) is low, then the dynamic learning process — given the inability of rebalancers to observe \(w_0\) directly — causes the rebalancer perceived stock-price drift loading on \(w_0\) to change sign. The blue and green lines in Fig. 3C and 3D illustrate that low values of \(\sigma _{w_0}\) make the trackers use their superior knowledge of \(w_0\) to manipulate stock-price perceptions to create gains from trade that outweigh their penalties. More specifically, the blue and green lines in Fig. 1C and 1D show that rebalancers have large positive stock holdings and trackers have large negative holdings based on a positive realization \(w_0>0\). Such large negative holdings imply that trackers incur large inventory penalties because they deviate from the target trajectory \(w_t= w_0 + w^\circ _t\). Trackers find this behavior optimal because their blue and green lines in Fig. 3D are negative (giving trackers large gains from trade) and rebalancers are willing to hold these large positive stock positions because their blue and green lines in Fig. 3C are positive (giving also rebalancers large gains from trade).

Fig. 4A plots the instantaneous intraday unconditional stock-price autocorrelation, which is again scaled relative to h

$$\begin{aligned} \rho (t) := \lim _{h\downarrow 0}\frac{ \text {corr} (\hat{S}_{t+h}-\hat{S}_{t}, \hat{S}_{t+2h}-\hat{S}_{t+h})}{h},\quad t\in [0,1), \end{aligned}$$
(3.16)

for the equilibrium stock-price process \(\hat{S}_t\).

Price pressure from persistent parent demands lead to rising intraday price autocorrelation over the trading day. Fig. 4B plots the time trajectory of the unconditional variance of intraday price drifts over the trading day based on the trackers’ equilibrium perceptions in (3.9). Predictable price drifts are important in actual markets as incentives for intraday liquidity provision by HFT market makers (represented in our model by rebalancers with realizations \(\tilde{a}_i \approx 0\).) We see that price-drift variability due to price pressure increases over the trading day.

Fig. 4
figure 4

Plots of the stock-price scaled autocorrelation (3.16) and of the variance of trackers’ equilibrium stock-price drift over time for the equilibrium stock-price dynamics \(d\hat{S}_t\) in (3.9). The exogenous model parameters are \( \sigma _{\tilde{a}}:=1, M:=5, \bar{M}:=10, \;\alpha :=0\), \(B(0):=-0.2\), \(\kappa (t):=1\) for \(t\in [0,1]\), and \((\gamma , \sigma _{w_0}) =(\frac{1}{2},\frac{1}{10}) (\text {blue}),\,(\frac{1}{2},1) (\text {amber}), \,(1,\frac{1}{10}) (\text {green})\), and \((1,1)(\text {red}).\)

3.4.3 Equilibrium learning

Fig. 5A shows that \(\sigma _{w_0}>0\) controls the starting point \(\Sigma (0)>0\) whereas \(\gamma >0\) controls the speed of learning (i.e., how negative the slope of \(\Sigma (t)\) is). For example, the green and red lines (\(\gamma =1\)) illustrate a slower speed of learning relative to the amber and blue lines (\(\gamma =0.1\)). These effects on \(\Sigma '(t)\) come from Fig. 5B and the formula for \(\Sigma (t)\) in terms \(B'(t)\) in (2.15). The red line in Fig. 5A also shows that the remaining variance \(\Sigma (1)\) at \(t=1\) can be substantial.

Fig. 5
figure 5

Plots of solutions of ODEs in (3.7) over time. The exogenous model parameters are \( \sigma _{\tilde{a}}:=1, M:=5, \bar{M}:=10, \;\alpha :=0\), \(B(0):=-0.2\), \(\kappa (t):=1\) for \(t\in [0,1]\), and \((\gamma , \sigma _{w_0}) =(\frac{1}{2},\frac{1}{10}) (\text {blue}),\,(\frac{1}{2},1) (\text {amber}), \,(1,\frac{1}{10}) (\text {green})\), and \((1,1)(\text {red}).\)

3.4.4 Equilibrium welfare

In this section, we study the impact of the exogenous model input \(B(0)\in \mathbb {R}\) on equilibrium welfare. There are many ways to measure social welfare (see, e.g., Vayanos [35,  Section 6]). We follow Du and Zhu [17,  Eq. 42] and consider maximizing the expected aggregate certainty equivalent for the \(M+\overline{M}\) investors. The certainty equivalent CE\(_k\in \mathbb {R}\) for investor \(k\in \{1,...,M+\overline{M}\}\) is defined by the expressions in (2.5). The aggregate expected welfare is given by

$$\begin{aligned} \begin{aligned} \sum _{k=1}^{M+\overline{M}}\mathbb {E}[\text {CE}_k], \end{aligned} \end{aligned}$$
(3.17)

where the expectation in (3.17) is ex ante in the sense that it is taken over the random variables \((\tilde{a}_1,...,\tilde{a}_M)\) and \(w_0\) (Gaussian and independent).

Fig. 6
figure 6

Plots of aggregate expected welfare in (3.17) for varying B(0). The exogenous model parameters are \( \sigma _{\tilde{a}}:=1, M:=5, \bar{M}:=10, \; \kappa (t):=1\) for \(t\in [0,1],\; \alpha :=0\), and \((\gamma , \sigma _{w_0}) =(\frac{1}{2},\frac{1}{10}) (\text {blue}),\,(\frac{1}{2},1) (\text {amber}), \,(1,\frac{1}{10}) (\text {green})\), and \((1,1)(\text {red}).\)

Fig. 6 shows that in the price-friction equilibrium with \(\alpha :=0\), expected welfare is maximized at \(B(0)=\frac{1}{{\bar{M}}}\). This is not too surprising because \(B(0)=\frac{1}{{\bar{M}}}\) implies full revelation and no dynamic learning takes place for \(t>0\) (see the discussion after Lemma 3.4). Aggregate welfare is decreasing in the initial tracker parent standard deviation \(\sigma _{w_0}\) both because more demand accommodation is required and also because the rebalancer learning problem is more difficult. This effect can be seen by comparing the blue and green (low initial standard deviation) and amber and red (high initial standard deviation) cases in Fig. 6.

4 Subgame perfect Nash equilibrium

This section builds on the analysis in Section 3 by endogenizing stock-price perceptions and price impact. In particular, we partially endogenize the impact of an investor’s hypothetical off-equilibrium holdings on off-equilibrium market-clearing stock prices based on her perceptions of how other investors perceive prices and on other investors’ resulting optimal response functions to her off-equilibrium holdings. More specifically, a subgame perfect Nash equilibrium involves describing how each trader \(k_0\) (who might be a rebalancer \(i_0\) or a tracker \(j_0\) with their different filtrations) perceives all other traders’ price perceptions.

The major difference between the price-friction equilibrium in Sect. 3 and our subgame perfect Nash equilibrium lies in the traders’ stock-price perceptions. For a subgame perfect Nash equilibrium, investor stock-price perceptions must be such that:

  1. (i)

    Trader \(k_0\)’s own stock-price perceptions must be consistent with market-clearing for any off-equilibrium holdings \(\theta _{k_0,t}\) used by \(k_0\), when other traders’ holding responses are optimal given the stock-price dynamics \(k_0\) perceives other traders \(k\ne k_0\) to have. This off-equilibrium market-clearing requirement can be found in, e.g., Vayanos [35].

  2. (ii)

    Trader \(k_0\)’s equilibrium holdings are found by solving her optimization problem using her own market-clearing stock-price dynamics from (i).

  3. (iii)

    All optimizers from (i) must be consistent with traders’ equilibrium holdings in (ii).

Definition 4.3 below makes properties (i)–(iii) operational. We refer to the last property (iii) as a consistency requirement between off- and on-equilibrium holdings.

4.1 Optimal off-equilibrium responses

In our subgame perfect Nash model, a generic trader \(k_0\) perceives that other rebalancers and trackers have stock-price perceptions of the form

$$\begin{aligned} \begin{aligned} dS^Z_{i,t}&:= \Big \{Z_t +\mu _1(t)\tilde{a}_i+\mu _2(t)q_{i,t}+\mu _3(t)\eta _t + \alpha \theta _{i,t}\Big \}dt+ \gamma dW_{i,t}, \\ S^Z_{i,0}&:=Z_0,\quad i\in \{1,...,M\},\\ dS^Z_{j,t}&:= \Big \{Z_t +\bar{\mu }_4(t)\tilde{a}_\Sigma +\bar{\mu }_5(t)w_t+ \alpha \theta _{j,t}\Big \}dt + \gamma dW_{j,t},\\ S^Z_{j,0}&:=Z_0,\quad j\in \{M+1,...,M+\bar{M}\}, \end{aligned} \end{aligned}$$
(4.1)

where \(W_{i,t}\) and \(W_{j,t}\) are Brownian motions and \(Z_t\) is an arbitrary Itô process (i.e., \(Z_t\) is a sum of drift and volatility). The “Z” superscript in (4.1) indicates that the perceived stock prices \(S_{i,t}^Z\) and \(S_{j,t}^Z\) are defined with respect to \(Z_t\). We use the market-clearing condition (2.3) to construct two such Itô processes in (4.5) and (4.8) below. These \(Z_t\) processes differ from \(Y_t\) in (3.1) and (3.3) in that we use \(Z_t\) to capture the effect of arbitrary off-equilibrium stock holdings by trader \(k_0\) on market-clearing prices given optimal responses by other investors k, \(k\ne k_0\). We then go on to determine endogenously the deterministic functions \((\mu _1,\mu _2,\mu _3,\bar{\mu }_4, \bar{\mu }_5)\) in equilibrium in Theorem 4.5 below.

Lemma 4.1 gives traders’ optimal response to an arbitrary Itô process \(Z_t\) and is the Nash equilibrium analogue of Lemma 3.2.

Lemma 4.1

(Optimal responses to \(Z_t\)) Let \(\mu _1,\mu _2,\mu _3,\bar{\mu }_4,\bar{\mu }_5:[0,1]\rightarrow \mathbb {R}\) and \(\kappa :[0,1]\rightarrow (0,\infty ]\) be continuous functions, let \(\alpha \le 0\), let \((Z_t)_{t\in [0,1]}\) be an Itô process, and let the perceived stock-price process in the wealth dynamics (2.7) be as in (4.1). Then, \(Z_t\) is adapted to both \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,Y_u,W_{i,u},S^Z_{i,u})_{u\in [0,t]}\) and \(\mathcal {F}_{j,t}:=\sigma (\tilde{a}_\Sigma ,w_u,Y_u,W_{j,u},S^Z_{j,u})_{u\in [0,t]}\) and, provided

$$\begin{aligned} \begin{aligned} \theta ^Z_{i,t}&:= \tfrac{1}{2 (\kappa (t)-\alpha ) }Z_t+\tfrac{2 \kappa (t)+\mu _1(t)}{2 (\kappa (t)-\alpha )}\tilde{a}_i+\tfrac{\mu _2(t)}{2 (\kappa (t)-\alpha )}q_{i,t}+\tfrac{\mu _3(t)}{2 (\kappa (t)-\alpha )}\eta _t,\\ \theta ^Z_{j,t}&:= \tfrac{1}{2 (\kappa (t)-\alpha )}Z_t+\tfrac{2 \kappa (t)+\bar{\mu }_5(t)}{2 (\kappa (t)-\alpha )}w_t +\tfrac{\bar{\mu }_4(t)}{2 (\kappa (t)-\alpha )}\tilde{a}_\Sigma , \end{aligned} \end{aligned}$$
(4.2)

satisfy (2.6), the maximizer for (2.5) is \(\theta ^Z_{i,t} \) for rebalancer \(i\in \{1,...,M\}\) and \(\theta ^Z_{j,t}\) for tracker \(j\in \{M+1,...,M+\bar{M}\}\). \(\diamondsuit \)

Similar to Lemma 3.2, Lemma 4.1 is proven using pointwise quadratic maximization. Unlike \(Y_t\) in Lemma 3.2, there is no Markov structure imposed on \(Z_t\) in Lemma 4.1, which makes dynamical programming inapplicable. Therefore, the simplicity of the linear-quadratic objectives in (2.5) is crucial for the proof of the optimality of \(\theta ^Z_{i,t}\) and \(\theta ^Z_{j,t}\) in (4.2).

4.2 Market-clearing stock-price perceptions

Investor \(k_0\)’s perceptions about other investors’ stock-price perceptions ensure that the stock market clears for any choice of \(k_0\)’s holdings. Thus, when solving for trader \(k_0\)’s individual equilibrium holdings, we require \(k_0\)’s perceived stock-price process (denoted by \(S^\nu _{k_0,t}\) below) clears the stock market for arbitrary hypothetical holdings \(\theta _{k_0,t}\). We assume that a given trader \(k_0\in \{1,...,M+\bar{M}\}\) perceives that other traders \(k\ne k_0\) perceive the stock-price processes in (4.1). Hence, trader \(k_0\) perceives that other traders k, \(k\ne k_0\), optimally hold \(\theta ^Z_{k,t}\) in (4.2) shares of stock. Given this, we then find market-clearing \(Z_{k_0,t}\) processes associated with arbitrary hypothetical holdings \(\theta _{k_0,t}\) for trader \(k_0\).

First, consider a rebalancer \(i_0\in \{1,...,M\}\). We construct a process \(Z_{i_0,t}\) such that the stock market clears in the sense

$$\begin{aligned} \begin{aligned} 0&= \underbrace{\theta _{i_0,t}}_{\text {rebalancer}} i_0+\underbrace{ \sum _{i=1, i\ne i_0}^M \theta ^{Z_{i_0}}_{i,t}}_{\text {other rebalancers}}+\underbrace{\sum _{j=M+1}^{\bar{M}}\theta ^{Z_{i_0}}_{j,t}}_{\text {trackers}},\quad t\in [0,1], \end{aligned} \end{aligned}$$
(4.3)

where \(\theta _{i_0,t}\) denotes an arbitrary stock-holdings process for rebalancer \(i_0\) and other investors’ responses \(\theta ^{Z_{i_0}}_{k,t}\) are from (4.2) for \(Z_t := Z_{i_0,t}\). Clearly, any solution \(Z_{i_0,t}\) of (4.3) is specific for rebalancer \(i_0\). To describe one particular solution \(Z_{i_0,t}\), we insert (4.2) into (4.3). This produces an affine equation in \((\theta _{i_0,t},Z_{i_0,t}, \tilde{a}_{i_0}, q_{i_0,t},\eta _t,w_t,\tilde{a}_\Sigma )\). Because rebalancer i cannot observe nor infer \(w_t\) and \(\tilde{a}_\Sigma \) seperately, she has to filter based on observing a linear combination of \(w_t\) and \(\tilde{a}_\Sigma \) given by \(Y_t := w_t -B(t)\tilde{a}_\Sigma \) where \(B:[0,1]\rightarrow \mathbb {R}\) is a continuously differentiable function satisfying

$$\begin{aligned} B(t)=-\frac{A(t) \mu _2(t)+\bar{M} \bar{\mu }_4(t)+2 \kappa (t)+\mu _1(t)}{2 \bar{M} \kappa (t)+\bar{M} \bar{\mu }_5(t)}, \end{aligned}$$
(4.4)

where A(t) is as in (2.17). The specific form of (4.4) comes from rewriting (4.3) in terms of \((\theta _{i_0,t},Z_{i_0,t}, \tilde{a}_{i_0}, q_{i_0,t},\eta _t,Y_t)\) rather than \((\theta _{i_0,t},Z_{i_0,t}, \tilde{a}_{i_0}, q_{i_0,t},\eta _t,w_t,\tilde{a}_\Sigma )\). Because A(t) in (2.17) depends on B(t), Eq. (4.4) is a fixed point requirement for B(t). Below, we show that the coupled ODEs in (4.19) characterize (AB) in (4.4), and we give conditions ensuring that (4.19) has a solution. Given a solution B(t) to (4.4), we use \(Y_t:=w_t - B(t)\tilde{a}_\Sigma \) from (2.8) to express a solution of (4.3) asFootnote 12

$$\begin{aligned} \begin{aligned} Z_{i_0,t}&:=\tfrac{2 (\alpha -\kappa (t))}{M+\bar{M}-1}\theta _{i_0,t} +\tfrac{2\kappa (t)+\mu _1(t)}{M+\bar{M}-1}\tilde{a}_{i_0}+\tfrac{\mu _2(t)}{M+\bar{M}-1}q_{i_0,t}\\&-\tfrac{(M-1) \mu _3(t)+\mu _2(t)}{M+\bar{M}-1}\eta _t -\tfrac{\bar{M} (2 \kappa (t)+\bar{\mu }_5(t))}{M+\bar{M}-1}Y_t,\quad t\in [0,1]. \end{aligned} \end{aligned}$$
(4.5)

The process \(Z_{i_0,t}\) in (4.5) captures the impact of arbitrary holdings \(\theta _{i_0,t}\) by rebalancer \(i_0\) on market-clearing stock prices given \(i_0\)’s perceptions of how other traders \(k\ne i_0\) optimally respond using \(\theta _{k,t}^{Z_{i_0}}\) from (4.2) with \(Z_t := Z_{i_0,t}\).

Next, we describe rebalancer \(i_0\)’s stock-price perceptions for \(i_0\in \{1,...,M\}\). Rebalancer \(i_0\) filters based on her own target \(\tilde{a}_i\) and on observations of past and current perceived market-clearing stock prices \(S^\nu _{i_0,u}\) defined by

$$\begin{aligned} \begin{aligned} dS^\nu _{i_0,t}&:= \Big \{\nu _0(t)Z_{i_0,t} +\nu _1(t)\tilde{a}_{i_0} +\nu _2(t)q_{i_0,t}+\nu _3(t)\eta _t+ \alpha \theta _{i_0,t}\Big \}dt + \gamma dw_{i_0,t},\\ S^\nu _{i_0,0}&:=Y_0,\quad i_0\in \{1,...,M\}, \end{aligned} \end{aligned}$$
(4.6)

where \((\tilde{a}_{i_0},\theta _{i_0,t})\) are known and \((Z_{i_0,t} ,q_{i_0,t},\eta _{t_0})\) are inferred by rebalancer \(i_0\). The “\(\nu \)” superscript in (4.6) indicates that the perceived stock prices are defined with respect to a particular set of deterministic functions \((\nu _0,\nu _1,\nu _2,\nu _3)\), which we endogenously determine in Theorem 4.5 below. More specifically, by observing \(\tilde{a}_{i_0}\) and \((S^\nu _{i_0,u})_{u\in [0,t]}\) defined in (4.6), rebalancer \(i_0\) infers \(Y_t:=w_t - B(t)\tilde{a}_\Sigma \) from (2.8) using the Volterra argument behind Lemma 3.1. To see this, we insert (4.5) into (4.6) to produce rebalancer \(i_0\)’s perceived market-clearing stock-price dynamics

$$\begin{aligned} \begin{aligned} dS^\nu _{i_0,t}&= \Big \{\Big (\tfrac{\nu _0(t) (2 \kappa (t)+\mu _1(t))}{M+\bar{M}-1}+\nu _1(t)\Big )\tilde{a}_{i_0} +\Big (\tfrac{\mu _2(t) \nu _0(t)}{M+\bar{M}-1}+\nu _2(t)\Big )q_{i_0,t}\\&\quad +\Big (\nu _3(t)-\tfrac{\nu _0(t) ((M-1) {\mu 3}(t)+\mu _2(t))}{M+\bar{M}-1}\Big )\eta _t-\tfrac{\bar{M} \nu _0(t) (2 \kappa (t)+\bar{\mu }_5(t))}{M+\bar{M}-1}Y_t\\&\quad +\Big (\alpha -\tfrac{2 \nu _0(t) (\kappa (t)-\alpha )}{M+\bar{M}-1}\Big )\theta _{i_0,t}\Big \}dt + \gamma dw_{i_0,t}. \end{aligned} \end{aligned}$$
(4.7)

Because the expressions multiplying \((\tilde{a}_{i_0},q_{i_0,t},\eta _t,Y_t,\theta _{i_0,t})\) in (4.7) are continuous (deterministic) functions of time \(t\in [0,1]\), Lemma 3.1 applies and shows that by observing \(\tilde{a}_{i_0}\) and \((S^\nu _{i_0,u})_{u\in [0,t]}\) in (4.7) over time \(t\in [0,1]\), rebalancer \(i_0\) can infer \(w_{i_0,t}\). Subsequently, rebalancer \(i_0\) can use (2.10) and (2.14) to also infer \(Y_t\) over time \(t\in [0,1]\).

Next, consider a tracker \(j_0\in \{M+1,...,M+\bar{M}\}\). For arbitrary off-equilibrium holdings \(\theta _{j_0,t}\), the market-clearing solution \(Z_{j_0,t}\) from

$$\begin{aligned} \begin{aligned} 0&= \underbrace{ \theta _{j_0,t}}_{\text {tracker }j_0}+\underbrace{\sum _{j=M+1, j\ne j_0}^{\bar{M}}\theta ^{Z_{j_0}}_{j,t}}_{\text {other trackers}} +\underbrace{\sum _{i=1}^M \theta ^{Z_{j_0}}_{i,t}}_{\text {rebalancers}},\quad t\in [0,1], \end{aligned} \end{aligned}$$
(4.8)

is given by

$$\begin{aligned} \begin{aligned} Z_{j_0,t}&:=\tfrac{2 (\alpha -\kappa (t))}{M+\bar{M}-1}\theta _{j_0,t}-\tfrac{M \mu _3(t)+\mu _2(t)}{M+\bar{M}-1}\eta _t -\tfrac{(\bar{M}-1) (2 \kappa (t)+\bar{\mu }_5(t))}{M+\bar{M}-1}w_t\\&\;-\tfrac{A(t) \mu _2(t)+(\bar{M}-1) \bar{\mu }_4(t)+2 \kappa (t)+\mu _1(t)}{M+\bar{M}-1}\tilde{a}_\Sigma , \end{aligned} \end{aligned}$$
(4.9)

where A(t) is as in (2.17). Once again, \(Z_{j_0,t}\) captures tracker \(j_0\)’s perceptions of the impact of her holdings \(\theta _{j_0,t}\) on market-clearing stock prices given \(j_0\)’s perceptions of other investors’ \(k\ne j_0\) responses \(\theta _{k,t}^{Z_{j_0}}\) to \(\theta _{j_0,t}\).

Tracker \(j_0\)’s perceived market-clearing stock-price process is defined as

$$\begin{aligned} \begin{aligned} dS^{\bar{\nu }}_{j_0,t}&:= \Big \{Z_{j_0,t} +\bar{\nu }_3(t)\eta _t+\bar{\nu }_4(t)\tilde{a}_\Sigma +\bar{\nu }_5(t)w_t+ \alpha \theta _{j_0,t}\Big \}dt + \gamma dw_t,\\ S^{\bar{\nu }}_{j_0,0}&:=Y_0,\quad j_0\in \{M+1,...,M+\bar{M}\}, \end{aligned} \end{aligned}$$
(4.10)

where \(\bar{\nu }_3,\bar{\nu }_4,\bar{\nu }_5:[0,1]\rightarrow \mathbb {R}\) are deterministic functions of time (endogenously determined Theorem 4.5 below). Inserting (4.9) into (4.10) gives tracker \(j_0\)’s perceived market-clearing stock-price dynamics

$$\begin{aligned} \begin{aligned} dS^{\bar{\nu }}_{j_0,t}&= \Big \{\Big (\bar{\nu }_3(t)-\tfrac{M {\mu 3}(t)+\mu _2(t)}{M+\bar{M}-1}\Big )\eta _t\\&+\Big (\bar{\nu }_5(t)-\tfrac{(\bar{M}-1) (2 \kappa (t)+\bar{\mu }_5(t))}{M+\bar{M}-1}\Big )w_t \\&+\Big (\bar{\nu }_4(t)-\tfrac{A(t) \mu _2(t)+(\bar{M}-1) \bar{\mu }_4(t)+2 \kappa (t)+\mu _1(t)}{M+\bar{M}-1}\Big ) \tilde{a}_\Sigma \\&+\tfrac{\alpha (M+\bar{M}+1)-2 \kappa (t)}{M+\bar{M}-1}\theta _{j_0,t}\Big \}dt + \gamma dw_t. \end{aligned} \end{aligned}$$
(4.11)

We note that tracker \(j_0\)’s perceived market-clearing stock-price dynamics \(dS^{\bar{\nu }}_{j_0,t}\) in (4.11) are driven by the exogenous Brownian motion \(w_t\) from (2.2) whereas rebalancer \(i_0\)’s stock prices \(dS^\nu _{i_0,t}\) in (4.7) are driven by \(i_0\)’s innovations process \(dw_{i_0,t}\) from (2.11). This is due to the different information sets of rebalancers and trackers.

Unlike the price-friction equilibrium in Theorem 3.5, we see from (4.7) and (4.11) that, even with no direct price impact in the sense \(\alpha := 0\) in (4.6) and (4.10), the remaining net price impacts \(-\frac{2 \nu _0(t) \kappa (t)}{M+\bar{M}-1}\) and \(-\frac{2 \kappa (t)}{M+\bar{M}-1}\) of \(\theta _{i,t}\) and \(\theta _{j,t}\) are nonzero. This is because price pressure in (4.7) and (4.11) clears the stock market for arbitrary holdings \(\theta _{i,t}\) and \(\theta _{j,t}\).

The next result gives the optimal holdings \(\theta ^*_{k,t}\) for all traders \(k_0:=k\in \{1,...,M+\bar{M}\}\) given their perceptions of market-clearing stock prices in (4.7) and (4.11). While both \(\theta ^*_{k,t}\) and the optimal response holdings \(\theta ^{Z}_{k,t}\) in (4.2) maximize (2.5), they differ because they are based on different perceived stock-price processes. On one hand, the optimal responses \(\theta ^{Z}_{k,t}\) in (4.2) are based on the stock-price perceptions in (4.1). On the other hand, the optimizer \(\theta ^*_{k,t}\) is based on the market-clearing stock-price perceptions in (4.7) and (4.11).

Lemma 4.2

Let \(\nu _0,\nu _1\), \(\nu _2,\nu _3,\bar{\nu }_3,\bar{\nu }_4,\bar{\nu }_5:[0,1]\rightarrow \mathbb {R}\) and \(\kappa :[0,1]\rightarrow (0,\infty ]\) be continuous functions with \(\nu _0>0\) and assume \(\alpha \le 0\). Let the perceived market-clearing stock-price processes in the wealth dynamics (2.7) be given by (4.7) and (4.11) with corresponding filtrations \(\mathcal {F}_{i,t}:= \sigma (\tilde{a}_i,S^\nu _{i,u})_{u\in [0,t]}\) and \(\mathcal {F}_{j,t}:= \sigma (w_u,S^{\bar{\nu }}_{j,u})_{u\in [0,t]}\) for \( i\in \{1,...,M\}\) and \(j\in \{M+1,...,M+\bar{M}\}\). Then, provided the holding processes

$$\begin{aligned} \begin{aligned} \theta _{i,t}^* :&=\tfrac{2 \kappa (t) (M+\bar{M}+\nu _0(t)-1)+(M+\bar{M}-1) \nu _1(t)+\mu _1(t) \nu _0(t)}{2 (\kappa (t)-\alpha ) (M+\bar{M}+2 \nu _0(t)-1)}\tilde{a}_i\\&+\tfrac{(M+\bar{M}-1) \nu _2(t)+\mu _2(t) \nu _0(t)}{2 (\kappa (t)-\alpha ) (M+\bar{M}+2 \nu _0(t)-1)}q_{i,t}\\&- \tfrac{\nu _0(t) ((M-1) \mu _3(t)+\mu _2(t))-(M+\bar{M}-1) \nu _3(t)}{2 (\kappa (t)-\alpha ) (M+\bar{M}+2 \nu _0(t)-1)}\eta _t\\&-\tfrac{\bar{M} \nu _0(t) (2 \kappa (t)+\bar{\mu }_5(t))}{2 (\kappa (t)-\alpha ) (M+\bar{M}+2 \nu _0(t)-1)}Y_t,\\ \theta _{j,t}^* :&= \tfrac{(M+\bar{M}-1) \bar{\nu }_3(t)-M \mu _3(t)-\mu _2(t)}{2 (M+\bar{M}+1) (\kappa (t)-\alpha )}\eta _t \\&+\tfrac{(M+\bar{M}-1) \bar{\nu }_5(t)+2 M \kappa (t)-(\bar{M}-1) \bar{\mu }_5(t)}{2 (M+\bar{M}+1) (\kappa (t)-\alpha )}w_t\\&-\tfrac{A(t) \mu _2(t)-(M+\bar{M}-1) \bar{\nu }_4(t)+(\bar{M}-1) \bar{\mu }_4(t)+2 \kappa (t)+\mu _1(t)}{2 (M+\bar{M}+1) (\kappa (t)-\alpha )}\tilde{a}_\Sigma , \end{aligned} \end{aligned}$$
(4.12)

satisfy (2.6), the traders’ maximizers for (2.5) are \(\theta _{i,t}^*\) for rebalancer \(i\in \{1,...,M\}\) and \(\theta _{j,t}^*\) for tracker \(j\in \{M+1,...,M+\bar{M}\}\). \(\diamondsuit \)

From Lemma 4.2, we note that a generic rebalancer \(i_0\) has filtration \(\sigma (\tilde{a}_{i_0},S^\nu _{i_0,u})_{u\in [0,t]}\) whereas she perceives that other rebalancers \(i\ne i_0\) have filtrations \(\sigma (\tilde{a}_i,Y_u,W_{i,u},S^Z_{i,u})_{u\in [0,t]}\) as in Lemma 4.1. Because these are \(i_0\)’s off-equilibrium perceptions, this is allowable as long as they are consistent with i’s equilibrium holdings. We require this consistency in Definition 4.3(iii) below. We also note from Lemma 4.1 that rebalancer i can infer \(Z_{i_0,t}\) in (4.5). In turn, this allows rebalancer i, \(i\ne i_0\), to also know the process

$$\begin{aligned} \tfrac{2 (\alpha -\kappa (t))}{M+\bar{M}-1}\theta _{i_0,t} +\tfrac{2\kappa (t)+\mu _1(t)}{M+\bar{M}-1}\tilde{a}_{i_0}+\tfrac{\mu _2(t)}{M+\bar{M}-1}q_{i_0,t}. \end{aligned}$$
(4.13)

However, knowing (4.13) is insufficient for rebalancer i, \(i\ne i_0\), to infer rebalancer \(i_0\)’s private target \(\tilde{a}_{i_0}\).

4.3 Equilibrium

Definition 4.3

Deterministic functions of time \(\mu _1,\mu _2,\mu _3,\bar{\mu }_4,\bar{\mu }_5,\nu _0,\nu _1,\nu _2,\nu _3,\bar{\nu }_3,,\bar{\nu }_4,\bar{\nu }_5:[0,1]\rightarrow \mathbb {R}\) constitute a subgame perfect Nash financial-market equilibrium if:

  1. (i)

    For \(k \in \{1,...,M+\bar{M}\}\), trader k’s maximizer \(\theta ^*_{k,t}\) for (2.5) exists given the market-clearing stock-price perceptions (4.7) and (4.11).

  2. (ii)

    For \(k\in \{1,...,M+\bar{M}\}\), inserting trader k’s maximizer \(\theta ^*_{k,t}\) into the perceived market-clearing stock-price processes (4.7) and (4.11) produces identical stock-price processes across all traders. This common equilibrium stock-price process is denoted by \(S^*_t\).

  3. (iii)

    Optimizers and equilibrium holdings must be consistent in the sense that trader k’s perceived response to trader \(k_0\)’s maximizer \(\theta ^*_{k_0,t}\) is trader k’s maximizer \(\theta ^*_{k,t}\).

  4. (iv)

    The money and stock markets clear. \(\diamondsuit \)

The identical stock-price requirement in Definition 4.3(ii) is similar to the one in Definition 3.3(ii). We see from the rebalancers’ perceptions (4.6) that both the drifts and the martingale terms have i dependence. Similar to (3.5), we replace \(dw_{i,t}\) in \(dS^\nu _{i,t}\) in (4.6) with the decomposition of \(dw_{i,t}\) in terms of \(dw_t\) in (2.11) and rewrite \(dS^\nu _{i,t}\) in (4.6) as

$$\begin{aligned} \begin{aligned} dS^\nu _{i,t}&= \Big \{\nu _0(t)Z_{i,t} +\nu _1(t)\tilde{a}_i +\nu _2(t)q_{i,t}+\nu _3(t) \eta _t +\alpha \theta _{i,t}\\&\quad -B'(t)\big (\tilde{a}_\Sigma -\tilde{a}_i - q_{i,t} \big ) \gamma \Big \}dt+ \gamma dw_t,\quad i\in \{1,...,M\}. \end{aligned} \end{aligned}$$
(4.14)

Therefore, to ensure identical equilibrium stock-price perceptions for all traders \(k\in \{1,...,M+\bar{M}\}\), it suffices to match the drift of \(dS^{\bar{\nu }}_{j,t}\) in (4.10) for \(j\in \{M+1,...,M+\bar{M}\}\) with the drift of \(dS^\nu _{i,t}\) in (4.14) for the optimal holdings \(\theta _{i,t}:= \theta ^*_{i,t}\) for \(i\in \{1,...,M\}\). This produces the requirement

$$\begin{aligned} \begin{aligned}&\nu _0(t)Z^*_{i,t} +\nu _1(t)\tilde{a}_i +\nu _2(t)q_{i,t}+\nu _3(t) \eta _t +\alpha \theta ^*_{i,t} -B'(t)\big (\tilde{a}_\Sigma -\tilde{a}_i - q_{i,t} \big )\gamma \\&={{Z^*_{j,t}}}+\bar{\nu }_3(t)\eta _t+\bar{\nu }_4(t)\tilde{a}_\Sigma +\bar{\nu }_5(t)w_t+ \alpha \theta ^*_{j,t}, \end{aligned} \end{aligned}$$
(4.15)

for all rebalancers \(i \in \{1,...,M\}\) and all trackers \(j\in \{M+1,...,M+\bar{M}\}\). The right-hand side of (4.15) does not depend on the rebalancer index i. In (4.15), the process \(Z_{i,t}^*\) is (4.5) evaluated at \(\theta _{i,t}:= \theta ^*_{i,t}\), and \(Z_{j,t}^*\) is (4.9) evaluated at \(\theta _{j,t}:= \theta ^*_{j,t}\) so that:

$$\begin{aligned} \begin{aligned} Z^*_{i,t}&:=\tfrac{2 (\alpha -\kappa (t))}{M+\bar{M}-1}\theta ^*_{i,t} + \tfrac{2\kappa (t)+\mu _1(t)}{M+\bar{M}-1}\tilde{a}_{i}+\tfrac{\mu _2(t)}{M+\bar{M}-1}q_{i,t}\\&\;-\tfrac{(M-1) \mu _3(t)+\mu _2(t)}{M+\bar{M}-1}\eta _t -\tfrac{\bar{M} (2 \kappa (t)+\bar{\mu }_5(t))}{M+\bar{M}-1}Y_t,\\ Z^*_{j,t}&:=\tfrac{2 (\alpha -\kappa (t))}{M+\bar{M}-1}\theta ^*_{j,t}-\tfrac{M \mu _3(t)+\mu _2(t)}{M+\bar{M}-1}\eta _t \\&\;-\tfrac{(\bar{M}-1) (2 \kappa (t)+\bar{\mu }_5(t))}{M+\bar{M}-1}w_t-\tfrac{A(t) \mu _2(t)+(\bar{M}-1) \bar{\mu }_4(t)+2 \kappa (t)+\mu _1(t)}{M+\bar{M}-1}\tilde{a}_\Sigma , \end{aligned} \end{aligned}$$
(4.16)

for rebalancers \( i\in \{1,...,M\}\) and trackers \(j\in \{M+1,...,M+\bar{M}\}\).

As for the consistency requirement in Definition 4.3(iii), we first fix a rebalancer \(i_0\in \{1,...,M\}\). We require that the response holdings in (4.2) are consistent with \(\theta ^*_{i_0,t}\) in the sense that

$$\begin{aligned} \begin{aligned} \theta ^*_{i,t}&=\tfrac{1}{2( \kappa (t)- \alpha )}Z^*_{i_0,t}+\tfrac{2 \kappa (t)+\mu _1(t)}{2( \kappa (t)- \alpha )}\tilde{a}_i+\tfrac{\mu _2(t)}{2( \kappa (t)- \alpha )}q_{i,t}+\tfrac{\mu _3(t)}{2( \kappa (t)- \alpha )}\eta _t,\\ \theta ^*_{j,t}&= \tfrac{1}{2( \kappa (t)- \alpha )}Z^*_{i_0,t}+\tfrac{2 \kappa (t)+\bar{\mu }_5(t)}{2( \kappa (t)- \alpha )}w_t +\tfrac{\bar{\mu }_4(t)}{2( \kappa (t)- \alpha )}\tilde{a}_\Sigma , \end{aligned} \end{aligned}$$
(4.17)

for rebalancers \(i\in \{1,...,M\}\setminus \{i_0\}\) and trackers \(j\in \{M+1,...,M+\bar{M}\}\). Second, we fix a tracker \(j_0\in \{M+1,...,M+\bar{M}\}\) and require that the response holdings in (4.2) must be consistent with \(\theta ^*_{j_0,t}\) in the sense that

$$\begin{aligned} \begin{aligned} \theta ^*_{i,t}&=\tfrac{1}{2( \kappa (t)- \alpha )}Z^*_{j_0,t}+\tfrac{2 \kappa (t)+\mu _1(t)}{2 \alpha -2 \kappa (t)}\tilde{a}_i+\tfrac{\mu _2(t)}{2( \kappa (t)- \alpha )}q_{i,t}+\tfrac{\mu _3(t)}{2( \kappa (t)- \alpha )}\eta _t,\\ \theta ^*_{j,t}&=\tfrac{1}{2( \kappa (t)- \alpha )}Z^*_{j_0,t}+\tfrac{2 \kappa (t)+\bar{\mu }_5(t)}{2( \kappa (t)- \alpha )}w_t +\tfrac{\bar{\mu }_4(t)}{2( \kappa (t)- \alpha )}\tilde{a}_\Sigma , \end{aligned} \end{aligned}$$
(4.18)

for rebalancers \(i\in \{1,...,M\}\) and trackers \(j\in \{M+1,...,M+\bar{M}\}\setminus \{j_0\}\).

Similar to the price-friction equilibrium, our Nash equilibrium existence result is based on a technical lemma, which guarantees the existence of a solution to an autonomous system of coupled ODEs.

Lemma 4.4

Let \(\kappa :[0,1]\rightarrow (0,\infty ]\) be a continuous and integrable function (i.e., \(\int _0^1 \kappa (t)dt <\infty \)), let \(M+\bar{M}>2\), and let \(\alpha \le 0\). For a constant \(B(0) \in \mathbb {R}\), the coupled ODEs

$$\begin{aligned} \begin{aligned} B'(t)&=\frac{\begin{array}{l}\Big \{2 \kappa (t) \Big (\bar{M} B(t) (M+\bar{M}-1) \big (\alpha (M+\bar{M})-2 (M+\bar{M}-1) \kappa (t)\big )\\ +(M+\bar{M}-2)\big (\alpha (M+\bar{M}+1)-2 (M+\bar{M}) \kappa (t)\big )\Big )\Big \}\\ \end{array}}{\begin{array}{l} \Big \{ \gamma \Big (A(t) (M+\bar{M}-2) \big (\alpha (M+\bar{M}+1)-2 (M+\bar{M}) \kappa (t)\big )\\ +\alpha \big ((M^2+M-1) \bar{M}+M^2+2 M \bar{M}^2-M+\bar{M}^3-2\big )\\ -2 \left( (M^2-1) \bar{M}+(2 M-1) \bar{M}^2+(M-2) M+\bar{M}^3\right) \kappa (t)\Big )\Big \},\\ \end{array} },\\ A'(t)&= - \big (B'(t)\big )^2\Sigma (t)\big (A(t)+1\big ),\quad A(0)=-\frac{(M-1)B(0)^2\sigma ^2_{\tilde{a}}}{\sigma ^2_{w_0} +(M-1)B(0)^2\sigma ^2_{\tilde{a}}},\\ \Sigma '(t)&= -\big (B'(t)\big )^2\Sigma (t)^2, \quad \Sigma (0) =\frac{(M-1) \sigma _{\tilde{a}}^2 \sigma _{w_0}^2}{(M-1)B(0)^2 \sigma _{\tilde{a}}^2+\sigma _{w_0}^2}, \end{aligned} \end{aligned}$$
(4.19)

have unique solutions with \(\Sigma (t) \ge 0\), \(\Sigma (t)\) decreasing, \(A(t) \in [-1,0]\), and A(t) decreasing for \(t\in [0,1]\). \(\diamondsuit \)

The affine ODE for B(t) in (4.19) is more complicated than the corresponding affine ODE in (3.7) because the Nash equilibrium has the additional fixed point requirement in (4.4) that is absent in the price-friction equilibrium. However, both ODEs for B(t) are affine. It is possible to restate the ODE system (4.19) using a single path-dependent ODE. The special case \(\alpha :=0\) and \(B(0):=-\frac{1}{{\bar{M}}}+\frac{1}{{\bar{M}} (M+{\bar{M}}-1)^2}\) produces a Nash model with no dynamic learning because \(B'(t)=0\) implies \(\Sigma '(t)=0\) and so \(d\eta _t=dq_{i,t}=0\). The resulting Subgame perfect Nash equilibrium model only has learning at \(t=0\) and can be seen as a special case of Choi, Larsen, and Seppi [13].

Our main theoretical result gives a Nash equilibrium in terms of the ODEs (4.19). In this theorem, the price-friction parameter \(\alpha \le 0 \), volatility \(\gamma >0\), and initial value \(B(0)\in \mathbb {R}\) are free parameters.

Theorem 4.5

Let \(\kappa :[0,1]\rightarrow (0,\infty )\) be continuous, let the functions \((B,A,\Sigma )\) be as in Lemma 4.4, let \(M+\bar{M}>2\), and let \(\alpha \le 0\). Then, we have:

  1. (i)

    A subgame perfect Nash financial-market equilibrium exists and is given by the functions in (A.6) in Appendix A.

  2. (ii)

    The Nash equilibrium in (i) has holdings given by

    $$\begin{aligned} \theta _{i,t}^*&:= -\frac{(M+\bar{M}-2) \left( 2 \kappa (t)-\gamma B'(t)\right) }{\alpha (M+\bar{M})-2 (M+\bar{M}-1) \kappa (t)}\tilde{a}_i\nonumber \\&+\frac{\gamma (M+\bar{M}-2) B'(t)}{\alpha (M+\bar{M})-2 (M+\bar{M}-1) \kappa (t)}q_{i,t}\nonumber \\&-\frac{\begin{array}{l}\Big \{\gamma (M+\bar{M}-2)^2 B'(t) (\alpha (M+\bar{M}+1)-2 (M+\bar{M}) \kappa (t))\Big \}\nonumber \\ \end{array}}{\begin{array}{l} \Big \{(\alpha (M+\bar{M})-2 (M+\bar{M}-1) \kappa (t)) \big (\alpha \big ((3 M-1) \bar{M}^2+M (3 M-2) \bar{M}\nonumber \\ +(M-2) M (M+1)+\bar{M}^3\big )-2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)\big )\Big \} \end{array} }\eta _t\nonumber \\&+\frac{\begin{array}{l}\Big \{2 \bar{M} (M+\bar{M}-2) (M+\bar{M}-1) \kappa (t)\Big \}\\ \end{array}}{\begin{array}{l} \Big \{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) \\ -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)\Big \} \end{array} } Y_t, \end{aligned}$$
    (4.20)
    $$\begin{aligned} \begin{aligned} \theta _{j,t}^* :&= -\tfrac{\gamma (M+\bar{M}-2) (M+\bar{M}-1) B'(t)}{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}\eta _t \\&\qquad -\tfrac{2 M (M+\bar{M}-2) (M+\bar{M}-1) \kappa (t)}{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}w_t \\&\qquad +\tfrac{(M+\bar{M}-2) (M+\bar{M}-1) \left( \gamma (-A(t)+M-1) B'(t)+2 \kappa (t)\right) }{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)} \tilde{a}_\Sigma , \end{aligned} \end{aligned}$$

    for rebalancers \(i\in \{1,...,M\}\) and trackers \( j\in \{M+1,...,M+\bar{M}\}\).

  3. (iii)

    The Nash equilibrium in (i) has the stock-price process \(S^*_t\) given by \(S^*_0 := w_0 - B(0)\tilde{a}_\Sigma \) and dynamics with respect to the trackers’ filtrations \(\mathcal {F}_{j,t}:=\sigma (w_u,S^{\bar{\nu }}_{j,u})_{u\in [0,t]}\) given by

    $$\begin{aligned} \begin{aligned} dS^*_t&=\Big \{\tfrac{\gamma (M+\bar{M}-2) B'(t) (\alpha (M+\bar{M}+1)-2 (M+\bar{M}) \kappa (t))}{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}\eta _t\\&\qquad -\tfrac{2 \bar{M} (M+\bar{M}-1) \kappa (t) (\alpha (M+\bar{M})-2 (M+\bar{M}-1) \kappa (t))}{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}w_t \\&\qquad -\tfrac{(M+\bar{M}-2) (\alpha (M+\bar{M}+1)-2 (M+\bar{M}) \kappa (t)) \left( \gamma (-A(t)+M-1) B'(t)+2 \kappa (t)\right) }{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}\tilde{a}_\Sigma \Big \}dt \\&\qquad + \gamma dw_t, \end{aligned} \end{aligned}$$
    (4.21)

    and dynamics with respect to the rebalancers’ filtrations \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^\nu _{i,u})_{u\in [0,t]}\) given by

    $$\begin{aligned} \begin{aligned} dS^*_t&=\Big \{\tfrac{\gamma (M+\bar{M}-2) B'(t) (\alpha (M+\bar{M}+1)-2 (M+\bar{M}) \kappa (t))}{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}\eta _t\\&-\tfrac{2 \bar{M} (M+\bar{M}-1) \kappa (t) (\alpha (M+\bar{M})-2 (M+\bar{M}-1) \kappa (t))}{\alpha \left( (3 M-1) \bar{M}^2+M (3 M-2) \bar{M}+(M-2) M (M+1)+\bar{M}^3\right) -2 \left( (M+\bar{M}-2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}Y_t \\&-\gamma B'(t)(\tilde{a}_i+q_{i,t})\Big \}dt + \gamma dw_{i,t}. \end{aligned} \end{aligned}$$
    (4.22)

    \(\diamondsuit \)

The following observations follow from Theorem 4.5:

  1. 1.

    The logic for the initial value B(0) being a free input parameter is the same as in the price-friction equilibrium.

  2. 2.

    The price-friction parameter \(\alpha \) and stock-price volatility \(\gamma \) affect the stock-price drift and holdings via its impact on B(t) in (4.19). The dependence on \(\alpha \) is different from the price-friction equilibrium where the corresponding B(t) in (3.7) is independent of \(\alpha \). The reason is that \(\alpha \) affects the perceived optimal responses in (4.2).

  3. 3.

    Similar to (3.11) and (3.12), for an arbitrary trader \(k_0 \in \{1,...,M+\bar{M}\}\) and her arbitrary holdings \(\theta _{k_0,t}\), the optimal responses in (4.2) can be decomposed as

    $$\begin{aligned} \begin{aligned} \theta ^{Z_{k_0}}_{i,t}&= \theta ^*_{i,t} -\frac{1}{M+\bar{M}-1} (\theta _{k_0,t}-\theta ^*_{k_0,t}),\quad i \in \{1,...,M\},\\ \theta ^{Z_{k_0}}_{j,t}&=\theta ^*_{j,t} -\frac{1}{M+\bar{M}-1} (\theta _{k_0,t}-\theta ^*_{k_0,t}),\quad j\in \{M+1,...,M+\bar{M}\}, \end{aligned} \end{aligned}$$
    (4.23)

    where the equilibrium holdings \((\theta ^*_{i,t}, \theta ^*_{j,t},\theta ^*_{k_0,t})\) are in (4.20).Footnote 13

  4. 4.

    The subgame perfect Nash financial-market equilibrium is attractive because of its reasonable off-equilibrium market-clearing perceptions. However, although much of the mathematic structure is similar, the expressions for the equilibrium stock price and holding coefficients are algebraically more complex. Nonetheless, our numerical results in Sect. 3.4 below show that the differences between the price-friction and the subgame perfect Nash financial-market equilibria are quantitatively small. This, in turn, suggests that the economic logic from the price-friction equilibrium carries over to the Nash equilibrium.

4.4 Numerics

We have experimented extensively with the subgame perfect Nash model’s numerics, and its numerics are very similar to the numerics of the price-friction equilibrium in Sect. 3. The numerical similarity of the two equilibria suggests that the intuitions for the signs of the various coefficients in the price-friction equilibrium carry over to the subgame perfect Nash financial-market equilibrium. Because the two equilibria produce similar numerics, it appears that the in-equilibrium market-clearing requirement (common in both equilibria) has a much larger effect on equilibrium prices relative to the off-equilibrium market-clearing requirement (only present in the subgame perfect Nash equilibrium).

5 Empirical predictions

The primary contribution of our analysis is theoretical. The Kyle model has provided a tractable framework for a large body of theoretical research on price discovery and dynamic order splitting given long-lived asymmetric information about stock cash flows. However, no corresponding tractable framework exists for modeling price discovery and dynamic order splitting with private trading targets (e.g., by large index funds). Our model provides such a framework. While our zero-dividend modeling approach precludes statements about the impact of order on price levels, our analysis does have empirical implications for intraday price drifts:

First, intraday price predictability is an important empirical driver of high-frequency liquidity provision. Our model’s equilibrium price dynamics in (3.9) and (4.21) suggest that intraday price drifts are path dependent (via the \(\eta _t\) term) and also that learning about parent-demand imbalances early in the trading day is associated with predictable price drifts later in the day.

Second, our analysis provides insights about the determinants of price impact as it relates to imbalance-related parent trading demands and toxic cumulative order flow. In particular, the holdings \(\theta _{k,t}\) are cumulative trading up through time t, and large parent targets \(\tilde{a}_i\) lead to toxic streams of orders. Our subgame perfect Nash model endogenizes the price-drift impact of investor holdings (i.e., cumulative trading). The Nash model’s price-friction coefficient in the rebalancer’s perceived stock-price dynamics (4.7) is given by

$$\begin{aligned} -\frac{2 \nu _0(t) (\kappa (t)-\alpha )}{M+\bar{M}-1} = -2\frac{ \kappa (t)-\alpha }{M+\bar{M}-2}, \end{aligned}$$
(5.1)

where we have inserted \(\nu _0(t)\) from (A.6). An implication of (5.1) is that if, as is widely believed, investor target penalties become stronger as time passes (i.e., if \(\kappa (t)\) increases with time), then our Nash model predicts that the total price impact in (5.1) should increase. On its face, this is contrary to evidence in Barardehi and Bernhardt [6] that price impact declines over the trading day. We conjecture, however, that a richer model can be reconciled with these stylized facts if the number of investors (and, thus, the available inventory bearing capacity to absorb aggregate parent demand imbalances) is also allowed to grow as the market approaches the end of the trading day. Increased investor participation toward the end of the trading day is also empirically common.

6 Measuring execution costs

As an application, this section gives a measure of a rebalancer’s costs of rebalancing from zero endowed shares at time \(t=0\) to a given target \(\tilde{a}_i\). We present the measure in the price-friction equilibrium in Sect. 3 (the Nash analogue is logically similar and produces similar numerics). In the price-friction equilibrium, rebalancer i’s value function is

$$\begin{aligned} \begin{aligned} J(\tilde{a}_i,0,\eta _0,Y_0,q_{i,0}):=&\mathbb {E}\Big [ \int _0^1 \hat{\theta }_{i,t}d{\hat{S}}_t - \int _0^1 \kappa (t)(\tilde{a}_i-\hat{\theta }_{i,t})^2dt\Big |\,\mathcal {F}_{i,0}\Big ], \end{aligned} \end{aligned}$$
(6.1)

where \(\hat{\theta }_{i,t}\) denotes rebalancer i’s equilibrium stock holdings in (3.8) and \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) where the f coefficient functions are as in (A.1) in Appendix A for \(i\in \{1,...,M\}\). We seek a value function \(J= J(\tilde{a}_i,s,q,Y,q_i)\) such that the process

$$\begin{aligned} \begin{aligned} J(\tilde{a}_i,s,\eta _s,Y_s,q_{i,s})&+ \int _0^s \Big \{\hat{\theta }_{i,t}\Big (f_0(t)Y_t +f_1(t)\tilde{a}_i +f_2(t)q_{i,t}+f_3(t)\eta _t+\alpha \hat{\theta }_{i,t}\Big )\\&- \kappa (t)(\tilde{a}_i-\hat{\theta }_{i,t})^2\Big \}dt,\quad s\in [0,1], \end{aligned} \end{aligned}$$
(6.2)

is a martingale with respect to \(\mathcal {F}_{i,t}\). Because rebalancer i’s objective in (2.5) is linear-quadratic, the value function J is again linear-quadratic in the state processes. Thus, J can be written as

$$\begin{aligned} \begin{aligned} J(\tilde{a}_i,s,\eta ,Y,q_i)&= J_0(s) + J_{\eta }(s) \eta + J_Y(s) Y+ J_{q_i}(s) q_i+ J_{\eta \eta }(s)\eta ^2 \\&\quad + J_{\eta Y}(s) \eta Y + J_{YY}(s) Y^2+J_{q_iq_i}(s) q^2_i+ J_{q_i\eta }(s)q_i\eta +J_{q_iY}(s) q_iY, \end{aligned} \end{aligned}$$
(6.3)

for deterministic functions of time \((J_0, J_\eta , J_Y, J_{q_i},J_{\eta \eta },J_{\eta Y}, J_{YY},J_{q_iq_i},J_{q_i\eta },J_{q_iY})\). These functions are given by a coupled set of ODEs with zero terminal conditions (we omit the ODEs for brevity). In (6.3), the dummy variables \((\eta ,Y,q_i)\) are real numbers and \(s\in [0,1]\).

Fig. 7
figure 7

Plots of the rebalancers’ value function J for various values of \((\gamma ,\sigma _{w_0})\). The exogenous model parameters are \( \sigma _{\tilde{a}}:=1, M:=\bar{M}:=10, \;\alpha :=-0.1,\; B(0):=-1,\;\kappa (t):=1\) for \(t\in [0,1]\), and \(w_0:= B(0)(\tilde{a}_\Sigma -\tilde{a}_i)\)

To quantify the costs associated with rebalancer i’s trading target \(\tilde{a}_i\), the quadratic mapping RC (Rebalancing Costs) defined by

$$\begin{aligned} \begin{aligned} \text {RC}(\tilde{a}_i):=J(0,0,\eta ,Y,q_{i})-J(\tilde{a}_i,0,\eta ,Y,q_{i}), \end{aligned} \end{aligned}$$
(6.4)

measures the dependence the change in profit (i.e., change in value function) associated with a non-zero target \(\tilde{a}_i\). The rebalancing cost RC in (6.4) for a target \(\tilde{a}_i\) is computed as the difference between the value function evaluated at \(\tilde{a}_i\) and the function evaluated at \(\tilde{a}_i = 0\). Since the value function J is highest at \(\tilde{a}_i = 0\), the measure RC is positive.

Figure 7 plots the rebalancer’s value function J for different target values \(\tilde{a}_i\) for different model parameterizations. When the target \(\tilde{a}_i\) is close to zero, the rebalancers become high-frequency liquidity providers. Their value function is positive due expected profit from liquidity provision and price-pressure speculation. As the target moves away from zero, the rebalancer starts to have larger stock-holding penalties that eventually drive the rebalancer’s value function negative. Interestingly, the impact of the stock-price volatility parameter \(\gamma \) on the rebalancer’s value function can be positive or negative. Liquidity providing rebalancers are better off with a small \(\gamma \) whereas rebalancers with large rebalancing targets are better off when \(\gamma \) is large.

7 Conclusion

This paper presents the first analytically tractable model of dynamic learning about parent trading-demand imbalances with optimized order-splitting. In particular, we provide closed-form expressions prices and stock holdings in terms of solutions to systems of coupled ODEs in both the price-friction and Subgame perfect Nash equilibria. Trading in our models reflects a combination of reaching investor’s own trading targets, liquidity provision so that markets can clear, and speculation based on predictions of future price pressure.

There are many interesting directions for future research based on our analysis. First, replacing the zero-dividend stock approach with valuation based on a terminal payoff would be a significant technical step. Second, the model could be enriched by allowing for investor heterogeneity in the form of different penalty functions \(\kappa (t)\) and by having multiple tracker targets (which would weaken the trackers’ informational advantage). Third, it would be interesting to investigate if other off-equilibrium refinements have larger equilibrium effects. Fourth, incorporating risk-aversion into the investors’ objectives would be interesting too. For example, how can Lemma 4.1 be extended if the objectives in (2.5) are changed to exponential utilities?