Confounding dynamics

In the context of a dynamic model with incomplete information, we isolate a novel mechanism of shock propagation that results in waves of optimism and pessimism along a Rational Expectations equilibrium. We term the mechanism confounding dynamics because it arises from agents’ optimal signal extraction efforts on variables whose dynamics—as opposed to superimposed noise—prevents full revelation of information. Employing methods in the space of analytic functions, we are able to obtain analytical characterizations of the equilibria that generalize the celebrated Hansen-Sargent optimal prediction formula. We apply our results to a canonical one-sector real business cycle model. We show that, in response to a persistent positive productivity shock, confounding dynamics generate expansions and recessions that would not be present under complete information.


Introduction
Modeling and seeking to understand economic fluctuations is one of the cornerstones of modern economics. The role of incomplete information in this endeavor was acknowledged very early on by Pigou (1929) and Keynes (1936). Their ideas were first formalized in a rational expectations setting by Lucas (1972Lucas ( , 1975, King (1982) and Townsend (1983b). The underlying theme that ties these papers together is that unresolved uncertainty-in and of itself-can be a source of fluctuation in the economy. This idea has seen a resurgence. Dynamic models with dispersed information are becoming increasingly prominent in several literatures such as asset pricing, optimal policy communication, international finance, and business cycles. 1 Our paper contributes to this literature by introducing the concept of confounding dynamics and doing so in a manner that permits tractability.
Confounding dynamics arise from optimal prediction (i.e. rational expectations) in which past realizations of economic shocks prevent full revelation of information today, even when an arbitrarily large amount of data is available. Ensuring confounding dynamics survive amounts to deriving (non-invertibility) restrictions on equilibrium dynamics. If endogenous variables are non-invertible in current and past observations, agents will never fully unravel the contemporaneous economic shock. This is true even when the number of shocks is equal to the number of observables. Thus, a distinguishing feature of our approach is that confounding dynamics does not rely on the need to impose exogenous noise on the model. 2 The defining property of confounding dynamics is that the equilibrium impulse response functions consist of oscillating over-and under-reactions relative to the full information (or exogenous noise) equilibrium. The typical shape of an impulse response to a fundamental shock under complete information is one that monotonically decays back to steady state either from above or below.
With confounding dynamics, impulse response functions fluctuate around the full information counterpart. That is, they display the "waves of optimism and pessimism" of Pigou (1929).
2 Many of the papers in the dynamic dispersed information literature introduce exogenous noise in order to preserve asymmetric information in equilibrium. In Section 4.3 we show that our approach can accommodate exogenous noise, while still maintaining confounding dynamics.
confounding dynamics are preserved in equilibrium (see Section 4.2). Finally, we provide economic intuition by introducing confounding dynamics into a standard Real Business Cycle model. We find that the elasticity of intertemporal substitution in agents' preferences plays a critical for both the existence of confounding dynamics, and the degree by which they affect equilibrium allocations.
To the best of our knowledge, this equilibrium dynamic behavior and formal characterization is new to the rational expectations literature (see Section 5).
We solve and analyze the rational expectations equilibrium in the space of analytic functions. This approach has several advantages vis-a-vis standard time-domain methods. For example, as emphasized in Townsend (1983a), equilibria are sought in generic functional spaces spanned by linear combinations of shocks, which allows one to avoid explicitly modeling higher-order belief dynamics (Section 3.5). Moreover, the matrix Ricatti equation is replaced by a more transparent spectral factorization problem. This allows us to solve and analyze the equilibrium in closed form.
We are not the first to advocate such an approach. Others, such as Futia (1981), Townsend (1983a), Taub (1989), Kasa (2000), Walker (2007), Rondina (2009), Bernhardt, Seiler, and Taub (2010), Kasa, Walker, and Whiteman (2014), and Huo and Takayama (2016) have used similar techniques to solve dynamic rational expectation models with incomplete information. We contribute to this literature by deriving analytical representations (e.g., generalized Hansen-Sargent formulas) and by providing a systematic treatment of equilibrium conditions in models with dispersed information that display confounding dynamics. Futia (1981) and Townsend (1983a) were the first to advocate for the use of analytic functions to solve dynamic rational expectations models with heterogeneous information. Many of the mathematical antecedents of this paper can be found there and in Whiteman (1983). Taub (1989) demonstrates how the algebra associated with dynamic signal extraction (i.e., spectral factorization) is simplified through the analytic function approach. We take advantage of these formulas to completely characterize existence and uniqueness of equilibria in dispersed informational setups. Bernhardt, Seiler, and Taub (2010) and Kasa, Walker, and Whiteman (2014) do not examine models with dispersed information, but show how these methods can be used to help resolve asset pricing anomalies.

Preliminaries
In this section, we first establish notation and introduce relevant mathematical definitions. We then formalize the notion of confounding dynamics, presenting a simple example that shows the mechanism at work.

Mathematical Preliminaries
Throughout the paper, we work in the space of polynomials in the lag operator L with square-summable coefficients that operate on Gaussian random variables.
In our framework, any stochastic process ω t can always be written as where ∞ j=0 |Q j | 2 < ∞, and ε t ∼ N (0, σ ε ), are innovations identically and independently distributed over time. In linear-Gaussian environments, working with representations of the form of (2.1), and their functional equivalents, has three advantages for analyzing rational expectations models with incomplete information. 3 First, representation (2.1) is general in the sense that it can accommodate both auto-regressive (AR) and moving-average (MA) components of any order. This is especially useful when searching for an equilibrium because it avoids the need to specify a conjecture with a specific ARMA order.
Regardless of the complexity of the equilibrium conditions that emerge in models of dispersed information (e.g., infinite regress in expectations), the solution will take the form of (2.1). 4 Second, the Wold Representation Theorem ensures that processes like ω t can always be written uniquely as a linear combination of a moving average representation where the innovations are the linear forecast errors for ω t , conditional on any linear-Gaussian information set [Brockwell and Davis (1987)]. That is, the Wold representation establishes the invertibility of Q(L) and one may write Q(L) −1 ω t = ε t , which implies that the space spanned by {ω t , ω t−1 , ...} is equivalent (in meansquare norm) to the space spanned by {ε t , ε t−1 , ...}. Consequently, one can apply the optimal prediction formulas derived by Wiener-Kolmogorov [Whittle (1983)] to compute the conditional expectation of processes like ω t .
Third, the Riesz-Fischer Theorem [see Sargent (1987)] establishes an isometric, isomorphic mapping from the space of lag polynomials with square-summable coefficients Q(L) to the space of analytic complex-valued functions, where (2.1) is represented as Q(z), but with z ∈ C. In several key steps of the analysis in this paper we find it convenient to exploit the properties of such functions, which allows us to derive simple existence and uniqueness conditions for rational expectations equilibria with incomplete and dispersed information following Whiteman (1983). In a slight abuse of notation, we employ L and z interchangeably when working in the space of analytic functions.
While this methodology is extremely helpful in solving dynamic models with incomplete information, it is not well known by economists. Thus, we provide Online Appendix C with the statements of the key theorems cited above and further references for interested readers. We now restrict our focus to the formulation of optimal prediction formulas, which is where confounding dynamics emerge.
Suppose that we would like to formulate the prediction of ω t+j so as to minimize the meansquared forecast error, conditional on the observation of the history of a n × 1 vector of variables, s t , up to time t. To denote such history, we use the compact notation, s t ≡ {s t−j } ∞ j=0 . Let s t = Γ(L)u t , (2.2) same number of shocks and signals, non-invertibility stems from the way signals combine over time and become themselves a source of noise.
We are interested in this internal source of non-invertibility, which we term "Confounding Dynamics". Confounding dynamics naturally arise also when m > n, in which case they compound with the first source of non-invertibility, but their characterization can be substantially more involved. Because our focus is on showing that confounding dynamics can endogenously emerge in equilibrium, in the rest of the paper we focus mostly on the m = n case. In Section 4.3 we analyze an example where the two sources of non-invertibility are simultaneously present, while we provide a more formal treatment of the m > n case in Appendix B.3. The following definition formalizes the above discussion.
7 Whittle (1983) shows that the computation of Γ * (L) is the dual to the solution of the n th order Ricatti equation in the Kalman filter approach to optimal linear-quadratic prediction. In both cases the objective is to figure out the variance-covariance of the optimal prediction errors. Hence, if the Ricatti equation is solved, then Γ * (z) can be determined; and if Γ * (L) is computed, then a solution to the Ricatti equation can be determined. In this paper we take the latter approach, while other authors, most notably Huo and Takayama (2016), take the former. Whether it is easier to solve the Ricatti equation or to obtain the canonical factorization often depends on the application at hand, and whether one is looking for an analytical or a numerical solution. The canonical factorization, however, has a wider scope of applicability. Whittle (1983) presents a generalized approach -which he terms the Hamiltonian approachwhere the prediction problem is solved by the canonical factorization of a matrix function obtained by augmenting the plant and observation equations of a Kalman system with the saddle point conditions of an Hamiltonian optimization problem. Huertgen, Hoffmann, Rondina, and Walker (2016) make an explicit connection between these solution methodologies.
Definition CD. Let s t be specified as in (2.2), with m = n. The s t process is said to display confounding dynamics if there exists a λ, with |λ| < 1, such that det Γ(z)| z=λ = 0.
If the determinant vanishes at z = |λ| ∈ (−1, 1), then it guarantees that Γ(z) is not invertible and therefore knowledge of s t will not translate into direct knowledge of u t (u t = Γ(L) −1 s t ). The Wold representation theorem implies that there always exists a Γ * (L) that is invertible. Deriving this Γ * (L) is the purpose of the next section.
The definition allows for an arbitrary number of roots at which the determinant of Γ(L) vanishes.
For simplicity, we will focus on characterizing equilibria with confounding dynamics due to only one root λ. An immediate consequence of this is that λ must necessarily be real-valued. Appendix B.4 presents an example of confounding dynamics due to N non-invertible roots. Taken together, the results in Appendix B.3 and B.4 ensure that the learning mechanism at the core of the equilibria that we characterize in Section 4 does not hinge upon the dimensions of Γ(L) or the number of zeros inside the unit circle.

Prediction with Confounding Dynamics
To study the mechanism of confounding dynamics and showcase the usefulness of the Wiener-Kolmogorov prediction formula (2.4), we present a simple version of the prediction problem that operates at the heart of the rational expectations equilibria established in Section 4.
Let Q(L) = 1 in (2.1), so that ω t = ε t , and s t is a univariate process specified as which results in Γ(L) = (L − λ)σ ε , u t = σ −1 ε ε t , and let λ < 0. Suppose that the prediction problem is to compute the mean-squared error minimizing prediction for ε t given that s t is observed. To fix ideas, imagine that ε t is the time-t unobserved innovation in aggregate productivity in the economy, while s t is the observed market rental rate of physical capital. The prediction problem asks for an estimate of the current productivity innovation using the history of the market rental rate. 8 To solve the problem we can apply the Wiener-Kolmogorov prediction formula (2.4) with j = 0.
Note that g ωs (z) = σ 2 ε (z −1 − λ), whereas to obtain Γ * (L), we need to consider two possible cases. If |λ| ≥ 1, Γ * (L) = Γ(L) = (L − λ)σ ε , which means that the stochastic process (2.7) is invertible, and therefore there exists a linear combination of current and past s t 's that allows the exact recovery of ε t . One can easily verify that applying (2.4) leads to Π(L) = (L − λ) −1 , and the optimal prediction corresponds to (2.8) which verifies that the history of s t contains all the information needed to perfectly know ε t .
Consider now the case of |λ| < 1, so that s t displays confounding dynamics, according to Definition CD. Clearly, the prediction formula (2.8) is no longer well defined as the coefficients diverge. The problem is that Γ(L) is non-invertible at L = λ and so our previous choice for Γ * (L) would not work in this case. In this simple environment, Rozanov (1967) shows that the canonical factorization is given by flipping the root outside of the unit circle, Γ * (L) = (1 − λL)σ ε , which results in The optimal prediction for ε t is then which results in a mean squared forecast error of 1 − λ 2 σ 2 ε > 0, demonstrating that as λ approaches one from below there is exact recovery of ε t .
When the process is non-invertible, (2.10) shows that the history of current and past s t 's reveals a particular linear combination of ε t 's. Expanding this last term yields (2.11) information + noise from confounding dynamics Thus, the noise resulting from confounding dynamics takes an unusual form as it consists of a linear combination of past realizations of ε t . This is the sense in which the non-invertibility of Γ(L) is of an "internal" nature. Expression (2.11) suggests that the process (2.7) is informationally equivalent to a noisy signal about ε t , where the noise is the linear combination of past shocks (in the bracketed term), and the signal-to-noise ratio is measured by λ 2 . A λ closer to zero results in less information and more noise, but, at the same time, it also makes past shocks less persistent. In fact as λ → 0, there is no information in s t about ε t and the optimal prediction is 0, the unconditional average.
As long as |λ| ∈ (−1, 1), the value of ε t will never be learned and in this sense, the history of the fundamental shock acts as a standard noise shock. This is the defining characteristic of confounding dynamics.
variance of the forecast error conditioned on the infinite history of the observed signal, i.e. (2.13) Notice that when the signal-to-noise ratio increases (decreases), this corresponds to a higher (lower) absolute value of λ. In the limit, as σ 2 η → 0, then λ 2 → 1, which ensures exact recovery of the state in both cases.
While the informational content can be made identical, the dynamics of the two signal extraction problems are very different. To visualize this, we report the impulse response function for the prediction equations that contain confounding dynamics (2.11) and for the standard signal extraction problem (2.12) to a one time, one unit increase in ε t in Figure 1. We do this for both a low and high value of λ 2 (resp. τ ).
First notice that with confounding dynamics, (2.11) under-predicts the actual innovation on impact, with a smaller value of λ under-predicting more significantly. This is due to the first term on the RHS of (2.11). The same is true for the standard signal extraction formulation (dashed lines). Agents weigh the initial innovation by the signal-to-noise ratio τ < 1 and therefore under-predict on impact. This is where the similarities end. With confounding dynamics, periods two through six show waves of over-and under-prediction relative to the actual realization and relative to the standard signal extraction problem. As discussed above, the current and past innovations will persistently affect the prediction function several periods beyond impact. This defining characteristic of confounding dynamics leads to the waves of over-and under-reaction. This is in contrast to the full information case and standard signal-extraction case where the impulse response is zero after impact. As already pointed out, the smaller the λ, the larger the noise term in (2.11), but the less persistent the over-and under-prediction. Thus optimal signal extraction with confounding dynamics generates fluctuations where the full-information and exogenously imposed noise counterparts generate none. We now show how to embed these dynamics in a rational expectations model.

Model, Information, and Equilibrium
We now model confounding dynamics in a generic rational expectations formulation that permits many interpretations (e.g., monetary model, asset pricing model, etc.). We do this via dispersed information, which introduces well-known difficulties. We lay out a solution strategy that takes advantage of aforementioned mathematical properties and compare that strategy to alternative methodologies.

Model
We consider models that are populated by a continuum of agents indexed by i ∈ [0, 1]. Let µ(i) be the density of agent i characterized by the information set at time t, denoted by Ω it .
We are interested in the class of models in which the individual optimal choice can be represented by the dynamic expectational difference equation, , is a vector of coefficients, and ψ(L) ≡ [ψ x (L) ψ y (L) ψ θ (L)], is a vector of square-summable lag polynomials in non-negative powers of L. x it is the choice variable under the control of the individual agent i; y t is an endogenous aggregate variable that agents take as given, and θ it is an exogenous stochastic process specified as the sum of an aggregate component θ t and , and A(L) is a square-summable polynomial in non-negative powers of L. To close the model we need to specify a relationship between the distribution of x it across agents, and the aggregate y t . We thus posit that , is a vector of square-summable finite-degree lag polynomials in non-negative powers of L, and we assume γ x (L) = 0. 10 As we proceed with the analysis it will be useful to think of equation (3.1) as representing a demand (or supply) schedule for agent i, and (3.4) as the relative market clearing condition. However, the specific form depends on the particular application at hand. As we show in Section 3.3, this setup nests incomplete information models typical of the macro and finance literatures.
The expectational difference equation (3.1) is a dispersed information version of the system originally considered by Blanchard and Kahn (1980), and subsequently studied by Uhlig (1999), Klein (2000 and Sims (2002), among others. Dispersed information implies that individual expectations are heterogeneous, which implies that the aggregation in (3.4) will result in taking an average of expectations. In particular, model (3.1)-(3.4) can accommodate both average expectations of aggregate variables and average expectations of individual variables.

Information
In our dispersed information setup, we assume that the information set Ω it of an arbitrary agent i at time t consists of the smallest closed subspace generated by the history of the random variable θ t i ≡ {θ it , θ it−1 , ...}, and the history of the aggregate variable y t = {y t , y t−1 , ...}.
where the operator ∨ denotes the span (i.e., the smallest closed subspace which contains the subspaces) generated by the sequences θ t i and y t . This notation simply suggests that expectations will be taken optimally; i.e., they will be consistent with the prediction formulas discussed in Section 2.2. In a multivariate moving-average setting, the invertible representation achieved via canonical factorization is the smallest closed subspace containing the observables, θ t i and y t (see Hoffman (1962)).
Given (3.1), x it will be a function of the history of idiosyncratic innovations, v it , and the aggregate innovations, ε t , namely In addition, aggregation implies that y t is only a function of aggregate innovations, so that The signal structure can be thus represented as We point out that our information set is in line with the typical information set assumed in the dispersed information rational expectations literature: we provide agents with both an exogenous signal about the aggregate unobserved state (θ it ), and an endogenous signal that is determined in equilibrium (y t ). The analytical convenience of the signal structure (3.7), for our purposes, is that the invertibility of the matrix Γ(L) hinges only upon the zeros of Y (L). At the same time, the structure imposes analytical discipline that is uncommon in the literature: the endogenous signal y t can reveal perfectly the underlying state, under the appropriate parametrization of model (3.1)-(3.4). Thus, the theorems below must establish both the degree to which information remains incomplete in equilibrium, along with the more standard existence and uniqueness conditions.

Examples
In this section we present four applications that can be cast into our model specification. The list is by no means exhaustive.
Example 1: Real Business Cycle with Capital. In a standard real business cycle model with capital, in presence of dispersed information about the aggregate productivity shock and incomplete insurance markets, the linear dynamics of capital around the steady state can be expressed as which is a standard second-order difference equation in capital, and where is the market-clearing rental rate for capital. Here β is the subjective discount factor, α is the capital share in the Cobb-Douglas output good technology, η is the elasticity of intertemporal substitution, and a it is the individual productivity shock. 11 Model (3.8)-(3.9) maps into (3.1)-(3.4) by setting x it = k it+1 , y t = r t , θ it = a it , and Example 2: Calvo Pricing and New Keynesian Phillips Curve. Nimark (2008) considers a dispersed information version of the optimal Calvo pricing problem that micro-founds the New Keynesian Phillips Curve popularized Woodford (2003b) and Galì (2008). Nimark (2008) shows that the optimal price for an optimizing firm is where p t is the aggregate price level, defined as p t = ϑp * t + (1 − ϑ)p t−1 , with p * t ≡ 1 0 p * it µ(i)di, and mc it is the individual marginal cost at time t specified as mc it = mc t + v it so that 1 0 mc it µ(i)di = mc t . The parameter β is the discount factor for price setters, while ϑ measures the probability of resetting ones' price in a given period. Define p it ≡ ϑp * it +(1−ϑ)p t−1 , which maintains p t . The individual and aggregate price level dynamics can then be written as, (3.12) Equations (3.11)-(3.12) maps into (3.1)-(3.4) by setting x it = p it , y t = p t θ it = mc it , and As recognized by Nimark (2008), in the presence of dispersed information a compact representation of the New Keynesian Phillips Curve cannot be obtained. However, once a solution for p t is derived from (3.11)-(3.12), inflation dynamics are immediately given by π t = p t − p t−1 .
Example 3: Dynamic Asset Pricing. Singleton (1987) presents a dynamic asset pricing model motivated by the market microstructure of the U.S. bond market, which features a competitive, Walrasian market structure with a single security that is traded among speculative investors and nonspeculative or liquidity traders at the price p t . 12 The security is assumed to pay a constant coupon every period, which we normalize to zero. Purchases of the security are financed by borrowing at the constant rate r, and the wealth of investor i evolves according to The i th investor is assumed to have a one-period investment horizon and to rank alternative investment strategies according to the utility E it − exp(−̺w it+1 ) , where ̺ is the constant coefficient of absolute risk aversion. Singleton (1987) shows that the demand schedule for the risky asset takes the form where ν is the variance of p t+1 and is set to be an exogenous constant. Singleton (1987) assumes that the net supply of the asset, denoted by n t , is specified as (3.14) The shock to net asset supply f t arises from nonspeculative traders (such as the U.S. Treasury, the Federal Reserve, financial intermediaries), that attempt to satisfy macroeconomic objectives for technical reasons related to the intermediation process. Nonspeculative traders are assumed to respond positively to an increase in prices; thus ϑ > 0. Investors in setting their strategy z it are assumed to receive a private signal, f it = f t + v it , about the shock to the net asset supply. Market clearing is therefore given by (3.15) Model (3.13)-(3.15) maps into (3.1)-(3.4) by setting x it = z it , y t = p t , θ it = f it , and φ = 0 1 0 , ψ(L) = ̺ν 1 + r 0 , γ(L) = 1 −ϑ −1 .
Example 4: Classical Monetary Models of Inflation. In classical monetary models of inflation, money demand takes the form popularized by Cagan (1956), where m it is nominal money demand by agent i, p t is an aggregate price index, and α > 0. The money supply M t is assumed to possess persistent dynamics specified as where f t is a money supply shock process. The money market clearing condition is then Agents are assumed to receive a private signal, f it = f t + v it , about the money supply shock.

Equilibrium Definition
Uncertainty is assumed to be driven by Gaussian innovations, which, together with linearity, implies that conditional expectations are computed as optimal linear projections. We thus have  Radner (1979) as an early example].
3.5 Weighted Sum of Expectations Before discussing our solution methodology, we give a brief overview of the typical approach to solve model (3.1)-(3.4), which consists of two steps. The first step is to iteratively substitute the endogenous variables x it+j and y t+j forward by leading (3.1) j periods forward and aggregating over agents. The end result is expressions for x it and y t , that are a function of expectations of the exogenous variable θ t at all future horizons. The second step is then to compute those expectations, which is non-trivial due to the fact that the law of iterated expectations may not be operational. Most of the work that uses this approach rely on numerics to calculate these expectations. 13 Specifically, consider the univariate case, where y t = 1 0 x it µ(i)di, setting φ θ = 0, ψ x (L) = 1, ψ y (L) = 0, and ψ θ (L) = −1. Next, bring (3.1) one period forward to obtain an expression for x it+1 and aggregate to get the analogue expression for y t+1 . Substitute both expressions back into (3.1), which now will contain x it+2 and y t+2 . Iterating on this procedure and aggregating over agents, the expression for y t becomes 14 HereĒ j (θ t+j ) stands for the j th order average expectation of θ t+j , with the convention thatĒ 0 (θ t ) = θ t . Disregarding for a moment the operator P j φx , equation (3.20) shows that y t can be represented as a weighted sum of average expectations of higher order about the future realizations of the exogenous process θ t . The higher the values of φ x and φ y , the higher the relative weight of higher order thinking. The operator P j φx operates on the order of expectations j by reducing some of the higher order compounding depending on the position of φ x in the terms of the polynomial (φ x + φ y ) j . To see why the order might need to be reduced, consider the expression for φ x E it (x it+1 ) which contains the term φ x E it+1 (x it+2 ), whose expression in turn contains θ t+2 . It follows that the law of iterated expectations (LIE) applies in this context so that φ 2 , and aggregation implies φ 2 xĒt (θ t+2 ) for j = 2 in (3.20). Intuitively, in each round of the iterative substitutions to achieve representation (3.20) there are terms where agent i is taking expectations of both her own future expectations and of future average expectations. The law of iterated expectations applies to the former, so that the order of expectations is reduced, but not to the latter. 15 It should be evident at this point that the second step required by the canonical approach-13 Nimark (2010), and Melosi (2016) are recent examples of sophisticated numerical methods to characterize equilibria with dispersed information.
14 We provide the details of the derivation in Appendix B.5. 15 Mechanically, whether LIE applies or not at each iteration depends on the position of φx in the coefficients of the polynomial (φx + φy) j , i.e. on the set of permutations of size j of φy and φx with repetition. For instance, for the case of j = 2, the set of terms that multiply ψ θ in (3.20) are (φ 2 y + φyφx)ĒtĒt+1(θt+2) + (φxφy + φ 2 x )Ēt(θt+2). For more details on this see Appendix B.5. computing closed form solutions for the expectations of arbitrary order-is a daunting task under dispersed information. As already remarked and discussed thoroughly in the next section, we approach the solution to (3.20) from a completely different angle.

Solution Methodology
Our aim is to characterize an equilibrium for model (3.1)-(3.4) with confounding dynamics. From definition CD, the critical requirement for confounding dynamics to emerge is that the information matrix Γ(L) must be non-invertible at a λ ∈ (−1, 1). However, there is no guarantee that this condition will hold. Following Townsend (1983a), our approach is to formulate a guess for the endogenous variables that follows a generic polynomial in the underlying shocks, and then derive conditions on the exogenous parameters that yield non-invertibility in equilibrium. The following steps describe our procedure when looking for an equilibrium with confounding dynamics.
1. Specify the guesses for x it and y t as generic polynomials of underlying shocks (3.21) where y t has confounding dynamics, so that Y (λ) = 0, for λ ∈ (−1, 1). 4. Aggregate over agents according to (3.4) and use the relationship between X(L) and Y (L) to substitute X(L) with Y (L) in (3.1). Both the right hand side and the left hand side will now be lag polynomial operators in ε t and v it , and will thus provide the fixed point conditions for Y (L) and V (L).
5. Derive conditions on exogenous parameters so as to ensure that the solution exists and is unique, and that there exists a |λ| < 1, verifying (3.22). Once Y (L) is solved for, use (3.4) to recover X(L).
Note that at no point in the solution procedure one needs to worry about higher-order expectations.
The so-called "higher-order thinking" that complicates the iterative approach outlined in Section 3.5 is implicit in how the guess (3.21) combines with the information matrix Γ(L) to provide a closed form for the first order expectations in (3.1). As recognized by Townsend (1983a), by guessing a generic lag polynomial, the higher-order beliefs are built into the guess and we do not have to track these terms explicitly, although higher-order beliefs can be backed out of the solution in closed form.
The same solution procedure is followed when we solve for an equilibrium with full information, with the only difference that condition (3.22) is not imposed, and thus does not have to be verified, and the signal matrix Γ(L) corresponds to full information so that prediction formula (2.5) applies.

Equilibrium with Confounding Dynamics
This section establishes the main result of the paper: the existence of a rational expectations equilibrium with confounding dynamics in a dispersed information environment.

Full Information Benchmark
We consider first the case of Full Information to establish a useful benchmark and to show, in the simplest of settings, how our solution methodology works.
We define Full Information as the case when every agent is endowed with perfect knowledge of the aggregate and her own idiosyncratic innovations history up to time t. Denoting the full information set byΩ it , the set is formally specified asΩ Here, and in the following analysis, we assume that agents know that the equilibrium relationship is given by (3.1)-(3.4). When information is full, all agents share the same expectations about the future value of the exogenous process θ t . As a consequence, the entire structure of higher-order expectations in (3.20) collapses to the common first-order expectation so that Hansen and Sargent (1980) worked out a formula to express the discounted sum of future expectationssuch as the one in (4.2)-in closed form, which since then has been known as the Hansen-Sargent formula. Here we show how to derive the Hansen-Sargent formula by applying the methodology of Whiteman (1983). 16 Our main theorem extends the Hansen-Sargent formula to models with incomplete information.
Throughout the full information analysis we will maintain that φ θ = 0, ψ θ (L) = −1, and consider market clearing (3.4) with γ θ (L) = 0, so that Hereγ y (L) ≡ γy(L) γx(L) , and we assume that γ x (0) = 0, so thatγ y (0) is well defined. 17 We begin by guessing that the solution takes the form, and Y(L) are square-summable lag polynomial in non-negative powers 16 Whiteman (1983) provides a rigorous treatment of solving linear rational expectation models using the space of analytic functions. We rely on his theorems to establish existence and uniqueness of equilibria and provide an overview of his approach in Online Appendix C. 17 The solution under the general specification (3.1)-(3.4) is not instructive for our purposes here and has been relegated to Appendix A.1.
of L. Under full information, direct application of the Wiener-Kolmogorov formula (2.5) provides expressions for the relevant expectational terms, Substituting these expressions into equation (3.1), invoking X (L) =γ y (L)Y(L), and aggregating over agents, we obtain an equation featuring Y(L), while V(L) is washed out by the aggregation process. After substitution, the ε t terms can be dropped, both sides can be multiplied by L, and the expression can be rearranged to solve for Y(L) so that where, is the characteristic polynomial that drives the auto-regressive behavior and the stationarity of y t . Expression (4.7) is not a solution because it features the endogenous constant Y(0) on both sides, and, if we evaluate both sides at L = 0, the term Y(0) drops out and cannot be solved for.
As shown in Whiteman (1983), this indeterminacy is pinned down by the requirement that Y(L) be square-summable (i.e. stationary), which is equivalent to the denominator polynomial of Φ(L) not having any roots inside the unit circle. Thus, the constant Y(0) must be set to remove roots inside the unit circle in Φ(L).
Whiteman ( Assumption (S). The polynomial Φ(L) has exactly one root inside the unit circle.
It is important to note that Assumption (S) is not a special case. It is the standard assumption necessary to yield a unique rational expectations equilibrium (e.g., Sims (2002)) and it immediately implies that Φ(L) can be factorized as where |ζ| < 1, andΦ(L) has no roots inside the unit circle. Under assumption (S), the constant Y(0) can be chosen so to introduce a root in the numerator polynomial of (4.7) that cancels the non-stationary root ζ at the denominator. To wit, Solving for Y(0) and substituting the expression into (4.7), one finally obtains the solution where, by construction, the root ζ in the denominator is now canceled with a zero at ζ in the numerator. 18 Equation (4.11) is an instance of the Hansen-Sargent formula. To better understand the formula, Comparing this expression to (4.2) shows that the Hansen-Sargent formula turns the infinite sum of expectations about future θ t 's into the difference between the infinite sum of future θ t 's under perfect foresight (the first summation term), minus the innovations to those future realizations that are not known at time t given the specified information set (the second summation term). In this sense, it is a true prediction formula. It takes the best guess if all information were available to the agents and subtracts off the precise linear combination of unknown elements that minimizes the agent's forecast error.
We conclude this section by pointing out that, even though one does not need to solve for V(L) to figure out the solution for y t (because all agents are equally informed), one can apply the same steps as above to obtain a closed form for V(L). The agent-specific component V(L)v it determines the cross section distribution of x it . The characteristic polynomial that drives the autoregressive In order to have the cross-sectional distribution well defined at any point in time, except possibly for the unit root limit, we assume the following.
Assumption (s). The polynomial φ x (L) has exactly one root inside the unit circle.
We report the closed form solution to V(L) in Appendix A.1.

Equilibrium with Confounding Dynamics: Main Theorem
In this section we state our main Theorem, which provides conditions under which a REE with Confounding Dynamics exists. As stated in Section 3.2, we specify the information set as Agents thus observe the entire history of the exogenous process θ it up to time t, together with the history of the aggregate variable y t . In addition, the signal structure (2.2) and the model equations (3.1)-(3.4) are both common knowledge across agents. By Definition CD, in solving for y t = Y (L)ε t , we must find restrictions on exogenous parameters that ensure that there exists a λ ∈ (−1, 1) such that Y (λ) = 0. The following theorem derives conditions for such a λ to exist.
Theorem 1. Consider model (3.1)-(3.4) with Assumptions (S) and (s). Let the information sets be specified as in (4.13). There exists a Rational Expectations Equilibrium with Confounding Dynamics only on exogenous parameters, and Y (L) in (4.14) has no zeros inside the unit circle other than λ.
Theorem 1 provides sufficient conditions for an equilibrium with confounding dynamics to exist.
The existence condition is given by (4.15), while the functional form of the equilibrium is (4.14).
The form of (4.14) is intuitive when contrasted with the full information counterpart. As noted above, the standard Hansen-Sargent formula (4.12) subtracts off the particular linear combination of future values of ε t that minimize the agent's forecast error. As described in Section 2.2, confounding dynamics implies that a particular linear combination of past values of ε t are never revealed to the agent. In order to make a direct comparison to the Hansen-Sargent formula (4.12), setγ y (L) = 1, ψ x (L) = 1, ψ y (L) = 0, φ θ = 0 and ψ θ (L) = −1. According to Theorem 1, the solution under confounding dynamics can be written as The first two components on the right-hand side are exactly the Hansen-Sargent formula. The third component-represented by the weighted sum ∞ j=0 λ j ε t−j -arises due to confounding dynamics and is similar to the prediction formula of Section 2.2. Agents do not observe the linear combination of shocks weighted by λ. Conditioning down implies that this linear combination will (optimally) be subtracted from the Hansen-Sargent full-information equilibrium. The extent to which the unknown past matters depends on the imprecision of the private signal θ it , measured by 1 − τ (λ); the imprecision stemming from confounding dynamics, measured by 1 − λ 2 ; and the fixed point Equation (4.15) provides the condition for the existence of equilibrium (4.14). It is obtained by evaluating the right-hand side of (4.14) at λ and setting it equal to zero. By doing so, (4.15) is ensuring that once the conditioning down due to confounding dynamics is taken into account, the λ responsible for such conditioning down must indeed be a point in which the equilibrium function is non-invertible. Condition (4.15) takes an intuitive from an informational point of view.
Note first that the LHS, Y(λ)Φ(λ), corresponds to the moving average part of the full information solution (4.11), evaluated at λ. Suppose for a moment that the RHS of (4.15) is set to zero. If a |λ| ∈ (0, 1) satisfying the condition existed, it would mean that the equilibrium with confounding dynamics would take the same form as the full information equilibrium Y(L). However, equation (4.16) shows that in presence of confounding dynamics the unknown past must be subtracted from the full information equilibrium, which would make the full information solution Y(L) inconsistent with confounding dynamics. The implication of this observation is that whenever the RHS of (4 .15) is made small enough, an equilibrium with confounding dynamics may fail to exist. In particular, as the noise-to-signal ratio in private information σ v /σ ε declines, the signal-to-noise ratio, τ (λ), gets closer to one, and eventually leads to non-existence of an equilibrium with confounding dynamics.
When restriction (4.15) is not satisfied, the solution is given by the full-information equilibrium (4.12).
We finally note that the auto-regressive factor in (4.14), 1/(1 − λL), injects into the equilibrium dynamics of y t the waves of over-and under-reaction depicted in Figure 1, which are the hallmark of signal extraction under confounding dynamics. In Section 5, in the context of a real business cycle model, we provide a description of how economic incentives can combine with the waves of over-and under-reaction to deliver the fixed-point condition (4.15).

Exogenous Noise
As discussed in Section 2.1, there are two ways to preserve heterogeneous information in equilibrium-by continually adding exogenous noise until the noise terms overwhelm all signals, and/or by proving that there exists a zero inside the unit circle of the equilibrium as is done in Theorem 1. These categories are not mutually exclusive. Combinations of the two can certainly exist. In this section we first show that the standard way of introducing exogenous aggregate noise will not lead to the characteristic over-and under-reaction of the impulse response which is the hallmark of confounding dynamics. 19 We then show that confounding dynamics can coexist with superimposed exogenous noise, and when they do, the characteristic over-and underreaction reemerges. For transparency, we work within the stylized version of the generic rational expectations model [γ y (L) = 1, ψ x (L) = 1, ψ y (L) = 0, and φ x = 0, so that ζ = φ y ], and we modify the private signal of an arbitrary agent i to, 20 (4.17) All agents also observe the endogenous variable with superimposed exogenous noiseη t , The noiseη t is assumed to be of the formη t = U (L)η t , where U (L) is a ratio of two lag polynomials in non-negative powers of L, and η t is i.i.d. Gaussian with distribution N (0, σ η ). Define the following relative signal-to-noise ratios, (4.19) and note that lim ση →∞ τ v = τ = σ 2 ε /(σ 2 ε +σ 2 η ). Following our solution strategy, we posit a candidate solution y t = Q ε (L)ε t + Q η (L)η t . In order to achieve a closed-form solution, we follow Taub (1989) in specifying U The following proposition characterizes analytically a rational expectations equilibrium for the exogenous noise economy, without confounding dynamics.
The form of the equilibrium (4.20) can be best understood by studying the limiting functions of the 20 Assuming the private signal is εit, rather than θit, greatly simplifies the algebra in characterization of the equilibria of Propositions 1 and 2. All the key steps in the equilibrium derivation would go through if one were to consider θit. 21 The reasons for choosing a convenient form for U (L) are two-fold. First, it streamlines the analytical derivation of the canonical factorization of the variance-covariance matrix. Second, it ensures that a solution to the equilibrium exists that takes the form of a finite order ARM A representation, or, in the frequency domain jargon, of an analytic function that can be represented as the ratio of two polynomials. The conditions for the existence of an ARMA solution in presence of exogenous noise superimposed to endogenous variables is an open active area of research, see Huo and Takayama (2016). noise terms. Note that the polynomial λ(L) takes the form of a Hansen-Sargent formula involving A(L) and τ v ζ. To understand its role, suppose that the public informationỹ t is made uninformative so that τ η → 0 (i.e. σ η → ∞). The equilibrium would then just be equal to y t = λ(L)ε t , which is the first term in (4.20) with τ v equal to τ . As soon as public information is made informative two additional terms appear, one which captures the additional information about ε t transmitted by the public information, and the other that injects the public noise η t into the equilibrium price. Note that the two terms enter the equilibrium price with the same dynamics, which is a consequence of the assumption U (L) = λ(L). This process is also characterized by a Hansen-Sargent formula involving λ(L) and τ η ζ. When public information is made arbitrarily precise, i.e. σ η → 0 so that τ η = 1 then (4.20) corresponds to the full information equilibrium (4.11).
A comparison with Theorem 1 reveals that the additional noise of Proposition 1 coming from (4.18) implies that condition (4.15), which guarantees heterogeneous beliefs are preserved in equilibrium, is no longer necessary. In fact, (4.21) is an explicit assumption that there are no zeros inside the unit circle, which is the standard assumption in models with exogenous noise. In turn, this implies that the equilibrium cannot support confounding dynamics.
To see this more clearly, suppose that A(L) = 1 + θL. Applying (4.11), the full information solution can be immediately obtained as the M A(1) process, Substituting into the equilibrium (4.20) under the assumption that, θ < 1/ 1 − ζ(τ v + τ η ) , so that the invertibility of (4.21) holds, yields, The impulse response dynamics of y t in (4.23) to a shock ε t are entirely consistent with the optimal prediction formula associated with the standard signal extraction problem described in Section 2.2.
The impulse response to a shock in ε t is smaller than the full information counterpart at impact, since τ v + τ η < 1, but otherwise unchanged (i.e. it matches the dashed dynamics of Figure 1).
Next we want to characterize a solution with both exogenous noise and confounding dynamics.
Under the same assumptions about the private and public information signals, we posit a candidate The following proposition holds.
Proposition 2. Consider model (3.1)-(3.4) and letγ y (L) = 1, ψ x (L) = 1, ψ y (L) = 0, φ x = 0, so that ζ = φ y . Let the information sets be specified as if, and only if, there exists a λ ∈ (−1, 1) that solves The building block of this equilibrium isλ(L), which can be directly compared to the λ(L) function of Proposition 1. As discussed above, the second term of a Hansen-Sargent prediction formula has an informational interpretation in that it amounts to what must be subtracted away from a complete-information equilibrium. The conditioning down associated withλ(L) of Proposition 2 is due to the endogenous zero, λ determined by (4.25); while the conditioning down associated with λ(L) of Proposition 1 is due to the exogenous noise term, τ v .
Most importantly, Proposition 2 reintroduces confounding dynamics. Under the specification A(L) = 1 + θL, the equilibrium (4.24), when (4.25) is satisfied, 22 is given by Comparing expression (4.26) to (4.23), both contain an M A(1) term for ε t , and a constant coefficient which, as seen in Section 2.2, injects the dynamic pattern typical of confounding dynamics. 23 The impulse response dynamics of y t in (4.26) to a shock ε t is smaller than the full information counterpart at impact and it matches the qualitative behavior of confounding dynamics in Figure   1. 24 Taken together, Propositions 1 and 2 show that it is the learning mechanism due to confounding dynamics, rather than the one due to exogenous noise, that injects persistence in innovations, and, simultaneously, an amplification pattern that resembles waves of optimism and pessimism.

Application: Business Cycle with Confounding Dynamics
In this section we apply our results to a model of business cycle fluctuations driven by productivity shocks. The purpose of this section is to analytically demonstrate the confounding dynamics mechanism within a well established framework. To achieve this goal, we work within a linearized model reminiscent of the islands model of Lucas (1975).
The economy consists of a continuum of islands indexed by i ∈ [0, 1]. Each island is inhabited by an infinitely-lived representative household, and by a representative firm, also indexed by i.
Household i supplies labor services exclusively to firm i in a decentralized competitive labor market 22 Equilibria (4.20) and (4.24) do not necessarily exist simultaneously for the same parameter values. However, our objective here is to compare the qualitative features of the equilibrium dynamics across the two different classes of equilibria they represent, one with exogenous noise only, the other with both exogenous noise and confounding dynamics. For such exercise, the space of existence across parameter values has a secondary relevance.
23 For θ > 0, one can show that λ ∈ (−1, 0) when (4.25) is satisfied. 24 To see this one need to show that the impact coefficient in (4.26) is smaller than the impact coefficient in (4.22), which corresponds to −λ θ − λ(1 + λθ) − λθζ(τη + (1 − τη)τ ) < 1 + θζ. This can be easily shown using the property that when θ > 0, −λθ ∈ (0, 1). or, equivalently, workers cannot move across islands. Households supply labor inelastically to firms, and the labor supply is normalized to 1. Households own capital in the economy, which is rented out to firms in a centralized spot market. Firms use capital and labor to produce output, also supplied in a centralized competitive spot market. Households derive utility from consuming the output good. Output is produced by firm i according to a Cobb-Douglas technology with capital and labor inputs -with income shares α, and 1 − α respectively, and total factor of productivity that is firm-specific and denoted by e a it , where The term a t is common across all the islands, while v it is a productivity component that is specific In what follows, we consider a log-linearized version of the model with full capital depreciation and constant elasticity of intertemporal substitution, denoted by η > 0. 25 Household i sets consumption intertemporally according to the Euler equation The intertemporal budget constraint is where k it+1 is the capital stock that household i is carrying into period t + 1, w it is the wage rate, r t is the rental rate of capital, and β ∈ (0, 1) is the subjective discount factor. The island-specific wage rate is given by, w it = 1 1−α (a it − αr t ). Aggregate capital is defined as k t+1 ≡ 1 0 k it+1 µ(i)di, and market clearing implies an interest rate Using the household's budget constraint at t and at t + 1 to get expressions for c it and c it+1 , and leading (5.3) one period forward, one can substitute (5.1) into the Euler to obtain a second-order difference equation for capital k it+1 which completely characterizes the equilibrium. As remarked in Section 3.3, the model maps into our general setting by specifying x it = k it+1 , y t = r t , and θ it = a it .

Full Information
We first derive the full information (Ω it = v t i ∨ ε t ) solution for aggregate capital and the interest rate. The full-information guess for island-specific capital is given by The characteristic polynomial associated with equation (5.4) can be determined as Given that α (capital's share of production) and β (subjective discount factor) are both less than one, (5.7) contains one root inside the unit circle (ζ) and one outside (β/ζ), and their product is always equal to β. That is, Assumption (S) holds with, 0 < ζ < 1, and, β/ζ > 1.
Following the steps outlined in Section 4.1, the full information equilibrium for capital can be derived as the AR (2) process 8) and the interest rate takes an ARM A(2, 2) form .
( 5.9) where κ ≡ (θ+ρ)(1−ζ)(αβ/ζ−1) (1−ρζ)(1−α) . 26 Figure 2 reports the impulse responses of capital and the interest rate to a persistent unitary shock in aggregate productivity in the full information equilibrium (5.8)-(5.9). The parameters for the aggregate productivity process are set to ρ ≈ 1 and θ = 1.5. This calibration of the productivity process implies that a unitary shock to ε t raises a t by 1 on impact, by an additional 2.5 in the subsequent period, and no additional growth thereafter. More generally when θ > 1, technical knowledge is assumed to diffuse slowly at first, then accelerate, only to level off in subsequent periods. This parameterization corresponds to a canonical "S-shaped" diffusion process.
A productivity process with this property emerges when, for example, a new technology (e.g. a new computer chip) is first adopted only by the most advanced sectors in the economy, and then it diffuses to the rest of the economy (see Canova (2007), pages 115-116 for further discussion). We consider two values for elasticity of substitution, η = 2 and η = 6 in Figure 2. In both cases capital climbs smoothly towards the same new permanent level. However in the case of low elasticity (η = 2), the climb takes longer because with a low elasticity of substitution, agents are less willing to give up their current consumption to accumulate more capital, everything else equal. 27 The response of the interest rate exhibits the general equilibrium implication of a more sluggish 26 The detailed derivation is reported in Appendix B.7. 27 Formally, when ρ ≈ 1, it can be shown that the coefficient at impact on kt+1 is ζ αβ (1 + κ) ≈ (1 + θ − ζ αβ (α + θ))/(1 − α). Since ζ is declining in η, the reaction of capital at impact is smaller when η is low. Capital, k t+1 Interest Rate, r t periods periods Impulse response of Capital kt+1, and Interest Rate rt, to a unitary positive shock to aggregate productivity at in the Full Information equilibrium. The dashed-black line represents the response for the equilibrium with η = 2; the solid-black line represents the response for the equilibrium with η = 6. Remaining parameter values are β = 0.985, α = 0.33, ρ ≈ 1, θ = 1.5. The impulse responses for capital are normalized with respect to the impact response for the η = 6 case.
adjustment of capital. Under ρ ≈ 1, one can show that, R(L) ≈ (1 + ζ αβ θL)/(1 − ζ β L). At impact the reaction of the interest rate is always 1: the capital stock is fixed, and the increase in the interest rate mimics the unitary shock to productivity. In the subsequent period, the response coefficient is ζ β 1 + θ α , which, for η low enough, can be bigger than 1. For η = 6, the root ζ is small, and the increase in the supply of capital is more than enough to reduce the interest rate after impact. In the η = 2 case, on the other hand, ζ is larger, and the small increase in the supply of capital keeps the interest rate from decreasing. The combined effect of a higher interest rate and a longer time to adjust finally brings down the interest rate in the subsequent periods. The contrasting forces at work in this example carry over to equilibria with confounding dynamics, to which we now turn.
Confounding Dynamics Because households participate in two competitive markets every period -the labor market and the rental market for capital -they observe the island-specific wage rate w it , and the rental rate r t . The observation of w it and r t implies that household i can always back out a it at time t through the expression for w it reported above. As a consequence, observing the prices of labor and capital is equivalent to the information set We also assume that households cannot observe the aggregate capital k t , so to avoid the full revelation of a t , and thus v it , which would be implied by (5.3). 28 Following Theorem 1, existence of confounding dynamics requires that the process for r t = R(L)ε t , has the following property,

11)
28 There are many other information structures that would preserve confounding dynamics in this setting and would be consistent with the general specification of Section 3.1.
for a λ ∈ (−1, 1). A direct application of Theorem 1 leads to the following corollary.
Corollary 1. Consider the Real Business Cycle model (5.3)-(5.5). Let the information sets be specified as in (5.10). There exists a Rational Expectations Equilibrium with Confounding Dynamics of the form, k t+1 = K(L)ε t , and r t = R(L)ε t , with . K(L) and R(L) are as in (5.8) and (5.9) , and R(L) has no zeros inside the unit circle other than λ. If there exists only one λ ∈ (−1, 1) solving (5.13), the equilibrium (5.12) is unique.
While the functional forms of equations (5.12)-(5.13) have the same general structure as Theorem 1 (and same interpretation), the context of the application allows us to gain additional insights into the existence and behavior of an equilibrium with confounding dynamics. For the calibration discussed below (and for a large neighborhood around the calibration), λ ∈ (−1, 0) and C(λ) > 0.
One immediate implication of C(λ) > 0 is that capital will under-react on impact compared to the full information case. The intuition of this result is consistent with our discussion in Section 2.2: under confounding dynamics, agents do not know exactly whether a positive innovation in aggregate productivity has been realized, and will act as if there is a chance that aggregate productivity is unchanged.
A second implication of C(λ) > 0, is that a necessary condition for an equilibrium with confounding dynamics to exist, i.e. (5.13) to be satisfied, is R(λ) < 0. Table 1 reports the endogenous values of λ, solved using (5.13), for several numerical combinations of the idiosyncratic noise in a it , σ v /σ ε , the elasticity of substitution, η, and the moving average parameter, θ. In Panel 1 we hold fixed σ v /σ ε = 1, and vary η and θ. Columns (a)-(c), show that, for θ = 0.5, there is no equilibrium with confounding dynamics, independent of the value of the elasticity of substitution. Recall that with ρ ≈ 1, we have R(L) ≈ (1 + ζ αβ θL)/(1 − ζ β L), which results in the necessary condition (5.14) Whenever θ > 0 is too small, this condition cannot possibly hold for λ ∈ (−1, 0). This is consistent with our optimal prediction formulas, established in Section 2.2. Recall that if the moving average representation was fundamental with respect to the underlying innovations, then perfect revelation of the state was achieved. For values of |θ| ∈ (0, 1), the exogenous process (a t ) perfectly reveals the underlying shocks, which gets embedded in endogenous variables. As θ increases from 0.5 to 1.5 (columns (d)-(f)), the non-fundamental root in the exogenous process (a t ), coupled with the idiosyncratic uncertainty, is sufficient to prevent full revelation of the underlying shocks.
Panel 1 also shows that as the elasticity of substitution increases, the confounding dynamics root increases in absolute value. As agents become more sensitive to changes in the interest rate, the deviation in capital accumulation from the full information counterpart on impact diminishes.
This can be seen directly in (5.14) through the endogenous root ζ, as it can easily be shown that ∂ζ/∂η < 0.
Panel 2 of Table 1 looks at the effect of changing the idiosyncratic noise in the private signal a it on the existence of an equilibrium with confounding dynamics, for a fixed value of η = 2, and two values for θ: 1.25 and 2. Column (a) shows that if the information contained in the private signal is high (σ v /σ ε = 0.1), an equilibrium with confounding dynamics does not exist, while it exists once the private signal is made less informative (columns (b) and (c)). This is consistent with our interpretation of condition (4.15) in Theorem 1, and it is also intuitive: with a very precise private signal agents react to aggregate productivity in a manner consistent with complete information, which makes the interest rate more informative, and thus prevents confounding dynamics. Finally, columns (b)-(c), and (d)-(f), show that the effect of increasing σ v /σ ε on confounding dynamics is highly non-linear, and tends to level-off quite rapidly: in response to a change from 0.5 to 2 to the noise-to-signal ratio, λ declines by merely 0.01 in absolute value.
We conclude the analysis by looking at the qualitative effects of confounding dynamics on capital and the interest rate. Figure 3 shows the response of capital, k t+1 , and the interest rate, r t , to a persistent unitary positive shock to aggregate productivity a t under confounding dynamics, measured in percentage deviation from the respective full information responses. We report the respectively), and then over-reacts after impact (4.5% and 6%), and then under-reacts again, and so on. For the η = 2 case the decay in fluctuations is faster, while for the η = 6 oscillations are more persistent. This is consistent with the impulses responses of prediction under confounding dynamics in Figure 1: for a lower absolute value of λ the initial under-reaction is stronger, but the fluctuations are less persistent.
The key to sustaining confounding dynamics resides in the behavior of the interest rate, r t , which is reported in the right panel of Figure 3. Because capital is predetermined, the interest rate under confounding dynamics increases at impact by 1, exactly as in the full information case. However, in the period after impact, due to the limited capital accumulation, it raises 20% above full information. As agents cannot observe aggregate capital, the behavior of the interest rate "confuses" them in the following sense: in their learning effort, they input a realization for the interest rate in which a past innovation -the unitary jump in productivity -has a larger effect than under full information. They thus interpret the interest rate behavior as possibly implying a positive innovation in productivity in period t = 1. In other words, the innovation at impact becomes a source of noise in estimating the innovation in period 1. This translates into an overlyoptimistic capital accumulation. In period t = 2, the resulting large increase in capital supply depresses the interest rate to a level (-10% and -14%) that is incompatible with the productivity innovation believed to exist in period 1. Beliefs about productivity turn then more pessimistic, which results in a slow-down in capital accumulation. The subsequent dynamics alternate the same mechanism at a gradually declining scale.
One interesting feature of the response in Figure 3 is that, despite the difference in the extent of confounding dynamics across the two cases, the deviations from full information look remarkably similar in size for the first few periods. This reveals a fundamental tension that exists in the context of a real business cycle application between the severity of endogenous informational frictions and the impact they have on capital accumulation. From Table 1, we know that a lower elasticity of substitution results in stronger confounding dynamics, i.e. a less informative interest rate. However, a lower elasticity of substitution also implies that movements in the interest rate have smaller impact on capital accumulation. Indeed, it is for this exact reason that confounding dynamics are stronger.
It follows that, even if the dynamics of the interest rate transmit little information on aggregate productivity, capital accumulation remains relatively unaffected because they are already sluggish in the first place. This is an important insight that extends beyond our application: in the presence of information transmitted by endogenous signals, the factors that hamper equilibrium information transmission can very well be the same factors that make information incompleteness less relevant for equilibrium allocations.
We believe that the numerical application of Figure 3 displays a qualitative behavior of capital that is interesting and promising for quantitative applications. First, we are able to generate dynamics that resemble waves of optimism and pessimism, just as a consequence of optimal learning from endogenous variables. Second, while the period of such waves is mechanically determined by the assumption that we look at equilibria with only one non-invertible root λ, richer non-invertible conditions -such as ones with multiple roots, conjugate pairs, etc. -result in waves that can be longer and asymmetric in duration (we explore a simple example with multiple roots in Appendix B.4). Third, in order to keep things analytically tractable and transparent, we have assumed away additional sources of frictions, thereby limiting the potential of the model to provide quantitatively significant results. However, we envision a richer environment with several types of frictions, such as financial frictions -which are likely to introduce stronger sensitivity of allocations to the interest rate, or exogenous noisy signals, but where confounding dynamics remain a major determinant of equilibrium behavior.
Hur, Leeper, Rondina, and Walker (2016) apply Bayesian methods to estimate both a real business cycle model and a medium-sized new Keynesian model with and without confounding dynamics. Results suggest that allowing for confounding dynamics can substantially increase the empirical fit of these models relative to competing information specifications (e.g., news shocks, complete information).

Concluding Comments
We have introduced a rational expectations equilibrium that generates over-and under-reaction relative to its complete information counterpart. This systematic optimism and pessimism is generated from a simple and optimal learning mechanism that can be easily applied to any dynamic setting. Future work will seek to better understand the empirical properties of confounding dynamics by incorporating them into real and nominal business cycle models designed to be taken to data. Theoretical results of Section 5 and preliminary empirical results of Hur, Leeper, Rondina, and Walker (2016) show much promise. Future work [Huertgen, Hoffmann, Rondina, and Walker (2016)] will also seek to show an equivalence between the analytic function approach advocated here and the more familiar time-domain approach. Contrasting these approaches in a side-byside fashion will help to highlight the benefits of the analytic function approach while (hopefully) demystifying certain aspects of it.

A.1 Full Information Solution This section derives the solution under full information for model (3.1)-(3.4).
The fixed point condition under full information can be found by substituting (4.4)-(4.6) into (3.1), so that This equation defines a fixed point condition for V(L) with all the terms that multiply vit. Collecting terms that multiply vit, multiplying both sides by L and rearranging we get Note that φx(L) ≡ φx + ψx(L)L, which, under assumption (s), has exactly one zero inside the unit circle, denoted by ζx. We thus pick V(0) to remove such zero by setting Solving for V(0), substituting back into (A.2) one finally obtains We now focus on the fixed point for Y(L) and X (L). As remarked in the text, the fixed point condition does not feature any components of V(L), so that one does not need to solve for the latter to obtain the former. To proceed with the solution there are two possibilities: solve for Y(L) and then recover X (L), or viceversa. In general, both routes are possible, but there are situations in which one direction is substantially easier than the other. This depends on whether γx(0) = 0 or γy(0) = 0. We report here both cases. We first consider the case that works whenever γx(0) = 0. We begin by manipulating condition ( Taking all the terms that multiply εt in (A.6), multiplying by L both sides and rearranging, one gets Under assumption (S), Φ(L) has exactly one zero inside the unit circle, denoted by ζ, which means that we can choose Y(0) to remove such zero. We thus set The expression for X (L) can then be recovered using (A.5). Next we consider the case that works whenever γy(0) = 0. We begin by manipulating condition (3.4) to get the following relationship between X (L) and and Analogously to Assumption (S), let us assume that Φx(L) has exactly one zero inside the unit circle, denoted byζ, which means that we can choose X (0) to remove such zero. We thus set Solving for X (0), substituting into (A.13) and rearranging, one finally gets The expression for Y(L) can then be recovered using (A.11).
A.2 Proof of Theorem 1 Step 1: Factorization We operationalize the key requirement that Y (λ) = 0 for λ ∈ (−1, 1) by specifying a guess of the form Y (L) = (L − λ)G(L), where G(L) has no zeros inside the unit circle. The first step in the proof is to then use the equilibrium guess to derive the canonical factorization for the information set, so that the Wiener-Kolmogorov formula (2.4) can be applied. The information set can be written as where εt = σεεt, vit = σvṽit, is a convenient normalization so that the variance-covariance matrix of the innovations vector is the identity matrix. It follows that The following Lemma shows the canonical factorization for Γ(L).
Lemma A1. The canonical factorization Γ * (z)Γ * (z −1 ) T of the variance-covariance matrix Γ(z)Γ(z −1 ) T , where Γ(z) is defined in (A.19), is given by Proof. Using Rozanov (1967) procedure, Γ * (z) is computed as (A.22) The form of W λ is obtained by application of Lemma C1 in the Online Technical Appendix. Solving out the matrix multiplication after some algebra one obtains (A.20).
Step 2: Expectations Equipped with the canonical factorization (A.20), we next derive the three expectational terms: Eit(xit+1), Eit(yt+1), and Eit(θit+1) (recall that Eit(θit+1) = Eit(θt+1)). The second and third in the list are given by The term Eit (xit+1), is substantially more involved to derive, due to the fact that the correlation between xit+1 and θit exists not only because they both depend on εt, but they also both depend on vit. Formally, the application of the Wiener-Kolmogorov formula leads to where g x i ,(θ i ,y) (L) is the variance-covariance generating function between xi and the information set. Given the equilibrium guess, such function takes the form It follows that The application of the annihilator operator requires to take the annihiland minus the principal part of its Laurent series expansion. All the terms have the usual principal part around L = 0. However, the term containing 1−λL L−λ also has a principal part around L = λ, it follows that Finally one gets Step 3: Fixed Point We begin by manipulating condition (3.4) to get the following relationship between X(L) and As one would expect, both on the left and right hand sides there are lag polynomials that multiply εt and vit. Because the two stochastic process are uncorrelated, the equality must hold independently for the terms that multiply εt for those that multiply vit. Taking into account relationship (A.31), equation (A.32) thus defines two fixed points: one for (L − λ)G(L) and one for V (L). Differently from the full information case, the fixed point for the aggregate yt (that defined by the terms multiplying εt) also contains elements of the function V (L), more precisely the constant V (0) − V (λ). Therefore, in order to solve for (L − λ)G(L), we need first to solve for V (L). Taking the fixed point condition for the terms that multiply vit, multiplying both sides by L and rearranging one obtains where φx(L) ≡ φx − ψx(L)L. Similarly, the fixed point for (L − λ)G(L) is where we have used (A.31) to substitute for, X(L) − X(0), and, X(L), and, Φ(L) ≡ φx(L) + φy − ψy(L)L. The next Lemma will prove very useful.
We can now use Lemma A2 to substitute for V (λ) in (A.33) and (A.34). It follows that to solve for (L − λ)G(L) we just need an expression for V (0), to which we now turn. From assumption (s) we know that there is a root ζV that needs to be removed for V (L) to be stationary. We achieve this by choosing the appropriate constant V (0) so that the numerator on the right hand side of (A.33) vanishes when evaluated at ζV , Using (A.31) so substitute for X(λ) − X(0), and rearranging one obtain the expression Next we used (A.38) in (A.34), and we also substitute X(λ) − X(0) using (A.31) to get and Under assumption (S), Φ(L) has a zero inside the unit circle at ζ, which means that we need to choose the constant G(0) so to cancel it. This is achieved by setting − λG (0)  Next, recall that we defined one can show that (A.45) can be written as . expression (4.14) follows. Finally, for the solution to be consistent with the information set that we have used to derive it, it must be that the polynomial in (4.14) vanishes at L = λ, which corresponds to condition (4.15) in the Theorem. The last step of the proof consists in making sure that when the equilibrium coefficients are evaluated using the λ that solves (4.15), there are no other points at which Y (L) vanishes inside the unit circle. More precisely, it has to be that there is no ξ = λ that solves such that |ξ| ∈ (−1, 1). If this was not the case, then the information conveyed by yt in equilibrium would be inconsistent with the information used to derive the expectations that we use to determine the fixed point. More precisely, the factorization of Γ(L) would be incorrect, as Γ * (L) in (A.20) would still be non-invertible. To see this, suppose that λ is a solution to (4.15), while ξ is a solution to (A.50), and they are both inside the unit circle. Then, the equilibrium function must have the form G(L)(L − λ)(L − ξ), but the factorization above only removes the zero associated with λ. It follows that , (A.51) whose determinant still vanishes at L = ξ, so that Γ * (L) is not the appropriate factorization. In this case one can modify the initial guess and consider N > 1 roots inside the unit circle, looking then for a condition like (4.15) to deliver exactly N solutions. We restrict our attention to N = 1 for simplicity and because the full description of the space of REE with confounding dynamics is beyond the scope of this paper, but we hope it is clear that our methods extend to the more general case.
Step 4: No Information from the Model The last thing to check to complete the proof is to ensure that there is no information that is transmitted by a clever manipulation of the model conditions -which are part of the information set of the agents -combined with the knowledge of the history of θit and yt. For instance, suppose that the market clearing condition (3.4) is specified so that 1 0 xitµ(i)di = yt, which means that yt is the aggregate of xit, then this would imply X(L) = Y (L), which would result in xit − yt = V (L)vit. Because rational agents know all this, they know that the difference xit − yt is just a linear combination of the individual innovations vit. It follows that they could, in principle, back out the realizations of vit's by inverting V (L). More generally, the link between X(L) and Y (L) due to (3.4) can be used by rational agents to obtain additional information on the underlying innovations. For this not to happen, if one augments the information set of the agents by xit − yt, the information matrix must still be non-invertible at λ. The following Lemma shows that this is indeed the case for the equilibrium of Theorem 1.

Lemma A3. In the equilibrium with confounding dynamics of Theorem 1, consider the augmented information matrix
The 2-by-2 minors ofΓ(L) all vanish at λ.

A.3 Proof of Proposition 1
The first step in the proof is to recognize that the expectations of agent i in equilibrium take the form of a linear combination of current and past realizations of the observed variables εit and pt, Eit(yt+1) =π1(L)εit +π2(L)ỹt, (A.55) whereπ1(L) andπ2(L) are assumed to be representable as ratios of two finite-degree lag polynomials with zeros outside the unit circle. The guess for the equilibrium price is specified as Substituting (A.55) in the equilibrium equation (3.1) and rearranging one obtains where π1(L) ≡ βπ1(L) + A(L), and π2(L) ≡ βπ2(L). Recalling thatỹt = yt +ηt, the information set of agent i can be then expressed as Denoting the entire history of the signal vector by ω, under the assumption that U (L) = π1(L) the variance-covariance generating function for the signal vector is In addition, the covariance generating function between the signal vector and y, the variable to be predicted, can be written as gys(z) = π 1 (z) 1−π 2 (z) σ 2 ε π 1 (z)π 1 (z −1 ) (1−π 2 (z))(1−π 2 (z −1 )) (σ 2 ε + π2(z)σ 2 η ) . where Ξ * (z) is the canonical factorization of the variance-covariance matrix gss(z) such that gss(z) = Ξ * (z)Ξ * (z −1 ) T . Following Rozanov (1967) and Taub (1989), the factorization can be shown to take the form . Using (A.60) and (A.62) one can show that (0) Define τv ≡ , the fixed point condition results in Because τvβ < 1, to ensure covariance-stability ofπ1(L) we need to pickπ1(0) so to cancel the unstable root at the denominator. This is achieved by settingπ1 (0) (0) .
Note that κ(L) is a known function except for the constantπ2(0). Since β < 1, in order for the left hand side of (A.67) to be covariance-stationary, the right hand side should vanish at L = β. The constantπ2(0) can be conveniently chosen to achieve this by setting π1(β) − κ(β) = 0. Solving this condition forπ2(0) and plugging the expression back into (A.67) one obtains Using π2(L) = βπ2(L) ,one can use the resulting expression together with π1(L) to substitute in (A.56) and obtain (4.20) in Proposition 1. To complete the proof we need to argue why λ(L) + βτη λ(L)−λ(β) L−β = 0 must have no solution inside the unit circle. Note that, from (A.62), the determinant of the matrix Ξ * (L) is proportional to π 1 (L) 1−π 2 (L) , and, for the matrix to be a canonical factorization, the determinant must not vanish inside the unit circle. Because π 1 (L) 1−π 2 (L) = λ(L) + βτη λ(L)−λ(β) L−β , for Ξ * (L) to be the appropriate factorization, the right hand side must not vanish inside the unit circle.
A.4 Proof of Proposition 2 The candidate solution for the equilibrium in Proposition 2 is specified as yt =Qε(L)εt +Qη(L)ηt. (A.70) The first part of the proof is equivalent to that of Proposition 1, up until equation (A.60). We then need to conjecture confounding dynamics, which we do by assuming that there exists a λ ∈ (−1, 0) such that If this is the case then the matrix Ξ * (z) in (A.62) harbors confounding dynamics since its determinant vanishes at λ. For notational convenience we assume that π 1 (λ) 1−π 2 (λ) = π(L)(L − λ), which embeds conjecture (A.71). To obtain the canonical factorization of gss(z) we apply the steps in Appendix C.3 to Ξ * (z) and we obtain For convenience defineπ2(z) = (z − λ)π2(z), and using Ξ * * (z) in the Wiener-Kolmogorov formula (A.61) one obtains the following two fixed point conditions in π(L) andπ2(L) after some straightforward rearrangements: We require π(L)(L − λ) to be stationary, which means that the two unstable roots at the denominator, ζ < 1 and ζτ (1 − τη) < 1, need to be removed. In addition, our conjecture of confounding dynamics requires the left hand side expression to vanish at L = λ. We can achieve all this by the appropriate choice of constants π(0),λ2(0) and λ. We thus have the following three conditions in three unknowns, We first note that λh2(λ) = τηh1(λ)(1 − λ 2 ), which implies that condition (A.78) is satisfied when π(0) = A(λ) ζτ . Substituting this into the expressions for h1(L) and h2(L) one sees that, Using these expressions into (A.79) one obtains A.83) and note that condition (A.80) is satisfied wheñ which corresponds to (4.25) once we multiply both sides by σ 2 v /σ 2 ε . With some additional straightforward algebra is then possible to solve for π(L) and π2(L), and using the conditions,Qε(L) = π(L), andQη(L) = π1(L) π 2 (L) 1−π 2 (L) , equation (4.24) obtains.
A.5 Proof of Corollary 1 The proof of the corollary is a straightforward application of the following lemma.
Lemma A4. Consider the Real Business Cycle model (5.3)-(5.4). Let the information sets be specified as in (5.10). There exists a Rational Expectations Equilibrium with Confounding Dynamics of the form, kt+1 = K(L)εt, and rt = R(L)εt, with where K(L) and R(L) are the full information solutions, τ (λ) ≡ is a function of λ that depends only on exogenous parameters, and R(L) has no zeros inside the unit circle other than λ. If there exists only one λ ∈ (−1, 1) solving (A.86), the equilibrium is unique.
Proof. The proof follows the same steps as that of Theorem 1, with the difference that we solve for X(L) first -K(L) in the application. Recall that φx = αβ, φy = 1 − αβ, φ θ = 1, ψx(L) = α(1 + β) − αL, ψy(L) = 0, ψ θ (L) = −1. Note that, although the notation adopted in the model has the two variables having different time subscripts, rt and kt+1, they are both pre-determined at time t, and so they are both functions of possibly the infinite history of εt up to time t. Since we are looking for an equilibrium with confounding dynamics, we operationalize the condition R(λ) = 0 by conjecturing R(L) = (L − λ)G(L), (A.87) where G(L) has no zeros inside the unit circle. Because in equilibrium R(L) = A(L) − (1 − α)K(L)L, the conjecture immediately implies a relationship that will be useful in what follows. One important remark on (A.88) is that it implies λ = 0. In fact, evaluating the expression at λ = 0, provided that K(0) is well defined, which must be the case in the solution we want to characterize, gives A(0) = 0, which never holds by assumption. Hence, the statement of the Proposition requires |λ| ∈ (0, 1). The information set takes the form of (A.18), where xit = ait and yt = rt, so that Eit(at+1) and Eit(rt+1) are provided by (A.24) and (A.25), respectively. For the term Eit kit+2 things require some extra steps.
We work under the conjecture that kit+1 = K(L)εt + V (L)vit, (A.89) Next, we evaluate the variance-covariance generating function between the information set and kit+1, which is We then use this expression, together with the canonical factorization Γ * (z) in (A.20) in the Wiener-Kolmogorov formula (2.4), and following steps similar to (A.28) and (A.29) to finally get We can now use the expressions for the expectational terms to obtain a fixed point condition similar to (A.32), where we have used (L − λ)G(L) = A(L) − (1 − α)K(L)L, and thus −λG(0) = A(0), to substitute for terms related to G(L). The fixed point equation contains only terms related to the endogenous polynomials V (L) and K(L), and one can proceed to solve for the fixed point as in the proof of Theorem 1. In particular, using the same steps as in Lemma A2, one can show that A(λ)V (λ) = K(λ), and, in addition, we know that (A.88) holds, so we can set K(λ) = A(λ) λ(1−α) . The uniqueness of a stationary solution under Assumption (s) and condition (5.7), is once again obtained by the appropriate choice of V (0) and K(0). In the end, the expression for A k (λ), analogue to the constant A(λ) in Theorem 1, can be simplified to The condition for the existence of one |λ| ∈ (0, 1) follows from using K(L) to write R(L) and then imposing R(λ) = 0. The same argument that we have used in the proof of Theorem 1 to argue that when the equilibrium coefficients are evaluated using the λ that solves R(λ) = 0, there must be no other points at which R(L) vanishes inside the unit circle, applies here as well. This completes the proof.
The proof of Corollary 1 consists in plugging A(L) = 1+θL 1−ρL into the above expressions and rearranging terms when possible.

B.1 Derivation of Predictions in Section 2.2
The critical step in obtain Equation (2.9) is the solution to the annihilating operator function Multiply both numerator and denominator of the right hand side by L to get The argument is now a regular function that has an isolated singularity at L = λ. The principal part of the Laurent series expansion is determined as Using Lemma C2 and after some algebra we have Equation (2.9) immediately follows. In Section B.4 we show that (B.4) generalizes to the case of N singularities.
B.3 Confounding Dynamics with m > n Let the signal structure be specified as in the text st = Γ(L)ut, (B.8) where Γ(L) is n × m. The objective of this section is to provide a formal definition of confounding dynamics that applies when m ≥ n, and then provide two examples for the case of m = 2 and n = 1 Denote the variance covariance matrix of the signal vector by gss(z). The matrix gss(z) is an n × n positive-definite matrix of rank r ≤ n for |z| = 1, and with rational elements. The final goal here is to find an appropriate factorization of gss(z) that can be used in the Wiener-Kolmogorov prediction formula. To that end, we follow Rozanov (1967), pages 44-47, and we divide the factorization in two steps. First, we perform a factorization that delivers a functionΓ(z) which is n × r, and that has rational elements and it is analytic inside the unit circle. Next, we check whether the functionΓ(z) has rank r for z inside the unit circle. If not, then it means that there exists one or more point in which all the minors of order r of Γ(z) vanish. For our purposes, this means that the original signal structure Γ(L) not only is non-invertible because m > n, but also because of confounding dynamics. We begin here by stating the existence of the functionΓ(z) as a Lemma.
Lemma B1. A positive definite matrix function gss(z) of dimension n × n, and of rank r ≤ n, with elements which are rational functions of z, can be represented in the form whereΓ(z) is n × r, and the elements inΓ(z) are rational with respect to z and analytic inside the unit circle.
The general definition of confounding dynamics immediately follows.
Definition GCD. Let st be specified as in (B.8), with m ≥ n, rank gss(z) = r ≤ n for |z| = 1, and letΓ(z) be defined as in Lemma B1. The st process is said to display confounding dynamics if there exists some λ with |λ| < 1, such that rank Γ (λ)) < r .
Note first that for m = n, and r = n, one has thatΓ(L) = Γ(L), and the above definition is consistent with the definition stated in Section 2.1. For the case m > n, one necessarily has that r < m, which is a formal way to express the fact that the initial signal system is not able to perfectly reveal the history of ut. MatrixΓ(z) of Lemma B1 performs a linear combination of the m elements in ut into at most r orthogonal components, so to have a representation of the signals st that can be used for optimal prediction. However, such linear combination might come short of providing r orthogonal components when the rank ofΓ(z) is less than r for some z inside the unit circle. When that happens, it means that the r orthogonal components are combined in such a way that their information is confounded into fewer than r orthogonal components, i.e. confounding dynamics are present. OnceΓ(z) is obtained and the set of λ's from the definition above identified, the derivation of the canonical factorization Γ * (z) follows the steps outlined in Appendix C.3. The canonical factorization Γ * (z) finally returns a representation of st into r orthogonal components, with variances that provide the least-squares prediction. We present two examples of signal systems with n = 1 and m = 2, which contains confounding dynamics. and the variance-covariance generating function is, Here r = 1 since gss(z) is non-zero when evaluated at |z| = 1. Applying Lemma B1, one has thatΓ(L) = √ σ 2 ε + σ 2 v (L− λ). Application of Definition B.3 informs that the 1×2 process st has confounding dynamics provided that λ ∈ (−1, 1). Using the procedure in Appendix C.3 one can show that the canonical factorization here is Γ * (L) = √ σ 2 ε + σ 2 v (1−λL). Plugging this into the Wiener-Kolmogorov formula for the mean-squared error minimizing prediction P(εt|s t ) one obtains, , as usual. Equation (B.13) clearly shows how the two sources of non-invertibility combine: the dimension m > n turns into the signal-to-noise coefficient τ , while confounding dynamics turn into the dynamic operator − λ 1−λL . In higher dimensional system the combination takes a substantially more convoluted form.
Example 2. Consider a process st specified as where, once again, εt The variance-covariance generating function is According to Lemma B1, we have thatΓ(L) = (1 −λL)σw, whereλ is a solution to the quadratic equation (B.17) and σ 2 w is determined by σ 2 wλ = λσ 2 ε . To see this note that gss(z) can be rewritten as and we are looking for the factorization Note that (B.19) can be written as Matching coefficients we get two conditions in two unknowns, namely Using (B.21) to substitute for σw in (B.22) one gets the quadratic equation (B.17). 29 Note that our candidateΓ(L) satisfies the requirements of Lemma B1 independently of which rootλ is chosen. However, the quadratic form (B.17) is such that there is always one root inside and one root outside the unit circle. The roots arê (B.23) Taking the limit of (B.23) for σv → 0, which corresponds to the case presented in Section 2.2, one has that lim σv →0λ (B.24) where the first root is the one associated with the "+" sign for the discriminant term, and the second root the one associated with the "−" sign. Suppose that we setλ =λ+ inΓ(L), then according to our Definition GDC the process st has confounding dynamics whenever |λ+| > 1, which is always the case when |λ| < 1. The canonical factorization in this case is Γ * (L) = (1 − λ * L)σw, where λ * =λ−, and σ 2 w solves (B.21). Application of the Wiener-Kolmogorov formula then leads to Expression (B.25) shows that the confounding dynamics hallmark is retained even in presence of exogenous noise. The factor that multiplies εt in (B.25) has the same format as the factor in (2.10), with the only difference that the impact is now scaled by λ * , and the autoregressive root is also λ * . One important difference in the rectangular case is that the additional noise term also appears in the prediction function, which is represented by the term that multiplies vt in (B.25). The noise term has a persistent effect on the prediction, with the same autoregressive root as the εt term.
Definition CGD and Examples 1 and 2 show that confounding dynamics can be present in any type of signal structure. The signal matrix structure with m = n that we employ in the main text is analytically convenient, and, at the same time, expressions (B.13) and (B.25) suggest that it is without loss of generality, in so far as the purpose is to qualitatively characterize confounding dynamics.

B.4 N Non-Invertible Roots
The following proposition describes the prediction formula for the innovations of a process with N non-invertible roots.
with |λi| < 1, for i = 1, ..., N . The least squares prediction P(εt|s t ) is given by Proof. The first step in the proof is to figure out the canonical factorization of gss(z) when st is as in (B.26). Rozanov (1967) method applies directly here so that (1 − λiL). (B.28) The application of the Wiener-Kolmogorov prediction formula results in the following The next Lemma is useful in solving for the annihilating operator.
The proof of the Lemma is by induction, repeatedly using Lemma C2 to obtain a solution for N = 1, 2, 3, .....

Application of the Lemma leads to
The final result of the proposition then immediately follows. Figure 4 shows the impulse response of the prediction formula for N = 2 and N = 3. As one can see, the mechanical alternation of over-reaction and under-reaction typical of the N = 1 case is lost here. In fact, longer and asymmetric cycles become clearly possible. Under the form we have assumed for θit, we always have that for j ≥ 1, a property that will keep the notation below manageable. To initiate the iterative substitution, take (B.32) one period forward so that xit+1 = φxEit+1(xit+2) + φyEit+1 ( Proceeding in such manner up to some arbitrary time J, one ends up getting a weighted sum of expectations of different orders about θt+j, for j = 1, ..., J + 1, while the remaining endogenous variables xit+J+1 and yt+J+1 multiply coefficients that tend to zero as. Letting J → ∞ and aggregating over agents, one finally obtains whereĒ j (θt+j) stands for the j th order average expectation of θt+j, and, for notational convenience, we letĒ 0 (θt) = θt. The way the operator P j φx works is visible in the first four terms of the last line of (B.41) where j = 2. In the first and second term, φx appears as the first coefficient, which results in the expectation being of the first order. In the last two terms, φy appears as the first coefficient, which results in the expectation being of the second order. In subsequent substitution the pattern that results in the reduction of some of the higher order compounding is quite complex, as the combination of relative positions of φx and φy in the coefficients grows at the power of 2 n , so we omit it here. For instance, if one considers the expression at the 3rd iteration, i.e. for J = 3, then one has yt = terms with xit+4, yt+4 + φxφxφx + φxφxφy Ē t(θt+3) + + φxφyφx + φxφyφy Ē t Ē t+2(θt+3) Note that, together with the direct average expectationĒt(θt+3), and the expectation of third order (the average expectation of the average expectation of the average expectation)Ēt Ē t+1 Ē t+2(θt+3) , expression (B.43) also displays the average expectation at t of the average expectation at t + 2,Ēt Ē t+2(θt+3) . Substituting further, other combinations of higher order expectations compounding appear too. In summary, equation (B.43) shows that, in presence of dispersed information, the pattern of higher order expectations can be extremely cumbersome, and so the requirement of the canonical approach to work out each possible combination of expectations quickly becomes prohibitive.

B.6 Real Business Cycle Application: Full Model and Log-Linearization
The economy is structured in a continuum of islands indexed by i ∈ [0, 1]. Each island is inhabited by a representative household i and by a representative firm i. Household i supplies labor services exclusively to firm i in a decentralized competitive labor market. Labor of household i is the only labor productive in firm i. Households own capital in the economy and rent it out to firms in a centralized spot market. Capital, expressed in consumption goods, is productive in all the firms across the islands. The problem for Household i can be then written as with β ∈ (0, 1), η > 0, and subject to a sequence of budget constraints of the form where Wit is the wage rate in the labor market of island i and Rt is the rental rate of capital in the centralized capital market. Households are assumed, for the moment, to supply labor N (s) it inelastically at the prevailing wage rate. We normalize the labor supplied by household i to N (s) it+i denotes the total capital that household i is bringing into period t + 1. The superscript (s) stands for "supply" to denote the fact that the capital that household i is bringing into period t + 1 will be the amount supplied by the same household in the centralized rental capital market at t + 1. Symmetrically, in what follows the superscript (d) will stand for demand.
The problem for the representative firm in island i is Output Yit is supplied in the centralized market for output. In other words, in this economy there is only one consumption good centrally traded. The price of the consumption good at t is normalized to 1. Output is produced by firm i according to a Cobb-Douglas technology with labor and capital inputs and a technological factor Zit that can be specified as The term at is common across all the islands, while εit is a productivity component that is specific to island i. The existence of a decentralized labor market together with an island specific productivity results in a labor income with an idiosyncratic risk component against which households would like to insure. We assume that markets for state contingent securities are not available, so that household i has to bear the labor income risk. We will also assume that the idiosyncratic labor income risk is not present in steady state, which means that the wealth distribution of the economy in steady state would be degenerate and an economy-wide representative household will exist. This is relevant at the linearization stage, as one would want to linearize the first order conditions for each island around the same steady state. The first order conditions for firm i are it Rt, so that the wage bill in island i is equal to a fraction (1 − α) of output in the island, while the capital bill is the remaining fraction α. In addition, under the assumption that N it . This will always be the case, independently of the equilibrium behavior of the variables. The Euler equation for household i is while, using the first order conditions from firm i and the market clearing for the labor market, the budget constraint can be re-written as it .
In addition, a no-Ponzi condition is assumed so that the solution path to the steady state has to satisfy the usual transversality condition. To close the model in terms of market interactions we need to specify the market clearing condition for capital and for output, formally We will work with a log-linearized version of the economy around a steady state that is derived under the assumption that the long run unconditional average of Zit is 1. Notice that this implies that the economy does not display a growth trend in steady state. This is without loss of generality for the purpose of the application.
For any variable Xt we define xt as Xt = X * e xt , where X * is the steady state value of Xt. The log-linearized economy is given by the following set of equations (details available from the authors upon request). The budget constraint for household i is The market clearing condition for aggregate capital is while the market clearing for aggregate output is The set of equations (B.44)-(B.51) completely describe the equilibrium dynamics of the linearized economy, conditional on the sequence of cross sectional distributions of information sets implicit in the expectational operator Eit for i ∈ [0, 1] and ∀t. Setting δ = 1, one obtains the model equations of Section 5.

B.7 Derivation of Full Information Solution of RBC Model
In Section 4.1 we solved for Y(L), which would correspond to R(L) in the present application, and left the solution of X (L) -K(L) here -as a corollary. In the application of Section 5 such route is precluded by the fact that here γx(L) = (1 − α)L, which means γx(0) = 0. We thus take the alternative route, as showed in Appendix A.1: we solve for K(L) and leave the solution to R(L) as a straightforward corollary. Under full information we know that Eit(kit+2) = K(L) − K(0) L −1 εt + V(L) − V(0) L −1 vit, Eit(kt+1) = kt+1 = K(L)εt, and Eit(ait+1) = Eit(at+1) = A(L) − A(0) L −1 εt. Substituting Substituting (5.3) into (5.4), using the above expressions for the expectations, aggregating over agents, multiplying both sides by L, and rearranging, one obtains the fixed point condition To ensure stationarity we choose . Next substitute this expression in (B.52), and specify A(L) = 1+θL 1−ρL . By construction the denominator polynomial contains the factor (ζ − L), which can be easily isolated and simplified with the same factor at the denominator, so to finally obtain .

C Technical Preliminaries for Frequency Domain Methods
Elementary results concerning the theory of stationary stochastic processes and the residue calculus are necessary for grasping the z-transform approach advocated in Rondina and Walker (2016). The purpose of this appendix is to offer readers unfamiliar with the methods used in Rondina and Walker (2016) the additional background necessary such that the paper is self-contained. The appendix introduces important theorems that are relatively well known but is by no means exhaustive. Interested readers are directed to Brown and Churchill (2013) and Whittle (1983) for good references on complex analysis and stochastic processes. Sargent (1987) provides a good introduction to these concepts and discusses economic applications.

C.1 Variance-Covariance Generating Function
Consider two-covariance stationary linear-Gaussian multivariate processes, {ωt, t ∈ Z} and {st, t ∈ Z}, where the vector dimensions are n × 1, and m × 1, respectively. Let Υωs(j) denote the m × n unconditional covariance matrix between ωt and st−j, for j ∈ Z, formally where T denotes transpose. The variance-covariance generating function is then defined as where gωs(z) is an m × n matrix. When ωt = st the function is referred to as the auto-covariance generating function and denoted by gωω(z), or gss(z). An extensive treatment of the properties of the variance-covariance generating function can be found in Sargent (1987).

C.2 Wold Fundamental Representation Theorem
Much of the analysis in Rondina and Walker (2016) is conducted in the space of lag polynomials without specific functional forms assumed (e.g., ARM A(m, n)). The Wold representation theorem allows for such a general specification. where Γ * (L) is a matrix polynomial in the lag operator, and ∞ j=0 Γ * j Γ * j T is convergent. The processŵt is n-variate white noise with E(ŵt) = 0, E(ŵtŵ ′ t ) = In and E(ŵtŵ ′ t−m ) = 0 for m = 0. The process Γ * 0ŵt is the innovation in predicting st linearly from its own past: where P (·) denotes linear projection. The process ηt is linearly deterministic; there exists an n × 1 vector c0 and n × n matrices Cs such that without error ηt = c0 + ∞ s=1 Csηt−s and E[ŵtη ′ t−m ] = 0 for all m.
The Wold representation theorem states that any covariance stationary process can be written as a linear combination of a (possibly infinite) moving average representation where the innovations are the linear forecast errors for st and a process that can be predicted arbitrarily well by a linear function of past values of st. The theorem is a representation determined by second moments of the stochastic process only and therefore may not fully capture the data generating process. For example, that the decomposition is linear suggests that a process could be deterministic in the strict sense and yet linearly non-deterministic; Whittle (1983) provides examples of such processes. The innovations in the Wold representation are generated by linear projections which need not be the same as the conditional expectation (E[st|st−1, st−2, ...]). However, when working with linear Gaussian stochastic processes, as is standard in the rational expectations literature, the best conditional expectation coincides with linear projection.
As discussed in detail in the paper, the innovations derived from the Wold representation are an essential element of a rational expectations equilibrium in that they define the information set obtained from conditioning on current and past st's. Using the language of Rozanov (1967), the innovation sequence {ŵt−j } ∞ j=0 of (C.4) are fundamental for the sequence {st−j } ∞ j=0 if the Hilbert space generated by the observables is equivalent (in mean-square norm) to the Hilbert space generated by the innovations.
C.3 Canonical Factorization Let st be specified as st = Γ(L)ut, (C.5) where Γ(L) is a lag polynomial matrix with square-summable coefficients, and ut is a m × 1 vector of i.i.d. Gaussian innovations with zero mean and variance-covariance matrix Σu. Testing for fundamentalness in a process like (C.5) can be done by checking for the invertibility of Γ(z) for |z| < 1. 30 While testing for fundamentalness is straightforward, deriving the unique fundamental Wold representation is not. Here we rely on powerful factorization theorems. The following definition is taken from Whittle (1983). If g(z) is a square matrix function of z, then the canonical factorization g(z) = Γ * (z)Γ * (z −1 ) T , (C.6) will refer to a factorization in which both Γ * (z) and its inverse have valid expansions in non-negative powers of z for |z| ≤ 1; and Γ * (z −1 ) T and its inverse have expansions in non-positive powers of z for |z| ≥ 1. The canonical factorization of the variance-covariance generating function delivers the Wold fundamental representation of the time series. It yields both the fundamental moving average Wold representation, and the autoregressive representation that is consistent with span {st−j } ∞ j=0 being equivalent to span {ŵt−j } ∞ j=0 . There does not exist a general method to compute analytically the canonical factorization of any arbitrary matrix g(z). Several methods that have been proposed work well when the knowledge of the structure of the matrix g(z) is used. Rozanov (1967), page 47, proposes an algorithm to obtain Γ * (z) from Γ(L) that can be used when Γ(L) is known to have a finite number of isolated singularities λ1, λ2, ..., λp inside the unit circle. In the case of n = 2, and one singularity at |λ| < 1, one has that Γ * (z) = Γ(z)W λ B λ (z), (C.7) where B λ (z) is the Blaschke matrix B λ (z) = 1 0 0 1−zλ z−λ . (C.8) The constant matrix W λ is a unitary matrix, so that W −1 λ = W T λ , whose columns are the left singular vectors of the system matrix Γ(z) evaluated at λ. While the inverse of Γ(λ) does not exist, the pseudo-inverse (also known as the Moore-Penrose inverse) always exists and it is given by where V ΣU T is the singular value decomposition of the matrix Γ(λ). The left singular vectors of Γ(λ) are the columns of U . The following result, which can be showed by using widely available formulas for the singular value decomposition of 2 by 2 matrices is used in the proof of Lemma A1. (C.11)

C.4 Riesz-Fischer Theorem
Theorem C2. [Riesz-Fischer] Let D( √ r) denote a disk in the complex plane of radius √ r centered at the origin.
There is an equivalence (i.e. an isometric isomorphism) between the space of r-summable sequences j r j |fj | 2 < ∞ and the Hardy space of analytic functions f (z) in D( √ r) satisfying the restriction where denotes (counterclockwise) contour integration around D( √ r). An analytic function satisfying the above condition is said to be r-integrable. 31 The Riesz-Fischer theorem ensures that one can work either in the space of infinite sequence of square-summable matrix coefficients {..., Q−1, Q0, Q1, ...}, or in the space of complex-valued analytic functions Q(z) = ∞ j=−∞ Qjz j , since the two spaces are equivalent.

C.5 Wiener-Kolmogorov Prediction Formula
Consider the problem of computing the linear least-squares estimate for ωt+j, denoted byωt+j, conditional on the realized history {st+j} ∞ j=0 . The solution to the problem consists in the sequence of real valued matrices {Πj } ∞ j=0 , or, equivalently, the complex-valued function Π(z), such thatω t+j = ∞ j=0 Πjst−j. (C.12) We assume that all the processes have zero unconditional mean. The following result is a version of the Wiener-Kolmogorov prediction formula taken from Whittle (1983).
The requirement that Γ * (z) should be the canonical factorization of gss(z) is essential in two steps. First, from it we ensure that Γ * (z −1 ) T −1 has an expansion in non-positive powers of z, which means that the term h(z) Γ * (z −1 ) T −1 disappears when the annihilating operator is applied. Second, we also ensure that Γ * (z) −1 has an expansion in nonnegative powers of z, which result in Π(z) having an expansion in non-negative powers of z only, as required.
C.6 Whiteman (1983) Solution Method: Existence and Uniqueness of REE We use the existence and uniqueness criteria of Whiteman (1983) developed for linear, rational expectations equilibria. The following works through the three relevant cases-existence but no uniqueness, no existence, and existence-uniqueness. Consider the following generic rational expectations model where εt is assumed to be fundamental for θt (i.e., A(L) is assumed to have a one-sided inverse in non-negative powers of L). Following the solution principle, we will look for a solution that is square-summable in the Hilbert space generated by the fundamental shock ε, yt = Q(L)εt (third tenet). If we invoke the optimal prediction formula (C.14), then Appealing to the Riesz-Fischer Thereom, square-summability (stationarity) in the time domain is tantamount to analyticity of Q(z) on the unit disk. The function Q(z) is analytic at z0 if it is continuously (complex) differentiable in an open neighborhood of z0. 32 Any rational function (f (z)/g(z)) where f (·) and g(·) are polynomials will be analytic on the unit disk provided g(z) = 0 at any point inside the unit circle. The extent to which this is true for Q(z) depends upon the parameters ζ andζ. As shown in Whiteman (1983), there are three cases one must consider. First, assume that |ζ| > 1 and |ζ| > 1. Then (C.20) is an analytic function on |z| < 1 and the representation is given by For any finite value of Q0, this is a solution that lies in the Hilbert space generated by {θt} and satisfies the tenets of the solution principle. Thus, we have existence but not uniqueness because Q0 can be set arbitrarily.
The second case to consider is |ζ| < 1 < |ζ|. In this case, the function Q(z) has an isolated singularity at ζ, implying that Q(z) is not analytic on the unit disk. In this case, the free parameter Q0 can be set to remove the singularity at ζ by setting Q0 in such a way as to cause the residue of Q(·) to be zero at ζ lim z→ζ (ζ − z)Q(z) = ζA(ζ) + Q0 ζ −ζ = 0 Solving for Q0 delivers Q0 = −ζA(ζ). Substituting this into (C.21) yields the following rational expectations equi-32 Analytic is synonymous with holomorphic, regular and regular analytic. The function Q(z) is now analytic and (C.22) is the unique solution that lies in the Hilbert space generated by {θt}.
The solution is the ubiquitous Hansen-Sargent prediction formula that clearly captures the cross-equation restrictions that are the "hallmark of rational expectations models" [Hansen and Sargent (1980)]. 33 The final case to consider is |ζ| < 1 and |ζ| < 1. In this case, (C.20) has two isolated singularities at ζ andζ, and Q0 cannot be set to remove both singularities. 34 Hence in this case, there is no solution in the Hilbert space generated by {θt} and we do not have existence.