Essex Finance Centre Working Paper Series Working Paper No 29 : 01-2018 Testing for Parameter Instability in Predictive Regression Models “

We consider tests for structural change, based on the SupF and Cramer-von-Mises type statistics of Andrews (1993) and Nyblom (1989), respectively, in the slope and/or intercept parameters of a predictive regression model where the predictors display strong persistence. The SupF type tests are motivated by alternatives where the parameters display a small number of breaks at deterministic points in the sample, while the Cramer-von-Mises alternative is one where the coeffi cients are random and slowly evolve through time. In order to allow for an unknown degree of persistence in the predictors, and for both conditional and unconditional heteroskedasticity in the data, we implement the tests using a fixed regressor wild bootstrap procedure. The asymptotic validity of the bootstrap tests is established by showing that the asymptotic distributions of the bootstrap parameter constancy statistics, conditional on the data, coincide with those of the asymptotic null distributions of the corresponding statistics computed on the original data, conditional on the predictors. Monte Carlo simulations suggest that the bootstrap parameter stability tests work well in finite samples, with the tests based on the Cramer-von-Mises type principle seemingly the most useful in practice. An empirical application to U.S. stock returns data demonstrates the practical usefulness of these methods.


Introduction
Predictive regression (hereafter PR) is a widely used tool in applied finance and economics. A leading example concerns whether future stock returns can be predicted by current information. In this context PR methods have been extensively utilised in studies of mutual fund performance, tests of the conditional CAPM and studies of optimal asset allocation; see Paye and Timmermann (2006, pp. 274-275) and references therein. Predictors commonly considered for returns include the dividend yield, the term structure of interest rates, and default premia. It is often found that the posited predictor (e.g. the dividend yield) exhibits strongly persistent behaviour akin to that of a (near-) unit root autoregressive process, whilst the variable being predicted (e.g. the stock return) resembles a (near-) martingale difference sequence [m.d.s.].
Predictability tests which are asymptotically valid when the putative predictor is strongly persistent and driven by innovations which are correlated with the series being predicted (the latter is often thought to be the case; e.g., the stock price is a component of both the return and the dividend yield) have been proposed in Cavanagh et al. (1995), Campbell and Yogo (2006), Kostakis et al. (2015), Breitung and Demetrescu (2015), Elliott et al. (2015) and Jansson and Moreira (2006), inter alia. These approaches are all based on the maintained assumption that the coefficients of the PR model are constant over time. There is, however, a growing body of empirical evidence casting doubt on this assumption. Henkel et al. (2011), for example, find that return predictability in the stock market appears to be closely linked to economic recessions with dividend yield and term structure variables displaying predictive power only during recessions. Johannes et al. (2014) find strong empirical evidence of time-variation in the parameters of PRs for returns, including evidence of non-constant volatility. Timmermann (2008) argues that for most time periods stock returns are not predictable but that there are 'pockets in time' where evidence of local predictability is seen. Paye and Timmermann (2006) also cite a number of applied studies which find significant evidence of in-sample (ex post) predictability in returns data but yet find very weak evidence of out-of-sample (ex ante) predictability, and argue that a possible explanation is structural instability in the predictive relations involved. Paye and Timmermann (2006)  changes in a model's parameters to investigate the structural stability of PRs for stock returns related to structural breaks in the coefficients of state variables (including the lagged dividend yield, short interest rate, term spread and default premium) for a data-set of monthly stock returns for ten OECD countries. They find evidence of instability for the vast majority of these countries, arguing that the ''Empirical evidence of predictability is not uniform over time and is concentrated in certain periods' '. op.cit. p. 312. They also present simulation evidence into the size and power of the structural change tests they consider for the case where the predictors involved are I(0) and a one-time break is allowed in the coefficient on a single predictor, and conclude in favour of the approach of Perron (1998, 2003). A significant drawback of applying the Bai and Perron approach to the PR model, however, is that it is not asymptotically valid in cases where the predictive variables are (near-) unit root processes. Moreover, as argued by Cai et al. (2015, p. 954) and the references therein, its focus on models of abrupt deterministic coefficient change might be considered unattractive in practice relative to tests designed for the case where the parameters of the PR are random and evolve smoothly over time. Indeed, using Bayesian model selection and averaging methods, Dangl and Halling (2012) conclude that time-variation in the coefficients of return prediction models is very important with a random walk coefficients model performing best in practice, quickly adapting to changes in environment. They also find evidence suggestive that predictability is linked to the business cycle. The Bai and Perron approach also requires that the variables in the PR do not display unconditional heteroskedasticity which would again appear to considerably limit their applicability for financial data; see, e.g., Johannes et al. (2014).
Our aim here is to address these shortcomings and develop structural change tests that can be more reasonably applied to empirically testing the constancy of the intercept and slope parameters in a PR model driven by heteroskedastic innovations. In earlier work, Georgiev et al. (2018) [GHLT hereafter], we investigated a variant of the stationarity test of Kwiatkowski et al. (1992) [KPSS] in the context of the PR model. This is a test of the instability of the regression intercept and can be viewed as a test against the alternative that the error in the PR model follows a near-unit root process. As such, GHLT interpret this as a test for spurious predictability. The present paper extends the work in GHLT to cover tests on all or a subset of the parameters of the PR model, not just the intercept, thereby allowing us to also investigate the constancy or otherwise of the slope parameter on the predictive regressor.
In the light of the arguments above, we consider parameter constancy tests based on the SupF type statistics of Andrews (1993) and the Cramer-von-Mises type statistics of Nyblom (1989). The former are designed for abrupt deterministic change models and the latter for (near-) unit root coefficient models. Although originally developed for asymptotically stationary regressors, Hansen (1992a) examines the large sample properties of these statistics for the case of pure unit root regressors, showing how these limits differ from the asymptotically stationary case. However, in the context of the PR model we need to go further and allow for the case where the predictive regressor is a near-unit root process. Doing so introduces the considerable complication relative to the case of a pure unit root regressor that the limiting null distributions of the parameter constancy statistics depend on the local-to-unity (persistence) parameter of the putative predictor. In principle, this makes it very difficult to control the size of the tests given that this parameter is unknown in practice and cannot be consistently estimated. 1 1 Cai et al. (2015) also develop a test against smooth parameter variation in the parameters of the PR model based on a non-parametric L 2 -type statistic. However, their proposed statistic requires the variables in the PR to be homoskedastic. We therefore do not consider their approach further here.
To resolve this problem we use bootstrap implementations of the parameter constancy tests which treat the putative predictor as a fixed regressor; i.e., the observed data on the predictor is used in calculating the bootstrap analogues of the structural change statistics. Because, as noted above, many economic and financial time series are thought to display non-stationary volatility and/or conditional heteroskedasticity, it is also important for our proposed bootstrap tests to be (asymptotically) robust to these effects. To achieve this we use a heteroskedasticity-robust variant of the fixed regressor bootstrap approach proposed in Hansen (2000). We show that this approach yields asymptotically size-controlled tests, without requiring knowledge of the local-to-unity parameter or the form of any heteroskedasticity present, and delivers tests which are powerful against both forms of coefficient variation considered. Moreover, the bootstrap tests are also valid when the predictive regressors are asymptotically stationary or contain a mix of both asymptotically stationary and strongly persistent regressors. They are also valid for regressors whose marginal distributions are subject to structural change, meaning that rejection by the bootstrap tests can be unambiguously interpreted as evidence for structural instability in the slope coefficients of the PR, even where the predictors themselves display structural change.
Closely related to this paper, Hansen (2000) also applies the fixed regressor bootstrap to the Andrews (1993) and Nyblom (1989) statistics we consider here, and Paye and Timmermann (2006) include the fixed regressor bootstrap implementation of the Andrews (1993) test in their simulation study. Although Hansen (2000) employs assumptions that allow for pure and near-unit root behaviour in the regressor variables, we demonstrate that his formal analysis needs an amendment. Hansen (2000) justifies bootstrap validity by claiming equivalence of the limiting distribution of the bootstrap parameter constancy statistics given the data and the unconditional limiting distribution of the original test statistics under the null. We show that this equivalence does not occur; in particular, the limits of the bootstrap statistics in his Theorems 5 and 6 on page 107 are both imprecisely stated when the predictive regressors are (near-) unit root processes. We establish that the fixed regressor bootstrap is nevertheless valid, at least in the PR set-up we consider, in the sense that the bootstrap parameter instability tests are asymptotically size-controlled. This is done by demonstrating that the limiting distributions of the bootstrap statistics, conditional on the data, are the same as the limiting null distributions of the corresponding statistics computed on the original data, conditional on the predictors.
The paper is organised as follows. Section 2 outlines our basic time-varying parameter PR model. To aid lucidity, we expound our approach through a single predictor variable whose innovations are serially uncorrelated. Generalisations to allow for multiple predictors and weak dependence are discussed in Section 6. Section 3 outlines the structural change statistics and derives their asymptotic distributions. Section 4 details the fixed regressor wild bootstrap tests based on these statistics and establishes their asymptotic validity. The asymptotic local power of the bootstrap tests is examined in Section 5, while Section 7 presents Monte Carlo simulation results investigating their finite sample performance. An empirical application to monthly U.S. stock returns data is presented in Section 8. Section 9 concludes. Proofs appear in Appendix A. Additional material relating to the limiting distributions of the statistics given in Section 3 is provided in an accompanying on-line supplementary appendix.

The predictive regression model with structural change
The basic PR model allowing for structural change that we consider for observed y t is given by where ϵ yt is a mean zero innovation process and x t is an observed process, specified according to the data generating process [DGP] x t = µ + s xt , t = 0, . . . , T (2) where ρ x := 1 − c x T −1 with c x ≥ 0 such that x t is a strongly persistent unit root or local-to-unit root autoregressive process with mean zero innovation process ϵ xt . We let s x0 be an O p (1) variate. Exact conditions on the innovations ϵ yt and ϵ xt will be given in Assumption 1 below. The DGP in (1) generalises the constant parameter PR model by allowing the intercept and slope coefficients to vary over time.
To nest the constant parameter PR within (1) we formulate the time-varying intercept and slope coefficients as: α t := (α + as αt ) and β t := (β + bs βt ). The parameter instability tests we discuss in this paper are, by construction, invariant to the values of α and β. However, in the context of a time-invariant PR (i.e., α t = α, β t = β) with near-unit root predictors it is usual to follow Cavanagh et al. (1995) and parameterise β to be local-to-zero at an appropriate rate; precisely, β = g * T −1 where g * is some constant.
This entails that under parameter constancy, and when g * is nonzero, y t is a near-m.d.s. process. Where β is fixed, as in Shin (1994), (1) should rightly be interpreted as a co-integrating regression because y t will be a (near-) unit root process. However, because no particular parameterisation is needed for the theoretical results which follow (only for the PR interpretation of (1)) we do not directly impose a localisation on β. This is because in the case where x t is (asymptotically) stationary no such standardisation of β is needed for a PR interpretation of (1).
In the context of (1) our focus will centre on testing the null hypotheses that the intercept and slope parameters are constant over time against the alternative that they vary over time through the sequences of associated time-varying coefficients, s αt and s βt . This can be done by testing the restrictions that a = 0 and b = 0 in (1). We will consider two possible mechanisms.

S: Stochastic Coefficient Variation
The first mechanism we consider for time variation in α t and β t in (1), in the spirit of Nyblom (1989), is one where s αt and s βt follow (near-) unit root processes. That is, where ρ α := 1 − c α T −1 , ρ β := 1 − c β T −1 with c α ≥ 0, c β ≥ 0 which are unit root or local-to-unit root autoregressive processes. 2 Precise m.d.s.-based assumptions on the innovations ϵ αt and ϵ βt will be given in Assumption 1. The coefficient processes are initialised at s α0 = s β0 = 0.
In the context of (4), the PR in (1) reduces to a fixed coefficient model when a = 0 and b = 0. The intercept alone is random if a ̸ = 0 while b = 0. In this situation, if (1) is treated as a fixed coefficient regression model, it is then under-specified by an unobserved (local-to) unit root autoregressive process; this is akin to the omission of a valid (local-to) unit root predictive regressor, as studied in GHLT. If a = 0 while b ̸ = 0, treating (1) as a fixed coefficient regression model ignores the fact that the relationship between y t and the predictive regressor x t−1 is not stable but is evolving through time. If a ̸ = 0 and b ̸ = 0, then both forms of mis-specification are present together when (1) is assumed to be a fixed coefficient model. In terms of hypothesis testing, then, we summarise these possibilities via the following taxonomy covering 2 For (1) to be interpreted as a PR the parameter a should, parallelling the discussion surrounding β above, be localised as a = g α T −1 under (4), otherwise y t will be a (near-) unit root process.
the null, H 0 , and various alternatives, H S , in the context of (1) and (4): slope coefficient, or both, vary.

N: Non-stochastic Coefficient Variation
The second mechanism we consider for time variation in α t and β t in (1) follows, among others, Andrews (1993) and is one where they are subject to abrupt changes which occur at a fixed number of deterministic points in the sample. For simplicity we will expound our analysis through the case of a one-time break, although the extension to allow for multiple such breaks is straightforward. However, where it is thought that multiple breaks are possible, the stochastic coefficient variation case might be considered a more natural framework; see also Remark 3 below. In the one-time break case s αt and s βt are modelled as where D t (⌊τ T ⌋) := I(t ≥ ⌊τ T ⌋) with ⌊τ T ⌋ denoting a generic shift point with associated break fraction τ , ⌊·⌋ the integer part of its argument and I(.) the indicator function. We take the true shift fraction τ 0 as unknown to the practitioner but to satisfy τ 0 ∈ Λ, Here then, at time ⌊τ 0 T ⌋, the intercept changes value from α to α + a; the coefficient on x t−1 changes value from β to β+b. The corresponding taxonomy covering the null, H 0 , and various alternatives, H N , in the context of (1) and (5) is then: We conclude this section by detailing in Assumption 1 the conditions that we will place on the innovation vector ϵ t := [ϵ xt , ϵ yt , ϵ αt , ϵ βt ] ′ in what follows, noting that only the assumptions pertaining to the leading two elements of ϵ t are germane under scheme N. Some remarks follow.
Assumption 1. The innovation process ϵ t can be written as ϵ t = HD t e t where: (a) H and D t are the 4 × 4 non-stochastic matrices (b) e t is a 4 × 1 vector m.d.s. with respect to a filtration F t , to which it is adapted, with conditional covariance matrix σ t := → denoting convergence in probability as T → ∞, and (ii) sup t E∥e t ∥ 4+δ < ∞ for some δ > 0, where for any vector, x, ∥x∥ denotes the usual Euclidean norm, ∥x∥ := ( Remark 1. Assumption 1 implies that ϵ t is a vector m.d.s. relative to F t , with conditional variance matrix Ω t|t−1 := E(ϵ t ϵ ′ t |F t−1 ) = (HD t )σ t (HD t ) ′ , and time-varying unconditional variance matrix heteroskedasticity and non-stationary unconditional volatility are obtained as special cases with d i (·) = d i , i = 1, 2, 3, 4 (constant unconditional variance, hence only conditional heteroskedasticity), and σ t = I 4 (so Ω t|t−1 = Ω t = Ω(t/T ), only unconditional non-stationary volatility), respectively. Assumption 1(a) implies that the elements of Ω t are only required to be bounded and to display a countable number of jumps, therefore allowing for a wide class of models for the behaviour of the variance matrix of ϵ t (subject to the structure imposed by H), including single or multiple (co-) variance shifts, variances which follow a broken trend, and smooth transition variance shifts. Assumption 1(b) coincides with the m.d.s. conditions in Assumption 1 of Breitung and Demetrescu (2015), except that the cross product moment summability condition given there is not required as we do not allow ϵ xt to be serially correlated at this stage. We will discuss extensions to allow for this in Section 6.1 where a corresponding condition will be introduced. Deo (2000) provides examples of commonly used stochastic volatility and generalised autoregressive-conditional heteroskedasticity (GARCH) processes that satisfy Assumption 1(b). □ Remark 2. Assumption 1 permits correlation between the elements of ϵ t through the elements h ij , i = 2, 3, 4, j = 1, 2, 3, of the matrix H. In particular, where h 21 ̸ = 0, then y t and the innovations driving x t , ϵ xt , are correlated. □

Remark 3.
Where c α = c β = 0 such that ρ α = ρ β = 1, (4) is a martingale of the form considered in Equation (2.1) of Nyblom (1989, p. 224). This permits α t and β t to undergo either a deterministic or a random number of jumps of random magnitude, with the number of jumps remaining (on average) a non-vanishing fraction of the sub-sample size, in every sub-sample. Where the (expected) number of jumps is lower than the sample size, they occur at random points in the sample. Where c α > 0, c β > 0, [ s αt , s βt ] ′ is a near-martingale and the coefficient processes α t and β t display long run mean reversion (towards α and β respectively). □

Parameter constancy tests
We first outline the structural change statistics that we will consider for testing parameter constancy in the PR in (1). We will then establish the large sample properties of these statistics.

S: Stochastic Coefficient Variation
To test H 0 against H S we adopt the LM statistic of Nyblom (1989). Under certain conditions, including homoskedasticity and the requirement that ρ α = ρ β = 1 in (4), then, conditional on x t , this test statistic has a Locally Best Invariant (LBI) property. For testing H 0 against H S 1x , the relevant LM statistic is given by 3 Notice that the assumptions that E(e t e ′ t ) = I 4 made in part (b)i and that the leading diagonal elements of H are unity involve no loss of generality.
As in Shin (1994), (7) contains the additional regressor ∆x t , to account for the possibility of a non-zero correlation between ϵ xt and ϵ yt (which occurs when h 21 ̸ = 0 in Assumption 1). The same will be needed in the context of the SupF statistics considered below. We can also consider the corresponding single parameter LM statistics. These are given by for the test statistics relating to the intercept alone and to the slope coefficient alone, respectively. Therefore LM 1 is appropriate for testing H S 1 , while LM x is appropriate for testing H S x . The LM 1 statistic coincides with the statistic proposed in GHLT.

N: Nonstochastic Coefficient Variation
To test H 0 against H N we use the SupF statistic of Andrews (1993). In a rather general, but asymptotically stationary setting, Andrews (1993) shows that a test based on this statistic has certain weak asymptotic local optimality properties against this form of parameter variation. For testing H 0 against H N 1x in (1) and (5), this statistic is given by withσ 2 defined as above, andσ 2 (τ ) : To test against H N 1 , exclude D t (⌊τ T ⌋)x t−1 from (10); denote the resulting statistic by SupF 1 . For testing against H N x , D t (⌊τ T ⌋) is excluded from (10), and we denote this statistic by SupF x .
Remark 4. The LM and SupF statistics are used to test the same null hypothesis, H 0 , but differ in which alternative hypothesis they are directed towards. Still, as Hansen (1992a, p. 325) points out they will ''. . . tend to have power in similar directions. . . '' The numerical results reported later in this paper accord with this view. Hansen argues that, as a result, the choice between the tests might be made on computational grounds and argues that this favours the LM statistic. He also argues that the purpose of the test is important and that if one is looking to test against a rapid change in regime then the SupF statistics would be appropriate, while ''. . . if one is simply interested in testing whether or not the specified model is a good model that captures a stable relationship, the notion of martingale parameters is more appropriate, since it captures the notion of an unstable model that gradually shifts over time". op. cit. p. 325. □

Asymptotic distribution theory
Under Assumption 1, the conditions of Lemma 1 of Boswijk et al. (2016) are satisfied such that such that B ηi (·) is a variance-transformed Brownian motion; see, for example, Davidson (1994, p. 484). Under unconditional ho- It will also prove convenient to define the Ornstein-Uhlenbeck In order to examine the asymptotic local power properties of the tests we discuss we will specify H S and H N as local-to H 0 by normalising the parameters a and b to be local-to-zero. The relevant normalisations are different for a and b and differ according to which form of coefficient variability is being considered.
Specifically, under scheme S these are given by a = g α T −1 in H S and H N 1x . In each case g α and g β are fixed Pitman drift constants. Notice also that in these local settings H S and H N reduce to H 0 when g α = g β = 0. In what follows, reference to these alternative hypotheses is understood to be made under these localisations, unless otherwise stated.
We now provide representations for the asymptotic distributions of the LM and SupF statistics under the local alternatives stated above. In Theorem 1 we do this for LM 1x and SupF 1x in terms of matrix-valued processes. Alternative expressions for these limiting distributions in terms of scalar processes, together with those for the single parameter LM 1 , LM x , SupF 1 and SupF x statistics are provided in the on-line supplement.

Theorem 1. Consider the model in (1)-(3) and let Assumption 1 hold.
Then under the null hypothesis and the local alternatives outlined above, Remark 5. The limit expressions given in Theorem 1 for the LM 1x and SupF 1x statistics can be regarded as statistics of the LM and SupF type, respectively, in the context of a continuous-time least squares regression of dY (r) on dr andM ηx,cx (r)dr. Under the null hypothesis of parameter stability, Q (r) = 0 for all r in these representations, while under the alternative hypotheses considered, the presence of parameter instability affects the limit distributions of both test statistics through the process Q (·) which is a function of the Pitman drifts, g α and g β . Although motivated under specific forms of instability, both statistics can therefore be seen to be sensitive to both of the considered alternatives. □ Remark 6. The representations in Theorem 1 for the limiting distributions of LM 1x and SupF 1x depend, under both the null, H 0 , and the local alternatives considered, on the local-to-unity parameter, c x , characterising the degree of persistence in x t . For LM 1x this dependence can be seen more clearly in the alternative representation of its limiting distribution in Corollary S.1 in the supplement. For SupF 1x , consider for simplicity the benchmark case of unconditional homoskedasticity in ϵ t , and observe first that the limiting processes J(·), V (·) and Q (·) all depend on c x .
It follows that the limiting null distribution of F (τ ) from (9) under unconditional homoskedasticity is χ 2 (2) regardless of whether x t−1 is a pure unit root (c x = 0), a near-unit root (c x > 0), or even an asymptotically stationary process (see Andrews, 1993 for the latter). However, upon taking the supremum over r ∈ Λ, the distribution of the resulting functional SupF 1x conditional on {A (s)} s∈ [0,1] depends on all the conditional covariances of {V (r) − V (r) V(1) −1 V (r)} −1/2 J(r) and not just on its trivial conditional variance, and so depends on c x . This dependence carries over to the unconditional distribution of SupF 1x . □ Remark 7. It can also be seen from the representations given in Theorem 1 that the limiting distributions of the structural change statistics do not depend on any of the elements of the matrix H in Assumption 1, under either the constant parameter null hypothesis, H 0 , or the local alternatives involving non-stochastic coefficient variation N. They do, however, depend on any unconditional non-stationary volatility present in the innovations through the variance transformed Brownian motion B η2 (r) and the variance transformed OU process M ηx,cx (r); that is, from any unconditional heteroskedasticity present in ϵ yt and ϵ xt . Where the local alternatives pertain to the stochastic coefficient variation scheme S in (4), these limiting distributions also depend on any unconditional heteroskedasticity present in ϵ αt when g α ̸ = 0 and in ϵ βt when g β ̸ = 0, and on the correlation between ϵ yt , ϵ xt and ϵ αt , ϵ βt . □ Remark 8. The representations given in Theorem 1 fit with the generic representations given in Theorem 2 of Hansen (2000). Hansen (2000, p. 98) gives a set of high-level conditions governing the weak convergence of the sample moments of the data and gives representations for the limiting distributions of the parameter constancy statistics under those conditions. The model set-up with associated assumptions that we consider here is, in the benchmark case of unconditional homoskedasticity in ϵ t and of no correlation between ϵ yt and ϵ xt (i.e., h 21 = 0), an example which satisfies Hansen's conditions and so we would expect our result to accord with his generic result. This is indeed seen to be so on noting that the processes V (·) and ∫ · 0 A (s) dY (s) which appear in Theorem 1 coincide with the generic M(·) and N(·) processes, respectively, in Theorem 2 of Hansen (2000) under the specific conditions of the benchmark case outlined above. □ Remark 9. Where x t−1 is asymptotically stationary (in the sense of Definition 1 of Hansen (2000)), and the error term d 2t e 2t is homoskedastic, the asymptotic null distributions of the LM 1x and SupF 1x statistics are, by Theorem 1 of Hansen (2000), of the form given in Equation (3.3) of Nyblom (1989, p. 226) and Theorem 3 of Andrews (1993, p. 838), respectively. More gen- bility to deterministic processes (say,Ṽ (·) andṼ η (·), respectively), which are continuous and, for r > 0, positive definite. Further sup- converges weakly to a zeromean Gaussian process (say,J 0 (·)) with independent increments and variance functionṼ η (·), and let x t−1 be such that the inclusion of ∆x t in (7) eliminates the effects of h 21 d 1t e 1t from the residual, e t . Then the asymptotic null distributions of the LM 1x and SupF 1x statistics are as given in Theorem 1, but with AdY and V, respectively. 4 The dependence of these distributions on (among other things) heteroskedasticity introduced by a general function d 2 (·) of the form given in Assumption 1 would make the use of a bootstrap approximation desirable and, by the arguments of Theorems 5 and 6 of Hansen (2000), the fixed regressor wild bootstrap we outline in Section 4 would be asymptotically valid. Because in this case the fixed regressor bootstrap statistics would converge to non-random distributions, the focus on conditioning which characterises the central results of this paper for the case of strongly persistent regressors becomes unnecessary. However, the key point is that the fixed regressor wild bootstrap implementations of the structural change tests we consider will be asymptotically valid regardless of whether x t satisfies the conditions outlined in Section 2 (the proof of which is given in Section 4.1) or the generic conditions outlined above, and can therefore be validly used regardless of which of these two set-ups holds for x t , or, when allowing for multiple predictors (see Section 6.2), the case where both types are present; indeed, they are also asymptotically valid in cases where the degree of persistence of the variables in x t changes over the sample. Moreover, as demonstrated in Theorems 5 and 6 of Hansen (2000), the fixed regressor bootstrap tests will also be asymptotically valid if the set of predictors includes regressors whose marginal distributions are subject to structural change of the form given in Section 4 of Hansen (2000). □

Fixed regressor wild bootstrap tests
As the results in the previous section show, implementing tests based on the LM and SupF statistics will require us to address the fact that their limiting null distributions depend on any unconditional heteroskedasticity present in ϵ xt and ϵ yt , and on the persistence parameter c x . To account for the former we employ a wild bootstrap procedure based on the residualsê t of the fitted 4 The corresponding limiting distributions under local alternatives can be obtained by appropriately modifying the limiting process Q (·) in Theorem 1. regression (7), while for the latter we use the observed outcome on x := [x 0 , x 1 , . . . , x T ] ′ as a fixed regressor when implementing the bootstrap procedure.
We now outline our fixed regressor wild bootstrap approach in Algorithm 1. To aid exposition we do so for the bootstrap tests based on the LM 1x statistic, but it should be entirely clear how the same approach can be applied to the LM 1 , LM x , SupF 1x , SupF 1 and SupF x statistics, with the resulting bootstrap analogues of these statistics correspondingly denoted by LM * 1 , LM * x , SupF * 1x , SupF * 1 and SupF * x , respectively. Algorithm 1 (Fixed Regressor Wild Bootstrap): (ii) Calculate the fixed regressor wild bootstrap analogue of LM 1x as outlined in Section 3, but with y * t in place of y t and with the regressor ∆x t omitted. Denote the resulting bootstrap statistic as LM * 1x . (iii) Define the corresponding p-value as P * In practice, G * T (·) will be unknown, but can be simulated in the usual way.
(iv) The wild bootstrap test of H 0 at level ξ rejects if P * T ≤ ξ .
Remark 10. Althoughê t depends on g α and/or g β unless H 0 is true we will show in the next subsection that this does not translate into large sample dependence of LM * 1x and SupF * 1x on these parameters. In the case of developing bootstrap tests based on the SupF 1x , SupF 1 and SupF x statistics and where it was thought that scheme N applied then one could also consider replacingê t in step (i) of Algorithm 1 by the residualsê t (τ ) whereτ := arg sup τ ∈Λ F (τ ).
This would not alter the large sample results which follow and in Monte Carlo experiments we found almost no difference between this approach and that outlined in Algorithm 1. □ Remark 11. Notice that in the bootstrap regression in step (ii) of Algorithm 1 we do not need to include ∆x t as an additional regressor. This is because theê t used to construct y * t are free of any effects arising from the correlation between ϵ xt and ϵ yt . Also observe that we can assume that α = β = 0 with no loss of generality when generating the bootstrap y * t data in step (i) because of the invariance of the residualsê t to the values of α and β in (1). □ Remark 12. An alternative approach to account for unconditional heteroskedasticity in the context of the SupF tests is to replace F (τ ) in (9) with a corresponding robust Wald statistic based around a heteroskedastic-robust variance estimate; see White (1982). However, although the marginal limiting null distributions for these statistics, for a fixed value of τ , do not depend on any unconditional heteroskedasticity present in ϵ xt and ϵ yt , the suprema of the sequences of such statistics taken over all τ ∈ Λ do still depend, in general, on the heteroskedasticity, and hence a wild bootstrap would still be needed to obtain asymptotic size control. The limiting distributions of these sup-Wald statistics differ from those of the corresponding SupF statistics under both the null and local alternatives and, as a result, their local power functions do not coincide. Similarly, one could also consider heteroskedasticitycorrected versions of the LM statistics, as discussed in Hansen (1992b), but the limiting distributions for these statistics are also not invariant to unconditional heteroskedasticity and so again a wild bootstrap would still be needed. In unreported finite sample simulations comparing these alternative approaches with those based on the tests outlined in Section 3, we found neither approach to dominate the other overall in terms of size and power performance. □

Asymptotic theory for the bootstrap tests
We first show that the limiting behaviour of the bootstrap statistics LM * 1x and SupF * 1x , conditional on the data, cannot be described in the standard terms of weak convergence in probability to a non-random distribution. Rather, to formulate a useful asymptotic result, a weaker convergence mode and a more general form of the limit are required. Using the concept of weak convergence of random measures, we demonstrate that the distributions of LM * 1x and SupF * 1x , given the data, converge to the random distributions which obtain by conditioning the limiting null distributions given in Theorem 1 on the weak limit B 1 of T −1/2 ∑ ⌊T ·⌋ t=1 e 1t . Second, we establish that under H 0 and a strengthening of Assumption 1, the distributions of the LM 1x and SupF 1x statistics, conditional on x := [x 0 , x 1 , . . . , x T ] ′ , converge weakly to the same random distributions referred to above. This result allows us to establish the asymptotic validity of our bootstrap test. As in GHLT, in order to proceed we strengthen Assumption 1 as follows: Assumption 2. Let Assumption 1 hold, together with the following conditions: (a) e t is drawn from a doubly infinite strictly stationary and ergodic sequence {e t } ∞ t=−∞ which is a martingale difference w.r.t. its own past.
(c) The initial value s x,0 is measurable w.r.t. X (in particular, it could be a fixed constant).

Remark 13.
A detailed discussion of the implications of Assumption 2 is given in GHLT to which we refer the reader. Assumption 2 enables us to invoke a conditional (on x) functional central limit theorem, together with a bootstrap analogue of that result for T −1/2 ∑ ⌊T ·⌋ t=1 y * t conditional on all of the data (x and y := [y 1 , . . . , y T ] ′ ). Taken together with further results on conditional convergence to stochastic integrals adapted from GHLT, these results allow us to obtain the limiting distributions of the original statistics LM 1x and SupF 1x , conditional on x, together with the limiting distributions of the corresponding bootstrap LM * 1x and SupF * 1x statistics from Algorithm 1, conditional on the data. These are now reported in Theorem 2 and underlie the validity of our bootstrap approach. □ Theorem 2. Consider the model in (1)-(3) and let Assumption 2 hold. Under the null hypothesis and under the same local alternatives as were considered in the context of Theorem 1, the following converge jointly as T → ∞, in the sense of weak convergence of random measures on R:

Remark 14.
For the precise meaning of joint weak convergence of random measures, we refer the reader to Appendix A and to the discussion on this point in section 4.3 of GHLT. The concept is weaker than weak convergence in probability, although it reduces to the latter when the limit distribution is non-random. Nevertheless, joint weak convergence of random measures implies convergence of the (conditional) distribution functions in a way that is still sufficient in order to yield consistency of the bootstrap in the usual p-value sense, as we will subsequently show in Corollary 1. □ Remark 15. Under the null hypothesis, the process J coincides with the process J 0 whose form is invariant as to which of the null and local alternatives considered in this paper holds. As a result, the limiting distributions of the bootstrap statistics are the same under both the null and local alternatives and, moreover, coincide with the limiting null distributions, conditional on B 1 , of the corresponding original test statistics. □ Remark 16. As discussed in Remark 6, in the case of unconditional homoskedasticity, the random variable J ′ 0 (r){V (r) − V (r) V(1) −1 V (r)} −1 J 0 (r) conditional on B 1 has a χ 2 (2) distribution for every fixed r ∈ Λ and, in particular, is independent of B 1 . Nevertheless, even in this case, the conditional limiting null distribution of SupF 1x and SupF * 1x is genuinely random (non-degenerate). This is so because the non-contemporaneous autocovariances of {V (r) − V (r) V(1) −1 V (r)} −1/2 J 0 (r) conditional on B 1 depend on V, and thus, on B 1 which is random. As a result, upon taking the supremum over r ∈ Λ, the distribution of the functional obtained, conditional on B 1 , still depends on B 1 and is, therefore, random. Regarding LM 1x and LM * 1x , the randomness of their conditional limiting null distributions is even more obvious because, even for fixed r ∈ Λ, the distribution of J ′ 0 (r){V(1)} −1 J 0 (r) given B 1 is not independent of B 1 , as V(1) is not the conditional variance of J 0 (r). □ Remark 17. With a slight abuse of terminology, we could think of the random distributional limits of SupF 1x and SupF * 1x (and likewise, of LM 1x and LM * 1x ) as random draws from a family of distributions indexed by B 1 . Such random draws are distinct from the non-random mixture distribution obtained by averaging the family of distributions over B 1 . Since the limit of SupF * 1x in Theorem 2 is distinct from this mixture distribution, it follows that the mixture distribution cannot be a weak limit in probability of SupF * 1x , because weak convergence in probability implies convergence to the same limit also in the mode employed in Theorem 2. Furthermore, as the limits in Theorem 2 are invariant to the value of h 21 , and our unconditionally homoskedastic case with h 21 = 0 satisfies Assumption 2 of Hansen (2000) (see also Example 3 therein), we can conclude that the part of Theorem 6 in Hansen (2000) asserting the weak convergence in probability of Hansen's counterpart of SupF * 1x to the un conditional (and hence, non-random mixture) null limit distribution of SupF 1x given in Theorem 1, is not correct.
The same error appears in Theorem 3 of Cavaliere and Taylor (2006, p. 626) who discuss fixed regressor wild bootstrap implementations of the Shin (1994) tests for the null of co-integration. Nevertheless, the ultimate claim in Hansen (2000), Corollary 2, that the bootstrap p-values under H 0 are asymptotically uniformly distributed (and, thus, that the fixed regressor wild bootstrap is asymptotically valid in this sense) can still be shown to hold true for the testing problem considered in this paper, though as a consequence of our Theorem 2 (see Corollary 1). By similar considerations, the fixed regressor wild bootstrap implementations of the Shin (1994) tests in Cavaliere and Taylor (2006) could be shown to be asymptotically valid in the same sense. □ As we have seen in Theorem 2, the bootstrap statistics LM * 1x and SupF * 1x , conditional on the data, and the original statistics LM 1x and SupF 1x , conditional on x, share the same asymptotic distribution under the null hypothesis. We can obtain as an implication, now formalised in Corollary 1, that the bootstrap tests based on LM * 1x and SupF * 1x are asymptotically valid. We state the result for LM 1x and SupF 1x but the same conclusions hold for LM 1 , LM x , SupF 1 and SupF x . As usual, validity is formulated in terms of bootstrap p-values.   The practical implication of Corollary 1 is that comparison of one of the original statistics, for example LM 1x , with a ξ level empirical bootstrap critical value (calculated as the upper tail ξ percentile from the order statistic formed from B independent simulated bootstrap LM * 1x statistics), which we will denote by cv ξ ,B , will result in a bootstrap test that under H 0 will have asymptotic size that for sufficiently large B will be as close as desired to the given nominal level ξ . Size in this context is understood to mean the rejection frequency in a thought experiment where the bootstrap test is applied to a large number of data samples constituting different realisations of the regressor {x t }. This is distinct from the interpretation of the stronger results (also derived in the proof of Corollary 1) that P * T ,LM |x w → p U[0, 1] and P * T ,F |x w → p U[0, 1] under H 0 , in the sense of weak convergence in probability; these results can be interpreted as also establishing the asymptotic validity of the bootstrap for fixed realisations of {x t }. Under local alternatives cv ξ ,B will remain as under H 0 (at least in the limit), while the distribution of LM 1x conditional on x will vary with g α and g β and so asymptotic local power of the bootstrap tests will be a function of those drift parameters. In what follows, as a matter of shorthand notation, we will denote by LM B 1x the fixed regressor wild bootstrap procedure outlined in Algorithm 1, whereby the original statistic is compared to its empirical bootstrap critical value, cv ξ ,B .

Asymptotic local power
We now turn to a consideration of the asymptotic local power of the fixed regressor wild bootstrap procedures. In accordance with the interpretation given to the results in Corollary 1, we focus on asymptotic power understood as the rejection rate in a thought experiment with a large number of different realisations of the process B 1 . We simulate the functionals in the limit distributions using 3000 Monte Carlo replications with different Brownian motion processes in each replication, approximated as random walks with IID N(0, 1) increments over a grid of 1000 points. For each replication, the simulated limit bootstrap critical value for ξ = 0.10 is obtained by simulating the appropriate bootstrap limit distribution using B = 499 bootstrap replications, conditioning on the simulated B 1 for that Monte Carlo replication.
In calculating asymptotic powers, in D t we abstract from any role that non-stationary volatility plays by setting d it = 1, for all i and t. We induce a correlation of −0.8 between ϵ xt and ϵ yt by setting h 21 = −4/3; the other non-diagonal elements of H are set to 0. We also set c x = 10. As regards the various alternatives, using a 30-step grid of values denoted g between 0 and 50, under stochastic parameter variation, S, we have in H S : g α = 3g/5 for H S 1 , g β = 3g for H S x and g α = 3g/5, g β = 3g for H S 1x and we consider c α = c β = {0, 10}. Under non-stochastic parameter variation, N, we have in H N : g α = g/5 for H N 1 , g β = g for H N x and g α = g/5, g β = g for H N 1x and we consider s αt = s βt = I(t > ⌊τ 0 T ⌋) with the break fractions τ 0 = {1/2, 3/4} with τ L = 0.1 and τ U = 0.9. Here, the strength of the alternatives increases with g, the null being true for g = 0. Fig. 1(a)-(c) report results for the stochastic parameter variation of H S with c α = c β = 0. In Fig. 1(a)  x and especially LM B x , the two procedures that do not permit intercept variation of either type, perform significantly worse than those that do. Fig. 1(b) shows results for the alternative H S x (slope parameter variation) where LM B x might be expected to perform best. Here there is little difference between this procedure and LM B 1x , SupF B x and SupF B 1x . We also observe that LM B 1 and SupF B 1 perform much worst, with LM B 1 being least powerful of all. In Fig. 1(c) the alternative is H S 1x (intercept and slope parameter variation). The two best procedures are LM B 1x and SupF B 1x and there is little to choose between them. None of the other procedures performs particularly poorly, however.  powers of all procedures are now lower than when c α = c β = 0, as would be expected. Otherwise, broadly speaking, the comments made for Fig. 1(a)-(c) apply here also. In Fig. 2(a)-(c) we give results for the non-stochastic parameter variation of H N for a mid-sample break, τ 0 = 1/2. For Fig. 2 (a) the alternative is H N 1 (intercept variation only). While we might expect SupF B 1 to provide most power, it is clear that this role is actually fulfilled by LM B 1 , followed by SupF B 1 and LM B 1x , and then SupF B 1x . As regards LM B x and SupF B x , the procedures that do not permit intercept variation, their power is again very low in comparison to the others. Fig. 2(b), where the alternative is H N x (slope parameter variation) reveals LM B x to be the best performing procedure, outperforming SupF B x . Here it is the power of LM B 1 and SupF B 1 that are the lowest by some margin. In Fig. 2(c) the alternative is H N 1x (intercept and slope parameter variation) and we see that the best procedure is LM B 1x , followed by SupF B 1x . The others have noticeably lower power compared to these two, though none of them performs badly. The analysis is repeated in Fig. 2 (d)-(f) for a late break, τ 0 = 3/4. Fig. 2(d), where the alternative is H N 1 (intercept variation), reveals that all the procedures that include this alternative now have fairly similar power; the power advantage previously seen for LM B 1 over SupF B 1 is no longer in evidence, with both showing similar levels of power. Likewise, in Fig. 2(e) under the alternative H N x (slope parameter variation), we see that LM B x and SupF B x also now have similar power levels. For the alternative H N 1x (intercept and slope parameter variation) in Fig. 2(f), SupF B 1x generally appears more powerful than LM B 1x , thereby reversing the previous ranking. Once more, we see that procedures which exclude parameter variation (of either type) perform badly when it is present in the alternative.
Summarising the findings of Figs. 1 and 2, what is clear throughout is that procedures which incorrectly exclude the possibility of parameter variation associated with a particular regressor when it is present in the alternative in either form will lose power compared to those procedures that do permit one or other form of variation in that parameter. This is not really surprising. What is perhaps more surprising is that employing a procedure that specifies the correct form of parameter variation for a given alternative (i.e. stochastic or non-stochastic) does not always yield higher power than the corresponding procedure which specifies the incorrect form. In fact, the incorrectly specified procedure may have the higher power, as seen most obviously in the context of non-stochastic variation when the break fraction is τ 0 = 1/2; here the LM-based procedures are consistently more powerful than their SupF -based counterparts.

Weak dependence
Thus far we have assumed that the noise, ϵ xt , driving x t is serially uncorrelated, by virtue of e t being a m.d.s. More generally we might consider a linear process assumption for ϵ xt of the form where v x,t denotes the first element of HD t e t and with the con- satisfied. Under homoskedasticity, this would include all stationary and invertible ARMA processes. Notice that under this structure ϵ yt remains uncorrelated with the lagged increments of x t at all lags.
In this case, it may be shown that the limiting results given in this paper would continue to hold provided in (7) and (10) we add in the regressors ∆x t−1 , . . . , ∆x t−p where p satisfies the standard rate condition that 1/p + p 3 /T → 0, as T → ∞, and where it is assumed that are the coefficients of the AR(∞) process obtained by inverting the MA(∞) process above. 5 Similarly to Breitung and Demetrescu (2015), 5 These regressors would not need to be added to the bootstrap analogues of (7) and (10) because theê t used to construct y * t are free of any effects arising from weak dependence in ϵ xt .
we would also need to restrict the amount of serial dependence allowed in the conditional variances via the cross-product moment assumption that sup i,j≥1 with ⊗ denoting the Kronecker product. As is standard in the PR literature, we maintain the assumption that ϵ yt is serially uncorrelated, which is why, unlike in the setting considered in Shin (1994), we need only include lags of ∆x t , rather than both leads and lags thereof.

Multiple predictors and deterministic components
The parameter constancy tests developed in the context of (1)-(3) with a single predictive regressor, x t−1 , and an intercept can be straightforwardly generalised to the case where the PR contains multiple predictors and/or a general deterministic component of the form considered in section 3.2 of Breitung and Demetrescu (2015).
Specifically, we may consider the case where the deterministic component in (1) is of the form α t + τ ′ f t , with α t specified as before, and where f t is as defined in section 3.2 of Breitung and Demetrescu (2015), but is such that it does not span the space of a constant; an obvious example is the linear trend case which obtains for f t := t. To allow for multiple predictors, replace x t−1 in (1) by the k × 1 vector of predictive regressors as x t−1 := (x 1,t−1 , . . . ., x k,t−1 ) ′ where each x i,t is generated by equations of the form given in (2) and (3), and where the former can also include the additional deterministic variables in f t . We would then correspondingly construct the LM structural instability statistics (which could be for single or joint parameter restrictions) with the residualsê t now obtained from the regression of y t onto an intercept, f t , x t−1 and ∆x t−1 (and lags of ∆x t−1 in the case considered in Section 6.1) and setting a t := [1, x ′ t−1 ] ′ in the calculation of (6). The bootstrap analogues of these statistics discussed in Section 4 would use the residuals from the regression of y * t (the wild bootstrap analogue of y t ) onto an intercept, f t and x t−1 . For the SupF -type statistics the additional set of residualsê t (τ ) needed to compute F (τ ) in (9) are obtained from the regressions above but augmented with D t (⌊τ T ⌋) and/or D t (⌊τ T ⌋)x t−1 and computed for each possible τ . For both the LM and SupF -type statistics, doing so alters the form of the limit distributions given in Theorem 1, but would not alter the primary conclusion given in Corollary 1, that the fixed regressor wild bootstrap implementation of the instability tests are asymptotically valid. In particular, the process A(·) along with the Brownian-based processes which appear in Theorem 1 would need to be appropriately re-defined to the deterministic component being considered, A(·) would now contain k OU derived processes, analogous toM ηx,cx (·), corresponding to each of the k elements of x t−1 , while Q (·) would also now contain additional terms, analogous to M ηβ,c β (·)M ηx,cx (·) under scheme S and M ηx,cx (·) under scheme N, corresponding to each of the k elements of x t−1 .

Finite sample size and power
We now evaluate the finite sample size and power properties of the bootstrap procedures, on average over different realisations on x. We simulate the DGP (1)-(3) where we set µ = α = β = 0, s x0 = 0 and generate e t ∼ IID N(0, I 4 ) for a sample size of T = 100. 6 The simulations are again conducted using 3000 Monte Carlo replications, B = 499 bootstrap replications, and setting ξ = 0.10. No lagged ∆x t terms are incorporated into any fitted regression model for y t . 6 We also ran simulations for T = 200. These results were little different from those discussed here for T = 100 and so are omitted in the interests of brevity.
These results can be obtained from the authors on request.
In order to meaningfully compare finite sample results with the homoskedastic-case asymptotic results of the previous section, in the simulation DGPs we first employ exactly the same constellation of parameter settings as underpinned our reported asymptotic results. Figs. 3 and 4 report our results for the finite sample analogues of Figs. 1 and 2. Throughout, it is seen that each procedure has empirical size near to the nominal 0.10 level. It is also clear throughout that the finite sample powers generally bear a strong resemblance to their asymptotic counterparts in terms of the relative behaviour of the bootstrap procedures, and hence the comments given in the previous section apply here also (some discrepancies are simply due to small finite sample size differences). In absolute terms, the finite sample powers tend to be slightly lower than their asymptotic counterparts, although this is hardly noticeable in the cases of alternatives with non-stochastic parameter variation (Fig. 4).
We next consider the impact of unconditional heteroskedasticity, investigating the finite sample size and power of our bootstrap procedures when two of the error processes, those for ϵ xt and ϵ yt are subject to a contemporaneous single break in volatility of equal magnitude. Specifically, we again simulate the DGP (1)-(3) with T = 100 letting d it = 1 for t ≤ ⌊τ 0h T ⌋ and d it = σ for t > ⌊τ 0h T ⌋, i = 1, 2, with τ 0h = {1/2, 3/4} and we consider σ = {4, 1/4} thus allowing for both upward and downward volatility shifts, with the chosen magnitudes being substantial for illustrative purposes. The other simulation DGP settings are as in Figs. 3 and 4, however for brevity we now only consider a subset of the values for g given by g = {0, 15, 35}. The results are shown in Tables 1(a) and 1(b); these include the previously-considered homoskedastic case (obtained by setting σ = 1) as a benchmark for sizes and powers (note that in the tables, SupF is abbreviated to SF ). . Beginning with empirical sizes of our procedures (g = 0), we see that heteroskedasticity has only a modest effect when compared to the benchmark homoskedastic case (particularly for the LM-based procedures). This suggests that the wild bootstrap is performing reasonably well in reproducing the patterns of heteroskedasticity present. Turning to finite sample power, in general terms we see that the upward volatility shift considered significantly decreases powers relative to the benchmark homoskedastic powers, while the downward shift considered has the opposite effect. These effects are observed for both volatility break timings considered. An examination of the power levels between the procedures reveals that under heteroskedasticity, the patterns of relative powers are generally similar to those observed in the homoskedastic case. For an alternative of H S x (slope parameter variation), from Panel B it appears that both upward and downward volatility shifts can lead to a decrease in power when compared to the homoskedastic benchmark (although this effect is rather small for the downward shift).  (a) τ 0 = 1/2, H N 1 , g α = g/5.

An empirical application
To illustrate how our proposed instability test procedures may be used in practice, we apply them to the U.S. annual equity series analysed in Welch and Goyal (2008), which is updated to cover the period 1926-2015 (T = 90) and is available at http://www. hec.unil.ch/agoyal/. Our y t variables are R t , the log of the total return (including dividends) on the S&P 500 stock market index from year t − 1 to t, and EP t , the equity premium, which subtracts the corresponding risk-free rate (the Treasury Bill rate) from R t . The x t predictor variables (in each case included in the bivariate PR with a one-period lag) are: the dividend yield, DY t , defined as the difference between the log of dividends and the log of oneperiod lagged prices; the dividend payout ratio, DE t , defined as the difference between the log of dividends and the log of earnings;  (7) is determined using BIC selection starting from a maximum value of 6. The same number of lagged difference terms is employed for the SupF -based procedures in the fitted regression (10).
The entries in parentheses in the column labelled x t are bootstrap p -values for a standard KPSS statistic applied to each predictor (with the long run variance estimate based on the quadratic spectral kernel with automatic bandwidth selection), obtained using the wild bootstrap method of Cavaliere and Taylor (2005) with 499 bootstrap replications. These p-values are small in all cases, implying rejection of the null of stationarity against the unit root alternative for each series. As is well known, the KPSS test also rejects stationarity with high probability when the series under test displays local-to-unit root behaviour, so at the very least these results are indicative of a high degree of persistence being present in each of the predictor series.
The entries in parentheses underneath the main entries are the bootstrap p -values for the LM B x tests statistics for the R t -DY t pairing, and, to a lesser extent, for the EP t -DY t pairing via LM B x . No evidence against H 0 is seen (i.e. no rejection at conventional significance levels) for the DE t and LTR t predictors, regardless of whether R t or EP t is employed. Turning to the pre-crisis sub-sample, evidence against H 0 is again seen for R t -DY t (now also including SupF B 1x at the 0.10-level). Some evidence is also found again for EP t -DY t , this time via the SupF B 1x test rather than the LM B x test. The change of sample period has no effect on the lack of rejections when using the DE t and LTR t predictors.
Table 2(a) also reports, under |IV |, the absolute value of the heteroskedasticity-robust IV t-test of Breitung and Demetrescu (2015) for predictability of y t by x t−1 . This statistic combines frac- tional and sine function instruments and tests the significance of the estimated coefficient on x t−1 , having a standard normal limit distribution under the null of no predictability. Its p-value is reported in parentheses. According to |IV |, there is strong evidence of predictability in both the R t -LTR t and EP t -LTR t relationships when the 1926-2007 sub-sample is considered. Interestingly, neither of these pairings were found to be subject to parameter instability according to our battery of bootstrap procedures.
To informally examine the extent to which parameter instability appears present in these PRs, Fig. 5 plots rolling window IV coefficient estimates and approximate 0.10-level standard error bounds. These are based on a rolling window length set at ⌊0.25T ⌋ observations, and the horizontal axis dates correspond to the end of a given window sub-sample. Although it is difficult to make any firm conclusions, on examining Fig. 5, we might be led to tentatively conclude that the most pronounced parameter variation is associated with the R t -DY t and EP t -DY t pairings ( Fig. 5(a) and 5(d)). This would be in line with our bootstrap test outcomes in Table 2(a). Also, it is credible to consider that the least pronounced parameter variation observed is associated with R t -DE t and EP t -DE t (Fig. 5(b) and 5(e)), which would tie in with the generally large p-values for the associated instability tests. The estimated parameter values are also generally fairly close to zero, which is in line with |IV | in Table 2(a) finding no evidence of predictability.
The parameter estimates for R t -LTR t and EP t -LTR t (Fig. 5(c) and 5(f)) display relative constancy at positive values over much of the sample period, which is compatible with our instability tests not rejecting, yet at the same time |IV | indicating predictability for the earlier sub-sample. That the rolling parameter estimates reduce to insignificant levels towards the end of the full sample period could explain why |IV | does not reject for the full sample; on the other hand, it appears from the instability test results that this change is not substantial enough, in either magnitude or duration, to be detected by our test procedures.
In Table 2(b) we consider instability tests allowing for multiple predictors, using two predictors together by combining DY t , the predictor for which most evidence of instability was found, with either DE t or LTR t in the PR. For each test we use subscripts to denote the regressor coefficients permitted to vary under the alternative, with x 1 = DY and x 2 = DE or x 2 = LTR. For brevity we only show a subset of the possible statistics that could be computed (we do not report statistics that allow for variability in both the intercept and a single predictor alone). Interestingly, the pre-crisis 1926-2007 period shows little in the way of parameter instability whenever DY t and LTR t are combined together (LM B x 1 x 2 being the exception). In Table 2(a), parameter instability was indicated when using DY t alone as a predictor, but it appears that when LTR t (which was identified as a potentially valid predictor for this period) is included in the PR, the appearance of parameter instability in the DY t coefficient is removed, suggesting that Table 2  results might be driven by under-specification of the PR. In the full sample, parameter instability is still detected for the DY t and LTR t combination; one possible explanation is the apparent late change in the LTR t rolling coefficients observed in Fig. 5(c) and 5(d), the impact of which could prevent a stable PR incorporating DY t and LTR t from holding for the full sample period. We also see that the addition of DE t to the R t -DY t and EP t -DY t regressions results in no evidence for instability, despite there being evidence for DY t coefficient instability when considered in isolation. Given that there was no evidence for DE t being a valid predictor, a possible interpretation is that the addition of this regressor has reduced the power of the instability tests. What it is clear is that allowing multiple predictors opens the door for rather complex interactions in the parameter instability testing context.

Conclusions
We have developed asymptotically valid tests for structural change in the slope and/or intercept parameters of a PR model, based on the well known SupF and Cramer-von-Mises type structural instability test statistics of Andrews (1993) and Nyblom (1989), respectively. To allow for an unknown degree of persistence in the predictors, and for both conditional and unconditional heteroskedasticity, a fixed regressor wild bootstrap test procedure was proposed and its asymptotic validity established. Our validity argument involved demonstrating that the asymptotic distributions of the bootstrap parameter constancy statistics, conditional on the data, coincide with the asymptotic null distributions of the corresponding statistics computed on the data, conditional on the predictors. In doing so we have shown that the standard approach to asymptotic bootstrap validity, based on bootstrap consistency for the unconditional limiting distributions of the original test statistics, is not generally applicable in cases where the bootstrap procedure treats non-stationary regressors as fixed. Monte Carlo simulations were reported which suggested that our proposed methods work well. An empirical illustration using well-known U.S. stock market data highlighted the potential value of our procedure in practice.
T −p β g β x t1 s β,t+1 , where p α = 0 and p β = 1/2 for the stochastic specification S, and p α = −1/2 and p β = 0 for the non-stochastic specification N, we can therefore write y t = βx t−1 + β z z t−1 + ϵ yt , t = 1, . . . , T , (A.5) as in Eq. (1) of GHLT. Here T −1/2 z ⌊Tr⌋ w → g α M ηα,cα (r) + g β M ηx,cx (r)M ηβ,c β (r) for the stochastic specification S and T −1/2 z ⌊Tr⌋ w → {g α + g β M ηx,cx (r)}I(r ≥ τ 0 ) for the non-stochastic specification N, in D as T → ∞. In either case, we denote the weak limit by Q (r) and the corresponding de-meaned process byQ (r) := Q (r) − ∫ 1 0 Q (s) ds. Since β z = O(T −1 ), as in GHLT, and T −1/2 z ⌊Tr⌋ converges weakly in D, also as in GHLT (albeit to a different limit), by the same argument as is used in the proof of Theorem 2 in GHLT (which is based on orders of magnitude and not the exact distribution of the weak limit of T −1/2 z ⌊Tr⌋ ), we can conclude that Regarding the variance estimators used in constructing the statistics, using an order of magnitude based argument as in the proof of Theorem 2 in GLHT, we obtain that (A.10) On the other hand, from the definition (9) of F (r) and (A.4), it follows thatσ 2 (r) =σ 2 ⌊Tr⌋ + o p (T −1 ), uniformly in r ∈ Λ.
By combining the previous results and using applications of the CMT, we obtain that Before progressing to the proof of Theorem 2 we define the conditional convergence modes which will be used in the rest of Appendix A. Let ξ T (respectively, η T ) be random elements of a Polish space, defined on the same probability space as the original data (respectively, the original and the bootstrap data), and ξ , η be random elements of a Polish space defined on the same probability space as B 1 . For weak convergence of random measures induced by conditioning, i.e., of the form ξ T |x w → ξ |B 1 and η T |x, y w → η|B 1 , we write resp. ξ T wx → ξ |B 1 and η T w * → η|B 1 , the definitions being E{f (ξ T )|x} w → E{f (ξ )|B 1 } and E{g(η T )|x, y} w → E{g(η)|B 1 } for all bounded continuous real functions f and g with matching domain. Importantly, we say that the w x and w * convergence are joint if (E{f (ξ T )|x}, E{g(η T )|x, y}) ′ w → (E{f (ξ )|B 1 }, E{g(η)|B 1 }) ′ for the same class of functions f , g. This is the meaning of joint convergence in Theorems 2 and A.1. We notice that it is distinct from two w x convergence results ξ ′ T wx → ξ ′ |B 1 and ξ ′′ T wx → ξ ′′ |B 1 being joint, or equivalently, from ξ T wx → ξ |B 1 with ξ T = ( ξ ′ T , ξ ′′ T ) and ξ = ( ξ ′ , ξ ′′ ) , where E{f (ξ ′ T , ξ ′′ T )|x} w → E{f (ξ ′ , ξ ′′ )|B 1 } should hold for bounded continuous f (and similarly, for w * ). Finally, we recall that for random elements of a Polish space the existence of regular conditional measures is guaranteed.
We next report in Theorem A.1 some results from GHLT, adapted to the problem discussed here and which will subsequently be used in the proof of Theorem 2. )⏐ ⏐ ⏐ ⏐ B 1 terms such that o * p (1) w * → 0. We conclude thatσ * 2 w * → ∫ 1 0 d 2 2 (s) and, by the CMT, from it follows that LM * 1x and SupF * 1x converge as asserted, jointly with LM 1x and SupF 1x . □ Proof of Corollary 1. The random cdf's conditional on B 1 of the conditional limit distributions given in Theorem 2 are continuous a.s. For the LM 1x statistic this follows from the representation of the limit distribution conditional on B 1 as the distribution of an infinite weighted sum of independent χ 2 (2) variables, similarly to Nyblom (1989) and Rao and Swift (2006, pp. 472-473), using the continuity of V a.s. For SupF 1x continuity of the limiting conditional cdf follows from Proposition 3.2 of Linde (1989) applied conditionally on B 1 . The proof then proceeds as that of Corollary 1 in GHLT. □