The Informational Efficiency of European Natural Gas Hubs: Price Formation and Intertemporal Arbitrage

In this study, the informational efficiency of the European natural gas market is analyzed by empirically investigating price formation and arbitrage efficiency between spot and futures markets. Econometric approaches accounting for nonlinearities induced by the low liquidity-framework and by technical constraints of the considered gas hubs are specified. The empirical results reveal that price discovery generally takes place on the futures market. Thus, the futures market seems to be more informationally efficient than the spot market. The theory of storage seems to hold at all hubs in the long run. There is empirical evidence of significant market frictions hampering intertemporal arbitrage. UK’s NBP and Austria’s CEGH seem to be the hubs at which arbitrage opportunities are exhausted most efficiently, although there is convergence in the degree of intertemporal arbitrage efficiency over time at the hubs investigated.


INTRODUCTION
The price signals of commodity spot and futures markets are of economic significance for market participants and various stakeholders, as they tend to ensure an efficient allocation of resources.However, the extent to which commodity spot and futures prices fulfil their function crucially depends on the informational efficiency of the respective market.Economic theory suggests that sufficient market liquidity facilitates the processing of information into valid price signals.Thus, the efficiency of markets that are still immature and suffer a lack of liquidity may be questioned.This holds true for the natural gas wholesale markets within continental Europe.Spot markets for immediate delivery of natural gas as well as futures markets have emerged rather recently as a consequence of the natural gas directives of the European Parliament (EU, 2003;EU, 2009), aiming towards an integrated and competitive European gas market.Liquidity on these markets, though rising, is still low compared to the mature gas markets in the UK or the U.S. The limited liquidity of both spot and futures markets at continental European gas hubs has entered the scientific debate, as European gas pricing is currently undergoing a transition phase from traditional oil indexed pricing of long-term contracts (LTC) to an increase in the significance of hub-based pricing. 1  , plus the storage costs adjusted for the convenience yield (i.e., the r S(1 + r economic benefit of physical ownership).This condition can be stated as Deviations from the intertemporal equilibrium may trigger arbitrage activity by market participants.In this context, arbitrage can be considered as the economic activity of generating risk free profits by taking advantage of the substitutability between commodity spot and futures markets (Schwartz and Szakmary, 1994).As outlined by Huang et al. (2009), a long arbitrage position, i.e., buying the commodity on the spot market and selling a futures contract, is profitable if the basis exceeds the difference of warehouse costs and convenience yield, adjusted for the interest b In contrast, a short arbitrage position, i.e., selling the commodity on the spot market and buying a futures contract, generates profits if The theory of storage has been empirically analyzed for different commodity markets by Fama and French (1987), and more recently by Considine and Larson (2001) and Huang et al. (2009).With regard to the European natural gas market, Stronzik et al. (2009) find significant deviations from the theory of storage equilibrium for three European hubs for the period 2005 to 2008 using indirect testing procedures.However, the efficiency of intertemporal arbitrage activity at European gas hubs has not yet been addressed in the existing literature.The subsequent sections seek to bridge this research gap in the area of gas markets.

III. SAMPLE DESCRIPTION AND PRELIMINARY DATA ANALYSIS
The sample comprises daily spot, one month-ahead (m + 1), two month-ahead (m + 2) and three month-ahead (m + 3) futures prices for the German hubs 'NetConnect Germany' (NCG) and 'Gaspool' (GP) 3 , the Dutch gas hub 'Title Transfer Facility' (TTF) 4 , UK's 'National Balancing Point' (NBP) 5 , French's hub 'Point d'Echange de Gaz Nord' (PEGN) 6 and the Austrian 'Central European Gas Hub' (CEGH) 7 for the period October 2007 to August 2012. 8All prices represent the settlement prices of the respective trading day.Monthly futures contracts are preferred to quarterly or seasonal products to account for the tendency towards the trading of monthly contracts with short maturity (NMA, 2012).Descriptive statistics of the return series, computed as the differences in the logarithms of two consecutive daily settlement prices, are provided in Table 1.
For the subsequent econometric analysis, the stationarity properties of all price series are investigated using the Augmented Dickey Fuller (ADF) test and the nonparametric Phillips-Perron test to avoid misleading statistical inference.In general, the null hypothesis of a unit root in the log-level cannot be rejected, which is the case for the first differences (i.e., the daily returns). 9Thus, the cost-of-carry hypothesis between the spot and futures markets at the considered hubs can be investigated using cointegration analysis as proposed by Johansen (1988). 10The null hypothesis of 11.In the following, this study focuses on the month-ahead contracts.This is in line with the fact that the trading of futures contracts at the European gas hubs is centered on these contracts.Test statistics for futures contracts with longer maturity are presented in the Appendix.However, the choice of maturity does not alter the main empirical findings significantly.
no cointegration between spot and month-ahead prices can be rejected for all hubs. 11The relevant test statistics are presented in the Appendix.

IV. THE ROLE OF LIQUIDITY AND STORAGE CAPACITY
Differences in the informational efficiency between the European gas hubs may be caused by different sources.Most notably, sufficient market liquidity is regarded as an important element for an efficient price formation process (see, e.g., Chordia et al., 2008).Besides, the availability of flexible storage capacity and a functioning third party access to these facilities may be a prerequisite for efficient intertemporal arbitrage activity.A direct empirical investigation of both potential efficiency determinants seems promising but suffers from the lack of suitable and comprehensive data sets.Nevertheless, the subsequent paragraphs present stylized facts on these potential determinants for the hubs analyzed in order to provide an indication.
The spot and futures markets of the gas hubs considered in this study differ significantly with respect to their liquidity.While the NBP hub can be considered as mature and liquid, the younger continental European hubs suffer from low liquidity despite steadily increasing trading volumes during the last years.The churn rate, defined as the ratio between the number of traded contracts and the number of contracts that result in physical delivery of the underlying asset, can Source: IEA (2012a), Gasunie (2011), NCG (2011).The figures presented refer to total hub trades (sum of trades in the "Over The Counter" (OTC) market and those via exchanges).

Figure 1: Trading Volumes at European Gas Hubs
Source: IEA (2012a) be used to assess the degree of financialization of commodity markets.Table 2 illustrates the differences among the hubs with regard to their churn rates.The historical development of traded volumes is presented in Figure 1.There is no agreement as to which churn rate is required for a market to be considered as sufficiently liquid.However, a churn rate in the range from eight to fifteen is frequently regarded as critical (IEA, 2012a).As can be seen in Table 2, only the churn rate of NBP is situated within this range.Based on the superior liquidity of the British hub, information processing and thus price formation is expected to be more efficient at NBP compared to the continental European hubs.
With regard to storage capacity, a first indicator is the ratio of aggregated working gas volume to annual gas consumption.In addition, the flexibility potential of the existing storage capacities is crucial for an efficient adjustment of storage flows in order to exploit arbitrage opportunities.Appropriate measures for the degree of gas storage flexibility are the shares of aggre- gated injection capacity (IC) and aggregated withdrawal capacity (WC) on aggregated working gas volume (WGV).Table 3 presents data on WGV, measured in billion cubic meters (bcm), consumption (C, in bcm per year) and the three flexibility indicators for Germany, the Netherlands, the UK, France and Austria.
The data emphasize the ample storage capacity of the German, French and Austrian gas markets.In contrast, storage capacity in the UK is rather scarce since the WGV only amounts to approximately 5 % of annual gas consumption.The Netherlands range between Germany and the UK in terms of this indicator.With regard to operational flexibility, Dutch gas storages seem most capable of adjusting operations to changing market conditions in the short-run, while UK storage facilities are fairly inflexible.From a technical point of view, the storage indicators thus suggest that the storage market in the UK is less supportive of efficient intertemporal arbitrage activity.

V. PRICE FORMATION AT EUROPEAN GAS HUBS: LINEAR AND NONLINEAR CAUSALITY TESTING a. Econometric Methodology and Economic Interpretation
The Fama (1970) hypothesis of simultaneous information processing of markets for the same asset implies that there should be no systematic relationship between price changes on spot and futures markets.Otherwise price returns of one market may be helpful in predicting price returns of another market, allowing for risk-free profits.As a consequence, tests for a systematic relationship, i.e., causality tests, can be used to empirically test the hypothesis of simultaneous information processing.In other words, causality tests are helpful in testing whether a market A is quicker in processing information and hence more informationally efficient than another market B. In this case, market B follows the price changes of market A, which acts as the leading market.The finding of causality from a market A to a market B thus represents evidence of market A providing price discovery for market B in this example.In contrast, the hypothesis of equal informational efficiency does rule out systematic unidirectional causality between the markets considered.The most established econometric measure of causality is the concept of Granger causality (Granger, 1969).Granger causality exists if one variable is helpful in predicting future changes of another variable, i.e., the availability of current data on a certain variable reduces the forecast error of another variable.In statistical terms, a process is said to cause a process in the sense of Granger where is the optimal mean squared error of an h-step forecast based on the information (h X ) reflecting all past and current information (Lu ¨tkepohl, 2005).

X t
The Granger causality test outlined above is only capable of investigating linear relationships.However, there is empirical evidence suggesting nonlinearities in the relationship of commodity spot and futures markets, which is usually attributed to the nonlinearity of transaction costs and market microstructure effects such as minimum lot sizes (Bekiros and Diks, 2008;Chen and Lin, 2004;Silvapulle and Moosa, 1999) as well as to asymmetric information and heterogeneous expectations of market participants (Arouri et al., 2013).There are good reasons to believe that these drivers of nonlinear interaction are relevant for the continental European gas hubs, since low liquidity and technical constraints at these hubs may foster market frictions.As a consequence, the empirical investigation of price discovery at the European gas market should take nonlinearities into account.
For this purpose, the nonlinear causality test of Diks and Panchenko (2006) can be applied based on the Hiemstra Jones Test (Hiemstra and Jones, 1994).Their methodology tests whether the whole current conditional distribution of a certain variable A has predictive power for the future conditional distribution of variable B. Thus, not only causality in the first but also in higher moments of distribution can be investigated.From an economic perspective, this allows the empirical analysis of nonlinear interaction between the two markets caused by transaction costs such as information costs and bid-ask spreads or by technical constraints such as restricted network or storage capacity.
In statistical terms, the null hypothesis of absent nonlinear Granger causality between two series is tested using their conditional distributions.Assuming stationarity, the null hypothesis of with respect to implies that the conditional distribution of a variable given its past realization Y X Z equals the conditional distribution of given and .Thus, the joint probability Y = y Z Y = y X= x functions and their marginals can be used to state the null hypothesis as Diks and Panchenko (2006) show that the null hypothesis can be reformulated as As outlined by Diks and Panchenko (2005), the test statistic is corrected for possible size bias resulting from time-varying conditional distributions.Diks and Panchenko (2006) show that the adjusted test statistic is where is the estimator of the local density of a -variate random vector with where is the bandwidth depending on the sample size and is an indicator function.Diks and Panchenko (2006) demonstrate that the distribution of the test statistic equals 12.In doing so, the cointegration relationship between spot and futures prices is explicitly accounted for to avoid misleading inference.Ignoring an existing cointegration relationship may lead to invalid results of linear and nonlinear Granger causality tests, as outlined by Chen and Lin (2004).
for a lag length of 1 and with CϾ0 and . is the estimator of the asymptotic 1 1 4 3 variance of (Bekiros and Diks, 2008).Furthermore, Diks and Panchenko (2006) show that T ( ⋅ ) n the optimal bandwidth (i.e., minimizing the mean squared error of ) is To sum up, causality tests are used in this study to investigate the Fama (1970) simultaneous information processing hypothesis and thus price discovery at European natural gas hubs.Besides linear causality patters, nonlinear causality is explicitly addressed to account for the low-liquidity framework of the continental European gas hubs and for the technical specifics of the natural gas market.

b. Empirical Results
The linear Granger causality test is carried out for the price returns within the vector error correction (VECM) framework. 12In addition, the VECM-filtered residuals are tested for any remaining linear causality pattern.Table 4 contains the results of the linear Granger causality tests for the spot-and month-ahead return series.For the unfiltered return series, the null hypothesis of absent Granger causality can be rejected for the direction from futures to spot markets at all hub.This means that the change in the month-ahead futures price has explanatory character for the next day's spot price change.In contrast, only at GP there is (weak) empirical evidence of Granger causality from the spot to the futures market.Consequently, information is not processed simultaneously by spot and futures market participants.In fact, information is first processed within the futures market and subsequently transmitted to the spot market.Thus, the futures market seems to be the dominant market in terms of price discovery.The finding of the futures market providing price discovery for the spot market is especially noteworthy in the context of natural gas markets, where the information sets of spot and futures markets partially differ from one another.Most notably, short-run influences such as weather conditions or infrastructure outages are expected to affect spot market returns significantly, whereas their impact on the futures market should be limited.However, despite these specific characteristics of the purely physical spot market, the futures market still has significant explanatory power for the subsequent outcome of the spot market.
The informational superiority of the futures market may result from the broader scope of market participants at this market.The opportunity to trade futures contracts multiple times before maturity (and thus close out the trading position without taking physical delivery) makes the futures market attractive for hedgers and speculators without interest in physical delivery of the underlying asset.These additional market participants may cause a greater efficiency of information processing of the futures market compared to the one of the spot market, as suggested by Silvapulle and Moosa (1999) and Bohl et al. (2012).Overall, the empirical evidence of the month-ahead natural gas futures market leading the corresponding spot market is in line with the findings of Root and Lien (2003) and Dergiades et al. (2012) for the U.S. natural gas market.For the VECM-filtered series, the null hypothesis of absent Granger causality cannot be rejected in any direction for all hubs (test statistics are provided in the Appendix).This suggests that all linear causality is captured by the VECMmodel.
The nonlinear causality testing procedure is applied to the VECM-filtered residuals to ensure that any detected causality can be attributed to nonlinear interaction of the spot and futures markets.Following Diks and Panchenko (2006), the constant term of the bandwidth is set * C ⑀ n to . 13 8 As can be seen in Table 5, the null hypothesis of absent nonlinear Granger causality among spot and month-ahead return series can be rejected in both directions for all hubs except for CEGH. 14 However, this finding should be interpreted cautiously: As pointed out by Silvapulle and Moosa (1999), conditional heteroskedasticity of both series may distort the size of the nonlinear causality test.Following this argument, a multivariate GARCH model, the diagonal BEKK GARCH model of Engle and Kroner (1995), is applied to capture the dynamics in the second moment of distribution in both markets, filtering out conditional volatility effects. 15Subsequently, the nonlinear causality test of Diks and Panchenko (2006) is used for the BEKK GARCH-filtered VECM residuals.
For all hubs except for Gaspool, the nonlinear causality from spot to futures markets disappears after BEKK-GARCH filtering while for some hubs, nonlinear causality from futures to the spot market remains.This suggests that the predictive power of spot return distributions for subsequent distributions of futures market returns is mainly due to conditional volatility effects and thus not a result of informational superiority.Overall, the performed causality analysis suggests that price formation at European gas hubs generally takes place on the more informationally efficient futures markets with the less informationally efficient spot markets adjusting accordingly.16.With regard to the cost-of-carry relationship, the intercept in Equation ( 11) contains the time-invariant spread between futures and spot prices that can be assigned to the convenience yield, storage costs and the interest rate.Assuming timeinvariant carrying parameters, represents the deviation from the cost-of-carry relationship, triggering arbitrage trading ⑀ t

VI. LONG-AND SHORT-RUN DYNAMICS BETWEEN SPOT AND FUTURES MARKETS: THE EFFICIENCY OF INTERTEMPORAL ARBITRAGE a. Econometric Methodology and Economic Interpretation
The theory of storage calls for a stable long-run equilibrium between the spot and the futures market for the same underlying asset.The finding of cointegration relationships for the spot and futures market price series at all hubs in Section 3 thus confirms that the theory of storage holds in the long run.From an economic perspective, this means that there is significant arbitrage activity at these hubs in order to prevent that deviations from the long-run equilibrium are infinitely persistent.The intertemporal long-run relationship between spot and futures market can be written as Here, and are the spot and the futures prices, respectively.The coefficient β represents the S F t t degree of price convergence in the long run and captures the deviations from the long run ⑀ t between spot and futures markets.One should keep in mind that in case of time-varying carrying parameters (e.g.fluctuations of storage costs), may not completely reflect deviations from the equilibrium condition.
⑀ t 17.TVECMs have proved to be a useful approach for capturing arbitrage dynamics among spot and futures markets for financial assets (e.g., Anderson, 1997) and various commodities (Li, 2010;Huang et al., 2009;Root and Lien, 2003) by explicitly accounting for market frictions.
relationship. 16As outlined by Arouri et al. (2013), long-run informational efficiency implies cointegration between both price series and full price convergence of spot and futures markets, i.e., .β = 1 However, even if the market is informationally efficient in the long run, short-run inefficiencies may exist (Arouri et al., 2013).Such short-run inefficiencies are characterized by transitory deviations from the cost-of-carry conditions that are not immediately exploited by arbitrage activity.In order to assess the short-run informational efficiency of spot and futures markets, their shortterm behavior can be modelled by the following VECM: where is the adjustment coefficient representing the error correction of the series in case of any α deviation from the long-run equilibrium (Lu ¨tkepohl, 2005) and denotes the lag length.To assess k the efficiency of arbitrage, the coefficients are of central interest because they measure the speed α of error correction, i.e., the process of arbitrage activity eliminating the deviations from the intertemporal equilibrium.The greater the value of the adjustment coefficient in absolute terms, the more informationally efficient are the market participants in exhausting arbitrage opportunities.The coefficients hence represent a measure of intertemporal arbitrage efficiency.The and coeffi-α γ d cients account for autoregressive behaviour of the series and thus give an indication about whether lagged changes in the variables are significant for current changes of the variables and are not of interest for the assessment of intertemporal arbitrage efficiency .
The specified VECM assumes linearity in the error correction process.This implies that arbitrage activity starts instantaneously in case of any, arbitrarily small, deviation from the longrun equilibrium, thus neglecting any kind of market frictions.However, the exhaustion of arbitrage opportunities at European gas hubs may be constrained by significant transaction costs resulting from low liquidity and by physical constraints, e.g., limited injection and withdrawal capacity of storage facilities.Thus, intertemporal arbitrage may only be triggered if the deviation from the costof-carry equilibrium exceeds a certain threshold, such that the arbitrage traders are compensated for the incurred transaction or information costs (Li, 2010), resulting in a so-called "band of no arbitrage" around the long-run equilibrium.
In order to investigate whether intertemporal arbitrage is constrained by a "band of no arbitrage" due to market frictions, the TVECM approach proposed by Granger and Lee (1989) can be applied. 17The economic intuition of a TVECM is that arbitrage activity may depend on the magnitude of the deviation from the equilibrium.Thus, the model allows arbitrage efficiency to vary between different regimes.The bivariate TVECM of order applied to the bivariate system k of spot and futures market returns used in this study has the representation 18. Test statistics are provided in the Appendix.
19. Similar results are obtained from the VECM estimation for the interaction of spot prices and futures prices with longer maturity.The respective test statistics are presented in the Appendix.20.For instance, the absolute value of the adjustment coefficient of the NCG spot return series implies a half-life period of error correction of about five days.
21.The small sample size of CEGH does not allow for valid statistical inference when splitting the data in various regimes as done in the TVECM.
where denotes the regime indicator stating whether the lagged deviation from the long-run equi-I librium is below or above a certain threshold (in absolute terms).The adjustment coefficient α h ( ) represents the error correction dynamic for the case in which the absolute value of the deviation α l is higher (lower) than the threshold (Enders and Siklos, 2001).As a consequence, the coefficients and represent measures of regime-specific intertemporal arbitrage efficiency.Thus, comparing α α h l the significance and magnitude of the regime-specific adjustment coefficients allows for an assessment of the arbitrage efficiency in the different regimes and therefore for the relevance of market frictions.For instance, if is insignificant in the model while is significant, this points towards α α l h a "band of no arbitrage", i.e., arbitrage is not carried out due to market frictions unless the deviation from the intertemporal equilibrium crosses a certain threshold.In contrast, if and are both α α l h significant, this suggests that there is no empirical evidence of a "band of no arbitrage" as arbitrage dynamics are similar in both regimes.

b. Empirical Results
In order to investigate the efficiency of intertemporal arbitrage, linear VECMs as specified in Equation ( 12) are estimated for all hubs.Table 6 presents the estimated cointegration vector and the adjustment coefficients.The estimated coefficients are statistically significant and close to β unity.For all hubs, the hypothesis of full price convergence of the spot and the futures market, i.e., , cannot be rejected using likelihood ratio tests. 18Thus, the hubs analyzed can be considered β = 1 as informationally efficient in the long run.
Next, the potential short-run inefficiencies are investigated.The adjustment coefficient is statistically significant in all spot price return equations.Hence, deviations from the long-run relationship are corrected within the spot market at all hubs.In contrast, except for Gaspool where the futures price adjusts slightly, the futures price return series do not react to deviations from the equilibrium.This finding is in line with Huang et al. (2009), who obtain similar results for crude oil spot and futures markets in the period 1991 to 2001. 19The small absolute values of the adjustment coefficients imply a sticky error correction process and thus suggest a rather low efficiency of intertemporal arbitrage. 20Although this means that none of the considered hubs can be regarded as fully informationally efficient in the short run, arbitrage seems to be most efficiently exploited at NBP and CEGH.However, the empirical results of CEGH should be interpreted with caution due to the small sample size of the respective price series.In turn, the finding of high arbitrage efficiency at NBP is noteworthy as physical storage flexibility is much smaller in the UK than in continental Europe (see Table 3) and may be a result of the superior liquidity of the British hub.However, the difference in the speed of adjustment and hence in the degree of arbitrage efficiency between the hubs is fairly moderate.The TVECM of Equation ( 13) is estimated using different thresholds for all hubs except for CEGH. 21The thresholds are assumed to be symmetric and their Notes: *** Denotes significance at the 99 %-level.A lag length of 1 for the VECM is selected based on the Schwarz Information Criterion for NCG, Gaspool, TTF, PEGN and CEGH, while the same criterion suggests to include 2 lags for NBP.The autoregressive coefficients are not reported to conserve space.
22. The standard deviations of different -series range between 0.08 and 0.11.The thresholds selected for the TVECM ⑀ t estimation are and .In general, smaller and greater thresholds can be used to investigate the regime-dependent 0.5σ σ ⑀ ⑀ arbitrage dynamics.However, these threshold choices result in small regimes with large standard errors of the estimated coefficients, hindering valid statistical inference.The same problem occurs when estimating the thresholds endogenously following the procedure of Balke and Fomby (1997).
size is defined in terms of the standard deviation of , the error term of the cointegration regres-⑀ t sion. 22Table 7 contains the estimates for the regime-specific adjustment coefficients of the TVECM.
In the TTF spot price return equation, the adjustment coefficient is statistically significant in both regimes.Thus, for the threshold values tested, there is no empirical evidence of a "band of no arbitrage".In contrast, arbitrage at NCG, GP, NBP and PEGN does not start until the deviation from the long-run equilibrium exceeds a certain threshold (i.e., is insignificant for at least one α l specification).Surprisingly, although NBP is the most liquid hub in the sample, it exhibits a rather broad "band of no arbitrage", indicating significant frictions hampering instantaneous arbitrage.To sum up, intertemporal arbitrage starts most instantaneously at TTF but is executed most efficiently   95) %-level.The estimation is based on OLS using robust standard errors as proposed by Newey and West (1987).A lag length of 1 for the VECM is selected based on the Schwarz Information Criterion for NCG, Gaspool, TTF and PEGN, while the same criterion suggests including 2 lags for NBP.The autoregressive coefficients are not reported to conserve space.23.Most notably, the Third Gas Market Directive of the European Union from 2009 comprises various efforts to improve access to gas infrastructure and thus facilitates the development of liquid natural gas hubs (EU, 2009).
24.As initial value of , zero is selected assuming informational inefficiency at the beginning of the sample period.α The variance of the respective spot return series, , is selected as initial variance of and is set to .In line with the linear VECM specified above, one lag is included for NCG, GP, TTF, and PEGN while two lags are used in the specification for NBP.The small sample size of CEGH does not allow for a valid estimation of the state-space model for this hub.
at NBP once the deviation from the intertemporal equilibrium crosses a certain threshold.The first finding is in line with the high flexibility of Dutch gas storage (see Table 3), while the latter may be attributed to the superior liquidity of NBP (see Table 2).

VII. THE EVOLUTION OF ARBITRAGE EFFICIENCY: A KALMAN FILTER APPROACH a. Econometric Methodology and Economic Interpretation
Various political and regulatory measures have been introduced to foster the liquidity of the continental European gas hubs. 23As a consequence, one may expect informational efficiency at these hubs to have increased over time.To test this hypothesis, a dynamic state-space approach can be applied to capture the evolution of intertemporal arbitrage efficiency over time.Such timevarying coefficient models allow for an assessment of the evolution of the economic relationship investigated over time.Time-varying coefficient models have been used for the European gas market in different applications (see Neumann et al., 2006 andGrowitsch et al., 2012).However, this paper is the first to apply the state-space methodology within an intertemporal context for the European gas markets.In doing so, the intertemporal arbitrage dynamic is investigated by estimating Equation ( 14): where represents the time-varying adjustment coefficient following a random walk and ⑀ t -1 is α t the lagged error term of the linear cointegration regression.Thus, represents a time-varying α t measure of intertemporal arbitrage efficiency at the respective hub.Based on the hypothesis of increasing informational efficiency over time at the continental European hubs due to the rise in liquidity, the absolute values of the respective coefficients are expected to increase over time.In α t contrast, a decrease in the absolute value of would imply a decrease in intertemporal arbitrage α t efficiency.

b. Empirical Results
The state-space model of Equation ( 14) is estimated using the recursive procedure suggested by Kalman (1960). 24  25.The evolution of the adjustment coefficient in the futures return equation is neglected due to statistical insignificance.26.In the latter two periods, it seems reasonable to infer that the strong increase in spot price represents an immediate reaction to the physical supply and demand imbalance, independent from the futures market price.For a more detailed discussion of the economic impact of these events on German gas prices, see Nick and Thoenes (2013).
in the spot return equation. 25Some of the spikes in the plotted series can be attributed to the economic downturn in autumn 2008, and gas market-specific shocks such as the extraordinary supply interruptions resulting from the Russian-Ukrainian crisis in January 2009 and the cold spell in February 2012. 26There is a distinctive pattern in the evolution of the relative informational efficiency of the hubs considered over time, as can be inferred from the time-varying coefficient estimates: As of the beginning of 2008, the two German hubs NCG and GP are the least informationally efficient hubs.However, the absolute value of the adjustment coefficients grows towards the end of the sample period, indicating an increase in informational efficiency.In contrast, the absolute value of the adjustment coefficient of NBP decreases over time, indicating a decline in the efficiency of intertemporal arbitrage.Similarly, intertemporal arbitrage efficiency has decreased at PEGN despite the growth in liquidity.For the Dutch TTF hub, informational efficiency is stable at a rather low level in the second half of the sample period.Overall, there is convergence in the degree of informational efficiency of the hubs considered and only the informational efficiency of the two German hubs seems to have benefited from the increase in liquidity.Thus, as of 2012, the differences in intertemporal arbitrage efficiency of the hubs considered appear significantly reduced.

VIII. CONCLUSION
The objective of the paper was to analyze the informational efficiency of different European gas hubs by empirically investigating price discovery and arbitrage activity between spot and futures markets.For this purpose, linear and nonlinear econometric approaches were specified to explicitly account for the low-liquidity environment and the physical characteristics of the gas market.
Causality testing reveals that price formation generally takes place on the futures market.This finding is in line with the hypothesis that futures market participants react more efficiently to information than traders at spot markets (Silvapulle and Moosa, 1999;Bohl et al., 2011).It seems intuitive to attribute this finding to the broader scope of market participants on the futures market: Although the futures contracts considered result in physical delivery, the opportunity to trade the contract multiple times before maturity and thus to close out the trading position without taking physical delivery enables their use for hedging and speculation.Thus, in contrast to the purely physical spot market, the futures market is easily accessible for traders without interest in physical delivery.Apparently, this structural difference between both markets yields the futures market to be significantly informational superior compared to the spot market.In the light of hub-based pricing of internationally traded gas, an indexation on futures market prices rather than on spot market prices promises to provide more valid price signals.
The theory of storage seems to hold for all gas hubs considered in the long run, indicating the existence of arbitrage between the respective spot and futures markets.However, the error correction process is rather sticky and subject to significant frictions.From a dynamic perspective, the state-space estimations reveal a convergence in informational efficiency across the hubs during the sample period.With regard to the liquidity of the hubs investigated, the empirical results provide mixed evidence: On the one hand, intertemporal arbitrage opportunities are rather efficiently exploited at the liquid NBP and the rise in liquidity seems to have fostered informational efficiency at NCG and GP.On the other hand, the detected frictions in the price formation process and arbitrage activities are similar for all hubs, regardless of their liquidity.Therefore, it seems reasonable to attribute these frictions at least partly to physical characteristics of the market, e.g., limited storage flexibility or inefficient allocation of storage capacity, rather than exclusively to market liquidity.
A promising field for further research could be the direct empirical analysis of potential determinants of informational inefficiency of the hubs analyzed such as liquidity, storage utilization or network capacity.This approach, however, is currently aggravated by the lack of comprehensive data sets in adequate frequency and is therefore left for future research.Notes: *** Denotes significance at the 99 %-level.A lag length of 1 for both VECMs is selected based on the Schwarz Information Criterion for NCG, GP, TTF and PEGN, while the same criterion suggests to include 2 lags for NBP.

APPENDIX: TEST STATISTICS
(*) Denotes significance at the 99 ( Figure 2 presents the estimated time paths for the adjustment coefficients

Figure 2 :
Figure 2: Time-Varying Adjustment Coefficients of Spot Price Return Series

Table 4 : Pairwise Linear Causality Tests for Gas Price Returns
The results of the nonlinear causality tests for the other pairs of return series are presented in the Appendix.15.BEKK refers to the first letters of the names of Baba, Engle, Kroner and Kraft, who jointly developed the model.

Table A .8: Results of the Unit Root Tests
The unit root tests are specified with a constant but without a linear trend, as a time trend seemed inappropriate from the first investigation of the price series.The optimization of the lag length included for the ADF test equation was conducted with respect to the Akaike Information Criterion.The selection of the bandwidth for the Phillips-Perron test was based on the Newey-West procedure using a Bartlett kernel.

Table A .12: Pairwise Linear Causality Tests for NCG Returns
Notes: *** Denotes significance at the 99 %-level.For the raw return series, Granger causality was investigated within the VECM framework, explicitly taking into account the cointegration relationship.For the VECM-filtered residuals, causality testing is based on a VAR-model of the residuals, where the number of lags is optimized with respect to the Schwarz information criterion, suggesting the inclusion of one lag.

Table A .13: Pairwise Linear Causality Tests for GP Returns
Notes: *** (**) Denotes significance at the 99 (95) %-level.For the raw return series, Granger causality was investigated within the VECM framework, explicitly taking into account the cointegration relationship.For the VECM-filtered residuals, causality testing is based on a VAR-model of the residuals, where the number of lags is optimized with respect to the Schwarz information criterion, suggesting the inclusion of one lag.

Table A .14: Pairwise Linear Causality Tests for TTF Returns
Notes: *** (**) Denotes significance at the 99 (95) %-level.For the raw return series, Granger causality was investigated within the VECM framework, explicitly taking into account the cointegration relationship.For the VECM-filtered residuals, causality testing is based on a VAR-model of the residuals, where the number of lags is optimized with respect to the Schwarz information criterion, suggesting the inclusion of one lag.

Table A .15: Pairwise Linear Causality Tests for NBP Returns
Notes: *** Denotes significance at the 99 %-level.For the raw return series, Granger causality was investigated within the VECM framework, explicitly taking into account the cointegration relationship.For the VECM-filtered residuals, causality testing is based on a VAR-model of the residuals, where the number of lags is optimized with respect to the Schwarz information criterion, suggesting the inclusion of one lag.

Table A .16: Pairwise Linear Causality Tests for PEGN Returns
Notes: *** Denotes significance at the 99 %-level.For the raw return series, Granger causality was investigated within the VECM framework, explicitly taking into account the cointegration relationship.For the VECM-filtered residuals, causality testing is based on a VAR-model of the residuals, where the number of lags is optimized with respect to the Schwarz information criterion, suggesting the inclusion of one lag.

Table A .17: Pairwise Linear Causality Tests for CEGH Returns
Notes: *** Denotes significance at the 99 %-level.For the raw return series, Granger causality was investigated within the VECM framework, explicitly taking into account the cointegration relationship.For the VECM-filtered residuals, causality testing is based on a VAR-model of the residuals, where the number of lags is optimized with respect to the Schwarz information criterion, suggesting the inclusion of one lag.

Table A .18: Results of the Likelihood Ratio Test on the Cointegration Vector
The test was applied to the cointegration vector of the spot and the m + 1 futures prices.The null hypothesis of the LR test is: β = [1; -1].

Table A .19: Normalized Cointegration Vectors and Error Correction Coefficients (Spot-m + 2)
Notes: *** Denotes significance at the 99 %-level.A lag length of 1 for the both VECMs is selected based on the Schwarz Information Criterion for NCG, GP, TTF, PEGN and CEGH, while the same criterion suggests to include 2 lags for NBP.