1 Introduction

The question if and how governments react to increases in public debt adjusting the primary balance has attracted considerable attention over the last decades. In a seminal contribution, Bohn (1998) showed that a positive relationship between the primary balance/GDP and the debt/GDP ratios is a sufficient condition for sustainability of the debt, defined as government’s ability to service it. Many empirical tests of debt sustainability have been carried out estimating this relationship, known as fiscal reaction function (FRF), either for individual economies or for panels of countries. The prevailing conclusion is that there is good support for the hypothesis of the existence of FRFs and thus sustainability (see, e.g. the comprehensive review by Checherita-Westphal and Žd’árek 2017). However, there are good reasons for claiming that a large part of this evidence is unreliable, so that the question of the existence of a debt–primary balances relationship is still essentially open. The problem is that while the early studies (Bohn 1998, and before that, Trehan and Walsh 1991) carefully took into account the stochastic properties of the data, this is mostly not true for the more recent contributions, especially those taking a panel approach.

Given this background, our aim is to reach reliable conclusions on the FRF for the advanced economies. To this end, we will test for the existence of the FRF as a long-run relationship using adequate techniques, which for the non-stationary case differ for linear and nonlinear specifications, a point totally ignored in the literature. Following this approach, we shall find that, contrary to the commonly accepted conclusions, over the last five decades long-run FRFs existed only in a small number of advanced economies. The paper is organised as follows: in Sect. 2, we define the set-up, in Sect. 3, we describe that data and carry out some univariate preliminary analysis, in Sect. 4, we estimate the long-run FRFs and finally, in Sect. 5 we draw some conclusions.

2 Fiscal reaction functions

2.1 Set-up

The FRF literature originates essentially from Trehan and Walsh (1991) and Bohn (1998) and is summarised, for instance, in Bohn (2008). The central relationship is the intertemporal budget constraint (IBC), which states that debt at the start of a given period t, say \(D_{t}^{*}\), must be backed by the expectation of the present value of all future primary surpluses (S):

$$\begin{aligned} D_{t}^{*}=\sum _{i=0}^{\infty }E_{t}(u_{t,i}S_{t+i}) \end{aligned}$$
(1)

where \(E_{t}\) is the conditional expectation operator and u a pricing kernel. Bohn (1998) showed that a sufficient condition for the IBC to hold is that the ratio of primary balances/GDP (hereafter pb) is an increasing function of the lagged debt/GDP ratio (hereafter d) and a bounded innovation. The simplest example of such a function is the linear FRF:

$$\begin{aligned} pb_{t}=\rho d_{t-1}+\mu _{t} \end{aligned}$$
(2)

where \(\rho >0\), the term \(\mu _{t}\), which may depend on other determinants, is assumed to be bounded, and the present value of future GDP is assumed to be finite.Footnote 1 Given the typical dimensions of d and pb (the 1961–2019 averages are, respectively, 0.7% and 53%, see Table 1), the FRF coefficient \(\rho \) is expected to be quite small, most likely below 0.10.

Model-based sustainability analysis can be carried out estimating either directly equation (2) or an augmented version including a set of stationary variables Z capturing cyclical conditions, for instance, the output gap.Footnote 2 In either case, it is important to keep in mind that the existence of a stable relationship between pb and d is a sufficient, but by no means necessary, condition for sustainability. As remarked by Bohn (2008), the point is the degree of confidence markets have that a country will actually implement all future policies necessary to satisfy the budget constraint. Thus, empirical violations of the sustainable fiscal policy rule, as defined by (2), are possible if markets expect future policy shifts ensuring respect of the IBC.

An important development in the literature is due to Ghosh et al. (2013), who pointed out that the FRF should be generalised to account for the increasing difficulty governments may find to increase primary balances as debt grows, or “fiscal fatigue”. In practice, Ghosh et al. (2013) estimated a homogeneous panel model using a cubic specification of the type:

$$\begin{aligned} pb_{it}=\theta +\rho _{1}d_{it-1}+\rho _{2}d_{it-1}^{2} +\rho _{3}d_{it-1}^{3}+\beta Z_{it}+e_{it} \end{aligned}$$
(3)

finding significant nonlinear effects of the expected shape for their panel of 23 advanced economies for the period 1970–2007.

3 Primary balances/GDP and debt/GDP paths in the advanced economies

3.1 Data

Our empirical study will be based on the dataset assembled by Mauro et al. (2015), updated using IMF’s “World Economic Outlook Database” (WEO). The starting year is 1961, the earliest available for all advanced countries in the Mauro et al. (2015) data, while the final year is 2019. The panel includes 22 countries: G7 (Canada, France, Germany, Italy, Japan, United Kingdom and USA), all the other western and southern European economiesFootnote 3 (Austria, Belgium, Denmark, Finland, Greece, Iceland, Ireland, Netherlands, Norway, Portugal, Spain, Sweden and Switzerland), Australia and New Zealand.Footnote 4

The time-series plots of mean, median, first and third quartiles of the distribution of the pb and d ratios across countries are in Fig. 1, country plots in Figs. 2, 3 and 4, averages for both ratios in Table 1 and country-level statistics for d in Table 2.

In view of the strong synchronisation of the economic cycle across advanced economies (and, for the European countries, of the constraints imposed by the EU fiscal policy rules), we expect strong comovements of both variables across countries. This expectation is confirmed by the visual inspection of the summary plots in Fig. 1 and by the CD cross-dependence test of Pesaran (2015): the test statistics for pb and d are, respectively, 34.1 and 47.5, both rejecting with p-values close to zero the null hypothesis of weak cross-sectional dependence in favour of the alternative hypothesis of strong dependence.

Fig. 1
figure 1

Mean, median, first and third quartiles of primary balances/GDP (pb, left panel) and debt/GDP (d, right panel) in 22 advances economies, 1961–2019. Countries Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Italy, Japan, Netherlands, New Zealand, Norway, Portugal, Spain, Sweden, Switzerland, United Kingdom, USA

The second, and crucial, remark suggested by both the summary and the country plots is that until the late 1990s in most countries d appeared to have an upward trend, not shared by pb. The 2008 crisis (this year is marked in the plots by a vertical line) caused a combination of strong increases in expenditure and decline in revenues which produced immediate generalised, drastic falls in pb, often to historically low values, and significant growth of d. After a couple of years pb appeared to recover,Footnote 5 but in most cases d did not reverse its growing trend until the last years of the decade. As a result, at the end of the period the median of d was significantly higher than before the crisis (64.9% in 2019 vs. 57.8% in 2007). This pattern suggests that the fiscal consolidation efforts carried out after about 2010 were generally not large enough to overcome the burden of interest payments and possibly the impact of stock-flow adjustments increasing government debt, such as transactions linked to the bank bailouts.

In fact, stock-flow adjustments can be even more important than pb for d dynamics, see, e.g. Afonso and Jalles (2020). However, since our focus is on the possible existence of a relationship running in the opposite direction, from d to pb, we shall not pursue this issue any further.

The trending behaviour of d suggested from the visual inspection is confirmed by standard ADF tests, which indicate that in all countries d should be formally considered a unit root process (details in Table 10 in “Appendix”).

For countries belonging to European Union (EU), this raises an interesting question, as the Maastricht Treaty, signed in 1992 (approximately the mid-point of our sample), introduced a 60% upper limit for d as one of the “convergence criteria” to be respected by Member States. The limit has been also later included in the 1997 “Stability and Growth Pact”. Strict rules on the adjustment path to be followed in case of breach have also been established. Clearly, non-stationary behaviour is not compatible with a barrier of this type, nor with the adjustment rules. However, what matters in practice is how a rule is imposed, and here, there is evidence that the enforcement has been loose enough to make non-stationarity fully possible.

To begin with, at the end of 1997, when the European Commission formulated the recommendations for the third stage of the monetary union, d was below the 60% limit only in four countries (France, Luxembourg, Finland and the United Kingdom). In all the other countries (including those where it was more than twice as high, Belgium and Italy), its dynamics was nevertheless considered by the Commission satisfactory enough to grant a positive recommendation (European Commission 1998).

In the following years, the key policy tool on this matter has been the “Excessive Deficit Procedure” (EDP), which, according to Art. 126 of the “Treaty on the Functioning of the European Union”, can be launched by the European Commission against Member States whose d exceeds the limit and is “not diminishing at a satisfactory pace”. However, EU official documents remark that “Enforcement [...] was weak, resulting in serious fiscal imbalances in some EU countries, exposed when the economic and financial crisis struck in 2008.”Footnote 6 Only after the 2011 sovereign debt crisis, “a satisfactory pace” has been precisely defined to mean that ”the gap between a country’s debt level and the 60% reference needs to be reduced by one 20th annually (on average over 3 years)”.Footnote 7 This requires (on the average) positive primary surpluses, and it is easily seen to imply a deterministic autoregressive data generating process for d above 60%. Formally, denoting the 60% limit by \(d_{u}\) and ignoring the 3-year smoothing for the sake of simplicity, the rule can be written as

$$\begin{aligned} \triangle \left( d_{t}-d_{u}\right) =-0.05\left( d_{t-1}-d_{u} \right) \end{aligned}$$
(4)

which can be rearranged in the form of a stationary deterministic AR(1) equation in d:

$$\begin{aligned} d_{t}=0.95d_{t-1}+0.05d_{u}. \end{aligned}$$
(5)

In fact, things did not change much even after the introduction of this automatic deterministic rule. Priewe (2020) remarks: “Since 2013 [...] until 2018 the debt level of the 6 high-debt-countries [...] remained on average almost constant, with 3 large countries even increasing their debt levels (France, Italy, Spain)” (p. 14).

Summing up, the empirical evidence of a unit root in d which we found, although in principle at odds with EU rules for \(d>60\%\) since the introduction of the debt convergence criterion in 1992, is in practice explained by their loose enforcement.

Going back to the data, the high heterogeneity of debt dynamics suggested by inspection of the plots in Figs. 2, 3 and 4 is confirmed the box plots in Fig. 5 and the statistics in Table 2. The median of d over time can be as low as about 20% (Australia, Finland) and as high as nearly 100% (Belgium, Italy). The range of the maximum values is even wider, from about 46% in Australia to 240% in Japan, with Greece second at 185%. To help classify, albeit approximately, the countries according to their debt profiles it may be useful to compare the individual medians and maximums over time with those of the “median country”, defined as a virtual country whose debt/GDP ratio is in each year equal to the median of the distribution of d across the 22 countries of the sample.Footnote 8 In Fig. 6, these appear as the horizontal and vertical lines dividing the (Median d–Max d) space in four quadrants. Applying a very simple rule, we may classify as “high debt” the countries falling in the right-top quadrant, which have both median and maximum greater than those of the median country. These are seven altogether: Belgium, Canada, Greece, Italy, Ireland, Japan and USA. Portugal and United Kingdom, in the left-top quadrant (medians lower than in the median country, but high maximums), may be included in this group as borderline cases. Australia, Denmark, Finland, Germany, Norway, Sweden, Switzerland, all in the left-bottom quadrant, are definitely “low debt”. New Zealand, Austria and the Netherlands, with maximums close to, or below, the threshold and medians higher than the threshold, but still lower than the 60% limit of the Maastricht treaty, may be considered borderline “low-debt” countries. Finally, France, Iceland and Spain (left-top quadrant) are hard to classify; considering the very low medians, in a binary classification they may be considered more “low debt” than “high debt”.

Fig. 2
figure 2

d (solid line, left axis) and pb (dashed line, right axis), 1961–2019. Vertical line at 2008

Fig. 3
figure 3

d (solid line, left axis) and pb (dashed line, right axis), 1961–2019. Vertical line at 2008

Fig. 4
figure 4

d (solid line, left axis) and pb (dashed line, right axis), 1961–2019. Vertical line at 2008.

4 Long-run fiscal reaction functions

We now proceed to test for the actual existence of long-run stable FRFs, starting with the more general polynomial (or nonlinear) specification proposed by Ghosh et al. (2013). We will first test for the existence of a long-run polynomial d–pb relationship, and when such a relationship is found to exist, estimate it. For the countries where no polynomial relationship is found to hold, we then carry out an analogous test-estimation cycle for long-run linear d–pb relationships. Note that stationary cyclical variables, irrelevant for conclusions on long-run debt sustainability, will not be considered.

4.1 Searching for polynomial FRFs

With non-stationary variables, estimation of polynomial equations is a challenging task. The problem is that, as well-known (see, for instance, Ermini and Granger 1993) but often overlooked, powers of integrated variables are not difference stationary, for any order of differencing. This implies that the asymptotic results for the usual cointegration tests and estimators for cointegrating regressions, based on the assumption of difference stationary variables, cannot be used for polynomial non-stationary models. We need a completely different set of econometric tools, recently developed by Wagner and coauthors in a series of contributions (Wagner 2015; Wagner and Hong 2016, and the references therein).

The starting point is the concept of a stable polynomial relationship, or cointegrating polynomial regression (CPR), that is, an equation including powers of non-stationary variables and with stationary errors.Footnote 9 Existence of a CPR may be tested using three tests, hereafter “CPR tests”. The first test, \(P_{u}\), is a variance ratio test based on Phillips (1990b) for the null hypothesis of no cointegration. The other two tests use instead as a null hypothesis “cointegration”: the first is a generalisation of the Shin (1994) variance ratio test, hence referred to as “Shin test”, while the second is a LM specification test of the RESET class. The key point to be taken into account in the empirical implementation of these CPR tests is that simulation results in Wagner and Hong (2016) suggest that with our sample size the Shin and LM tests may suffer from low power. No evidence is instead available for the \(P_{u}\) test. To maximise reliability of the results as much as possible, we thus considered a joint test, concluding in favour of existence of a FRF if, and only if, the Shin and LM tests do not reject the null of no cointegration and the \(P_{u}\) test rejects the null of no cointegration. Further, we choose significance levels for the individual tests yielding a conservative joint test of the FRF hypothesis. More specifically, considering individual significance levels in the traditional [0.01,0.10] range, we fix the significance level at the minimum, 0.01, for the Shin and LM tests (which have the null hypothesis that the FRF is a stable polynomial relationship) and at the maximum, 0.10, for the \(P_{u}\) test (which has the opposite null hypothesis that the FRF is not a stable polynomial relationship). With these individual significance levels, the Bonferroni upper bound of the family-wise error rate (FWER) of the joint Shin/LM/\(P_{u}\) test is 0.12.

Finally, the LM test has been constructed using the fourth power of debt as a test variable, as in Wagner (2015) for the cubic specification of the environmental Kuznets curve, formally identical the our FRF.

Table 1 d and pb, averages 1961–2019
Table 2 Debt/GDP ratio: selected descriptive statistics 1961–2019
Fig. 5
figure 5

Debt/GDP ratio: box plots 1961–2019. Box limits: first and third quartile; “whiskers”: 1.5\(\times \)interquartile range; horizontal line: median; “+”: mean; dots: outliers. Note that (i) the scale differs across panels; (ii) the countries appear in alphabetical order according to the full names. Country codes: see footnote 4

Fig. 6
figure 6

Maximum versus median debt/GDP ratios, 1961–2019. Horizontal and vertical lines: values for the “median country” (see footnote 6); right-top quadrant: “high debt”; left-bottom quadrant: “low debt”. Country codes: see footnote 4

Since asymptotic results for the CPR tests are available only for models with right-hand side variables which are non-stationary without a drift (Wagner and Hong 2016, Assumption 1), we first of all need to check the time-series properties of the d series. This implies two steps: first, testing for a unit root; second, for the series found to have a unit root, testing for presence of a drift. The most natural way of carrying out this two-step check is (i) computing an ADF test and (ii), when the unit root is not rejected, testing for the significance of the constant in an ARIMA(p,1,0) model.

As anticipated in Sect. 3.1, standard ADF tests (details in Table 10 in “Appendix”) always supported the unit root hypothesis for d except for Iceland. This case will be discussed below. In all the other countries, the tests for the significance of the drift in ARIMA(p,1,0) models suggested that a drift in d can be excluded in all cases except France, Greece, Italy and Japan (details also in Table 10). In these four countries, the CPR tests cannot be used, and we will proceed directly to testing existence of a linear FRF (Sect. 4.2 ).

As already anticipated, Iceland is the only case in which there is evidence against a unit root in d. This country was hit exceptionally hard by the 2008 recession, with GDP declining by 13% and debt trebling within two years (see OECD 2019). After the financial crisis, d had an extremely wide swing, rising from 29% in 2007 to 99% in 2011, and falling back to 37% in 2019. Under such strong heteroskedasticity, standard tests for a unit root are likely to be invalid (Cavaliere and Taylor 2009), so that the rejection of the unit root may be spurious. However, CPR testing procedures might also be affected. Considering the extremely small size of this economy (population 400,000, GDP in 2019 about 26 billion USD), rather than pursuing a specific, and presumably complex, modelling strategy we preferred to drop it from the analysis.

For the 17 countries which at this point are left in the panel, we compute the joint CPR tests. In spite of our FRF-conservative stance, even counting as favourable to the FRF some borderline cases, the hypothesis of polynomial FRF is supported only in five countries: Austria, Germany, Netherlands, Norway, Portugal and Switzerland (see Table 3). For these five countries, we can proceed to estimate cubic polynomial FRF regressions using the fully modified OLS estimator (Wagner and Hong 2016), which we refer to as FM-CPR. The results, reported in Table 4, are disappointing. In all cases but Germany, the estimates are not significant, suggesting very weak links of the third degree polynomial in the debt/GDP ratio with the primary balance. In these cases, following a “general to specific” strategy we tested a quadratic specification of the FRF. The rationale is that, although the quadratic FRF is nested within the cubic FRF and thus implicitly tested in the set of tests of the latter, we cannot exclude that in small samples, a more parsimonious specification excluding the third powers may yield different results. However, the CPR tests for the quadratic FRF and the FM-CPR estimates of the potentially cointegrating equations never support existence of a quadratic FRF (see Tables 5, 6), confirming Germany as the only country with a polynomial (cubic) FRF. From Fig. 7, we can see that the cubic FRF for this country is monotonically decreasing for levels of d smaller than about 35%, reached in the early 1980s. This appears to be a politically sensitive threshold triggering consolidation efforts, but only up to about 70%. Beyond this level, fiscal fatigue seems to appear, with higher levels of d associated with lower levels of pb.

We now move to the next step, testing for the existence of linear FRFs.

Table 3 Cubic FRFs: cointegrating polynomial regression tests, 1961–2019
Table 4 Cubic FRFs, FM-CPR estimates 1961–2019
Table 5 Quadratic FRFs: cointegrating polynomial regression tests, 1961–2019
Table 6 Quadratic FRFs, FM-CPR estimates 1961–2019
Fig. 7
figure 7

FRF for Germany

4.2 Searching for linear FRFs

Given the panel set-up of our study, it is now natural to abandon the single country approach and use a panel test for linear cointegration which would grant higher power. However, the test should be chosen carefully. First of all, we need a test robust to the strong cross-country links in both variables (see Sect. 3.1). Second, we must keep in mind that our aim is to establish in which countries, if any, of our panel a linear FRF holds. Specifying the null hypothesis as in the traditional Engle–Granger approach, this means that we are interested in testing the null hypothesis “in no country of the given panel pb and d are linearly cointegrated” against the alternative hypothesis “pb and d are cointegrated in all countries of the given panel”. Considering a sequence of nested panels, we will in this way be able to identify, if it exists, the subset of countries where the FRF holds. Clearly, to ensure rejection of no cointegration if, and only if, all relations in the panel are cointegrating, we need to summarise the individual test statistics in the manner most favourable to the null of no cointegration. For Engle–Granger-type tests, this implies taking their maximum.

Taking all our needs into account, we conclude that the bootstrap test Max(HEG) by Di Iorio and Fachin (2014), robust to short- and long-run dependence across units, appears fully suitable for our needs. The results are reported in Table 7. We first analyse a panel including all the countries except Iceland, where d was found to be trend stationary, and Germany, where a cubic FRF was found to be a cointegrating relationship. The null hypothesis that the linear FRF is not a cointegrating relationship is comfortably not rejected for this panel of 20 countries with a p-value of 0.20 (5000 bootstrap redrawings). However, excluding Italy, which provides the strongest support to the null hypothesis of no cointegration, the p-value falls drastically to 0.04 (“panel 2”, second row). We are thus authorised to proceed to the estimation of linear FRFs for all those 19 countries. The results, obtained by FM-OLS (Phillips 1990a) and reported in Table 8, are, however, disappointing: in several cases, the estimated FRF coefficients are negative or, although positive, not significant. Excluding the countries whose coefficients have the wrong sign, we restrict the panel to ten countries, “panel 3” in Table 7. For this smaller panel, the null hypothesis that the linear FRF is not a cointegrating relationship is rejected. In fact, given the very limited sample size the p-value is small enough (0.04) to be considered strong evidence in favour of the FRF. Further restricting the panel to the countries with a positive and significant FRF coefficient, we are left with only five countries (Belgium, Greece, Norway, Portugal and Sweden, “panel 4” in Table 7). In view of the minimum dimension of this panel, the p-value of 0.14 can still be reasonably considered as lending support to the FRF hypothesis. With the only exception of Norway (where, thanks to the oil revenues, the conditions of public finances are unique: it is the only country where pb has always been positive, reaching a maximum of 20% in 2008), all coefficients are between 0.04 and 0.08, a range consistent with a priori expectations and in line with the literature.Footnote 10

Table 7 Panel linear cointegration tests, 1961–2019
Table 8 Long-run linear FRFs, FM-OLS estimates 1961–2019

4.3 FRF: Where and when?

Looking at the results of our testing and estimation exercise, summarised in the top panel of Table 9, two questions naturally arises: first, what have in common the countries for which a FRF could be estimated for the period 1961–2019, “FRF countries” for short? Second, are these results reasonably robust to the estimation sample? For instance, do they continue to hold if we truncate the sample at 2007, excluding the decade after the 2008 financial crisis? We tackle the two questions in turn.

What have the FRF countries in common? From the point of view of the debt/GDP ratio, these six countries seem to have very little in common: on the basis of the simple descriptive analysis of Sect. 3.1, three of them (Belgium, Greece and, with some caution, Portugal) can be described as “high debt” and three (Norway, Sweden and, with some caution, Germany) as “low debt”. Further, they do not seem to have any special features either. For instance, in the “Median d versus Maximum d” plot (Fig. 6) we find Germany, Belgium and Sweden to be very close, respectively, to Denmark, Italy and Switzerland, where the FRF does not hold. This picture, although it may appear puzzling, is nevertheless consistent with the FRF being a sufficient, but not necessary condition for sustainability. As remarked by Bohn (2008), the point is the degree of confidence markets have that a country will actually implement all future policies necessary to satisfy the budget constraint. Empirical violations of the sustainable fiscal policy rule defined by (2) are possible if markets expect future policy shifts ensuring that the IBC is nevertheless respected. This said, there is one feature relatively more common among FRF countries, namely EU membership: five out of the 14 EU countries of our panel are FRF countries, while only one out of the eightFootnote 11 non-EU ones is. This is quite consistent with the fact that, as discussed above, since 1992 the EU has been developing a body of rules in order to “ensure that countries in the EU pursue sound public finances”.Footnote 12 Thus, although EU rules have not been enforced strictly enough to rule out non-stationarity in d (see Sect. 3.1), a fiscal policy stance systematically linking primary balance to debt appears to be somehow more likely in EU countries than in non-EU ones (Tables 10, 11).

Table 9 FRFs: overview 1961–2019 and 1961–2007

Let us now move to the second question: are the results robust over time? In order to answer this question, we repeated the testing and estimation process described in Sects. 4.14.2 for the period 1961–2007. For obvious reasons of space, we shall not discuss here all the details (available in Tables 12, 13, 14, 15 and 16 in “Appendix”) but proceed directly to some considerations on the final result, the list of 1961–2007 FRF countries, reported in the lower panel of Table 9. Comparing the lists for the two periods, we first of all notice that before 2008 the number of FRF countries was significantly wider, ten instead of six. In five cases, the FRF held over 1961–2007 but collapsed after 2008. According to the classification of Fig. 6, four of these, Canada, Ireland, USA and Italy, are “high debt”, and one, Spain, borderline “low debt”. Thus, a “high-debt” profile is definitely more common among countries where the FRF collapsed after the 2008 crisis. However, we should be careful not to rush to conclusions: in other “high-debt” countries, Belgium, Greece and Portugal, FRFs held in both periods. This is also true for two “low-debt” ones, Norway and Sweden. In Germany (“low debt”), the FRF holds only on the longer time span: this is the only case going against the general tendency of a weakening of the d–pb relationship after the 2008 crisis.

In fact, comparing the 1961–2019 and 1961–2007 estimates of the linear FRFs (respectively, Tables 8, 16) we find that the coefficients are essentially constant, or even larger, for the longer sample in Sweden and Norway (both “low-debt” countries, see Figs. 5, 6 and Table 2) but smaller in Belgium, Greece and Portugal. In general, this is likely to be a consequence of the impact on public finances of the 2008 crisis. In the case of Belgium, the decline of interest rates (see Fig. 9) may also have been important, as with lower interest rates smaller reactions of the primary balance are required to obtain the same amount of debt stabilisation.Footnote 13 In Greece and Portugal, interest rates instead grew considerably in the last part of the sample as a consequence of the 2011 sovereign debt crisis, so that this channel is out of question.

Finally, the 1961–2007 estimates confirm the scarce support for polynomial specifications, here found also in one case only, Italy. Although formally a cubic, the FRF for this country (Fig. 8) is essentially a slightly asymmetric concave parabola with minimum at \(d=60\%\), showing no signs of fiscal fatigue. The first part of the curve can be associated with the fiscal policies prevailing from the 1960s to the early 1980s, while the second part to the introduction of the 60% upper limit for d introduced by the Maastricht treaty. This leads us back to the remark made above on the possible impact of this limit. Contrary to what found for the full 1961–2019 sample (when the EU share is definitely higher), for the shorter 1961–2007 period the share of EU and non-EU countries implementing a FRF is essentially the same, respectively, seven out of 14 and three out of seven. On the one hand, this is not surprising. Introduced in 1992, the EU debt limit could influence the policy stances of the EU countries for only one-third of the 1961–2007 sample, as opposed to about one half of the full 1961–2019 sample. On the other hand, it is also interesting, as it suggests that after the 2008 crisis, the effects of the EU efforts to “ensure that countries in the EU pursue sound public finances” certainly did not weaken with respect to the pre-crisis years.Footnote 14

Fig. 8
figure 8

FRF for Italy, 1962–2007

Fig. 9
figure 9

Interest rates on government bonds: Belgium, Greece and Portugal. Vertical lines at 2008; data for Greece and Portugal not available for part of the period (source: IMF, International Financial Statistics)

5 Conclusions

Our aim was to carry out a careful assessment of model-based debt sustainability analysis for a group of 22 advanced economies from the early 1960s until 2019, thus including the decade after the 2008 financial crisis. The main lesson seems to be that the FRF is a deceivingly simple model: inadequate techniques may lead to particularly misleading conclusions. In contrast to the widely reported evidence in support of FRFs (e.g. Mauro et al. 2015; Plödt and Reicher 2015; Everaert and Jansen 2018), using a wide range of specifications (cubic, quadratic, linear) and adequate tests and estimators we could find positive evidence in favour of the existence of long-run FRFs only in six countries (Belgium, Germany, Greece, Norway, Portugal and Sweden) out of the 22 of our panel. Further, the evidence for polynomial effects, with some sign of “fiscal fatigue”, is limited to Germany. Before the 2008 crisis, the group of countries implementing FRFs was significantly larger: to the countries listed above, we need to add Canada, Ireland, Italy, Spain and USA (but exclude Germany). Also, where FRFs always hold the d–pb link appears to have been stronger before the financial crisis. A point of some interest is that over the entire 1961–2019 period the share of EU countries implementing a FRF, although small, is definitely higher than that of non-EU countries (respectively, five out of 14 and only one out of eight). The two shares are instead approximately the same (about one half) for the period before the financial crisis, 1961–2007.

Summing up, there is no evidence that FRFs are generally implemented in the advanced economies, and this is especially true after the 2008 crisis. These results warn against the widespread practice of estimating homogeneous polynomial panel FRFs, as, for instance, those reported by D’Erasmo et al. (2016).