Consistent flexibility: Enforcement of deficit rules through political incentives

We study the optimal design of a deficit rule in a model in which the government is present-biased, shocks to tax revenues make rule compliance stochastic, and a rule violation reduces the payoff from holding office. We show that: i) the benchmark policy of the social planner can be always implemented via an optimal nonlinear deficit rule and under certain conditions even under a linear rule; ii) the optimal rule prescribes a zero structural deficit but only partially accounts for shocks; and iii) a government with a stronger ex-ante deficit bias should be granted a higher degree of flexibility. JEL classification: D02, D72, E6, H2, H6.


Introduction
Fiscal rules are widely used to constrain a government's fiscal policy and aim for moderate levels of budget deficits, debt or expenditure levels (see Davoodi et al. 2022, Budina et al. 2012, Yared 2019); at the same time, however, they should allow for enough flexibility in order to stabilize the economy in the presence of macroeconomic shocks. The optimal design of a fiscal rule that is consistent with both objectives is therefore a key challenge.
More specifically, there are two theoretical questions regarding the design of fiscal rules. The first question is how to optimally balance the benefit of committing the government against overspending versus the benefit of granting it discretion to react to shocks. The second question is how to provide incentives to run moderate budget deficits in a way that minimizes distortions on the fiscal policy composition and, in turn, prevents unintended market inefficiencies and output losses. Although the former question has been analyzed in the recent theoretical literature on fiscal rules (Amador et al. 2006, Halac and Yared 2014, 2022b, the second question has been largely overlooked. In this paper we aim to fill this gap by analyzing what an optimal deficit rule (sometimes also called balance or budget rules) should look like when i) the political process leads to a deficit bias, ii) the government has full discretion regarding both a distortionary labor income tax and the level of a public good, and iii) monetary punishment mechanisms are absent. More specifically, we ask how restrictive-in terms of a maximum deficit limit-and how flexible-in terms of accommodating shocks to public finances-an optimally designed deficit rule should be.
The motivation to focus on deficit rules, rather than expenditure or revenue rules, is that deficit and, to some lesser extent, debt rules dominate in practice. Our main contribution consists in: i) showing that the maximum deficit limit should be zero and that shocks should be only partially accommodated, as well as ii) elucidating how the postulated tradeoff between discipline of elected politicians and flexibility of the fiscal rule may not exist at all.
These results are derived in a model that has the following three key features. First, a shock to tax revenues makes compliance with a fiscal rule uncertain, as we assume full commitment by politicians to their policy platforms. An adjustment of fiscal policy after the shock is ruled out. This is motivated in two parts. For one, both the submission of budgetary plans by J o u r n a l P r e -p r o o f Journal Pre-proof competing politicians and the election of a policymaker take place before the shock is realized. 1 Uncertainty about government revenues and expenditures is a central feature of budgetary planning and forecasting. For another, adjusting fiscal policy after observing the fiscal shock is not always possible because the time to respond may be too short or spending commitments have already been legally implemented and would imply that voters took this into account in the voting process, leading to less-than-full commitment and reduced incentives to draft prudent budgetary plans. Adjustment of fiscal policy after the shock entails therefore efficiency costs and weaker fiscal discipline, meaning that it may not be ex ante desirable for the regulatory authority and society to allow for governments to achieve rule compliance through ex post fiscal policy correction.
The second feature concerns the type of punishment when fiscal rules are not complied with.
Our assumption is that monetary punishments of rule violations are absent. A number of reasons motivate this assumption. Monetary punishments may not be credible. They are either simply wasting resources ex-post or involve a pure transfer of resources from countries with high to those with low marginal utility of the public good. In addition, the punishment for a violation of a non-conditional deficit rule during a recession has a pro-cyclical effect and reduces the policymakers' ability to smooth public consumption over time. This may generate credibility issues, because the fiscal authority may step back on the commitment to punish violations during a recession. In light of these observations, it is not surprising that monetary punishments, such as the fine of up to 0.5% of GDP for violation of the EU's Stability and Growth Pact, have not been used.
Instead we assume that the violation of a fiscal rule leads to a loss in the rent of holding office in the next period, which may discipline politicians. The mechanism requires that stakeholders in the political process and the general public value compliance with fiscal rules, and that the media or other institutions such as fiscal councils make non-compliance public knowledge. The rent loss in our setup can thus be interpreted as a reputational cost, similar to Halac and Yared (2018).
Third, we restrict the designer of the fiscal rule to use solely deficit rules, meaning that a J o u r n a l P r e -p r o o f Journal Pre-proof rule consists of a threshold on the realized deficit-to-output ratio. This implies that, for any given value of such ratio, the government has full policy discretion with respect to tax rates and public spending. This assumption is justified by the great prevalence (see LLedo et al. 2017) and the desirable properties of this class of rules relative to available alternatives (Gros and Jahn 2020). At the same time, we allow the elected policy maker to choose a distortionary labor tax (rather than assuming a fixed tax revenue, such as in Halac and Yared (2014) and (2018)).
The interaction between the deficit and tax choices is crucial for the design and the properties of the optimal deficit rule. Intuitively, with an endogenous tax rate the policy maker has an additional tool to manipulate the output level and thus the probability of non-compliance with the rule.
We analyze whether an optimally designed deficit rule can achieve the outcome a social planner would choose, and we characterize the optimal rule. A deficit rule consists of a function that maps the values of (1) the output and (2) the ratio of the tax shock to output into a maximum level of deficit to output. In order to characterize a deficit rule, we define two measures. The first is tightness; i.e., the level of the maximum (structural) deficit; that is, the highest deficit level allowed under the rule if the tax shock takes its expected value of zero. The second is flexibility; i.e., the degree to which the tax shock modifies the maximum deficit level. The latter measure captures the extent to which fiscal rules accommodate macroeconomic circumstances, one-offs and other observable (and contractible) temporary circumstances. Our definition of rule flexibility draws on the macro-fiscal policy literature 2 and is distinct from the homonymous concept used in the mechanism design literature (Amador et al. 2006, Halac andYared 2022b) as a synonym for the degree of policy discretion granted to policymakers to accommodate unobservable taste shocks.
Main Results. We derive four main results after initially showing that in our framework the expected budget deficit is rising in the political present bias (Prop. 1) and excessive in the absence of a fiscal rule (Prop. 2). First, the benchmark policy of the social planner can be always implemented via an optimally designed deficit rule even if the policymaker has access to a distortionary labor tax which allows her to influence the probability of rule compliance (Prop. J o u r n a l P r e -p r o o f Journal Pre-proof 3). A deficit rule is therefore sufficient to deal with the joint issues of the political distortion and the stochastic nature of the budget process. Yet, the optimal rule is much more complex when a distortionary tax is available than when it is not.
Second, we characterize the class of deficit rules (under the distortionary tax) that implement the benchmark and satisfy some minimal conditions. We find that any such optimal rules prescribes a zero structural deficit (Prop. 4i). The intuition that underpins this result is simple.
Politicians' tax choices affect output by distorting labor supply decisions. The deficit rule is in the form of a threshold on the deficit/output ratio. This implies that the maximum level of deficit allowed by the rule is increasing in output at a rate equal to the value of the threshold itself. Thus, if the latter is set to zero, then there is no impact of output on the probability that a violation of the rule occurs; therefore imposing a zero structural deficit is sufficient to ensure that tax choices are not distorted.
Moreover, we show that, typically, the optimal rule accounts only partially for the tax shock; that is, the maximum deficit under the rule is the target level minus a fraction lower than one of the tax shock relative to GDP (Prop. 4ii). A full consideration of tax shocks under the target of a balanced structural budget is typically not optimal because either the marginal cost of increasing public debt becomes too large in terms of expected cost of rule violation-and hence the rule induces a debt level that is too small-, or the probability of punishment approaches 1, implying that the politician faces a fixed expected cost of rule violation that does not affect her optimal choices. Third, any optimal deficit rule prescribes more flexibility to governments that have-ceteris paribus-stronger incentives to run excessive deficit in the first period, as measured by the political present bias due to the neglect of the interest of future generations in the current political process (Prop. 5). The intuition is the following: because the shock is not observed in the moment in which the fiscal policy is chosen in the first period, the policymaker faces a probability of being punished in the next period. The more flexible the rule is, the greater the marginal effect of increasing the planned deficit on the probability of being punished. In other words, a more flexible rule is more effective in disciplining the politician because it implies a stronger link between current fiscal policy and the probability of future punishment. At the extreme opposite of the spectrum, under a very inflexible rule, the marginal effect of increasing Moreover, our analysis is complementary to that of Halac and Yared (2022b), who also examine the design of optimal fiscal rules under limited enforcement. They investigate the tradeoff between the benefit of committing the government against overspending versus the benefit of granting it discretion to react to privately observed shocks by shifting government resources from nondistortionary sources across time periods. Conversely, we investigate the tradeoff between the benefit of reducing intergenerational transfers towards current generations due to an excessive public deficit and the cost of generating intratemporal inefficiencies due to distortionary taxation. Thus, in our framework compliance with a deficit rule is linked via the taxation decision to the efficiency of the market outcome.
In terms of results, Halac and Yared (2022b) look at the properties of the optimal fiscal rule and the punishment when the rule is violated. They show that the deficit limit is laxer than in a situation with perfect enforcement of a fiscal rule and that in case of violation the penalty should be maximal, which is in line with other work on optimal contracts in the presence of adverse selection. On the contrary, in our setup, the shock on tax revenues is fully observable and contractible by the regulator, such that the optimal deficit limit varies with its realization.
This feature allows us to characterize the optimal deficit rule (conditional on the realization of the shock) in terms of maximum structural deficit limit (tightness) and degree of responsiveness to shocks to public finances (flexibility).
Our results relate to the design and use of deficit rules in practice. First, the zero structural deficit is in line with those fiscal rules that require a (structurally) balanced budget or that target a balance near to that, such as balanced budget rules in the US (for an analysis see, for example, Asatryan et al. 2018) or the German debt brake. Second, although second-generation fiscal rules account for cyclical fluctuations, and are therefore considered advantageous from an economic perspective, they are often criticized on practical matters, because the output gap is difficult to estimate in real time. Our results indicate that full flexibility is not optimal even when the output gap estimation itself is not an issue. We discuss these and further policy aspects in Section 5.
Outline of Paper. The remainder of the paper is organized as follows. In Section 2 we and the low compliance of about 50% with the EU's Stability and Growth Pact over two decades (Larch and Santacroce 2020 and Reuter 2019) are indicative. Eyraud et al. (2018) report that lack of compliance is a worldwide problem.
J o u r n a l P r e -p r o o f Journal Pre-proof describe the model and solve for the socially optimal policy in the absence of political economy considerations. In Section 3 we then introduce voting for candidates, which leads to a present bias in government spending. The existence and features of the optimal (linear and nonlinear) deficit rules are considered in Section 4. In Section 5 we discuss our theoretical results in light of the current debate in the EU on the design and flexibility of fiscal rules and present several extensions and robustness results. Section 6 concludes.

Model
We study a small open economy that lasts for two periods b = 1, 2. The population of consumersvoters is a continuum of size 1 in each period. A share ϑ 1 of the population is of type T = Y and cares both about the current period and about the next period, while a share (1 − ϑ 1 ) is of type T = O and only cares about the current period. One can think about the two types to be "young" vs. "old" voters (an alternative interpretation could be "forward looking" and "myopic" voters). A young voter survives to period 2 with probability equal to π. Thus, a share πϑ 1 of the population lives for two periods. The political present bias that we introduce later into the model and drives our results is directly linked to this share. Given these assumptions, the individuals born at the beginning of period 2 represent a share ϑ 2 = 1 − πϑ 1 of the total population in that period.
All individuals work and consume a consumption good and a public good in both periods.
There are no savings. 5 The government collects taxes on labor income and provides public goods in periods 1 and 2. Tax revenues are stochastic in period 1.
At the beginning of period b = 1 two candidates run for elections. Each of them fully commits to a policy platform consisting of a linear income tax rate t 1 on labor income and a level of planned debt D 1 . Because the government faces a budget constraint, each platform (t 1 , D 1 ) implies a corresponding level of provision of the public good. The actual level of debt is determined after a shock to tax revenues is realized given the policy package (t 1 , D 1 ) implemented by the winner of the election. At the beginning of period b = 2 the same two candidates run 5 In our model with utility being linear in consumption, we can show that allowing for savings and relaxing the small open economy assumption would not change our results, because there is a unique and fixed interest rate that clears the saving market given any potential amount of debt that the government needs to finance. This alternative setup is outlined in Section 5 and described in detail in the online appendix.
J o u r n a l P r e -p r o o f Journal Pre-proof for elections. Each of them fully commits to a policy platform consisting of a linear income tax rate on labor income. There is no default, thus all debt must be repaid in period 2, and the public good level follows as a residuum. An elected candidate always implements the platform he/she proposes before the elections.
A deficit rule can be imposed in period 1, whose violation carries cost for the government in period 2. The stochastic nature of tax revenues makes compliance with the deficit rule uncertain ex-ante.

Private sector
Consumers in each period b ∈ {1, 2} derive utility from consumption of a private good c b , which is produced using labor as only input with a linear technology, and of a public good g b . In each period b ∈ {1, 2} individuals supply labor l b ∈ [0,l] and are compensated at wage rate w b > 0 (equal to their productivity). They face a strictly convex cost of labor v(l b ) with v ′′′ ≥ 0. The wage at time 2 is assumed to be w 2 ≥ v ′ (l), which implies that the labor supply in period 2 is fully inelastic.
Income is taxed at a linear rate t b , such that c b = (1 − t b )w b l b . Thus, the within-period utility of any type of consumer for b ∈ {1, 2} is given by where u is strictly concave and satisfies lim gb→0 u ′ (g b ) = +∞. The lifetime utility of a young household born in period 1 is therefore where β is the discount factor. Individuals born in period 2 live for one period only. Thus, the young generation born in period 2 enjoys utility U (c 2 , l 2 , Note that the wage rate w b and the utility cost of labor v(·) are identical across young and old citizens in any given period, as is the quasi-linear utility function. Thus, the two types face the same tradeoff between utility from consumption and cost of labor. As a result, the optimal labor supply is the same across types. Because of that, for ease of notation we denote with l b the labor supply of a citizen of any type in period b.
J o u r n a l P r e -p r o o f Journal Pre-proof

Government sector
The government faces different decisions over time. In period 1 tax revenue has two components: where t 1 ∈ [0, 1] is the tax rate. The second component ϵ is the realization of an independently distributed shock with support ϵ ∈ [−a, a], and such that E[ϵ] = 0. Specifically, we assume that the shock on tax revenues ϵ is distributed as a two-sided symmetrically truncated normal with c.d.f. F (ϵ): where Φ (·) is the c.d.f. of the standard normal distribution. The truncation is imposed to avoid problems such as negative public good supplies due to excessively large negative tax shocks.
The government can borrow from abroad at a fixed interest rater. 6 Let D act 1 denote the stock of debt at the end of period 1, after the tax shock has realized. The intended debt level D 1 is the one planned prior to the realization of the tax shock. Thus, D act defined as a positive tax revenue shock. In period 1, by assumption the government repays its existing debt inherited from the past D 0 . Before the shock is realized, the planned government budget in period 1 must satisfy We assume in the following that the budget constraint holds with equality and write public consumption good as function of the tax rate and the intended debt level g 1 (t 1 , D 1 ). 7 6 All the results in Proposition 1-8 hold true in an alternative setup featuring a closed economy with endogenous interest rate, as outlined in Section 5. Full proofs are provided in the online appendix. 7 We impose the constraint g1(t1, D1) ≥ 0 to avoid negative public consumption, D0 ≤ D 1 − a /(1 + r) to ensure that g1(t1, D1) ≥ 0 is feasible, and set g1 = 0 for all (t1, D1) such that g1(t1, D1) ≤ 0, if any exists. The assumption u ′ (0) = +∞ ensures that this constraint is never binding.
J o u r n a l P r e -p r o o f

Journal Pre-proof
The government budget constraint in period 2 has formula: Similarly to period 1, we construct g 2 (t 2 , ϵ), using the budget constraint in period 2.
We assume that the value of productivity w 2 is large enough to ensure that the repayment of debt in period 2 can be always fully satisfied. Specifically, we impose where D 1 represents the maximum value of the intended debt in period 1. Moreover, we assume that the choice of planned debt level D 1 lies within the range D 1 , D 1 . Lastly, the bounds D 1 ,

Normative Benchmark: Social Planner's Problem
For this analysis we introduce a benevolent social planner who can set D 1 and t 1 optimally in period 1, from which the public good level in period 1 follows immediately from (4). Thus, the plan- to maximize the sum of the utilities of all individuals over both periods -i.e., including the utility of the future generation. He/she discounts the utility of the future generation at rate β. In period 2, based on the actual debt level of period 1, the planner chooses the labor tax t 2 ∈ [0, 1] and the public good level g 2 to maximize the the sum of the utilities of all non-deceased individuals. Thus, the indirect utility of a young or old individual individual in period 2, which is also equal to the objective function of the social planner, writes: 9 Recall that labor supply is perfectly inelastic in period 2, which implies in conjunction with the separable utility function that g 2 is implicitly defined by the planner's first order condition for utility maximization in period 2, u ′ (g 2 ) = 1, and is independent of D 1 . Therefore, for a given actual inherited public debt level from period 1, D act 1 , the planner's optimal tax rate in period J o u r n a l P r e -p r o o f Journal Pre-proof 2 follows from the government budget constraint (5). These considerations allow us to move to the analysis of the planner's period 1 optimization problem (while anticipating the period 2 outcome). 10 Denote with u Y 1 (t 1 , D 1 ) the indirect expected lifetime utility enjoyed by a young voter in period 1 under policy (t 1 , D 1 ), and with u O 1 (t 1 , D 1 ) the one enjoyed by a old voter. The former writes: where expectation are rational given history. The latter is given by: Recall that the social planner maximizes the sum of the discounted utilities of all individuals over both periods. Thus, her objective function writes where ϑ 2 = 1 − πϑ 1 is the share of young individuals in period 2, as introduced above. It is easy to show that the social planner's objective function is strictly concave in (t 1 , D 1 ). Substituting the formulas from (6)-(8) for u Y 1 (t 1 , D 1 ), u Y 1 (t 1 , D 1 ), and u Y 2 (t 2 , ϵ) into the above, we derive the planner's problem The solution to (10), denoted by (t * 1 , D * 1 ), is called the optimal policy. Notice that the social planner's objective function is independent of ϑ 1 , ϑ 2 , and π. Rational expectations imply that in period 2 t 2 is chosen optimally given D 1 and ϵ. Thus, the first order conditions are: 10 The assumption of perfectly inelastic labor supply in period 2 is solely a matter of convenience. If labor supply in period 2 is not fully inelastic, the equilibrium conditions illustrating the optimal intertemporal allocation of resources change slightly, but the trade-offs underpinning the social planner's choice are qualitatively unchanged.
J o u r n a l P r e -p r o o f Journal Pre-proof where η 1 (t 1 ) is the tax elasticity of labor supply at tax rate t 1 . The assumptions on the function u ensure that the solution of the planner's problem is interior. Condition (12) shows that g 2 is independent of D 1 (due to exogenous labor supply). Hence the social cost of an increase in D 1 by one unit is the discounted value of the the repayment of debt and the interest on it.

Deficit rule
In section 3 we assume that fiscal policy in any given period is not chosen by a social planner but by a policymaker who won the election in that period. Because policymakers focus on current voters, the well-being of future generations is ignored. This generates a present bias and leads to excessive deficits, against which a deficit rule may be put in place. In the remainder of section 2 we describe the structure of the fiscal rule whose optimal design will be considered in section 3.
A deficit rule R is in place, defined by the real analytic function R : S × Y → R. The government is compliant with the rule after the realization of the tax shock if and only if where we have used (4) and the relationship between actual and planned deficit. Given a rule R, we define a threshold 11 of the shock on tax revenuesε (t 1 , D 1 , y 1 | R), based on (13), below which the politician gets punished as the one that solves: Our setup presumes that the realization of the tax shock is fully observable and contractible 11 Note that, without further restrictions, the thresholdε (t1, D1, y1 | R) may not be unique given R. However, a uniqueε (t1, D1, y1 | R) exists as long as the rule flexibility (as defined in (16)) is less than 1-which happens to be the case for any rule that is optimal. See Proposition 4. (2021), who argue that availability of resources to a government can be measured, so it is possible to write contracts contingent on it. In practice, policymakers may have some room for manipulating the data if they have superior information. However, the presence of fiscal watchdogs and the widespread endorsement of a government's fiscal and economic projections by other, independent institutions make systematic manipulation unlikely. More generally, we believe that a noisy signal of the realization of the tax shock would not change our subsequent results, as long as voters and the rule designer take the signals rationally into account.
For any given rule R we define, the following concepts: 1. The tightness is the level of the rule R at s 1 = 0, that is, in a "normal" situation where the shock is zero, 2. The flexibility is the marginal effect of a decrease in the shock-to-output ratio s 1 on the level of the rule R evaluated at s 1 =s 1 ≡ε 1 (t 1 , D 1 , y 1 | R) /y 1 , 12 i.e.
An interesting case, also considered below in detail, is the one represented by a linear rule in the form R (s 1 , y 1 ) = κ − δs 1 for parameters κ ∈ [0,κ] and δ ∈ R. In such case the government is compliant with the rule if and only if: Notice that in the case of a linear rule tightness and flexibility are equal to the values of the parameters k and δ, respectively. Specifically, K (y 1 | R) = κ and ∆ (t 1 , D 1 , y 1 | R) = δ.
12 A more general definition of flexibility should consider the value of − ∂R(s 1 ,y 1 ) ∂s 1 at all possible values of the shock to output ratio s1. In order to pin down a unique measure, it is natural to evaluate the derivative ats1, which is the value of s1 at which flexibility does matter to determine the principal's punishment decision, and in turn the agents' choices. We now turn to a positive model of fiscal policy choices. In each period two candidates compete for the support of voters, and the elected winner implements her policy platform. We use a probabilistic voting model in the tradition of Lindbeck and Weibull (1987) and Banks and Duggan (2005). The equilibrium concept is Subgame-Perfect Nash Equilibrium.

Timing of events and choices
At the beginning of period 1 two candidates denoted by superscript I ∈ {A, B} run for elections.
Each of them fully commits to a policy platform (t I 1 , D I 1 ) consisting of a linear income tax rate and a level of planned debt. The (planned) level of public good follows from this policy proposal via the government budget constraint (4). The winner of the election implements her proposed platform. Voters observe the policy and choose their labor supply l 1 and consumption c 1 . The government collects labor taxes and provides a public good g 1 . At the end of period 1 a shock on tax revenues is realized and it is publicly observable. Such realization determines the actual level of debt accumulated D act 1 . At the beginning of the second period a new election takes place between the same two candidates. Each of them fully commits to a policy platform consisting solely of a linear income tax rate t I 2 , which via the government budget constraint defines public consumption, as there is by assumption no tax shock and no new borrowing in the second period. Then-if a deficit rule is in place-a supranatural authority or an independent fiscal institutions verifies if a violation of the rule has occurred in period 1 and, if so, imposes a punishment to the politician in power.
Thus, the punishment is a cost imposed on the policymaker regardless of who was in power in the previous period. 13 The winner of the elections implements her proposed platform. The government collects taxes, provides a public good g 2 , and repays debt.
As mentioned in the introduction, our setup is one of full commitment. Politicians are elected in period 1 on the basis of their policy platform, that is implemented once a person is voted 13 In this sense, the punishment affects the reputation of representative government institutions and not only that of an individual politician. Such a reputational loss typically translates into a relative empowerment of unelected officials within the administration and of competing institutions (e.g., expansion of the judicial power) and, in turn, into a lower capacity of elected politicians-whether previous incumbent of the post or not-to extract rent from holding office. Making the level of punishment conditional on the identity of the previous policymaker is left for future research.
J o u r n a l P r e -p r o o f Journal Pre-proof into office. This is a reasonable assumption if the time to respond after the shock is too short.
Moreover, if policy adjustments were feasible ex-post voters would need to take this into account ex-ante.
The politician that holds the office in period b ∈ {1, 2} enjoys an exogenous rent W b . If a violation of the fiscal rule has occurred in period 1, then the rent enjoyed by the politician that holds the office in period 2-whether incumbent or not-is reduced by an exogenous amount C < W 2 . 14 In period 2 politicians take as given the actual debt level inherited from period 1, and choose the tax rate to maximize the expected rent from office in that period. Conversely, in period 1, each politician wishes to maximize the weighted expected return of being in office in the two periods. For example, politician A in period 1 maximizes where win A b denotes the event corresponding to a victory of candidate A in the election at time b, and nc denotes the event of non-compliance with the fiscal rule in period 2.
The outcome of elections is probabilistic and shaped by voters' preferences.
In each period, each voter casts her vote for candidate A if the utility difference from electing A vs. B-conditional on the platform proposed by both candidates-is positive. The utility difference depends upon a deterministic and a stochastic component. Recall that u T 1 (t 1 , D 1 ) represents the indirect expected lifetime utility enjoyed by a type T voter under policy (t 1 , D 1 ). The deterministic part consists of the difference between the utility induced by the policy platforms that each politician has proposed;, i.e., u T . The stochastic part is 14 Our assumption is consistent with the remit of several independent fiscal institutions, which often have little formal power but influence government fiscal policy by publicly exposing a government that is in danger of violating fiscal rules or that uses overoptimistic forecasts in its budgetary planning (Calmfors and Wren-Lewis 2011, Beetsma and Debrun 2016, Beetsma et al. 2018). Such an enforcement mechanism has the advantage that the problem of the credibility of the commitment to punish is typically less severe, at least as long as the enforcer of the fiscal rule is independent of the government. Moreover, because the punishment affects politicians rather than citizens, this mechanism is less prone to induce direct pro-cyclical and/or distributive fiscal effects (see Beetsma Similarly, in period 2 a voter of type T ∈ {Y, O} casts her vote for candidate A if and only if

Voting Equilibrium and Equivalent Problem
It is well known that in a large class of probabilistic voting model the equilibrium policy outcome corresponds to the platform that maximizes a weighted average of the voters' expected utilities where the equivalent reputational cost C e captures the expected reputational cost that the politician must face in period 2 as a consequence of the punishment that is imposed if a violation of the deficit rule R occurs, and P r (nc | (t 1 , D 1 ), R) is the probability that the rule is violated ("nc" stands for non-compliance) given the policy implemented in period 1. The equivalent reputational cost C e is itself a function of the exogenous rent loss C, and of the endogenous probability of reelection faced by each politician.
The probabilistic nature of the voting process, together with the presence of the fiscal rule, imply that the candidates' equilibrium platforms are identical to the policy that a partially benevolent social planner would choose, whose policy differs from a social planner's (see formula (9)) due to the possible cost of violation of the fiscal rule and the lack of accounting for future young generations. We will refer to this fictive agent as the representative politician or, more simply, 15 The requirements are that both the variance σ 2 ν of the distribution of the voters' common taste shock and the rent W1 from being in office in period 1 are sufficiently large. If these two conditions are satisfied, then the equivalence between the outcome of the electoral game and the choice of the representative politician holds true as long as the objective function of the representative politician is strictly concave. Details in Appendix A.
16 When no rule is in place, such equilibrium is also the unique equilibrium of the electoral game. Otherwise, multiplicity is possible. Details in the online appendix.
Similarly, in period 2, the equilibrium platform maximizes the weighted expected utility of period 2's voters. Formal proofs of these results are provided in Appendix A.
Using the formulas for u Y , and abstracting from the parts that do not affect the optimal outcome, one can rewrite the politician's problem as follows: Comparing the above with the social planner's problem in formula (10), it is immediately evident that the two objective functions are identical, except for two aspects. First, the politician discounts future utility at rate βπϑ 1 , while the social planner does so at rate β only. Because of that, we call B 1 = 1 − πϑ 1 the political present bias, which is decreasing in the probability of survival and the size of the young cohort in period 1. Second, the politician's objective function includes a cost C e to be paid if the fiscal rule is violated.
Before we turn to the characterization of the main results in section 4 on the optimal design of a fiscal rule it is useful to understand the effect of the political bias on fiscal policy. The first result holds both in the presence of a fiscal rule and with no fiscal rule.  The next result follows from Proposition 1 and implies that without a fiscal rule the political process leads to an inefficient outcome, because the voters, on average, do not care about the future as much as a benevolent social planner does.

Proposition 2.
In the absence of a fiscal rule the equilibrium level of deficit in period 1 is weakly larger than the optimal level.
Proof. See Appendix B.
An implication of Proposition 2 is that in the presence of a present bias the period 1 tax rate is lower than the socially optimal one.

The Design of the Fiscal Rule
In this section we characterize the optimal fiscal rule when fiscal policy is chosen via the described political process. Proposition 2 gives room for a fiscal rule to improve the outcome. However, it is far from clear whether a fiscal rule can implement the optimal allocation that would be induced by a social planner who chooses the tax rate and the debt level in period 1 directly (which we denoted by t * 1 , D * 1 ). In period 2 there is no political bias, thus the politician's policy choice is the same as the one of the social planner. But the period 2 choice is affected by the level of debt accumulated in period 1, thus it is typically different from the one that would prevail if the social planner had chosen the policy in period 1. In this sense, the equilibrium policy in period 1 spills over into period 2, even though there is no further shock in that period, a fiscal rule does not need to be satisfied, and taxation is non-distortionary because the labor supply is perfectly inelastic.
Before moving to the main results, we first define implementation via a deficit rule and then define two desirable properties of a deficit rule R. From the perspective of the principal a fiscal rule is optimal if it induces the agents to implement the same fiscal policy that the planner would choose if he/she could dictate his/her preferred policy in period 1. This concept is formally defined as follows.
J o u r n a l P r e -p r o o f Journal Pre-proof Definition 1. A rule R is said to implement the optimal policy (t * 1 , D * 1 ) for a given level of political present bias B 1 if there exists an SPNE of the electoral game such that the unique policy platform optimally chosen by both candidates in period 1 in the presence of such a rule Definition 1 clarifies that we adopt a weak concept of implementation which allows for the possibility of multiple equilibria of the electoral game. 17 Condition (TCO) states that the level of structural deficit to output prescribed by the fiscal rule should not vary with the per capita income level. (TCO) is trivially satisfied by any rule that is output-independent; i.e., such that ∂R(s1,y1) This property is deemed as desirable whenever output is a variable that can potentially be misrepresented or manipulated by the politician. Moreover, it restricts the attention to a class of rules that is arguably superior in terms of ease of adoption and implementation. For instance, output typically exhibits a positive time trend. Thus, a rule whose level is dependent on output is going to prescribe a different level of structural deficit over time. Lastly, (TCO) is satisfied by a large class of widely adopted deficit rules. For example, the linear rule described in section 2.4 trivially satisfies this condition because R(0, y 1 ) = k, which is constant in y 1 . This type of rule is further analyzed in sections 4.2 -4.3. being unavoidable due to the use of distortionary taxation to collect revenue-regardless of the weights assigned to present and future citizens' utility by the representative politician relative to the benevolent social planner (a feature captured by the political present bias B 1 ). That is, (constrained) efficiency in the consumption of private and public goods in period 1 should be ensured, at least locally, irrespective of the debt level that is optimal with respect to the specific social welfare function that a benevolent regulatory authority aims to maximize.
Lastly, in our setup (LCE) is satisfied by several widely adopted deficit rules. For example, the linear rule described in section 2.4 satisfies this condition if the tightness parameter k is set equal to zero; i.e., whenever the linear rule prescribes zero structural deficit.

Characterization of the Optimal Deficit Rule
Our first main result establishes that the optimal policy is implementable.

Proposition 3.
A fiscal rule R that implements the optimal policy (t * 1 , D * 1 ) and that satisfies conditions (TCO) and (LCE) always exists.
Proof. See Appendix B.
Proposition 3 ensures implementability. Now we characterize the family of rules that are optimal in the sense of Definition 1. We focus on the class of rules that satisfy the property stated in Definition 2.
Proposition 4. If a deficit rule R satisfies (TCO) and (LCE) and implements the optimal , the rule prescribes zero structural deficit, and (ii) the flexibility of the rule ∆ (t * 1 , D * 1 , y * 1 | R) is lower than 1, i.e. the rule does not fully account for tax shocks.
Proof. See Appendix B. The next step consists in studying the comparative statics of the optimal rule. Specifically, we are interested in studying how the optimal degree of flexibility of the deficit rule responds to a marginal increase in the political present bias. In order to perform this exercise we need to impose additional structure because the optimal deficit rule is typically not unique for any given level of political present bias B 1 . Thus, we need a notion of monotonicity that accounts for this kind of multiplicity. We establish a criterium to compare the flexibility of any of the (possibly many) rules that are optimal at bias B 1 = B ′ 1 with that of any rule that is optimal at bias B ′′ Informally, our approach consists in constructing all the possible parametric families that include R ′ and that possess one (or more) members that implement the optimal policy at bias level B ′′ 1 . Then we evaluate the flexibility of any rule that is optimal at B ′′ 1 and that is a member of one of those families. If the flexibility of all such rules is weakly higher than that of R ′ , and this result holds true for all rules R ′ that are optimal at B 1 = B ′ , then we say that the flexibility of the optimal rule is weakly increasing in the political present bias B 1 in a neighborhood of B 1 = B ′ 1 . Formally, consider a family of rules ρ r defined by the real analytic function r : S × Y × Z r with Z r = ζ r , ζ r . A rule R is said to be a member of family ρ r (and writes R ∈ ρ r ) if there exists ζ ∈ Z r such that R (·, ·) = r (·, ·; ζ). Lastly, let ζ * (B 1 ) denote the value of ζ such that a rule R with R (·, ·) = r (·, ·; ζ * (B 1 )) implements the optimal policy (t * 1 , D * 1 ) given bias B 1 . It is easy to show that a family ρ r such that R ∈ ρ r can be constructed for any possible rule R. 19 Moreover, it can be shown that for any family ρ r such that r (·, ·; ζ * (B ′ 1 )) implements the optimal policy at B 1 = B ′ 1 , then it also implements the optimal policy for all values of B 1 in a neighborhood of B ′ 1 under mild restrictions. 20 In such case, we say that r (·, ·; ζ * (B 1 )) implements the optimal policy in a neighborhood of that R ∈ ρ r and such that for any value of B 1 in a neighborhood of B 1 = B ′ 1 there exists ζ ∈ Z r such that r (·, ·; ζ) implements the optimal policy (t * 1 , D * 1 ).
This definition delivers a very general notion of monotonicity. Namely, it applies to all possible families of rules that include R, and that ensure the implementation of the optimal policy within a neighborhood of B 1 = B ′ 1 . Using this notion of monotonicity, we can state the next main result of this paper.
Proposition 5. There exists finite ς > 0 such that if the variance of the tax shock is sufficiently large, σ 2 ϵ ≥ ς 2 , then the flexibility of the optimal rule ∆ t * 1 , D * 1 , y 1 | R * B1 is weakly increasing in the political present bias B 1 .

Proof. See Appendix B.
A more present-biased government requires, ceteris paribus, a larger reduction in the level of intended deficit in order to achieve the socially desirable outcome. Proposition 5 implies that such larger deficit reduction can be achieved through a more flexible deficit rule. Thus, it suggests that flexibility may actually encourage fiscal discipline, rather than jeopardizing it.
The intuition is the following. A more flexible fiscal rule reduces the weight of the shock on tax revenues in determining the probability of punishment, and increases the weight of the actual policy choices made by the politician. Therefore the marginal effect of running a larger expected deficit on the probability of punishment typically increases with the degree of flexibility.  true whenever the distribution of the tax shock is sufficiently "flat", i.e. if σ 2 ϵ is large enough. As a result, a more flexible fiscal rule tends to be more effective in disciplining the politician. Therefore a trade-off between fiscal discipline and flexibility may not always exist.
Although results in this section are reassuring, one might be concerned that rules in practice are not complex enough to implement the socially optimal solution, with the possible consequence that the result on flexibility may no longer hold. We therefore turn to the case of a linear rule that appears to be much closer to actual fiscal rules.

Optimal Linear Rule
Consider a linear rule in the form R (s 1 , y 1 ) = k − δs 1 , with k ∈ 0,k , δ ∈ R, as introduced in section 2.4. Given this rule, the government is compliant if def icit1 output1 ≤ κ − δs 1 . Proposition 2 gives room for a fiscal rule to improve the outcome compared to no rule. However, a linear fiscal J o u r n a l P r e -p r o o f Journal Pre-proof rule may not always be able to implement the optimal allocation (t * 1 , D * 1 ). Two problems may arise. Firstly, a linear rule may cause the politician's objective function to be non-concave, even if the social planner's objective function is concave. 21 Secondly, even if the rule can improve the outcome, it may not be able to achieve the optimal policy within the range of admissible parameter values. 22 Nevertheless, under certain conditions on parameters of the model both these problems can be resolved. This finding is formalized in the following statement. Proposition 6. The optimal policy is implementable via a linear rule (with appropriately chosen parameters κ, δ) if the tax shock has enough variance: σ 2 ϵ ≥σ 2 ϵ for someσ 2 ϵ ∈ (0, ∞).
Proof. See Appendix C. For this purpose, we define a thresholdδ = 1 − Corollary 7. If the optimal policy (t * 1 , D * 1 ) is implementable by a linear rule R = κ − δs 1 for all B 1 within an interval (B ′ 1 , B ′′ 1 ), then: (i) the implementation occurs at κ * = 0 and δ * ≤δ; 21 The conditions for concavity to hold in the presence of a linear rule are non-trivial. If the tax shock is truncated-normally distributed, they imply the variance of the distribution to be sufficiently large relatively to the maximum deficitD1 − D0. See Appendix C for details. 22 The linear rule implies a marginal probability of non-compliance which is typically strictly positive at all flexibility levels, as shown in Figure 2. Thus, if the political present bias and the variance of the tax shock are small, the rule may provide the politician with an excessive incentive to reduce public spending in order to avoid a potential punishment at all admissible values of the parameter δ. In this case, the linear rule induces a policy platform featuring a suboptimally low level of intended debt. Note that this issues is a direct consequence of the assumption that the rent loss faced by the politician in case of rule violation is exogenous. If the regulatory authority is allowed to choose the size of such rent loss, a linear rule that implements the optimal policy always exists. It should be noted that any linear rule featuring κ = 0 is also deemed to be desirable of the ground of efficiency considerations. Specifically, any such rule satisfies not only (LCE), but also global constrained-Pareto efficiency (GCE), meaning that it induces a constrained-Pareto efficient allocation for any possible value of the political present bias B 1 ∈ (0, 1). The latter property provides a further rationale for the adoption of linear rules, given that the exact quantification of the political present bias may often be a conceptually and empirically difficult exercise.
23 Note that the optimal deficit rule may feature negative flexibility, i.e. δ * < 0. This outcome corresponds to the case in which the cost of violating the rule C e is very large, such that the representative politician's optimal policy features a sub-optimally low level of deficit for any rule with κ = 0 and δ ∈ [0,δ). In words, the rule provides too much fiscal discipline. As such, this case is unlikely to occur within the range of empirically relevant values of the parameters of the model. Moreover, we can show that δ * ≥ 0 if W1 is sufficiently large. The proof to this result is provided in the online appendix.

Linear rule: Non-Implementable Case
Suppose that the optimal policy described in Corollary 7 is not implementable through a linear fiscal rule (0, δ * ), because the condition σ 2 ϵ ≥σ 2 ϵ that ensures implementability is not satisfied. In this case, we cannot rule out that the best linear rule that is consistent with the permissible parameter space κ ∈ [0,κ] could be worse than not having a linear fiscal rule at all, and hence would not be optimal. In the following, however, we characterize the optimal linear rule that maximizes the social planner's utility, assuming that it strictly improves with respect to the case in which no rule is in place. This allows us to get insights about the direction of the tightness and flexibility parameters when the optimal policy cannot be implemented but still a fiscal rule Using the truncated-normal distribution it is easy to show thatδ( we recursively define δ max as δ max ≡δ(D * * 1 ), which satisfies δ max < 1. Using this definition we can state the following result.
Proof. See Appendix C.
Proposition 8 provides a characterization that is far from complete. Yet, it delivers an important insight regarding the intuition that underpins the main results of this paper. If the conditions in Corollary 7 are not satisfied, it is not possible to induce sufficient fiscal discipline using flexibility only. Thus, the regulator can improve social welfare by manipulating the tightness of the fiscal rule κ. Whether this is optimal or not depends on the tradeoff between the benefit of reducing intertemporal distortions and the cost of generating intratemporal inefficiencies in labor supply decisions. The optimal tightness in this case depends upon the tax elasticity of labor supply.
If η 1 is large in magnitude, the distortions on labor supply are substantial. Thus, the optimal deficit rule prescribes κ * = 0, even if this implies a suboptimal intertemporal allocation of resources. Conversely, if the labor supply is sufficiently inelastic, the regulator optimally allows for some intratemporal inefficiency in order to achieve stronger fiscal discipline. 24 24 Note that whenever κ * ̸ = 0 the linear rule does not satisfy (LCE). As a consequence, the regulator is allowing for a pure efficiency loss in order to achieve some intertemporal redistribution from the current to the future generation.

Discussion of Model and Extensions
Our results depend on a number of simplifying assumptions. Firstly, our analysis abstracts from the possibility of asymmetric information between the regulatory authority, the politicians, and the voters. Our main results in Section 4.1 are fully robust to the introduction of asymmetric information between the regulatory authority and the politicians regarding the realization of a shock on the value of public spending, as in Yared (2014, 2022b).
Full details on this extensions of our framework are provided in the online appendix (available at https://drive.google.com/file/d/1b7GlSbTxPGUQyiUd2SWE9BTy0CFAhOCN/view?usp=sharing).
In particular, we show that the optimally designed deficit rule implements the optimal policy regardless of the specific realization of the taste shock and that all the results in Propositions 1-5 carry over in this modified setup. However, other forms of asymmetric information may affect the predictions of our analysis. For instance, our results imply that the degree of flexibility of the optimal deficit rule depends upon the level of average present bias within the population of voters. Thus, if such level is not perfectly observable by the principal, then agents may have an incentive to misrepresent the extent of the bias in order to obtain a more favorable deficit rule. Although this is a theoretical possibility, we believe it is unlikely to be a key issue for the purposes of our application, because regulatory authorities are typically independent from the executive power. Thus, it is reasonable to assume that the principal infers the level of B 1 directly from the citizens' observable characteristics and behavior, such as sociodemographic composition and past electoral choices, rather than from information reported by the politicians in power. Voters are less likely to manipulate the design of the rule than politicians because the assumption of a very large number of voters implies that each of them has no strict individual incentive to misrepresent his or her preferences. Moreover, a collective action carried out by a large number of voters aiming to distort the principal's design of the deficit rule requires a substantial degree of coordination and sophistication in voters' choices, which seems implausible.
Secondly, our assumptions rule out a specific moral hazard problem that may arise if the J o u r n a l P r e -p r o o f Journal Pre-proof realization of the tax shock depends upon some action by the politician that is not observable by the regulatory authority (e.g., private use of public resources and corruption). Although we acknowledge that this is a potentially important factor in shaping the optimal design of fiscal rules, we believe that it should not qualitatively affect the main tradeoff underpinning our results: a flexible rule is more effective in disciplining the politician than an inflexible one for the reasons illustrated in Section 3. In the same way, flexibility should help in disciplining the politician's behavior with respect to the choice of a hidden action that may affect her reputation.
Future research should look into the robustness of this intuition.
Thirdly, we assume that the voters' taste shocks are independently distributed over time.
Allowing for serial correlation does not qualitatively change our result, but it may have consequences on the optimal degree of flexibility. Specifically, in the presence of positive serial correlation, every rule tends to be-ceteris paribus-more effective in disciplining the politician, whereas the opposite is true if the correlation is negative. Because higher flexibility corresponds to stronger fiscal discipline in our model, the obvious consequence is that the optimal deficit rule prescribes less flexibility relative to the baseline result in the former case, and more flexibility in the latter. Positive serial correlation translates into an "incumbent effect" in period 2 because a candidate's electoral success today implies a higher probability that the same candidate will win the election tomorrow. Whenever the incumbent is more likely to be reelected, she assigns a greater weight to the payoff from being in office in period 2. Thus, each politician in period 1 is less "present biased" if the taste shock exhibits positive serial correlation.
Fourthly, we assume that voters and the regulatory authority possess perfect information regarding politicians' competence, preferences, and moral standards. As a result, voters' choices in period 2 are independent of the politicians' behavior in period 1; i.e., we rule out the possibility of retrospective voting aiming to punish incompetent or dishonest politicians. Although this is admittedly a strong restriction, it does not drive the key tradeoffs that underpin our results. Sixthly, we assume an exogenous interest rate. This assumption is imposed for the sake of simplicity and is fully innocuous: in the online appendix we show that all the results hold true under the alternative assumption of a closed economy with savings and endogenous interest rate.
In this alternative setup, the constrained-optimal allocation implemented through an optimal deficit rule implies perfect intertemporal smoothing of the marginal utility of the public good.
This finding is consistent with most traditional normative results on intertemporal allocation of resources.
Seventhly, we assume that the government faces a reputation loss for rule violation regardless of its policy decisions. One may argue that if a fiscal council assessed the ex ante fiscal plan to be in line with the fiscal target, then any ex post violation may just be attributed to bad luck and be excused by the voters and the public. While ex-ante compliance certainly matters, in our view the interaction between budgetary planning, surveillance by fiscal councils, and ex post compliance is more complex. Assessment of ex ante rule compliance is not zero-one, but rather a probabilistic statement. For example, the European Commission evaluates the Draft Budgetary Plans of EU member states in the preceding fall of the upcoming budget year in terms of how likely the rules will we complied with. In 2019, the Commission said that ten states were "compliant" with the preventive arm of the Stability and Growth Pact, Lastly, we restrict the principal to use only a specific class of target-based rules (deficit rules), as opposed to instrument-based rules (see Halac and Yared (2022a) for a discussion). This means that the occurrence of a rule violation cannot be determined directly by a politician's actions, i.e. the implemented level of public spending g 1 or the tax rate t 1 , implying that the government has full policy discretion. Conversely, the rule is in the form of a threshold on the realized deficit-tooutput ratio, which is allowed to vary with two outcomes: the realized shock-to-output ratio and the level of output per capita. The specific restriction to the class of deficit rules is motivated by their great prevalence (see IMF Fiscal Rules Data Set 2015). 25 More generally, the use of target-based rules is justified by normative arguments. Specifically, recent theoretical work shows that target-based rule are typically superior to instrument-based rules whenever the regulator's information is sufficiently precise (Halac and Yared 2022a). In the online appendix we show that a similar finding holds true in our framework. Specifically, while our main results regarding the optimality of deficit rules are robust to the introduction of a standard type of asymmetric information (see above), any possibility of implementation of the optimal policy via instrumentbased rules is not: in the online appendix we prove that no instrument-based rule implements the optimal policy in that scenario. This results illustrates how in our theoretical framework the possibility of implementing the constrained-optimal allocation through simple deterministic rules -for instance, an upper limit on public spending and on the expected tax revenue -is solely the outcome of the simplifying assumption regarding the absence of information frictions, 25 Expenditure rules are discussed as alternative, in part because governments control more directly expenditure compared to deficits. At the same time, expenditures rules are deemed to incentivize a distortionary use of tax expenditures, meaning that a government can favour certain groups or industries by offering them advantageous tax treatment in lieu of providing them direct funding (Gros and Jahn 2020). Consistently, an OECD report observes that "a number of countries putting in place an expenditure rule have simultaneously experienced a sharp increase in the number of tax expenditures" (OECD 2010).

J o u r n a l P r e -p r o o f
Journal Pre-proof and vanishes as soon as such assumption is relaxed. Thus, the inclusion of information frictions in our setup helps in shedding light on the normative reasons underpinning the great prevalence of target-based rules -and in particular deficit rules -in policymaking.

Policy Relevance
Our results speak to the actual design and use of fiscal rules. First, the zero structural deficit is in line with those fiscal rules that require a (structurally) balanced budget or that target a balance near to that. For example, many countries, as well as states in the US, have balanced Second, as noted earlier, so-called first generation fiscal rules, such as the Maastricht criteria, do not account for business cycle effects and thus tend to have an undesirable procyclical effect.
The second generation of fiscal rules, such as the German debt brake or the Fiscal Compact, have been designed to account for cyclical fluctuations. Given the definition of a cyclical effect, which is often measured by the difference between potential and actual output (output gap), the two rules cited fully adjust for the business cycle. In practice, these rules are considered advantageous from an economic/conceptual perspective, but they are often criticized on practical matters because the output gap is hard to estimate in real time and is subject to substantial revision over short time periods-and may contribute to procyclical fiscal policies. Our results indicate that full flexibility is not optimal even when the output gap estimation itself is not an issue.

Interpretation of the Stability and Growth Pact by the European Commission (2015) has
introduced further flexibility regarding the required fiscal adjustment towards the medium-term objective (MTO), which is a country-specific deficit target (often around 0.5-1%), when the MTO has not been reached. In particular, the European Commission demands lower fiscal ad-

J o u r n a l P r e -p r o o f
Journal Pre-proof justment as the current output gap worsens. Our model speaks indirectly to this issue because the Commission is concerned with the adjustment to the MTO when the fiscal target has not been reached, whereas our model concerns the level of the deficit target. In the Commission's framework, a lower adjustment speed in case of a severe shock can be interpreted in our framework as a looser deficit target. However, since some EU fiscal rules do account for the business cycle, the additional flexibility seems to suggest more than full responsiveness to shocks, which is in contrast to our results. Interestingly, the European Fiscal Board (EFB) in a report (2019) calls for discarding the flexibility interpretation because the rules have "failed to generate differentiated recommendations that reconcile sustainability and stabilization objectives" (p. 74).  Table 2.7). Compliance with the preventive arm of the SGP matches well with the planned deficit in our model. We also note that the present bias in practice may be to some extent endogenous, whereas in our theoretical model it is exogenous due to the generational structure. Hence, recommending greater flexibility of rules could be problematic from a normative perspective if the present bias can be manipulated, for example, through changes in institutions that lead to less stable governments (larger common pool problems).
Nevertheless, our result is useful because it sheds new light on the nature of the tradeoff between flexibility and fiscal discipline.

Conclusion
Fiscal rules play an important role throughout the world. Their design is of crucial importance to meet the dual objective of achieving sustainable public finances while also leaving room for stabilizing the economy. In this paper, we analyze the optimal design of a fiscal rule in an envi- Moreover, we assume that monetary punishments for rule violations are absent. Instead the disciplining force comes from a loss in payoff when holding the office in the next period, which could capture a loss in reputation. Despite these constraints on the fiscal instruments and the nature and timing of the fiscal policymaking process, we show that an optimally designed fiscal rule goes a long way.
We show that the optimal rule always prescribes a zero structural deficit. This finding is in line with the heavy use of (nearly) balanced budget rules in the real world. The economic reasoning behind our theoretical result is presumably different from the arguments used in practice, wherein simplicity and presumed generational fairness often play a role. In our model, a zero structural balance is optimal because politicians' tax choices affect output by distorting labor supply decisions. In addition, we show that the optimal rule accounts only partially for the tax shock. A full consideration of tax shocks under the target of a balanced structural budget is typically not optimal because either the marginal cost of increasing public debt in terms of expected cost of rule violation becomes too large-and hence the rule induces a debt level that is too small-or the probability of punishment approaches 1, implying that the politician faces a fixed expected cost of rule violation independent of the politician's choices.
Lastly, our paper raises a number of new challenging questions regarding the optimal design of fiscal rules. In particular, future research should elucidate the potential role played by rule flexibility in mitigating excessive government deficits caused by mechanisms other than voters' present bias. Examples of such alternative mechanisms include: (i) the moral hazard problem that may arise if the size of public deficit depends upon some action by the politician that is not observable by the regulatory authority, such as the private use of public funds or bribery; and (ii) cross-country spillovers on public debt that may emerge across financially integrated groups of countries, such as the Eurozone. Although we conjecture that flexibility may help in disciplining the politician's behavior in these alternative scenarios, future research should look into the robustness of this intuition. [25] Gros, D., and M. Jahn (2020). "Benefits and drawbacks of an "expenditure rule", as well as of a "golden rule", in the European fiscal framework," Economic Governance Support Unit, Directorate-General for Internal Policies, PE 614.523 -July.
Description of the two-candidate electoral competition. Voters have preferences in period 1 given by formulas (7) and (8), and in period 2 by formula (6). Let ϑ 1 and ϑ 2 denote the share of young individuals in period 1 and 2, respectively. 26 We assume an exogenous birth process and positive probability of survival from period 1 to period 2 denoted by π. Namely, a new generation of size ϑ 2 is born in period 2 and a share π of the young generation in period 1 survives and becomes the old generation in the following period. This implies that in the second period there are πϑ 1 old voters and ϑ 2 young voters.
Lastly, we assume that the total size of the population remains constant in the two periods, which implies ϑ 2 = 1 − πϑ 1 . Therefore, the share of elderly voters in the economy increases between the two periods if π ≥ 1−ϑ1 ϑ1 , and decreases otherwise. We adopt a modified version of Lindbeck and Weibull's (1987) where . We assume that a) H T b is strictly and some constant k h > 0. Assumption a) implies that the share of votes for candidate A is strictly increasing in the utility difference induced by the policies proposed by candidate A and candidate B (standard). Assumption b) states the two types of voters do not have ex-ante asymmetric preferences for the two candidates, meaning that if the two candidates propose the same platform, then the expected share of votes for each candidate is 0.5. Assumption c) ensures 26 Note that the notation ϑ b differs from θ b defined in (27). Later in this section it will become clear that our theoretical framework implies θ b = ϑ b for b = 1, 2.
27 Notice that, as in Banks and Duggan (2005), H T 2 (x2 + ν2) can be interpreted also as the probability of a voter of type T to vote for politician A conditional on ν2 and x =x2. Note that the uncertainty in the electoral outcome is entirely due to the common shock ν2. This implies in turn that for a large electorate the presence of a common shock ν2 is necessary to have probabilistic voting. Without ν2 the electoral outcome would be deterministic for all values of x2, except the exact point in which H T 2 (x2) = .5 and the two candidates win with equal probability.
σν . The random variable ν b represents a shift in voters' taste due to circumstances that cannot be foreseen by the candidates, and it is assumed to be common to all voters. Thus, the share of vote for candidate A in the whole population of voters in period 2 writes: The probability of victory for candidate A vs B in period 2 given D 1 , ϵ is: and the probability of victory for candidate B in period 2 is simply P . Politician A in period 2 maximizes her expected payoff, which is given by: where ph ∈ {nc, c} indicates whether the government was compliant to the fiscal rule in period 1 (if any was in place) and W ph 2 = W 2 − C × 1 [ph = nc]. One can easily derive the expected payoff of candidate B using the formula for P B 2 t A 2 , t B 2 , D 1 , ϵ, ϑ 2 . Lastly, we define the weight θ b for b ∈ {1, 2} as follows: . We omit the formal description of optimal candidates' behavior in period 2 because it is a standard outcome of the Lindbeck and Weibull's (1987) framework. Namely, in period 2 both candidates solve a standard two-candidates symmetric zero-sum game. The well-known results in Banks and Duggan (2005) and Lindbeck and Weibull (1987) apply. Specifically, if the distribution of the voters' taste shock ν 2 has large enough variance, 28 then there exists a unique J o u r n a l P r e -p r o o f Journal Pre-proof Nash equilibrium, which is in pure strategies, and such that both candidates propose the same platform and win the elections with equal probability. Lastly, the equilibrium platform of both candidates is the policy that maximizes the expected utility voters in period 2. We provide a detailed proof of these results in the online appendix. Similarly to period 2, the probability of victory for candidate A vs B in period 1 is: where H Y 1 , H O 1 represent the shares of citizens of type Y, O voting for candidate A. The probability of victory for candidate B in period 1 is simply P B . Candidate A in period 1 maximizes her expected payoff given t B 1 , D B 1 , which using the assumption that ϵ and ν 1 are independently distributed, has formula: wheret A 2 (D 1 , ϵ),t B 2 (D 1 , ϵ) denote the perfect foresight values-under the assumption that candidates will play the unique NE strategies in period 2-of t A 2 , t B 2 (conditional on D 1 , ϵ), respectively. One can easily derive the expected payoff of candidate B given t A 1 , D A 1 by using the formula J o u r n a l P r e -p r o o f Journal Pre-proof i.e.,ν 1 is the level of common taste shock ν 1 such that each of the two candidates obtains exactly half of the votes given policy platforms . Note that the assumptions on H Y 1 and H O 1 ensure existence and uniqueness of such level of ν 1 given any into the corresponding unique level of ν 1 that satisfies (30). We introduce the simplified notationν A 1 = . Both candidates and voters fully anticipate the unique NE outcome in period 2 conditional on the choices made in period 1 and the realization of the shock, and we know from the previous paragraph that each candidate in period 2 is elected with probability 0.5. Using the optimal proposals in period 2 (conditional on D 1 and the realization of the tax shock ϵ) and the independence assumption over the distributions of ϵ, ν 1 , ν 2 , the problem of candidate A writes: and similarly for candidate B: We defineū(t 1 , D 1 ; θ) to be the weighted average of period 1 voters' utilities; i.e.: given a weight θ ∈ [0, 1] (note that θ does not need to be equal to θ 1 ). We define for I ∈ {A, B} the following: J o u r n a l P r e -p r o o f Journal Pre-proof and we assume that W 1 > 0.5βC to ensure that w I W 1 , t 1 , D 1 , t −I 1 , D −I 1 > 0 for all t 1 , D 1 , t −I 1 , D −I 1 . Furthermore, we consider the matrix: whose entries have formulas: where Proof. Part (i). We must show that there exists a symmetric NE in pure strategies given the expectation that politicians I = A, B will play the unique NE in period 2. Because the distribution of ν 1 is symmetric about zero, the optimization problems in (31) and (32) show that the game is symmetric. Nevertheless, the presence of the cost of punishment implies that Thus, the proof of existence requires additional restrictions relative to that in those papers.
Existence. The FOCs of candidate A write: and: where the above conditions are binding for interior solutions; i.e., whenever the implicit constraints implies by the condition (t 1 , D 1 ) ∈ X are not binding. Candidate B solves a problem that mirrors that of candidate A. Notice that in any equilibrium it must be thatν A The proof of existence consists in four steps. For strict concavity it is sufficient to show that the Hessian matrices of and if (b) the period 1 rent W 1 , the variance of the taste shock σ 2 ν , and the threshold k h are all sufficiently large. 29 Sufficient conditions for (a) to hold true are provided in Lemma A.2. 29 The condition on σ 2 ν corresponds to the restriction on g ′ 1 (ν1)/g1(ν1) that ensures concavity in Lindbeck and Weibull (1987). The additional condition on W1 is needed because of the interaction between the probability of winning the elections and the probability of entering a punishment phase in period 2 in each candidate's objective function, which is not an issue in traditional probabilistic voting models.
Specifically, if (a) is satisfied, then there exists a (possibly not unique) vector of thresholds The detailed proof to this result is lengthy and relatively standard, thus it presented in the online appendix. actions is non-empty and compact, and the objective function Π I 1 for I = A, B satisfies the following conditions:

and (iii)
Π I 1 is bounded and weakly lower semi-continuous in t I 1 , D I 1 . In our application the set of players' actions X×X is non-empty and compact because X = (t 1 , D 1 ) ∈ [0, 1] × D 1 , D 1 | g 1 (t 1 , D 1 ) ≥ 0} is non-empty and compact. Conditions (ii) and (iii) are satisfied because . Thus, the game satisfies all the properties of Lemma 7 in Dasgupta and Maskin (1986), which implies that the game possesses a symmetric mixed strategy equilibrium. Details are provided in the online appendix.
3. All NE are in pure strategies. Proof. If each candidate I's objective function is strictly concave in (t I 1 , D I 1 ), then all best responses to mixed strategies, and therefore all electoral equilibria, are in pure strategies (as in Banks and Duggan, 2005, proof to Theorem 2).
Part (ii) (Equivalent problem). Consider the equilibrium conditions of each candidate I in a symmetric pure strategies Nash equilibrium. In such type of equilibrium it must be true that simplify as follows: where C e = C 4W1g1(0.5) . Notice that the FOCs above are the same as those of a partially benevolent representative politician solving: with const = W 1 g 1 (0.5) > 0. Because (a) the maximization problem in (42)  Defineε R (t 1 , D 1 ) ≡ε (t 1 , D 1 , y 1 (t 1 ) | R), whereε R always exists given the assumptions, and is locally unique if ∆ < 1. We obtain the following result.

Appendix B Proofs Nonlinear Rule
In this section we maintain the assumption that the conditions for equivalence between the outcome of the electoral game and that of the modified social planned problem stated in Proposition A.1 are satisfied. Thus, all the proofs make use of the latter. Proof. First we must solve the problem of the elected politician in period 2. From the previous section we know that in period 2, the problem is equivalent to the one of social planner that maximizes voters' expected utility. The problem is: where J o u r n a l P r e -p r o o f

Journal Pre-proof
Define the tax elasticity of labor supply as: The FOC implies [t 2 ] : w 2l2 {−1 + u ′ (g 2 )} = 0 and the SOC is satisfied given the strict concavity of u. Notice that this equation implies: which implies that g 2 is independent of D 1 (this is a consequence of linearity and of l 2 =l 2 ).
Thus, the problem of the representative politician in period 1 can be rewritten as follows: First, notice that because g 1 (t 1 , D 1 ) is a concave function, the set X is a convex set. Moreover, the assumptions u ′ (0) = +∞ and D 0 ≤ (D 1 −a)/(1+r) imply that the constraint g 1 (t 1 , D 1 ) ≥ 0 embedded in the definition of X is satisfied and never binding. Thus, we can ignore it in deriving the optimality conditions of each agent. Calculate the FOCs w.r.t. t 1 and D 1 : M B O to be negative definite is that u Y 1 (t 1 , D 1 ) and u O 1 (t 1 , D 1 ) are strictly concave. For that to hold true we need the Hessian matrix of u Y 1 to be negative definite. Let us denote with u I x,y the second derivatives of u Y 1 . The conditions are: This, together with the strict concavity of u, implies all the conditions in (50) are satisfied. Thus, all the conditions of Proposition A.1 are satisfied as long as either condition 1., 2., or 3. of Lemma A.2 is satisfied. Note that M I O negative definite also implies that the objective function of the representative politician is strictly concave. Thus, the optimal policy solves the FOCs and the sign of the comparative statics of interest can be obtained by differentiating the FOCs. Specifically, we calculate the cross derivatives of V using the notation V xy = ∂ 2 V ∂x∂y of the representative politician's objective function V , noticing that P r (nc | (t 1 , D 1 ), R) is a function of t 1 , D 1 , R, but it is invariant in B 1 . Differentiating (48) and (49) we obtain: Recall that by assumption u ′′ and ∂ 2 P r(nc|(t1,D1),R) Proof. First we must derive the condition for the optimal choice of the social planner. This planner can decide D 1 and t 1 optimally (no need of the deficit rule). The social planner problem is stated in formula (10). In period 2, the platform chosen is the same as the one of the politician, which corresponds to the one of a planner that maximizes the sum of voters utilities. Thus, the problem in period 1 can be rewritten as follows:  interior, then there must be some (t 1 , D 1 ) ∈ X such that either (A) V SP is the minimum feasible D 1 given t 1 , and that g 1 (t 1 , D 1 (t 1 )) = 0.

Thus, the assumption
Because v is strictly increasing and strictly convex we get v ′′ (l 1 (1))l 1 (1) = 0, Proposition 3. A fiscal rule R that implements the optimal policy (t * 1 , D * 1 ) and that satisfies conditions (TCO) and (LCE) always exists.
Proof. Consider the rule R ζ (s 1 , y1 − s 1 for some constant ζ. We show that for any finite C e > 0 and B 1 ∈ (0, 1) this rule (i) satisfies (TCO), (ii) implements the optimal policy for some value of ζ > 0, and (iii) satisfies (LCE). (i) First, tightness writes R ζ (0, y 1 ) = ζ F (0) y1 − ζ F (0) y1 = 0, which is constant in y 1 . Thus, rule R ζ satisfies condition (TCO). (ii) Suppose R ζ does not implement the optimal policy given some bias level B ′ 1 . A violation of the rule occurs iff D1−D0 y1 − s 1 , which rewrites: Because F is strictly increasing over [−a, a], it is invertible. Thus, we can rewrite condition (57) J o u r n a l P r e -p r o o f Journal Pre-proof as follows: Recall that the probability of non-compliance given policy (t 1 , D 1 ) and rule R writes: Thus, the expected cost of punishment becomes C e ζ (D 1 − D 0 ) − C e F (0). This rule trivially satisfies the condition of Lemma A.2 (1), which ensures that the solution to the electoral game is the same as that to the problem of the representative politician and that the objective function is strictly concave. Thus, we can use the FOCs of the politician (see proof to Proposition 1) which write: Secondly, the objective function of the social planner is strictly concave given the assumptions on u and v. Thus, sufficient conditions for the solution to the politician's problem to be socially optimal are: For the rule R ζ we have C e ∂P r(nc|(t1,D1),Rζ )

∂D1
= C e ζ and C e ∂P r(nc|(t1,D1),Rζ ) ∂t1 = 0. Thus, the following value for ζ: solves both the equation in (62) and that in (63), implying that the principal and the politician's FOCs are made equal to each other. Lastly, both the objective function of the principal and the one of the politician are strictly concave in (t 1 , D 1 ) and the choice set X is the same. Thus, the result above implies (t * * 1 , D * * 1 ) = (t * 1 , D * 1 ). That is, setting ζ = ζ * as in (64), the rule R ζ * J o u r n a l P r e -p r o o f Journal Pre-proof implements the optimal policy at bias level B = B ′ 1 . This leads to a contradiction to the initial claim that the rule does not implement (t * 1 , D * 1 ). (iii) Suppose the rule R ζ does not satisfy (LCE). Then, there exists no neighborhood N d (B ′ 1 ) such that the allocation induced by policy (t * * 1 , D * * 1 ) is constrained-Pareto efficient for all B 1 ∈ The Lagrangian for this problem writes: Because the optimization problem is strictly convex, for interior solutions and givenū Y 1 ,ū Y 2 the unique solution to this maximization problem-denoted by t CP 1 , D CP 1 -satisfies the FOCs: plus the standard complementary slackness conditions. Consider a neighborhood N d (B ′ 1 ). The equilibrium allocation induced by a rule R given bias is set equal to its maximum feasible value; i.e.,ū Y 2 = max (t1,D1)∈X u Y 2 (t 1 , D 1 ), then the solution features D CP Thus, for any given D ′ 1 ∈ D 1 , D 1 , the allocation induced by policy (t ′ 1 , D ′ 1 ) is constrained-Pareto efficient if the FOC w.r.t. t 1 are satisfied at (t ′ 1 , D ′ 1 ). In particular, for each is the equilibrium level of D 1 given B 1 and rule R . Note that the F.O.C. w.r.t. t 1 of the constrained-Pareto maximization problem is just a strictly positive value 1 + λ 1 times a function of (t 1 , D 1 ). Thus, the solution to the equilibrium condition in (67) is unchanged if we divide both sides by 1 + λ 1 . Evaluated at the debt level D CP 1 = D * * 1 this leads to the condition: Compare this with the F.O.C. w.r.t. t 1 of the representative politician's maximization problem: Recall that dy1(t1) dt1 ̸ = 0 given the assumptions on agent's preferences. The last two conditions are identical -delivering the same solution t CP = 0. Given the definition of (LCE) and the fact that the rule R ζ implements the optimal policy by part (ii) of this proof, (LCE) is violated only if there exists a neighborhood N d (B ′ 1 ) such that = 0 for all (t 1 , D 1 ) ∈ X. Thus, the rule R ζ satisfies (LCE).
This leads to a contradiction. Q.E.D.
Proposition 4. If a deficit rule R satisfies (TCO) and (LCE) and implements the optimal policy (t * 1 , D * 1 ) at B 1 = B ′ 1 , then (i) the tightness of the rule K (y * 1 | R) is zero; i.e., the rule prescribes zero structural deficit, and (ii) the flexibility of the rule ∆ (t * 1 , D * 1 , y * 1 | R) is lower than 1, i.e. the rule does not fully account for tax shocks.
Proof. Part (i). Let D * * 1 (B 1 ; R) denote the equilibrium level of D 1 given bias B 1 and rule R. Because R satisfies (LCE), there exists N d (B ′ 1 ) such that the equilibrium allocation induced by the rule is constrained-Pareto efficient for all B 1 ∈ N d (B ′ 1 ). Consider 0 < h < d and let J o u r n a l P r e -p r o o f Journal Pre-proof is an interior solution. Thus, from step (iii) in the proof of Proposition 1 , we get , the optimality of the rule w.r.t. D 1 requires (62) is satisfied. Third, the optimality of the rule w.r.t. t 1 implies These three results imply in turn that called result (a). Lastly, because the socially optimal policy is an interior solution by Lemma  R)). This means that n is constant and equal to zero for all the values of ϵ within the non-degenerate interval [ε ′ R ,ε ′′ R ] at y 1 = y * 1 . Because by assumption the function R is real analytic and has finite derivatives, then n (ϵ, y * 1 ) is real analytic over its domain, and ϵ =ε R (t * * 1 , D * * 1 (B 1 ; R)) is an accumulation point of [ε ′ R ,ε ′′ R ]. Then by the identity theorem for holomorphic functions we obtain n (ϵ, y * 1 ) = R 1 . Third, consider the definition of tightness: K(y * 1 | R) = R (0, y * 1 ). The condition (TCO) implies R 2 (0, y * 1 ) = 0. Thus, the result (b) implies R 1 (0, y * 1 ) × 0 − 0 × y * 1 − R (0, y * 1 ) = 0, which implies in turn K(y * 1 | R) = R (0, y * 1 ) = 0. Part (ii). First, notice that using the notation ∆ * = ∆ (t * 1 , D * 1 , y * 1 | R), the optimality condition for D 1 in (62) corresponds to the following necessary condition for implementation of the social optimum: J o u r n a l P r e -p r o o f Journal Pre-proof which is never satisfied for any ∆ * > 1. Thus, any rule that implements the social opti- = +∞ and therefore condition (73) is not satisfied. Both cases imply that the rule cannot implement the optimal policy. Thus, − Proposition 5. There exists finite ς > 0 such that if the variance of the tax shock is sufficiently large, σ 2 ϵ ≥ ς 2 , then the flexibility of the optimal rule ∆ t * 1 , D * 1 , y 1 | R * B1 is weakly increasing in the political present bias B 1 .
Corollary 7. If the optimal policy (t * 1 , D * 1 ) is implementable by a linear rule R = κ − δs 1 for all B 1 within an interval (B ′ 1 , B ′′ 1 ), then: (i) the implementation occurs at κ * = 0 and δ * ≤δ; (ii) the optimal degree of flexibility δ * is weakly increasing in the political present bias B 1 within such interval.
Proof. Part (i). The rule R = κ − δs 1 trivially satisfies (TCO). Moreover, at κ = 0 it also J o u r n a l P r e -p r o o f Journal Pre-proof Step 2. We show that if κ = 0 and D * * 1 is interior it must be true that V SP D ̸ = 0 at (t * * 1 , D * * 1 ). Suppose V SP D = 0 at (t * * 1 , D * * 1 ). κ = 0 implies V t = V SP t for all (t 1 , D 1 ) ∈ X. Thus, if V SP D = 0 at (t * * 1 , D * * 1 ), then (t * * 1 , D * * 1 ) satisfies the FOCs of the social planner, and given the strict concavity of the objective function this implies in turn that the socially optimal policy is implementable, leading to a contradiction. Thus, it must be V SP D ̸ = 0.
Step 3 states that δ * can be optimal only if δ * = δ max .