1 Introduction

The Heisenberg spin chain can be obtained from 4-dimensional mixed topological–holomorphic Chern–Simons theory on \(\mathbb {R}^2 \times \mathbb {C}\) by introducing certain line defects along the topological plane \(\mathbb {R}^2\) for each site of the spin chain [12, 13, 15, 16]. This elegant description of the Heisenberg spin chain is ultimately possible because the integrable structure of the latter is underpinned by the quantum Yang–Baxter equation

$$\begin{aligned}&R_{{\mathfrak {12}}}(z_1, z_2) R_{{\mathfrak {13}}}(z_1, z_3) R_{{\mathfrak {23}}}(z_2, z_3)\\&\qquad = R_{{\mathfrak {23}}}(z_2, z_3) R_{{\mathfrak {13}}}(z_1, z_3) R_{{\mathfrak {12}}}(z_1, z_2) \end{aligned}$$
figure a

which is neatly encoded in the fact that the 4-dimensional Chern–Simons theory is topological in the \(\mathbb {R}^2\) direction. By contrast, the integrability of the (classical and quantum) Gaudin model [38, 39], or more generally the Hitchin system [43], is underpinned by the classical Yang–Baxter equation

$$\begin{aligned}{}[r_{{\mathfrak {12}}}(z_1, z_2), r_{{\mathfrak {13}}}(z_1, z_3)] = [r_{{\mathfrak {23}}}(z_2, z_3), r_{{\mathfrak {12}}}(z_1, z_2)] - [r_{{\mathfrak {32}}}(z_3, z_2), r_{{\mathfrak {13}}}(z_1, z_3)], \end{aligned}$$
(1)

whose topological origin is less clear.

On the other hand, affine Gaudin models, whose integrability is also underpinned by the classical Yang–Baxter equation (1), see [64], can be obtained [65] from the same 4-dimensional mixed topological–holomorphic Chern–Simons theory [17] on \(\mathbb {R}^2 \times \mathbb {C}P^1\), this time by introducing surface defects along \(\mathbb {R}^2\) placed at the marked points \(z_i \in \mathbb {C}\) of the affine Gaudin model. A natural question is therefore whether the ordinary Gaudin model, associated with a finite-dimensional semisimple Lie algebra \(\mathfrak {g}\) rather than an affine Kac–Moody algebra, can also be described in a similar way.

The purpose of this paper is to show that the tamely ramified Hitchin system on a Riemann surface C, in which the Higgs field has simple poles at certain marked points on C, can be described using a collection of various line defects in 3-dimensional mixed (topological–holomorphic) BF theory on \(\mathbb {R}\times C\). In particular, the Gaudin model is obtained as the special case when \(C=\mathbb {C}P^1\). More precisely, we perform a Hamiltonian analysis of the 3d mixed BF theory, whose fields are a partial connection 1-form A and a (1, 0)-form B, with suitably chosen line defects. Using the condition \(A_{\bar{z}} = 0\) to fix the gauge invariance, we find that the dynamics on the reduced phase space coincides with that of the tamely ramified Hitchin system. In particular, the (1, 0)-form B becomes meromorphic and gets identified with the Higgs field. This is completely analogous to the relationship found in [65] between 4d mixed topological–holomorphic Chern–Simons theory on \(\Sigma \times \mathbb {C}P^1\), with the cylinder \(\Sigma = \mathbb {R}\times S^1\) , and the affine Gaudin model. In other words, our analysis shows that 3d mixed BF theory is to the Gaudin model what 4d Chern–Simons theory is to the affine Gaudin model.

The plan of the paper is as follows.

In Sect. 2, we recall the definition of 3-dimensional mixed topological–holomorphic BF theory on \(\mathbb {R}\times C\) for some Riemann surface C, depending on a \(\mathfrak {g}\)-valued (1, 0)-form B and a \(\mathfrak {g}\)-valued connection 1-form A. In Sect. 2.3, we introduce two types of line defects, which we refer to as type A and B, respectively. Type A defects ensure that in the gauge \(A_{\bar{z}} = 0\) the field B is meromorphic on-shell, with poles at the location of the defects. This field then gets identified with the Lax matrix L of the Gaudin model, or the Higgs field \(\Phi \) of the tamely ramified Hitchin system. Adding the type B defect ensures that (the negative of) the time component of the connection 1-form A gets correctly identified with the matrix M in the Lax equation \(\partial _t L = [M, L]\). We also derive the 1d action for the Gaudin model along the lines of [11, 20].

In Sect. 3, we perform a Hamiltonian analysis of 3-dimensional mixed topological–holomorphic BF theory with the two types of defects introduced in Sect. 2.3. There is a first class constraint \(\widehat{\mu }\) generating the gauge symmetry of the theory. In the absence of type A defects, \(\widehat{\mu }\) coincides with the moment map \(\mu = \bar{\partial }_A \Phi \) of the Hitchin system, after identifying the Higgs field \(\Phi \) with the (1, 0)-form B of the 3-dimensional mixed topological–holomorphic BF theory. Upon imposing the condition \(A_{\bar{z}} = 0\) to fix the gauge symmetry generated by \(\widehat{\mu }\), the component \(B_z\) of the (1, 0)-form B becomes meromorphic with poles at the location \(z_i\) of the type A line defects and is shown to satisfy the Lax algebra with respect to the Dirac bracket. Moreover, the Hamiltonian on the reduced phase space coincides with that of the Hitchin system.

We end the paper with a brief discussion of possible future directions in Sect. 4.

2 \(\mathbf{3}\)d mixed BF theory on \(\mathbb {R}\times C\)

Let G be a semisimple Lie group over \(\mathbb {C}\), with Lie algebra \(\mathfrak {g}\) and fix a non-degenerate invariant symmetric bilinear form \(\langle \cdot , \cdot \rangle : \mathfrak {g}\times \mathfrak {g}\rightarrow \mathbb {C}\) on \(\mathfrak {g}\). Let C be a Riemann surface.

We shall consider the 3-dimensional classical mixed topological–holomorphic BF theory on \(\mathbb {R}\times C\), or 3d mixed BF theory for short—see e.g. [40, 41, 54] where the theory is discussed using the BV formalism. The field content of this theory consists of a \(\mathfrak {g}\)-valued (1, 0)-form B on \(\mathbb {R}\times C\), where the bigrading (pq) corresponds to the one induced by the complex structure on C, together with a \(\mathfrak {g}\)-valued connection 1-form A on \(\mathbb {R}\times C\). We denote the curvature of the latter as \(F(A) = d A+ \frac{1}{2} [A,A]\). The action of 3d mixed BF theory is then given by

$$\begin{aligned} S_\mathrm{3d}[A,B] = \frac{1}{2\pi \mathsf {i}} \int _{\mathbb {R}\times C} \langle B, F(A) \rangle . \end{aligned}$$
(2)

2.1 Gauge invariance

The 3d mixed BF action (2) is trivially invariant under gauge transformations of the form \(A \rightarrow A + \chi \) for any \(\mathfrak {g}\)-valued (1, 0)-form \(\chi \) on \(\mathbb {R}\times C\). Indeed, \(\chi \) drops out from the action since B is a (1, 0)-form. We can fix this invariance by requiring that A has no (1, 0)-component so that it locally takes the form \(A = A_{\bar{z}} d \bar{z} + A_t d t\) for some local coordinate t on \(\mathbb {R}\) and a local holomorphic coordinate z on C. From now on, we will always take A to be a partial connection of this form.

More interestingly, the action (2) is invariant under the action of any G-valued function g on \(\mathbb {R}\times C\) acting by gauge transformations on the connection 1-form A and by conjugation on the field B, namely

$$\begin{aligned} A&\longmapsto ^g A :=- \bar{\partial }g g^{-1} - d _\mathbb {R}g g^{-1} + g A g^{-1}, \end{aligned}$$
(3a)
$$\begin{aligned} B&\longmapsto g B g^{-1}, \end{aligned}$$
(3b)

where \(\bar{\partial }\) is the Dolbeault differential on C, i.e. the (0, 1)-component of the de Rham differential \(d _C = \partial + \bar{\partial }\) on C, and \(d _\mathbb {R}\) denotes the de Rham differential on \(\mathbb {R}\). Indeed, under such a transformation the curvature 2-form F(A) transforms by conjugation \(F(^g A) = g F(A) g^{-1}\) and so the invariance of the action follows from the adjoint G-invariance of the bilinear form \(\langle \cdot , \cdot \rangle \).

2.2 Equations of motion

To derive the equations of motion, we consider the variations \(B \rightarrow B + \epsilon \) and \(A \rightarrow A + \eta \) by an arbitrary (1, 0)-form \(\epsilon \) and 1-form \(\eta \) on \(\mathbb {R}\times C\). Varying the action, we find

$$\begin{aligned} \delta S_\mathrm{3d}[A,B]&:=S_\mathrm{3d}[A+\eta , B+\epsilon ] - S_\mathrm{3d}[A,B]\\&\, = \frac{1}{2 \pi \mathsf {i}} \int _{\mathbb {R}\times C} \big ( \langle \epsilon , F(A) \rangle + \langle B, d \eta + \tfrac{1}{2} [A,\eta ] + \tfrac{1}{2} [\eta ,A] \rangle + O(\eta ^2)\big )\\&\, = \frac{1}{2 \pi \mathsf {i}} \int _{\mathbb {R}\times C} \big ( \langle \epsilon , F(A) \rangle + \langle B, d \eta + [A,\eta ] \rangle + O(\eta ^2) \big )\\&\, = \frac{1}{2 \pi \mathsf {i}} \int _{\mathbb {R}\times C} \big ( \langle \epsilon , F(A) \rangle + \langle d B +[B,A], \eta \rangle + O(\eta ^2) \big ), \end{aligned}$$

where in the third equality we have used the fact that A and \(\eta \) are both \(\mathfrak {g}\)-valued 1-forms, so that \([\eta , A] = [A, \eta ]\). In the last equality, we used Stokes’s theorem, noting that \(\langle B, d \eta \rangle = \langle d B, \eta \rangle - d \langle B, \eta \rangle \), and the adjoint invariance of the bilinear form.

The equation of motion for B is therefore \(F(A) = 0\), or explicitly

$$\begin{aligned} \bar{\partial }A + d _\mathbb {R}A + \tfrac{1}{2} [A,A] = 0, \end{aligned}$$
(4a)

while the equation of motion for A reads

$$\begin{aligned} \bar{\partial }B + d _\mathbb {R}B + [B,A] = 0. \end{aligned}$$
(4b)

Letting z be a local holomorphic coordinate on C and t a global coordinate on \(\mathbb {R}\), and writing the two fields in components as \(B = B_z d z\) and \(A = A_{\bar{z}} d \bar{z} + A_t d t\), we can write the equations of motion (4) more explicitly in components as

$$\begin{aligned} \partial _{\bar{z}} A_t - \partial _t A_{\bar{z}}&= [A_t, A_{\bar{z}}], \end{aligned}$$
(5a)
$$\begin{aligned} \partial _{\bar{z}} B_z&= [B_z, A_{\bar{z}}], \end{aligned}$$
(5b)
$$\begin{aligned} \partial _t B_z&= [-A_t, B_z]. \end{aligned}$$
(5c)

The first key observation to make here is that the equation of motion (5c) is very reminiscent of the Lax equation

$$\begin{aligned} \partial _t L = [M, L]. \end{aligned}$$
(6)

However, to make this superficial resemblance more precise we would need \(B_z\) and \(- A_t\) to both be holomorphic (or more generally meromorphic) in order to identify them with the Lax pair L and M of an integrable system.

The second observation, based on the other two equations of motion (5a) and (5b), is that this can be achieved by working in the gauge where \(A_{\bar{z}} = 0\). Indeed, in this gauge the two equations (5a) and (5b) reduce to \(\partial _{\bar{z}} A_t = 0\) and \(\partial _{\bar{z}} B_z = 0\), respectively, which express the fact that \(A_t\) and \(B_z\) are both holomorphic on C.

2.3 Introducing line defects

In the Lax equation (6) of an integrable system, however, L and M are more generally \(\mathfrak {g}\)-valued meromorphic functions with poles at certain marked points. This is, in fact, necessary if C has genus zero, i.e. when \(C = \mathbb {C}P^1\). Moreover, as it stands there is no relation between \(B_z\) and \(- A_t\) in (5c), while in (6) the matrix M is typically built out of the Lax matrix L. We can fix both of these issues by introducing two different types of line defects in the action (2). We will refer to these as type A and type B line defects, since these will depend on the fields A and B, respectively.

2.3.1 Type A line defects

A particularly important class of integrable systems is given by the Gaudin model [38, 39], whose Lax pair is formed of two \(\mathfrak {g}\)-valued meromorphic functions L and M on \(\mathbb {C}P^1\) with L having poles at certain marked points \(z_i \in \mathbb {C}\) for \(i = 1, \ldots , N\). In order to view \(B_z\) and \(- A_t\) as such a Lax pair, but working on a more general Riemann surface C, we would like them to be meromorphic instead of holomorphic, with \(B_z\) having poles at certain marked points \(z_i \in C\). To this end, we pick and fix elements \(u_i \in \mathfrak {g}\) and introduce G-valued fields \(h_i\) on \(\mathbb {R}\) for \(i=1,\ldots , N\). Following [17], see also [11], we add to the action (2) the following sum of line defects

$$\begin{aligned} S_{A-\mathrm{def}}\big [ A, \{h_i\}_{i=1}^N \big ] = - \sum _{i=1}^N \int _{\mathbb R \times \{z_i\}} \big \langle u_i, h_i^{-1}(d _\mathbb {R}+ \iota _{z_i}^*A) h_i \big \rangle \end{aligned}$$
(7)

where \(\iota _{z_i} : \mathbb {R}\times \{z_i\} \hookrightarrow \mathbb {R}\times C\) is the embedding of the line defect at \(z_i\) into the total space. In particular, the pullback \(\iota _{z_i}^*A\) is just the evaluation of the component \(A_t d t\) at the point \(z_i \in C\) so that we can rewrite the defect action (7) more explicitly as

$$\begin{aligned} S_{A-\mathrm{def}}\big [ A, \{h_i\}_{i=1}^N \big ] = - \sum _{i=1}^N \int _\mathbb {R}\big \langle u_i, h_i^{-1}\big (\partial _t + A_t(z_i)\big ) h_i \big \rangle d t. \end{aligned}$$

In order to maintain the gauge invariance of the action (2) after adding (7) to it, we should require that the latter is itself gauge invariant. This can easily be achieved by supplementing the gauge transformations (3) of the fields A and B by the transformation

$$\begin{aligned} h_i \longmapsto g h_i \end{aligned}$$
(8)

for the G-valued fields \(h_i\), \(i = 1, \ldots , N\).

Consider now the extended action

$$\begin{aligned} \widetilde{S}\big [ A,B, \{h_i\}_{i=1}^N \big ] :=S_\mathrm{3d}[A,B] + S_{A-\mathrm{def}}\big [ A, \{h_i\}_{i=1}^N \big ]. \end{aligned}$$
(9)

Since the defect action (7) does not depend on B, the equations of motion (4a) for B are unchanged. On the other hand, the equation of motion (5b) for A in a local holomorphic coordinate z on an open neighbourhood U of the point \(z_i\) is now replaced by

$$\begin{aligned} \partial _{\bar{z}} B_z = [B_z, A_{\bar{z}}] - 2\pi \mathsf {i}\, \widehat{u}_i \delta _{z z_i}, \end{aligned}$$
(10)

where we introduced \(\widehat{u}_i :=h_i u_i h_i^{-1}\) for each \(i = 1,\ldots , N\) and \(\delta _{z z_i}\) denotes the Dirac \(\delta \)-distribution at the marked point \(z_i\) with the property that

$$\begin{aligned} \int _U f(z) \delta _{z z_i} d z \wedge d \bar{z} = f(z_i) \end{aligned}$$
(11)

for any function \(f : U \rightarrow \mathbb {C}\) on the neighbourhood \(U \subset C\) of \(z_i\) equipped with the local holomorphic coordinate z.

In the gauge \(A_{\bar{z}} = 0\), the modified equation of motion (10) reads

$$\begin{aligned} \partial _{\bar{z}} B_z = - 2 \pi \mathsf {i}\, \widehat{u}_i \delta _{z z_i}. \end{aligned}$$
(12)

Using the fact that \(\partial _{\bar{z}} (z-z_i)^{-1} = - 2 \pi \mathsf {i}\delta _{zz_i}\) we may rewrite this equation as

$$\begin{aligned} \partial _{\bar{z}} \bigg ( B_z - \frac{\widehat{u}_i}{z-z_i} \bigg ) = 0 \end{aligned}$$

which tells us that \(B_z\) has a simple pole at \(z_i\) with residue \(\widehat{u}_i\) there, i.e.

$$\begin{aligned} B = \frac{\widehat{u}_i}{z-z_i} d z + O(1) \end{aligned}$$
(13)

where O(1) denotes terms which are holomorphic at the point \(z_i\).

When \(C = \mathbb {C}P^1\), corresponding to the Gaudin model, if we fix a global coordinate z on \(\mathbb {C}\subset \mathbb {C}P^1\) and require B to have a simple pole also at infinity, then we can explicitly write B as the \(\mathfrak {g}\)-valued meromorphic (1, 0)-form

$$\begin{aligned} B = \sum _{i=1}^N \frac{\widehat{u}_i}{z - z_i} d z. \end{aligned}$$
(14)

By varying the action (9) with respect to \(h_i \rightarrow e^{\epsilon _i} h_i\) for some \(\mathfrak {g}\)-valued function \(\epsilon _i\) on \(\mathbb {R}\), we find N further equations of motion

$$\begin{aligned} \partial _t \widehat{u}_i = [ - A_t(z_i), \widehat{u}_i ] \end{aligned}$$
(15)

for \(i = 1, \ldots , N\). But given the meromorphic behaviour (13) of the (1, 0)-form B at each of the marked points \(z_i\), these are merely consequences of the equation of motion (5c) given by taking the residue at each \(z_i\), assuming that \(A_t\) is regular at \(z_i\), as will be the case in Sect. 2.3.2.

2.3.2 Type B line defects

The type A line defects introduced in Sect. 2.3.1 ensured that \(B_z\) is no longer holomorphic in the gauge \(A_{\bar{z}} = 0\) but rather meromorphic with poles at certain marked points \(z_i \in C\). The type B line defects will have a similar effect on the field \(A_t\). However, since \(- A_t\) is meant to play the role of M in the Lax pair (6), we want it to be built out of \(B_z\), which plays the role of the Lax matrix L.

Let \(P : \mathfrak {g}\rightarrow \mathbb {C}\) be a G-invariant polynomial on \(\mathfrak {g}\) and fix a point \(w \in C\) distinct from the marked points \(z_i \in C\) for \(i=1,\ldots , N\) at which the type A line defects are inserted in Sect. 2.3.1. We consider the following line defect

$$\begin{aligned} S_{B-\mathrm{def}}[B] = - \int _{\mathbb R \times \{w\}} P(B_z) d t = - \int _\mathbb {R}P\big ( B_z(w) \big ) d t \end{aligned}$$
(16)

where z is a local holomorphic coordinate around the point \(w \in C\) and, writing \(B = B_z d z\) in this coordinate, \(B_z(w)\) denotes the evaluation of \(B_z\) at the point w.

The G-invariance of the polynomial P ensures that the action (16) is gauge invariant. Therefore, adding it to the gauge invariant action (9) obtained so far, we obtain the full gauge invariant action

$$\begin{aligned} S\big [ A,B, \{h_i\}_{i=1}^N \big ] :=S_\mathrm{3d}[A,B] + S_{A-\mathrm{def}}\big [ A, \{h_i\}_{i=1}^N \big ] + S_{B-\mathrm{def}}[B]. \end{aligned}$$
(17)

Since the defect term (16) only depends on B, it will not modify the equations of motion for A. Only the equation of motion for B, namely (5a) which has so far remained unchanged, will be modified. To derive it, we note that the variation of the defect action (16), under the variation \(B \rightarrow B + \epsilon \) considered in Sect. 2.2 with \(\epsilon = \epsilon _z d z\) in the local holomorphic coordinate z, reads

$$\begin{aligned} \delta S_{B-\mathrm{def}}[B]&:=S_{B-\mathrm{def}}[B+\epsilon ] - S_{B-\mathrm{def}}[B]\\&= - \int _\mathbb {R}\Big ( P\big ( B_z(w) + \epsilon _z(w) \big ) - P\big ( B_z(w) \big ) \Big ) d t\\&= - \int _\mathbb {R}\Big ( \big \langle P'(B_z(w)), \epsilon _z(w) \big \rangle + O\big ( \epsilon _z(w)^2 \big ) \Big ) d t \end{aligned}$$

where in the third line we introduced the element \(P'(B_z(w)) \in \mathfrak {g}\) such that the linear map \(\langle P'(B_z(w)), \cdot \rangle : \mathfrak {g}\rightarrow \mathbb {C}\) is the derivative of \(P: \mathfrak {g}\rightarrow \mathbb {C}\) at \(B_z(w)\) and kept only the terms linear in \(\epsilon _z(w)\). It follows that (5a) is now replaced by

$$\begin{aligned} \partial _{\bar{z}} A_t - \partial _t A_{\bar{z}} = [A_t, A_{\bar{z}}] + 2 \pi \mathsf {i}\, P'\big ( B_z(w) \big ) \delta _{zw}. \end{aligned}$$
(18)

In the gauge \(A_{\bar{z}} = 0\), this simplifies to

$$\begin{aligned} \partial _{\bar{z}} A_t = 2 \pi \mathsf {i}\, P'\big ( B_z(w) \big ) \delta _{zw} \end{aligned}$$
(19)

or in other words,

$$\begin{aligned} \partial _{\bar{z}} \bigg ( A_t + \frac{P'\big ( B_z(w) \big )}{z - w} \bigg ) = 0. \end{aligned}$$

In the case \(C = \mathbb {C}P^1\), this tells us that the expression in brackets is a constant. Taking this constant to be zero, we therefore obtain

$$\begin{aligned} - A_t(z) = \frac{P'\big ( B_z(w) \big )}{z - w}, \end{aligned}$$
(20)

which coincides with the usual expression for \(M = - A_t\) in terms of \(L = B_z\), see, for instance, [3, (3.33)] in the case when \(\mathfrak {g}= \mathfrak {gl}_r\) and the polynomial \(P : \mathfrak {gl}_r \rightarrow \mathbb {C}\) is given by \(X \mapsto tr (X^n)\) for some \(n \in \mathbb {Z}_{\ge 1}\). Indeed, in this case we have \(P'(X) = n X^{n-1}\) for any \(X \in \mathfrak {gl}_r\) so that (20) becomes

$$\begin{aligned} - A_t(z) = n \frac{B_z(w)^{n-1}}{z - w}. \end{aligned}$$
(21)

In connection with the Hamiltonian analysis to be performed in Sect. 3, where the classical r-matrix \(r_{{\mathfrak {12}}}(z,w) = \frac{C_{{\mathfrak {12}}}}{w-z}\) will be introduced in (45), note that we can rewrite (21) in the more recognisable form

$$\begin{aligned} - A_t(z) = - n \, tr _{\mathfrak {2}} \big ( r_{{\mathfrak {12}}}(z,w) B_z(w)_{\mathfrak {2}}^{n-1} \big ). \end{aligned}$$

Substituting the expression (20) for \(A_t\) into the equation of motion (5c), we obtain the desired Lax equation

$$\begin{aligned} \partial _t B_z(z) = \bigg [ \frac{P'\big ( B_z(w) \big )}{z - w}, B_z(z) \bigg ], \end{aligned}$$
(22)

where we have explicitly written the dependence of \(B_z\) on the spectral parameters. We thus expect from the general theory of integrable systems, see, for instance, Proposition [3, p. 47], that the time coordinate t along the topological direction of the 3-dimensional space \(\mathbb {R}\times C\) is identified, through the introduction of the type B defect (16), with the time induced by the Hamiltonian

$$\begin{aligned} H^P_w :=P(B_z(w)). \end{aligned}$$
(23)

To confirm this, we will move to the Hamiltonian formalism in Sect. 3.

2.4 Unifying \(\mathbf{1}\)-dimensional action

We have now shown that the gauge fixed equations of motion for 3d mixed BF theory in the presence of type A and B defects correspond exactly to the Lax equation (22) of the Gaudin model with Lax matrix \(L(z) = B_z(z)\) given by (14), where the residues \(\widehat{u}_i = h_i u_i h_i^{-1}\) are coadjoint orbits through the fixed elements \(u_i \in \mathfrak {g}\) and parametrised by the dynamical G-valued variables \(h_i \in G\).

At this stage, it is therefore natural to proceed along the lines of [20], where a unifying 2d action for integrable field theories of affine Gaudin type was derived from the 4d Chern–Simons action of [17]. In a similar spirit, in the present context we would like to obtain a 1d action for the Gaudin model with Lax matrix (14) starting from the 3d mixed BF theory with both type A and type B defects. In fact, the procedure followed in [11] is closer in spirit to the present case since we do not have to deal with the presence of a meromorphic 1-form \(\omega \) having zeroes, as in the 4d Chern–Simons action considered in [20].

Following [11, §2.6], we will therefore substitute the solutions to the equations of motion (10) and (18) (but crucially not (5c)) in the gauge \(A_{\bar{z}} = 0\), namely (14) and (20), respectively, into the full action (17). We will do this for the three pieces in the action separately. For the bulk action (2) we find

$$\begin{aligned} S_\mathrm{3d}[A,B]&= \frac{1}{2 \pi \mathsf {i}} \int _{\mathbb {R}\times \mathbb {C}} \langle B_z, \partial _{\bar{z}} A_t - \partial _t A_{\bar{z}} - [A_t, A_{\bar{z}}] \rangle d z \wedge d \bar{z} \wedge d t\\&= \frac{1}{2 \pi \mathsf {i}} \int _{\mathbb {R}\times \mathbb {C}} \langle B_z, \partial _{\bar{z}} A_t \rangle d z \wedge d \bar{z} \wedge d t\\&= \int _{\mathbb {R}} \big \langle L(w), P'\big ( L(w) \big ) \big \rangle d t \end{aligned}$$

where in the second equality we used the gauge \(A_{\bar{z}} = 0\). In the last equality, we used the fact that \(B_z\) is identified with the Lax matrix L together with the identity (19) and then performed the integral over \(\mathbb {C}\) using the presence of the \(\delta \)-function.

For the type A defect action (7), we have

$$\begin{aligned} S_{A-\mathrm{def}}\big [ A, \{h_i\}_{i=1}^N \big ]&= - \sum _{i=1}^N \int _{\mathbb {R}} \langle u_i, h_i^{-1}\partial _t h_i \rangle d t - \sum _{i=1}^N \int _{\mathbb {R}} \langle \widehat{u}_i, A_t(z_i) \rangle d t\\&= - \sum _{i=1}^N \int _{\mathbb {R}} \langle u_i, h_i^{-1}\partial _t h_i \rangle d t - \sum _{i=1}^N \int _{\mathbb {R}} \bigg \langle \widehat{u}_i, \frac{P'(L(w))}{w - z_i} \bigg \rangle d t\\&= - \sum _{i=1}^N \int _{\mathbb {R}} \langle u_i, h_i^{-1}\partial _t h_i \rangle d t - \int _{\mathbb {R}} \big \langle L(w), P'\big ( L(w)\big ) \big \rangle d t, \end{aligned}$$

where in the second equality we used (20) evaluated at \(z = z_i\) and in the last line we recognised the sum over i in the second term as the expression for the Lax matrix \(L(w) = B_z(w)\) in (14). Note that the second term on the right-hand side exactly cancels the expression found above for the bulk action \(S_\mathrm{3d}[A,B]\).

Finally, the type B defect action (16) is simply \(S_{B-\mathrm{def}}[B] = - \int _\mathbb {R}H^P_w d t\) using the expression (23) for the Hamiltonian alluded to in Sect. 2.3.2 and to be confirmed in Sect. 3. Putting all the above together, we deduce that the full action (17) reduces to the simple form

$$\begin{aligned} S_\mathrm{1d}\big [ \{h_i\}_{i=1}^N \big ] = - \sum _{i=1}^N \int _{\mathbb {R}} \langle u_i, h_i^{-1}\partial _t h_i \rangle d t - \int _{\mathbb {R}} H^P_w d t, \end{aligned}$$
(24)

where we have suppressed the dependence on the fields A and B since these have now been expressed in terms of the dynamical variables \(h_i \in G\) and the fixed elements \(u_i \in \mathfrak {g}\) for \(i = 1, \ldots , N\). We recognise (24) as the first-order action

$$\begin{aligned} S\big [ \{h_i\}_{i=1}^N \big ] = \sum _{i=1}^N \int _{\mathbb {R}} \langle X_i, h_i^{-1}\partial _t h_i \rangle d t - \int _{\mathbb {R}} H^P_w d t, \end{aligned}$$

associated with the Hamiltonian \(H^P_w\) in (23) but where the conjugate momentum \(X_i \in \mathfrak {g}\) of \(h_i \in G\) has been fixed to the constant element \(X_i = - u_i\). This is consistent with the Hamiltonian analysis to be performed in the next section. Namely, we find in Sect. 3.1.2 that there is a primary constraint \(X_i + u_i \approx 0\) on the conjugate momentum \(X_i \in \mathfrak {g}\) of the dynamical variable \(h_i \in G\).

We can check directly that the equations of motion of the 1d action (24) are given by (15), with \(A_t\) as in (20), by varying it with respect to \(h_i \rightarrow e^{\epsilon _i} h_i\) for some arbitrary \(\mathfrak {g}\)-valued variable \(\epsilon _i\). Under this variation, the Lax matrix L(w) transforms to

$$\begin{aligned} \sum _{i=1}^N \frac{e^{\epsilon _i}\widehat{u}_i e^{-\epsilon _i}}{w-z_i} = L(w) +\sum _{i=1}^N\frac{[{\epsilon _i}, {\widehat{u}_i}]}{w-z_i} + O(\epsilon _i^2). \end{aligned}$$

Hence, using the explicit expression \(H= P(L(w))\) for the Hamiltonian, the variation in the action is given by

$$\begin{aligned} \delta S_\mathrm{1d}&:=S_\mathrm{1d}[\{e^{\epsilon _i} h_i\}_{i=1}^N] - S_\mathrm{1d}[\{h_i\}_{i=1}^N]\\&= - \sum _{i=1}^N \int _{\mathbb {R}} \big \langle u_i, h_i^{-1}e^{-\epsilon _i}\partial _t (e^{\epsilon _i}h_i) - h_i^{-1}\partial _t h_i \big \rangle d t\\&\qquad \qquad - \int _{\mathbb {R}} \bigg ( P\bigg (L(w) +\sum _{i=1}^N\frac{[{\epsilon _i}, {\widehat{u}_i}]}{w-z_i}\bigg ) - P\big (L(w)\big ) \bigg ) d t\\&= - \sum _{i=1}^N \int _{\mathbb {R}} \bigg ( \langle \widehat{u}_i, \partial _t \epsilon _i \rangle + \bigg \langle P'\big (L(w)\big ), \frac{[{\epsilon _i}, {\widehat{u}_i}]}{w-z_i} \bigg \rangle + O(\epsilon _i^2) \bigg )d t\\&= \sum _{i=1}^N \int _{\mathbb {R}} \bigg ( \bigg \langle \partial _t \widehat{u}_i - \bigg [ \frac{P'(L(w))}{z_i - w}, \widehat{u}_i \bigg ], \epsilon _i \bigg \rangle + O(\epsilon _i^2) \bigg ) d t, \end{aligned}$$

where in the last equality we have used Stokes’s theorem and the adjoint invariance of the bilinear form. The N equations of motion for the \(h_i\) are therefore

$$\begin{aligned} \partial _t \widehat{u}_i = \left[ {\frac{P'(L(w))}{z_i -w}},{\widehat{u}_i}\right] , \end{aligned}$$

and using (20), we do indeed recover the equations of motion (15) found previously from adding in type A defects in Sect. 2.3.1.

3 Hamiltonian analysis

Throughout this section, we shall work in some local coordinate z on some open subset of C. Our starting point is the Lagrangian density of the action (17) written in terms of the components of the \(\mathfrak {g}\)-valued bulk fields \(A = A_{\bar{z}} d \bar{z} + A_t d t\) and \(B = B_z d z\) and in terms of the G-valued defect variables \(h_i\) for \(i = 1, \ldots , N\) which we introduced at the type A defects in Sect. 2.3.1, namely

$$\begin{aligned} \mathcal {L}\big (A,B, \{h_i\}_{i=1}^N \big )&= \frac{1}{2 \pi \mathsf {i}}\big \langle B_z, \partial _{\bar{z}} A_t - \partial _t A_{\bar{z}} + [{A_{\bar{z}}}, {A_t}] \big \rangle \nonumber \\&\qquad \qquad - \sum _{i=1}^N \big \langle u_i, h_i^{-1}(\partial _t + A_t)h_i \big \rangle \delta _{z z_i} - P(B_z(w)) \delta _{zw}. \end{aligned}$$
(25)

3.1 Conjugate momenta and primary constraints

To move to the Hamiltonian formalism we first determine the conjugate momenta of the bulk fields \(A_{\bar{z}}\), \(A_t\) and \(B_z\) and the defect variables \(h_i\). We shall find various primary constraints, some of which will be second class. We shall impose the latter strongly at this stage by introducing a corresponding Dirac bracket. To alleviate the notation, all Dirac brackets computed in this section will ultimately be renamed simply as \(\{ \cdot , \cdot \}\) before moving on to Sect. 3.2 where we work out the secondary constraints.

We begin in Sect. 3.1.1 by considering the conjugate momenta of the bulk fields \(A_{\bar{z}}\), \(A_t\) and \(B_z\), as the conjugate momenta to the G-valued defect variables \(h_i\) will need to be handled with more care, as discussed in Sect. 3.1.2.

3.1.1 Bulk canonical fields

The conjugate momenta to the \(\mathfrak {g}\)-valued bulk fields \(A_{\bar{z}}\), \(A_t\) and \(B_z\) are the \(\mathfrak {g}\)-valued fields given, respectively, by

$$\begin{aligned} \Pi _t =\frac{\partial \mathcal L}{\partial (\partial _t A_t)} = 0, \quad \Pi _{\bar{z}} =\frac{\partial \mathcal L}{\partial (\partial _t A_{\bar{z}})} = - \frac{1}{2 \pi \mathsf {i}}B_z, \quad P_z =\frac{\partial \mathcal L}{\partial (\partial _t B_z)} = 0, \end{aligned}$$

which satisfy the canonical Poisson bracket relations

$$\begin{aligned} \{\Pi _{t{\mathfrak {1}}}(z), A_{t{\mathfrak {2}}}(z')\}&= C_{{\mathfrak {12}}}\delta _{zz'},\\ \{\Pi _{\bar{z}{\mathfrak {1}}}(z), A_{\bar{z}{\mathfrak {2}}}(z')\}&= C_{{\mathfrak {12}}}\delta _{zz'},\\ \{P_{z{\mathfrak {1}}}(z),B_{z{\mathfrak {2}}}(z') \}&= C_{{\mathfrak {12}}}\delta _{zz'}. \end{aligned}$$

Here and in what follows we use the standard tensor notation. In particular, if we fix dual bases \(\{ I^a \}\) and \(\{ I_a \}\) of \(\mathfrak {g}\) with respect to the non-degenerate invariant symmetric bilinear form \(\langle \cdot , \cdot \rangle : \mathfrak {g}\times \mathfrak {g}\rightarrow \mathbb {C}\) introduced in Sect. 2, then \(C_{{\mathfrak {12}}} = I^a \otimes I_a \) is the split Casimir. Also for \(\mathfrak {g}\)-valued fields F and G, which can be written in components as \(F = F_a I^a\) and \(G = G_a I^a\), we have \(\{ F_{\mathfrak {1}}(z), G_{\mathfrak {2}}(z') \} :=\{ F_a(z), G_b(z') \} I^a \otimes I^b\). The Dirac \(\delta \)-distribution \(\delta _{zz'}\) was defined in (11).

We have three primary constraints associated with the bulk fields, namely

$$\begin{aligned} \Pi _t \approx 0 , \qquad \mathcal C_{z} :=B_z + 2 \pi \mathsf {i}\Pi _{\bar{z}} \approx 0, \qquad P_z \approx 0. \end{aligned}$$
(26)

Throughout this paper, we use the conventional notation ‘\(\approx \)’ to denote equality on the constraint surface [42]. The first constraint in (26) is clearly first class and the latter two are second class with Poisson bracket

$$\begin{aligned} \{P_{z{\mathfrak {1}}}(z), \mathcal C_{z{\mathfrak {2}}}(z')\} = C_{{\mathfrak {12}}} \delta _{z z'}. \end{aligned}$$

We set these both strongly to zero immediately, by which we mean restricting to the submanifold of phase space specified by \(P_z = 0\) and \(\mathcal C_z = 0\) and replacing the Poisson bracket with the corresponding Dirac bracket [42]. With respect to the latter we still have the same relations between the remaining fields, i.e.

$$\begin{aligned} \{\Pi _{t{\mathfrak {1}}}(z), A_{t{\mathfrak {2}}}(z')\}&= C_{{\mathfrak {12}}}\delta _{z z'}, \end{aligned}$$
(27a)
$$\begin{aligned} \{\Pi _{\bar{z} {\mathfrak {1}}}(z), A_{\bar{z} {\mathfrak {2}}}(z')\}&= C_{{\mathfrak {12}}}\delta _{z z'}, \end{aligned}$$
(27b)

and hence, by an abuse of notation, we will continue to denote this Diract bracket as \(\{\cdot , \cdot \}\).

3.1.2 Defect canonical variables

We have yet to find the conjugate momenta to the G-valued variables \(h_i\), \(i = 1, \ldots , N\) introduced at the type A defects. This can be done by working in local coordinates \(\phi ^\alpha \) on the group G where \(\alpha \) ranges from 1 to \(\dim G\), the dimension of G. We refer the reader, for instance, to [44, §3.1.2] for details. Each variable \(h_i \in G\) can then be described locally in terms of the \(\dim G\) variables \(\phi ^\alpha _i :=\phi ^\alpha (h_i)\).

The relevant part of the Lagrangian in finding the conjugate momenta is

$$\begin{aligned} - \langle u_i, h_i^{-1} \partial _t h_i \rangle = - \langle u_i, \partial _t \phi ^\alpha _i h_i^{-1}\partial _\alpha h_i \rangle , \end{aligned}$$

which in the second expression we have rewritten in terms of the local coordinates \(\phi ^\alpha _i\), where \(\partial _\alpha \) denotes the partial derivative with respect to the coordinate \(\phi ^\alpha \). The corresponding conjugate momenta are therefore given by

$$\begin{aligned} \pi _{i, \alpha } = \frac{\partial \mathcal L}{ \partial (\partial _t \phi ^\alpha _i)} = - \langle u_i, h_i^{-1} \partial _\alpha h_i \rangle , \end{aligned}$$
(28)

and these have the usual canonical Poisson bracket relations

$$\begin{aligned} \{\phi ^\alpha _i,\phi ^\beta _j\} = 0, \qquad \{\pi _{i, \alpha },\pi _{j, \beta } \} = 0, \qquad \{\pi _{i, \alpha },\phi ^\beta _j\} = \delta ^\beta _\alpha \delta _{ij}. \end{aligned}$$

To return to a coordinate free description of the phase space, we define a matrix \(L^a_{\;\; \alpha }\) for some fixed basis \(\{I_a\}\) of \(\mathfrak {g}\) such that

$$\begin{aligned} h_i^{-1}\partial _\alpha h_i = L^a_{\;\; \alpha } I_a. \end{aligned}$$
(29)

This \(L^a_{\;\; \alpha }\) is invertable and we denote the inverse as \(L^\alpha _{\;\; a}\) following the conventions of [44, §3.1.2]. We can then introduce a coordinate-free \(\mathfrak {g}\)-valued variable \(X_i\) which encodes the conjugate momentum \(\pi _{i, \alpha }\) as

$$\begin{aligned} X_i :=L^\alpha _{\;\; a} \pi _{i, \alpha } I^a, \end{aligned}$$
(30)

where \(\{ I^a \}\) is the basis of \(\mathfrak {g}\) dual to \(\{ I_a\}\) with respect to the bilinear form \(\langle \cdot , \cdot \rangle \). We therefore have a coordinate free description of the phase space, parameterised by a pair of fields \((X_i, h_i)\) at each defect valued in \(TG \simeq \mathfrak {g}\times G\), with the canonical Poisson brackets in local coordinates being equivalent to, see, for instance, [44],

$$\begin{aligned} \{h_{i{\mathfrak {1}}},h_{j{\mathfrak {2}}}\}&= 0, \end{aligned}$$
(31a)
$$\begin{aligned} \{X_{i{\mathfrak {1}}}, h_{j{\mathfrak {2}}} \}&= h_{i{\mathfrak {2}}} C_{{\mathfrak {12}}} \delta _{ij},\end{aligned}$$
(31b)
$$\begin{aligned} \{X_{i{\mathfrak {1}}}, X_{j{\mathfrak {2}}} \}&= - [C_{{\mathfrak {12}}},X_{i{\mathfrak {2}}}]\delta _{ij}. \end{aligned}$$
(31c)

for each \(i, j = 1, \ldots , N\).

Using the definition of the matrix \(L^a_{\;\;\alpha }\) in (29), we have

$$\begin{aligned} L^\alpha _{\;\; a} \langle u_i, h_i^{-1} \partial _\alpha h_i \rangle I^a = \langle u_i, L^\alpha _{\;\; a} L^b_{\;\; \alpha } I_b \rangle I^a = \langle u_i, I_a \rangle I^a = u_i. \end{aligned}$$

It then follows from the expression (28) for \(\pi _{i, \alpha }\) above, derived from the Lagrangian, and the definition (30) of \(X_i\) that we have a primary constraint of the form

$$\begin{aligned} \mathcal C_i :=X_i + u_i \approx 0 \end{aligned}$$
(32)

for each defect \(i=1, \dots , N\). These N primary constraints are not entirely first or second class. Indeed, their Poisson brackets

$$\begin{aligned} \{\mathcal {C}_{i{\mathfrak {1}}},\mathcal {C}_{j{\mathfrak {2}}}\} = \{X_{i{\mathfrak {1}}},X_{j{\mathfrak {2}}}\} = - [C_{{\mathfrak {12}}},\mathcal C_{i{\mathfrak {2}}} - u_{i{\mathfrak {2}}}] \delta _{ij} \approx [C_{{\mathfrak {12}}}, u_{i{\mathfrak {2}}}] \delta _{ij}, \end{aligned}$$
(33)

are non-vanishing on the constraint surface (32) and are not generally invertible.

Let \(\{ v^i_p \}_{p=1}^{d_i}\) be a basis of the centraliser \(\mathfrak {g}^{u_i} :=\ker (ad _{u_i})\) of the element \(u_i \in \mathfrak {g}\), with \(d_i :=\dim \mathfrak {g}^{u_i}\) for each \(i = 1, \ldots , N\). The first class part of each \(\mathcal C_i\) is given by the set of constraints \(\mathcal C^p_i :=\langle v^i_p, \mathcal C_i\rangle \) for \(p = 1, \ldots , d_i\). These satisfy the relations

$$\begin{aligned} \{ \mathcal {C}^p_i, \mathcal {C}_j \} \approx \big \langle v^i_{p{\mathfrak {1}}}, [C_{{\mathfrak {12}}}, u_{i{\mathfrak {2}}}] \big \rangle _{\mathfrak {1}} \delta _{ij} = [v^i_p, u_i] \delta _{ij} = 0, \end{aligned}$$
(34)

for every \(i, j = 1, \ldots , N\) and \(p = 1, \ldots , d_i\), where the last equality uses the fact that \(v^i_p \in \mathfrak {g}^{u_i}\). In particular, we have \(\{ \mathcal {C}^p_i, \mathcal {C}^q_j \} \approx 0\) for any \(q=1, \ldots , d_j\) so that the set of constraints \(\mathcal C^p_i\) for \(p = 1, \ldots , d_i\), \(i = 1, \ldots , N\) are indeed first class. It also follows from (31b) that the first class constraints \(\mathcal C^p_i\) generate right multiplication of the \(h_i\) by elements \(e^{\epsilon v^i_p}\) of the centraliser \(G^{u_i}\) of \(u_i\) in G –note that under such transformations the \(\mathfrak {g}\)-valued variables \(\widehat{u}_i\) are invariant.

Let us extend the basis \(\{ v^i_p \}_{p=1}^{d_i}\) of the centraliser \(\mathfrak {g}^{u_i}\) to a basis \(\{ v^i_p \}_{p=1}^{d_i} \cup \{ \widetilde{v}^i_r \}_{r=1}^{c_i}\) of \(\mathfrak {g}\) where \(c_i :=\dim \mathfrak {g}- d_i\). We claim that the remaining constraints \(\widetilde{\mathcal C}^r_i :=\langle \widetilde{v}^i_r, \mathcal C_i\rangle \) for \(r = 1, \ldots , c_i\) contained in \(\mathcal C_i\) are second class. We need to show that the matrix \(\{ \widetilde{\mathcal C}^r_i, \widetilde{\mathcal C}^s_i \}\) for \(r,s = 1, \ldots , c_i\) is invertible on the constraint surface \(\mathcal C_i \approx 0\). If this were not the case then we would have \(\sum _{s=1}^{c_i} \{ \widetilde{\mathcal C}^r_i, \widetilde{\mathcal C}^s_i \} a_s \approx 0\) for some \(a_s \in \mathbb {C}\) with \(s = 1, \ldots , c_i\). On the other hand, we also know from (34) that \(\sum _{s=1}^{c_i} \{ \mathcal C^p_i, \widetilde{\mathcal C}^s_i \} a_s \approx 0\) for all \(p = 1, \ldots , d_i\). Combining these statements we have

$$\begin{aligned} 0 \approx \sum _{s=1}^{c_i} \{ \mathcal C_i, \widetilde{\mathcal C}^s_i \} a_s = \sum _{s=1}^{c_i} \big \{ \mathcal C_i, \langle \widetilde{v}^i_s, \mathcal C_i \rangle \big \} a_s \approx \sum _{s=1}^{c_i} \big \langle \widetilde{v}^i_{s{\mathfrak {2}}}, [C_{{\mathfrak {12}}} , u_{i{\mathfrak {2}}}] \big \rangle _{{\mathfrak {2}}} a_s = \bigg [ u_i, \sum _{s=1}^{c_i} a_s \widetilde{v}^i_s \bigg ], \end{aligned}$$

where in the third step we used (33). It follows that \(\sum _{s=1}^{c_i} a_s \widetilde{v}^i_s \in \mathfrak {g}^{u_i}\) which contradicts the assumption that \(\{ \widetilde{v}^i_r \}_{r=1}^{c_i}\) is the basis of some complement of \(\mathfrak {g}^{u_i}\) in \(\mathfrak {g}\).

We would like to impose suitable gauge fixing conditions \(\mathcal D^p_i \approx 0\), for \(p = 1, \ldots , d_i\), to fix the first class constraints \(\mathcal C^p_i\) and move to a Dirac bracket \(\{ \cdot , \cdot \}^*\) which fixes the constraints \(\mathcal C_i \approx 0\) strongly. In particular, we would like to compute the Dirac bracket \(\{ \widehat{u}_{i{\mathfrak {1}}}, \widehat{u}_{j{\mathfrak {2}}} \}^*\) of the \(\mathfrak {g}\)-valued variables \(\widehat{u}_i = h_i u_i h_i^{-1}\) for \(i=1, \ldots , N\). It turns out that the result is independent of the choice of gauge fixing condition \(\mathcal D^p_i \approx 0\). Indeed, consider the variables \(\widehat{X}_i :=h_i X_i h_i^{-1}\). One deduces from (31) that they have the Poisson brackets

$$\begin{aligned} \{ \widehat{X}_{i {\mathfrak {1}}}, \widehat{X}_{j {\mathfrak {2}}} \}&= [ C_{{\mathfrak {12}}}, \widehat{X}_{i{\mathfrak {2}}} ]\delta _{ij}, \end{aligned}$$
(35a)
$$\begin{aligned} \{ X_{i{\mathfrak {1}}}, \widehat{X}_{j {\mathfrak {2}}} \}&=0 \end{aligned}$$
(35b)

for each \(i, j = 1, \ldots , N\). In particular, it follows from (35b) that \(\{ \mathcal C_{i{\mathfrak {1}}}, \widehat{X}_{j{\mathfrak {2}}} \} = 0\) for any \(i,j = 1, \ldots , N\). Now the matrix of Poisson brackets between the set of all second class constraints \(\mathcal C^p_i\), \(\mathcal D^p_i\) for \(p = 1, \ldots , d_i\) and \(\widetilde{\mathcal C}^r_i\) for \(r = 1, \ldots , c_i\) is of the block form

$$\begin{aligned} \left( \begin{matrix} 0 &{}\quad *&{}\quad 0\\ *&{}\quad *&{}\quad *\\ 0 &{}\quad *&{}\quad *\end{matrix}\right) \end{aligned}$$
(36)

where the first, second and third block rows and columns correspond to the set of constraints \(\mathcal C^p_i\), \(\mathcal D^p_i\) and \(\widetilde{\mathcal C}^r_i\), respectively. Each ‘\(*\)’ denotes a possibly non-zero block matrix. The matrix (36) is invertible since the blocks in position (1, 2), (2, 1) and (3, 3) are all invertible by design. Its inverse is then schematically of the block form

$$\begin{aligned} \left( \begin{matrix} 0 &{}\quad *&{}\quad 0\\ *&{}\quad *&{}\quad *\\ 0 &{}\quad *&{}\quad *\end{matrix}\right) ^{-1} = \left( \begin{matrix} *&{}\quad *&{}\quad *\\ *&{}\quad 0 &{}\quad 0\\ *&{}\quad 0 &{}\quad *\end{matrix}\right) . \end{aligned}$$
(37)

Since \(\{ \mathcal C^p_i, \widehat{X}_j \} = \{ \widetilde{\mathcal C}^r_i, \widehat{X}_j \} = 0\) for all \(p = 1, \ldots , d_i\) and \(r = 1, \ldots , c_i\), the zero block in the middle of the right-hand side of (37) implies that the Poisson brackets (35a) will remain unchanged when passing to the Dirac bracket, i.e. we have

$$\begin{aligned} \{ \widehat{X}_{i {\mathfrak {1}}}, \widehat{X}_{j {\mathfrak {2}}}\}^*= [ C_{{\mathfrak {12}}}, \widehat{X}_{i{\mathfrak {2}}}]\delta _{ij}. \end{aligned}$$

Finally, using the fact that \(\widehat{X}_i = - \widehat{u}_i\) after imposing the constraint \(\mathcal C_i = 0\) strongly, we deduce that the \(\mathfrak {g}\)-valued variables \(\widehat{u}_i\) for \(i=1, \ldots , N\) satisfy N commuting copies of the Kostant–Kirillov bracket

$$\begin{aligned} \{ \widehat{u}_{i {\mathfrak {1}}}, \widehat{u}_{j {\mathfrak {2}}}\}^*= - [ C_{{\mathfrak {12}}}, \widehat{u}_{i{\mathfrak {2}}}]\delta _{ij}. \end{aligned}$$
(38)

To avoid overburdening the notation, and since we shall need to introduce a further Dirac bracket in Sect. 3.3, we will denote the Dirac bracket \(\{ \cdot , \cdot \}^*\) introduced above simply as \(\{ \cdot , \cdot \}\) from now on.

3.2 Hamiltonian and secondary constraints

The Hamiltonian density is defined as the Legendre transform of the Lagrangian density (25). However, since the field \(A_t\) is non-dynamical, i.e. there are no time derivatives of \(A_t\) in the action, we shall perform the Legendre transform only with respect to the dynamical fields \(A_{\bar{z}}\), \(B_z\) and the dynamical variables \(h_i\). So we define

$$\begin{aligned} \mathcal H&:=\langle \Pi _{\bar{z}}, \partial _t A_{\bar{z}} \rangle + \langle P_z, \partial _t B_z \rangle + \langle X_i, h_i^{-1}\partial _t h_i \rangle - \mathcal L\big ( A,B, \{h_i\}_{i=1}^N \big ) \\&= \frac{1}{2 \pi \mathsf {i}} \langle \mathcal C_z, \partial _t A_{\bar{z}} \rangle + \langle P_z, \partial _t B_z \rangle + \sum _{i=1}^N\langle \mathcal C_i, h_i^{-1}\partial _t h_i \rangle \\&\qquad \qquad \qquad - \frac{1}{2 \pi \mathsf {i}} \langle B_z, \partial _{\bar{z}} A_t + [A_{\bar{z}},A_t] \rangle + \sum _{i=1}^N \langle \widehat{u} _i, A_t \rangle \delta _{zz_i} + H^P_w \delta _{zw} \end{aligned}$$

where in the second line we have used the definition of the bulk constraint \(\mathcal C_z\) in (26) and of the defect constraints \(\mathcal C_i\) for \(i = 1, \ldots , N\) in (32). Since we have already set these along with \(P_z\) strongly to zero, we can drop the corresponding terms in the Hamiltonian density.

The Hamiltonian is the integral of the Hamiltonian density over C, namely

$$\begin{aligned} H&:=\int _C \mathcal H \, d z \wedge d \bar{z}\\&= - \frac{1}{2 \pi \mathsf {i}} \big \langle \bigg \langle B_z, \partial _{\bar{z}} A_t + [A_{\bar{z}},A_t] \big \rangle \bigg \rangle + \int _C \bigg ( \sum _{i=1}^N \langle \widehat{u} _i, A_t \rangle \delta _{zz_i} \bigg ) d z \wedge d \bar{z} + H^P_w, \end{aligned}$$

where in the first term of the right-hand side we introduced the notation

$$\begin{aligned} \big \langle \bigg \langle X, Y \big \rangle \bigg \rangle :=\int _C \langle X, Y \rangle \, d z \wedge d \bar{z} \end{aligned}$$

for any \(\mathfrak {g}\)-valued fields X and Y on C.

It is convenient to introduce the bulk \(\mathfrak {g}\)-valued field

$$\begin{aligned} \mu :=\frac{1}{2\pi \mathsf {i}} \big ( \partial _{\bar{z}}B_z + [{A_{\bar{z}}}, {B_z}] \big ) = - \partial _{\bar{z}}\Pi _{\bar{z}} - [A_{\bar{z}},\Pi _{\bar{z}}], \end{aligned}$$
(39)

where in the second equality we have used the constraint \(\mathcal C_z = 0\) in (26) which is now imposed strongly. Introducing also the \(\mathfrak {g}\)-valued field

$$\begin{aligned} \widehat{\mu }:=\mu + \sum _{i=1}^N \widehat{u}_i \delta _{zz_i}, \end{aligned}$$
(40)

the Hamiltonian can be rewritten succinctly as

$$\begin{aligned} H = \big \langle \bigg \langle \widehat{\mu }, A_t \big \rangle \bigg \rangle + H^P_w. \end{aligned}$$
(41)

At this point, it is interesting to note the similarity between the \(\mathfrak {g}\)-valued fields \(\mu \) and \(\widehat{\mu }\) just introduced and the moment map of the Hitchin system [43] (we refer the reader to [3, §7.11] for a concise review of Hitchin systems). The latter is the moment map on the cotangent bundle \(T^*\mathcal A\) of the space \(\mathcal A\) of (0, 1)-forms on the Riemann surface C, parameterised by the (0, 1)-form A and the Higgs field \(\Phi \), defined by \(\mu _\mathrm{Hit} :=\bar{\partial }_A \Phi = \bar{\partial }\Phi + [A, \Phi ]\). The phase space of the Hitchin system without marked points is defined as the symplectic reduction to the level surface \(\mu _\mathrm{Hit} = 0\). This would coincide exactly, upon identifying \(B_z\) with the Higgs field \(\Phi \) and the (0, 1)-form A with \(A_{\bar{z}}\), with the condition \(\mu = 0\). In the presence of marked points \(z_i\) the level of the moment map of the Hitchin system is chosen instead to be \(\sum _{i=1}^N \widehat{u}_i \delta _{zz_i}\). As we will see, this level surface corresponds exactly to the constraint \(\widehat{\mu } \approx 0\) coming from the gauge invariance in 3d mixed BF theory with the defects introduced in Sect. 2.3. Without the defects, the constraint would reduce to \(\mu \approx 0\) in pure BF theory.

3.2.1 Gauge invariance

We need to ensure that the remaining primary constraint, \(\Pi _t \approx 0\), is preserved under time evolution. That is,

$$\begin{aligned} \{H,\Pi _t\} = \widehat{\mu }\approx 0, \end{aligned}$$

giving rise to the secondary constraint \(\widehat{\mu }\approx 0\). We see from the canonical brackets (27) that \(-\widehat{\mu }\) is the generator of gauge transformations (3) on the fields \(A_{\bar{z}}\) and \(B_z\) since

$$\begin{aligned} \{\widehat{\mu }_{\mathfrak {1}}(z), A_{\bar{z}{\mathfrak {2}}}(z') \}&= - [C_{{\mathfrak {12}}}, A_{\bar{z}{\mathfrak {2}}}(z)]\delta _{zz'} - \partial _{\bar{z}}( C_{{\mathfrak {12}}} \delta _{zz'}) \end{aligned}$$
(42a)
$$\begin{aligned} \{\widehat{\mu }_{\mathfrak {1}}(z), B_{z{\mathfrak {2}}}(z')\}&= - [C_{{\mathfrak {12}}}, B_{z{\mathfrak {2}}}(z)] \delta _{zz'}, \end{aligned}$$
(42b)

where we have used the identity \([{C_{{\mathfrak {12}}}}, {A_{\bar{z}{\mathfrak {2}}}(z) +A_{\bar{z}{\mathfrak {1}}}}](z) = 0\) for (42a). Note that the moment map \(\mu \) satisfies the following Poisson bracket

$$\begin{aligned}&\{\mu _{\mathfrak {1}}(z), \mu _{\mathfrak {2}}(z') \} = \frac{1}{2\pi \mathsf {i}}\{\mu _{\mathfrak {1}}(z),\partial _{\bar{z}'}B_{z{\mathfrak {2}}}(z') \} + \frac{1}{2\pi \mathsf {i}}\big \{\mu _{\mathfrak {1}}(z), [A_{\bar{z}{\mathfrak {2}}}(z'),B_{z{\mathfrak {2}}}(z')] \big \}\\&\quad = \frac{1}{2 \pi \mathsf {i}} \Big ( -\partial _{\bar{z}'}[C_{{\mathfrak {12}}} \delta _{zz'},B_{z{\mathfrak {2}}}(z')] - \big [ A_{\bar{z}{\mathfrak {2}}}(z), [C_{{\mathfrak {12}}},B_{z{\mathfrak {2}}}(z')] \big ]\delta _{zz'}\\&\qquad \qquad \qquad \qquad \qquad \qquad - \big [ [C_{{\mathfrak {12}}}, A_{\bar{z}{\mathfrak {2}}}(z)]\delta _{zz'}+ \partial _{\bar{z}}(C_{{\mathfrak {12}}} \delta _{zz'}), B_{z{\mathfrak {2}}}(z') \big ] \Big )\\&\quad =\frac{1}{2\pi \mathsf {i}}\Big ( - [C_{{\mathfrak {12}}},\partial _{\bar{z}}B_{z{\mathfrak {2}}}(z)] \delta _{zz'} - \big [ C_{{\mathfrak {12}}},[A_{\bar{z}{\mathfrak {2}}}(z), B_{z{\mathfrak {2}}}(z')] \big ]\delta _{zz'} \Big ) \\&\quad = - [{C_{{\mathfrak {12}}}}, {\mu _{\mathfrak {2}}(z)}] \delta _{zz'}, \end{aligned}$$

where in the second equality we used the relations (26), which also trivially hold with \(\widehat{\mu }\) replaced by \(\mu \). In the third equality, we have used the Jacobi identity and the fact that \(\partial _{\bar{z}} \delta _{zz'} + \partial _{\bar{z}'} \delta _{zz'} = 0\), which follows using the identity \(\partial _{\bar{z}} (z-z')^{-1} = - 2 \pi \mathsf {i}\delta _{zz'}\).

The Poisson bracket of \(\widehat{\mu }\) with itself is therefore

$$\begin{aligned} \{\widehat{\mu }_{\mathfrak {1}}(z), \widehat{\mu }_{\mathfrak {2}}(z')\}&= \{\mu _{\mathfrak {1}}(z),\mu _{\mathfrak {2}}(z') \} + \sum _{i,j=1}^N \{ \widehat{u}_{i{\mathfrak {1}}}, \widehat{u}_{j{\mathfrak {2}}} \} \delta _{zz_i}\delta _{z'z_j} \\&= - [C_{{\mathfrak {12}}}, \mu _{\mathfrak {2}}(z)]\delta _{zz'} - \sum _{i=1}^N [ C_{{\mathfrak {12}}}, \widehat{u}_{i{\mathfrak {2}}} \delta _{zz_i} ] \delta _{z'z_i} = - [C_{{\mathfrak {12}}}, \widehat{\mu }_{\mathfrak {2}}(z)] \delta _{zz'} \end{aligned}$$

where in the second equality we have used (38) for the second term. This vanishes on the constraint surface so \(\widehat{\mu }\) is first class–we will set it strongly to zero with an appropriate gauge fixing condition in the following section.

The time evolution of \(\widehat{\mu }\) is given by

$$\begin{aligned} \{H, \widehat{\mu }(z) \}&\approx \frac{1}{2 \pi \mathsf {i}} \big \{ H^P_w, [A_{\bar{z}}(z), B_z(z)] \big \}\\&= \frac{1}{2 \pi \mathsf {i}} \big [ \{ H^P_w, A_{\bar{z}}(z) \}, B_z(z) \big ] = - [{P'(B_z(w))}, {B_z(z)}]\delta _{zw} = 0, \end{aligned}$$

and therefore, we have no tertiary constraints.

3.3 Gauge fixing and Lax formalism

Recall that so far we have fixed the pair of second class constraints \(P_z \approx 0\) and \(\mathcal C_z \approx 0\) by introducing the corresponding Dirac bracket in Sect. 3.1.1. We kept the notation \(\{ \cdot , \cdot \}\) for this Dirac bracket. In Sect. 3.1.2, we introduced a further Dirac bracket \(\{ \cdot , \cdot \}^*\) to fix the constraints \(\mathcal C_i \approx 0\). As mentioned at the end of that section, by abuse of notation we continued to call this Dirac bracket \(\{ \cdot , \cdot \}\) since the Dirac bracket of the bulk fields is unaffected. In this section, we start with the latter Dirac bracket and wish to fix the gauge invariance arising from the constraint \(\widehat{\mu }\approx 0\).

We will use the gauge fixing condition \(A_{\bar{z}} \approx 0\) and simultaneously impose this condition and the constraint \(\widehat{\mu }\approx 0\) strongly by defining a new Dirac bracket. To this end, recall that

$$\begin{aligned} \{\widehat{\mu }_{\mathfrak {1}}(z), A_{\bar{z}{\mathfrak {2}}}(z')\} = - [C_{{\mathfrak {12}}}, A_{\bar{z}{\mathfrak {2}}}(z)] \delta _{zz'} - \partial _{\bar{z}} (C_{{\mathfrak {12}}} \delta _{zz'}) \approx - \partial _{\bar{z}} (C_{{\mathfrak {12}}} \delta _{zz'}) \end{aligned}$$

where the first equality is (42a) and in the last step we have used the new constraint \(A_{\bar{z}} \approx 0\). This can be inverted, since

$$\begin{aligned} \left\langle \bigg \langle - \partial _{\bar{z}}( C_{{\mathfrak {12}}} \delta _{zz'}), \frac{1}{2\pi \mathsf {i}} \frac{C_{{\mathfrak {23}}}}{z' - z''} \right\rangle \bigg \rangle _{(z', {\mathfrak {2}})}= & {} \frac{\mathsf {i}}{2\pi } C_{{\mathfrak {13}}} \partial _{\bar{z}} \left( \frac{1}{z-z''} \right) =\frac{\mathsf {i}}{2\pi } C_{{\mathfrak {13}}}(-2\pi \mathsf {i}\delta _{zz''})\nonumber \\= & {} C_{{\mathfrak {13}}}\delta _{zz''}. \end{aligned}$$
(43)

Here the subscript \((z', {\mathfrak {2}})\) means that the pairing \(\langle \cdot , \cdot \rangle \) is taken in the second tensor space and the integration is with respect to \(z'\). We therefore define the new Dirac bracket, denoted \(\{\cdot , \cdot \}^\star \) for \(\mathfrak {g}\)-valued functions U and V on C, by the usual formula [42, §1.3.3], see also [65, §2.6] for the analogous derivation in the 4d Chern–Simons theory context, namely

$$\begin{aligned}&\{U_{\mathfrak {1}}(z), V_{\mathfrak {2}}(z')\}^\star = \{U_{\mathfrak {1}}(z), V_{\mathfrak {2}}(z')\} \\&\qquad - \big \langle \bigg \langle \{U_{\mathfrak {1}}(z), \widehat{\mu }_{\mathfrak {3}}(z'')\}, \big \langle \bigg \langle \frac{1}{2 \pi \mathsf {i}} \frac{C_{{\mathfrak {34}}}}{z'' - z'''}, \{A_{\bar{z}{\mathfrak {4}}}(z'''),V_{\mathfrak {2}}(z')\} \big \rangle \bigg \rangle _{(z''',{\mathfrak {4}})} \big \rangle \bigg \rangle _{(z'',{\mathfrak {3}})} \\&\qquad - \big \langle \bigg \langle \{U_{\mathfrak {1}}(z), A_{\bar{z}{\mathfrak {3}}}(z''')\}, \big \langle \bigg \langle \frac{1}{2\pi \mathsf {i}} \frac{C_{{\mathfrak {34}}}}{z'' - z'''}, \{\widehat{\mu }_{\mathfrak {4}}(z'''),V_{\mathfrak {2}}(z')\} \big \rangle \bigg \rangle _{(z''',{\mathfrak {4}})} \big \rangle \bigg \rangle _{(z'',{\mathfrak {3}})}. \end{aligned}$$

By construction, working with this Dirac bracket allows us to set the pair of constraints \(\widehat{\mu } \approx 0\) and \(A_{\bar{z}} \approx 0\) strongly to zero.

3.3.1 Lax algebra

We will show that the Dirac bracket of \(B_z\) with itself satisfies the Lax algebra

$$\begin{aligned} \{B_{z{\mathfrak {1}}}(z), B_{z{\mathfrak {2}}}(z')\}^\star = \big [ r_{{\mathfrak {12}}}(z,z'), B_{z{\mathfrak {1}}}(z) + B_{z{\mathfrak {2}}}(z') \big ], \end{aligned}$$
(44)

where \(r_{{\mathfrak {12}}}(z,z')\) is the standard classical r-matrix

$$\begin{aligned} r_{{\mathfrak {12}}}(z,z') = \frac{C_{{\mathfrak {12}}}}{z'-z}. \end{aligned}$$
(45)

To compute this Dirac bracket, we begin by noting that (42b) implies

$$\begin{aligned} \{B_{z{\mathfrak {1}}}(z), \widehat{\mu }_{\mathfrak {2}}(z')\} = [C_{{\mathfrak {12}}}, B_{z{\mathfrak {1}}}(z)] \delta _{zz'}. \end{aligned}$$

Using this and the bracket \(\{ B_{z {\mathfrak {1}}}(z), A_{\bar{z} {\mathfrak {2}}}(z') \} = -2 \pi \mathsf {i}C_{{\mathfrak {12}}} \delta _{zz'}\) which follows from (27b) along with the constraint \(\mathcal C_z = 0\) in (26), we find

$$\begin{aligned}&\{B_{z{\mathfrak {1}}}(z), B_{z{\mathfrak {2}}}(z')\}^\star \\&\qquad = - \big \langle \bigg \langle [C_{{\mathfrak {13}}},B_{z{\mathfrak {1}}}(z)] \delta _{zz''}, \big \langle \bigg \langle \frac{1}{2 \pi \mathsf {i}} \frac{C_{{\mathfrak {34}}}}{z'' - z'''}, 2 \pi \mathsf {i}C_{{\mathfrak {24}}}\delta _{z'z'''} \big \rangle \bigg \rangle _{(z''',{\mathfrak {4}})} \big \rangle \bigg \rangle _{(z'',{\mathfrak {3}})}\\&\qquad \qquad - \big \langle \bigg \langle - 2\pi \mathsf {i}C_{{\mathfrak {13}}}\delta _{zz''},\big \langle \bigg \langle \frac{1}{2 \pi \mathsf {i}}\frac{C_{{\mathfrak {34}}}}{z'' - z'''}, - [C_{{\mathfrak {24}}}, B_{z{\mathfrak {2}}}(z')] \delta _{z'z'''} \big \rangle \bigg \rangle _{(z''',{\mathfrak {4}})} \big \rangle \bigg \rangle _{(z'',{\mathfrak {3}})}\\&\qquad = - \big \langle \bigg \langle [{C_{{\mathfrak {13}}}}, {B_{z{\mathfrak {1}}}(z)}] \delta _{zz''}, \frac{ C_{{\mathfrak {23}}}}{z''-z'} \big \rangle \bigg \rangle _{(z'',{\mathfrak {3}})} - \big \langle \bigg \langle C_{{\mathfrak {13}}}\delta _{zz''}, [{ \frac{C_{{\mathfrak {23}}}}{z'' - z'}}, {B_{z{\mathfrak {2}}}(z')}] \big \rangle \bigg \rangle _{(z'',{\mathfrak {3}})}\\&\qquad = - \bigg [{\frac{C_{{\mathfrak {12}}}}{z-z'}}, {B_{z{\mathfrak {1}}}(z)}\bigg ] - \bigg [{\frac{C_{{\mathfrak {12}}}}{z-z'}}, {B_{z{\mathfrak {2}}}(z')}\bigg ] = \bigg [{\frac{C_{{\mathfrak {12}}}}{z'-z}}, {B_{z{\mathfrak {1}}}(z) + B_{z{\mathfrak {2}}}(z')}\bigg ]. \end{aligned}$$

In other words, we recover the Lax algebra (44).

3.3.2 Lax matrix

By definition of \(\widehat{\mu }\) in (40), it follows that setting this constraint and its gauge fixing condition to zero strongly, i.e. \(\widehat{\mu }= 0\) and \(A_{\bar{z}} = 0\), leads to the equation

$$\begin{aligned} \partial _{\bar{z}}B_z = - 2\pi \mathsf {i}\, \sum _{i=1}^N \widehat{u}_i \delta _{zz_i}, \end{aligned}$$
(46)

or \(\partial _{\bar{z}}B_z = - 2\pi \mathsf {i}\, \widehat{u}_i \delta _{zz_i}\) in a small neighbourhood of the point \(z_i\), which is equivalent to (12). This then leads to the local meromorphic behaviour (13) of the (1, 0)-form B, namely

$$\begin{aligned} B = \frac{\widehat{u}_i}{z-z_i} d z + O(1). \end{aligned}$$

The Kostant–Kirillov bracket (38) for the residues \(\widehat{u}_i\) obtained in Sect. 3.1.2 (recall that we are now denoting the Dirac bracket \(\{\cdot , \cdot \}^*\) of Sect. 3.1.2 simply as \(\{\cdot , \cdot \}\)) is equivalent to the Lax algebra (44) derived in Sect. 3.3.1.

3.3.3 Lax equation

At this point, we have now fixed all the constraints strongly except for the primary constraint \(\Pi _t \approx 0\). However, now that \(\widehat{\mu }= 0\) is imposed strongly, the Hamiltonian (41) no longer involves the field \(A_t\) and simply reduces to

$$\begin{aligned} H = H^P_w. \end{aligned}$$

In particular, together with the Dirac bracket (44) this now implies the Lax equation (22) in the Hamiltonian formalism

$$\begin{aligned} \big \{ H^P_w, B_z(z) \big \}^\star = \bigg [ \frac{P'\big ( B_z(w) \big )}{z - w}, B_z(z) \bigg ]. \end{aligned}$$
(47)

We deduce, as claimed at the end of Sect. 2, that the time flow \(\partial _t\) along the topological direction of the 3-dimensional space \(\mathbb {R}\times C\) is the one induced by the Hamiltonian \(H^P_w = P(B_z(w))\) with respect to the Dirac bracket, i.e. \(\partial _t f = \{ H^P_w, f \}^\star \) for any function f of the Lax matrix \(B_z\). Focusing on such observables, we are also free to set \(\Pi _t = 0\) strongly since these all Poisson commute with \(\Pi _t\) under the Dirac bracket \(\{ \cdot , \cdot \}^\star \) and so their bracket will remain unchanged after introducing a further Dirac bracket to fix the constraint \(\Pi _t \approx 0\).

3.3.4 Involution

It is well known [3] that the Lax algebra (44) implies the involution property

$$\begin{aligned} \big \{ H^P_w, H^Q_z \big \}^\star = 0, \end{aligned}$$
(48)

for any pair of G-invariant polynomials \(P, Q : \mathfrak {g}\rightarrow \mathbb {C}\) and distinct points \(w,z \in C\).

This can also be seen more directly from the above Hamiltonian analysis of 3d mixed BF theory as follows. Since \(H^P_w = P(B_z(w))\) only depends on the field \(B_z\), we have the involution property

$$\begin{aligned} \big \{ H^P_w, H^Q_z \big \} = 0 \end{aligned}$$
(49)

with respect to the Poisson bracket (more precisely, recall that \(\{ \cdot , \cdot \}\) denotes the Dirac bracket introduced in Sect. 3.1), for any polynomials \(P, Q : \mathfrak {g}\rightarrow \mathbb {C}\) and distinct points \(w, z \in C\). But since \(H^P_w\) is gauge invariant for G-invariant polynomials P and \(- \widehat{\mu }\) is the generator of gauge transformations, see (42), we have \(\{\widehat{\mu }(z), H^P_w\} = 0\). The involution property (49), for any polynomials \(P, Q : \mathfrak {g}\rightarrow \mathbb {C}\), therefore immediately implies the involution property (48), for any G-invariant polynomials \(P, Q : \mathfrak {g}\rightarrow \mathbb {C}\).

4 Discussion

In this article, we showed that Gaudin models associated with a finite-dimensional semisimple Lie algebra, and more generally tamely ramified Hitchin systems, can be obtained from 3d mixed BF theory in the presence of certain line defects by moving to the Hamiltonian framework and fixing the gauge symmetry using the gauge fixing condition \(A_{\bar{z}} \approx 0\).

This 3-dimensional gauge theoretic origin of finite-dimensional Gaudin models is exactly analogous to that of affine Gaudin models in terms of 4-dimensional mixed topological–holomorphic Chern–Simons theory [65].

We note in passing that it would be interesting to relate the connection between affine Gaudin models and 2d integrable field theories, in the context of the present work, to another description of 2d integrable field theories via an affine extension of the Hitchin system considered in [47].

4.1 Alternative realisations

The Lax matrix of the Gaudin model, or the Higgs field of the Hitchin system, arises from the (1, 0)-form B of the 3d mixed BF theory. In particular, after going to the gauge \(A_{\bar{z}} = 0\) the latter becomes meromorphic with simple poles (13) at each \(z_i\), the location of the type A line defects. The specific choice of line defect (7) led to the residues of B at these simple poles being coadjoint orbits \(\widehat{u}_i = h_i u_i h_i^{-1}\) of some fixed Lie algebra elements \(u_i \in \mathfrak {g}\). As is well known, and as we have rederived in the present setting in Sect. 3.1.2, such coadjoint orbits provide a realisation of the Kostant–Kirillov Poisson bracket (38).

It would be interesting to see if other realisations of the Kostant–Kirillov Poisson bracket can be obtained by making other choices of type A defects than (7). Indeed, since the field \(B_z\) satisfies the Lax algebra (44) regardless of the choice of type A line defects we make, the residues \(\widehat{u}_i\) at each simple pole \(z_i\) of \(B_z\) will necessarily satisfy the Kostant–Kirillov bracket. As mentioned in the affine case in [65, §4.1], it would be desirable to find the precise dictionary between the possible choices of type A line defects one can introduce in 3d mixed BF theory and different types of possible representations of the Kostant–Kirillov bracket.

4.2 Generalised Gaudin models

We have focused in this paper on the case when the Lax matrix of the Gaudin model, or the Higgs field of the Hitchin system, has simple poles at the marked points \(z_i\).

It would be interesting to consider also type A line defects which would give rise to higher-order poles in the Lax matrix in order to construct Gaudin models with irregular singularities [28, 29, 63]. In the affine setting, generalised surface defects in 4-dimensional Chern–Simons theory leading to affine Gaudin models with irregular singularities were recently considered in [8, 46].

Other generalisations of the Gaudin model which one could try to relate to 3d mixed BF theory, or some generalisation thereof, include cyclotomic Gaudin models [58, 62, 63] or dihedral Gaudin models (see [64] in the affine case), whose Lax matrices are equivariant under the action of cyclic or dihedral groups, respectively. In the affine case, such a generalisation was considered recently in [57] where the symmetric space \(\lambda \)-model, which can be described as a \(\mathbb {Z}_4\)-cyclotomic affine Gaudin model, was obtained along the lines of [65] starting from 4d Chern–Simons theory with a \(\mathbb {Z}_4\)-equivariance condition imposed on the gauge field.

As another direction for further possible generalisation, it would be interesting to understand whether the generalised Gaudin-type models of [10], which are based on solutions of the classical Yang–Baxter equation and classical N-reflection equations, can similarly be obtained from 3d mixed BF theory or some generalisation thereof.

4.3 Quantum Gaudin models

The 4-dimensional gauge theoretic origin of 2-dimensional integrable field theories, as proposed by Costello and Yamazaki in [17], has been extensively studied over the past couple of years, see for instance [1, 8, 9, 11, 14, 20, 21, 32,33,34,35,36, 45, 46, 56, 57, 59,60,61].

The proposal of [64], see also [19, 44], to reformulate non-ultralocal integrable field theories with twist functions as affine Gaudin models similarly provides a deeper origin, more algebraic in nature, of the integrable structure in these theories.

Both the gauge theoretic and algebraic approaches to 2-dimensional integrable field theories, which are of course intimately related [65], have been used to construct many new examples of 2-dimensional classical integrable field theories in recent years; see, for instance, [2, 4, 18, 19] in the affine Gaudin model setting and the references above in the 4d Chern–Simons theory setting. Finite Gaudin models, or equivalently 3d mixed BF theory, could similarly be used to extend the list of known finite-dimensional integrable systems.

However, the main interest in both approaches lies in their potential to offer new perspectives on various long-standing open problems in quantum integrable field theory, such as the problem of quantisation of non-ultralocal integrable field theories or the search for a deeper understanding of the celebrated ODE/IM correspondence [5, 6, 22, 26, 48]. Indeed, one of the main original motivations in [64] for reformulating non-ultralocal integrable field theories as affine Gaudin models was the remarkable observation made in [26], based on the example of quantum KdV theory, that this may provide an explanation for the ODE/IM correspondence in terms of some suitable affine generalisation of the geometric Langlands correspondence.

By contrast with the affine case, however, the quantisation of the finite Gaudin model, and more generally of the Hitchin system, is extremely well understood; see e.g. [7, 27,28,29,30,31, 49,50,51,52,53, 55]. The connection between 3d mixed BF theory and finite Gaudin models should therefore provide a useful toy model for further developing our understanding of the gauge theoretic approach to integrable models and more generally integrable field theories in the sense of [17]. In particular, it would be very desirable to understand the Bethe ansatz construction in quantum Gaudin models, and more generally the Gaudin/oper correspondence [28,29,30,31, 49, 50], from the point of view of quantum 3d mixed BF theory. In fact, the quantisation of 3d mixed BF theory and its embedding in 4d super-Yang–Mills theory was also recently described in [37], in relation to the analytic version of the geometric Langlands correspondence proposed in [23,24,25]. It is expected that the quantum Gaudin model, and more generally the quantisation of the Hitchin system, should arise from critical level quantum 3d Chern–Simons theory [37, 66], a certain deformation of quantum 3d mixed BF theory, in accordance with the central role played by the critical level in the quantum Gaudin model [28,29,30,31, 49, 50].