On circuit diameter bounds via circuit imbalances

We study the circuit diameter of polyhedra, introduced by Borgwardt, Finhold, and Hemmecke (SIAM J. Discrete Math. 29 (1), 113–121 (2015)) as a relaxation of the combinatorial diameter. We show that the circuit diameter of a system { x ∈ R n : Ax = b , 0 ≤ x ≤ u } for A ∈ R m × n is bounded by O ( m min { m , n − m } log ( m + κ A ) + n log n ) , where κ A is the circuit imbalance measure of the constraint matrix. This yields a strongly polynomial circuit diameter bound if e.g., all entries of A have polynomially bounded encoding length in n . Further, we present circuit augmentation algorithms for LPs using the minimum-ratio circuit cancelling rule. Even though the standard minimum-ratio circuit cancelling algorithm is not ﬁnite in general, our variant can solve an LP in O ( mn 2 log ( n + κ A )) augmentation steps.


Introduction
The combinatorial diameter of a polyhedron P is the diameter of the vertex-edge graph associated with P .Hirsch's famous conjecture from 1957 asserted that the combinatorial diameter of a d-dimensional polytope (bounded polyhedron) with f facets is at most f −d.This was disproved by Santos in 2012 [San12].The polynomial Hirsch conjecture, i.e., finding a poly(f ) bound on the combinatorial diameter remains a central question in the theory of linear programming.
The first quasipolynomial bound was given by Kalai and Kleitman [Kal92,KK92], see [Suk17] for the best current bound and an overview of the literature.Dyer and Frieze [DF94] proved the polynomial Hirsch conjecture for totally unimodular (TU) matrices.For a system {x ∈ R d : M x ≤ b} with integer constraint matrix M , polynomial diameter bounds were given in terms of the maximum subdeterminant ∆ M [BDSE + 14, BR13, EV17, DH16].These arguments can be strengthened to using a parametrization by a 'discrete curvature measure' δ M ≥ 1/(d∆ 2 M ).The best such bound was given by Dadush and Hähnle [DH16] as O(d 3 log(d/δ M )/δ M ), using a shadow vertex simplex algorithm.
As a natural relaxation of the combinatorial diameter, Borgwardt, Finhold, and Hemmecke [BFH15] initiated the study of circuit diameters.Consider a polyhedron in standard equality form P = { x ∈ R n : Ax = b, x ≥ 0 } (P) for A ∈ R m×n , b ∈ R m ; we assume rk(A) = m.For the linear space W = ker(A) ⊆ R n , g ∈ W is an elementary vector if g is a support-minimal nonzero vector in W , that is, no h ∈ W \{0} exists such that supp(h) supp(g).A circuit in W is the support of some elementary vector; these are precisely the circuits of the associated linear matroid M(A).We remark that many papers on circuit diameter, e.g., [BSY18, BDLF16, BFH15, BDLFM18, KPS19], refer to elementary vectors as circuits; we follow the traditional convention of [Ful67,Roc67,Lee89].We let E(W ) = E(A) ⊆ W and C(W ) = C(A) ⊆ 2 n denote the set of elementary vectors and circuits in the space W = ker(A), respectively.All edge directions of P are elementary vectors, and the set of elementary vectors E(A) equals the set of all possible edge directions of P in the form (P) for varying b ∈ R m [ST97].
A circuit walk is a sequence of points x (0) , x (1) , . . ., x (k) in P such that for each i = 0, . . ., k−1, x (i+1) = x (i) + α (i) g (i) for some g (i) ∈ E(A) and α (i) > 0, and further, x (i) + αg (i) / ∈ P for any α > α (i) , i.e., each consecutive circuit step is maximal.The circuit diameter of P is the maximum length (number of steps) of a shortest circuit walk between any two vertices x, y ∈ P .Note that, in contrast to walks in the vertex-edge graph, circuit walks are non-reversible and the minimum length from x to y may be different from the one from y to x; this is due to the maximal step requirement.The circuit-analogue of Hirsch conjecture, formulated in [BFH15], asserts that the circuit diameter of d-dimensional polyhedron with f facets is at most f − d; this may be true even for unbounded polyhedra, see [BSY18].For P in the form (P), d = n − m and the number of facets is at most n; hence, the conjectured bound is m.
Circuit diameter bounds have been shown for some combinatorial polytopes such as dual transportation polyhedra [BFH15], matching, travelling salesman, and fractional stable set polytopes [KPS19].The paper [BDLF16] introduced several other variants of circuit diameter, and explored the relation between them.We note that [BDLF16,KPS19,DLKS22] considers circuits for LPs given in the general form {x ∈ R n : Ax = b, Bx ≤ d}.In Section 8, we show that this setting can be reduced to the form (P).
Circuit augmentation algorithms Circuit diameter bounds are inherently related to circuit augmentation algorithms.This is a general algorithmic scheme to solve an LP min c, x s.t.Ax = b , x ≥ 0 . (LP) The algorithm proceeds through a sequence of feasible solutions x (t) .An initial feasible x (0) is required in the input.For t = 0, 1, . . ., the current x (t) is updated to x (t+1) = x (t) + αg for some g ∈ E(A) such that c, g ≤ 0, and α > 0 such that x (t) + αg is feasible.The elementary vector g is an augmenting direction if c, g < 0 and such an α > 0 exists; by LP duality, x (t) is optimal if and only if no augmenting direction exists.The augmentation is maximal if x (t) + α ′ g is infeasible for any α ′ > α; α is called the maximal stepsize for x (t) and g.Clearly, an upper bound on the number of steps of a circuit augmentation algorithm with maximal augmentations for arbitrary cost c and starting point x (0) yields an upper bound on the circuit diameter.Simplex is a circuit augmentation algorithm that is restricted to using special elementary vectors corresponding to edges of the polyhedron.Many network optimization algorithms can be seen as special circuit augmentation algorithms.Bland [Bla76] introduced a circuit augmentation algorithm for LP, that generalizes the Edmonds-Karp-Dinic maximum flow algorithm and its analysis, see also [Lee89,Proposition 3.1].Circuit augmentation algorithms were revisited by De Loera, Hemmecke, and Lee in 2015 [DLHL15], analyzing different augmentation rules and also extending them to integer programming.De Loera, Kafer, and Sanità [DLKS22] studied the convergence of these rules on 0/1-polytopes, as well as the computational complexity of performing them.We refer the reader to [DLHL15] and [DLKS22] for a more detailed overview of the background and history of circuit augmentations.
The circuit imbalance measure For a linear space W = ker(A) ⊆ R n , the circuit imbalance κ W = κ A is defined as the maximum of |g j /g i | over all elementary vectors g ∈ E(W ), i, j ∈ supp(g).It can be shown that κ W = 1 if and only if W is a unimodular space, i.e., the kernel of a totally unimodular matrix.This parameter and related variants have been used implicitly or explicitly in many areas of linear programming and discrete optimization, see [ENV22] for a recent survey.It is closely related to the Dikin-Stewart-Todd condition number χW that plays a key role in layered-least-squares interior point methods introduced by Vavasis and Ye [VY96].An LP of the form (LP) for A ∈ R m×n can be solved in time poly(n, m, log κ A ), which is strongly polynomial if κ A ≤ 2 poly(n) ; see [DHNV23,DNV20a] for recent developments and references.

Imbalance and diameter
The combinatorial diameter bound O(d 3 log(d/δ M )/δ M ) from [DH16] mentioned above translates to a bound O((n − m) 3 mκ A log(κ A + n)) for the system in the form (P), see [ENV22].For circuit diameters, the Goldberg-Tarjan minimum-mean cycle cancelling algorithm for minimum-cost flows [GT89] naturally extends to a circuit augmentation algorithm for general LPs using the steepest-descent rule.This yields a circuit diameter bound O(n 2 mκ A log(κ A + n)) [ENV22], see also [GD21].However, note that these bounds may be exponential in the bit-complexity of the input.

Our contributions
Our first main contribution improves the κ A dependence to a log κ A dependence for circuit diameter bounds.
Theorem 1.1.The circuit diameter of a system in the form (P) The proof in Section 3 is via a simple 'shoot towards the optimum' scheme.We need the wellknown concept of conformal circuit decompositions.We say that x, y ∈ R n are sign-compatible if x i y i ≥ 0 for all i ∈ [n].We write x ⊑ y if they are sign-compatible and further |x i | ≤ |y i | for all i ∈ [n].It follows from Carathéodory's theorem and Minkowski-Weyl theorem that for any linear space W ⊆ R n and x ∈ W , there exists a decomposition x = k j=1 h (j) such that h (j) ∈ E(W ), h (j) ⊑ x for all j ∈ [k] and k ≤ dim(W ).This is called a conformal circuit decomposition of x (see also Definition 2.2 and Lemma 2.3 below).
Let B ⊆ [n] be a feasible basis and N = [n] \ B, i.e., x * = (A −1 B b, 0 N ) ≥ 0 n is a basic feasible solution.This is the unique optimal solution to (LP) for the cost function c = (0 B , 1 N ).Let x (0) ∈ P be an arbitrary vertex.We may assume that n ≤ 2m, by restricting to the union of the support of x * and x (0) , and setting all other variables to 0. For the current iterate x (t) , let us consider a conformal circuit decomposition x * − x (t) = k j=1 h (j) .Note that the existence of such a decomposition does not yield a circuit diameter bound of n, due to the maximality requirement in the definition of circuit walks.For each j ∈ [k], x (t) + h (j) ∈ P , but there might be a larger augmentation x (t) + αh (j) ∈ P for some α > 1.
Still, one can use this decomposition to construct a circuit walk.Let us pick the most improving circuit from the decomposition, i.e., the one maximizing − c, h (j) = h (j) N 1 , and obtain x (t+1) = x (t) + α (t) h (j) for the maximum stepsize α (t) ≥ 1.The proof of Theorem 1.1 is based on analyzing this procedure.The first key observation is that c, x (t) = x (t) N 1 decreases geometrically.Then, we look at the set of indices i }, and show that indices may never leave these sets once they enter.Moreover, a new index is added to either set every O(m log(m + κ A )) iterations.In Section 4, we extend this bound to the setting with upper bounds on the variables.
Theorem 1.2.The circuit diameter of a system in the form Ax = b, 0 There is a straightforward reduction from the capacitated form to (P) by adding n slack variables; however, this would give an O(n 2 log(n + κ A )) bound.For the stronger bound, we use a preprocessing that involves cancelling circuits in the support of the current solution; this eliminates all but O(m) of the capacity bounds in O(n log n) iterations, independently of κ A .
For rational input, log(κ A ) = O(size(A)) where size(A) denotes the total encoding length of A [DHNV23].Hence, our result yields an O(m min{m, n − m} size(A) + n log n) diameter bound on Ax = b, 0 ≤ x ≤ u.This can be compared with the bounds O(n size(A, b)) using deepest descent augmentation steps in [DLHL15,DLKS22], where size(A, b) is the encoding length of (A, b).(Such a bound holds for every augmentation rule that decreases the optimality gap geometrically, including the minimum-ratio circuit rule discussed below.)Note that our bound is independent of b.Furthermore, it is also applicable to systems given by irrational inputs, in which case arguments based on subdeterminants and bit-complexity cannot be used.
In light of these results, the next important step towards the polynomial Hirsch conjecture might be to show a poly(n, log κ A ) bound on the combinatorial diameter of (P).Note thatin contrast with the circuit diameter-not even a poly(n, size(A, b)) bound is known.In this context, the best known general bound is O((n − m) 3 mκ A log(κ A + n)) implied by [DH16].

Circuit augmentation algorithms
The diameter bounds in Theorems 1.1 and 1.2 rely on knowing the optimal solution x * ; thus, they do not provide efficient LP algorithms.We next present circuit augmentation algorithms with poly(n, m, log κ A ) bounds on the number of iterations.Such algorithms require subroutines for finding augmenting circuits.In many cases, such subroutines are LPs themselves.However, they may be of a simpler form, and might be easier to solve in practice.Borgwardt and Viss [BV20] exhibit an implementation of a steepest-descent circuit augmentation algorithm with encouraging computational results.
We assume that a subroutine Ratio-Circuit(A, c, w) is available; this implements the wellknown minimum-ratio circuit rule.It takes as input a matrix A ∈ R m×n , c ∈ R n , w ∈ (R ++ ∪ {∞}) n , and returns a basic optimal solution to the system min c, z where (z Here, we use the convention w i z i = 0 if w i = ∞ and z i = 0.This system can be equivalently written as an LP using auxiliary variables.If bounded, a basic optimal solution is either 0 or an elementary vector z ∈ E(A) that minimizes c, z / w, z − .Given x ∈ P , we use weights w i = 1/x i (with w i = ∞ if x i = 0).For minimum-cost flow problems, this rule was proposed by Wallacher [Wal89]; such a cycle can be found in strongly polynomial time for flows.The main advantage of this rule is that the optimality gap decreases by a factor 1 − 1/n in every iteration.This rule, along with the same convergence property, can be naturally extended to linear programming [MS00], and has found several combinatorial applications, e.g., [WZ99,Way02], and has also been used in the context of integer programming [SW99].
On the negative side, Wallacher's algorithm is not strongly polynomial: it does not terminate finitely for minimum-cost flows, as shown in [MS00].In contrast, our algorithms achieve a strongly polynomial running time whenever κ A ≤ 2 poly(n) .An important modification is the occasional use of a second type of circuit augmentation step Support-Circuit that removes circuits in the support of the current (non-basic) iterate x (t) (see Subroutine 2.1); this can be implemented using simple linear algebra.Our first result addresses the feasibility setting: Theorem 1.3.Consider an LP of the form (LP) with cost function c = (0 [n]\N , 1 N ) for some N ⊆ [n].There exists a circuit augmentation algorithm that either finds a solution x such that x N = 0 or a dual certificate that no such solution exists, using O(mn log(n + κ A )) Ratio-Circuit and (m + 1)n Support-Circuit augmentation steps.
Such problems typically arise in Phase I of the Simplex method when we add auxiliary variables in order to find a feasible solution.The algorithm is presented in Section 6.The analysis extends that of Theorem 1.1, tracking large coordinates x (t) i .Our second result considers general optimization: Theorem 1.4.Consider an LP of the form (LP).There exists a circuit augmentation algorithm that finds an optimal solution or concludes unboundedness using O(mn 2 log(n + κ A )) Ratio-Circuit and (m + 1)n 2 Support-Circuit augmentation steps.
The proof is given in Section 7. The main subroutine identifies a new index i ∈ [n] such that x (t) i = 0 in the current iteration and x * i = 0 in an optimal solution; we henceforth fix this variable to 0. To derive this conclusion, at the end of each phase the current iterate x (t) will be optimal to (LP) with a slightly modified cost function c; the conclusion follows using a proximity argument (Theorem 5.4).The overall algorithm repeats this subroutine n times.The subroutine is reminiscent of the feasibility algorithm (Theorem 1.3) with the following main difference: whenever we identify a new 'large' coordinate, we slightly perturb the cost function.
Comparison to black-box LP approaches An important milestone towards strongly polynomial linear programming was Tardos's 1986 paper [Tar86] on solving (LP) in time poly(n, m, log ∆ A ), where ∆ A is the maximum subdeterminant of A. Her algorithm makes O(nm) calls to a weakly polynomial LP solver for instances with small integer capacities and costs, and uses proximity arguments to gradually learn the support of an optimal solution.This approach was extended to the real model of computation for a poly(n, m, log κ A ) bound [DNV20a].The latter result uses proximity arguments with circuit imbalances κ A , and eliminates all dependence on bitcomplexity.
The proximity tool Theorem 5.4 derives from [DNV20a], and our circuit augmentation algorithms are inspired by the feasibility and optimization algorithms in this paper.However, using circuit augmentation oracles instead of an approximate LP oracle changes the setup.Our arguments become simpler since we proceed through a sequence of feasible solutions, whereas much effort in [DNV20a] is needed to deal with infeasibility of the solutions returned by the approximate solver.On the other hand, we need to be more careful as all steps must be implemented using circuit augmentations in the original system, in contrast to the higher degree of freedom in [DNV20a] where we can make approximate solver calls to arbitrary modified versions of the input LP.
Organization of the paper The rest of the paper is organized as follows.We first provide the necessary preliminaries in Section 2. In Section 3, we upper bound the circuit diameter of (P).In Section 4, this bound is extended to the setting with upper bounds on the variables.Then, we develop circuit-augmentation algorithms for solving (LP).In particular, Section 6 contains the algorithm for finding a feasible solution, whereas Section 7 contains the algorithm for solving (LP) given an initial feasible solution.Section 8 shows how circuits in LPs of more general forms can be reduced to the notion used in this paper.

Preliminaries
Let [n] = {1, 2, . . ., n}.Let R + and R ++ be the set of nonnegative and positive real numbers respectively.For α ∈ R, we denote α + = max{0, α} and α − = max{0, −α}.For a vector z ∈ R n , we define z + , z − ∈ R n as (z . We use • p to denote the ℓ p -norm; we simply write • for • 2 .For A ∈ R m×n and S ⊆ [n], we let A S ∈ R m×|S| denote the submatrix corresponding to columns S. We denote rk(S) := rk(A S ), i.e., the rank of the set S in the linear matroid associated with A. A spanning subset of S is a subset T ⊆ S such that rk(T ) = rk(S).The closure of S is defined as cl(S) : Note that y uniquely determines s, and due to the assumption rk(A) = m, s also uniquely determines y.For this reason, given a dual feasible solution (y, s), we can just focus on y or s.
For A ∈ R m×n , let W = ker(A).Recall that C(W ) = C(A) and E(W ) = E(A) are the set of circuits and elementary vectors in W respectively.Note that every circuit has size at most m + 1 because we assumed that rk(A) = m.The circuit imbalance measure of W is defined as Otherwise, it is defined as κ W := κ A := 1.For a linear space W ⊆ R n , let W ⊥ denote the orthogonal complement.Thus, for W = ker(A), W ⊥ = Im(A ⊤ ).According to the next lemma, circuit imbalances are self-dual.
For P as in (P), x ∈ P and an elementary vector g ∈ E(A) \ R n + , we let aug P (x, g) := x + αg where α = max{ᾱ : We write x ⊑ y if they are sign-compatible and further The following lemma shows that every vector in a linear space has a conformal circuit decomposition.It is a simple corollary of the Minkowski-Weyl and Carathéodory theorems.
Lemma 2.3.For a linear space W ⊆ R n , every x ∈ W has a conformal circuit decomposition x = k j=1 h (j) such that k ≤ min{dim(W ), |supp(x)|}.

Circuit oracles
In Sections 4, 6, and 7, we use a simple circuit finding subroutine Support-Circuit(A, c, x, S) that will be used to identify circuits in the support of a solution x.This can be implemented easily using Gaussian elimination.Note that the constraint c, z ≤ 0 is superficial as −z is also an elementary vector for every elementary vector z.
The circuit augmentation algorithms in Sections 6 and 7 will use the subroutine Ratio-Circuit(A, c, w).

Subroutine 2.2. Ratio-Circuit(A, c, w)
For a matrix A ∈ R m×n and vectors c ∈ R n , w ∈ (R ++ ∪ {∞}) n , the output is a basic optimal solution to the system: and an optimal solution to the following dual program: Note that (2) can be reformulated as an LP using additional variables, and its dual LP can be equivalently written as (3).Recall that we use the convention is bounded, then a basic optimal solution is either 0 or an elementary vector z ∈ E(A) that minimizes c, z / w, z − .Moreover, observe that every feasible solution to (3) is also feasible to (DLP).
We will use the following lemma, a direct consequence of [DNV20b, Lemma 4.3].
This lemma, together with Lemma 2.1, allows us to assume that the optimal dual solution s returned by Ratio-Circuit satisfies To see this, let (y, s, λ) be an optimal solution to (3).We know that Then, s ′ := r ′ + c is an optimal solution to (3) which satisfies Thus, (4) follows using Lemma 2.1, since κ The following lemma is well-known, see e.g., [MS00, Lemma 2.2].
Lemma 2.5.Let OPT be the optimal value of (LP), and assume that it is finite.Given a feasible solution x to (LP), let g be the optimal solution to (2) returned by Ratio-Circuit(A, c, 1/x).
(i) If c, g = 0, then x is optimal to (LP).
Remark 2.6.It is worth noting that Lemma 2.5 shows that applying Ratio-Circuit to vectors x with small support gives better convergence guarantees.Algorithms 3 and 4 for feasibility and optimization in Sections 6 and 7 apply Ratio-Circuit to vectors x which have large support |supp(x)| = Θ(n) in general.These algorithms could be reformulated in that one first runs Support-Circuit to reduce the size of the support to size O(m) and only then runs Ratio-Circuit.The guarantees of Lemma 2.5 now imply that to reduce the optimality gap by a constant factor we would replace O(n) calls to Ratio-Circuit with only O(m) calls.On the other hand, this comes at the cost of n additional calls to Support-Circuit for every call to Ratio-Circuit.

A norm bound
We now formulate a proximity bound asserting that if the columns of A outside N are linearly independent, then we can bound the ℓ ∞ -norm of any vector in ker(A) by the norm of its coordinates in N .This can be seen as a special case of Hoffman-proximity results; see Section 5 for more such results and references.
The lemma follows by combining all the previous inequalities.

Estimating circuit imbalances
The circuit augmentation algorithms in Sections 6 and 7 explicitly use the circuit imbalance measure κ A .However, this is NP-hard to approximate within a factor 2 O(n) , see [Tun99,DHNV23].We circumvent this problem using a standard guessing procedure, see e.g., [VY96,DHNV23].Instead of κ A , we use an estimate κ, initialized as κ = n.Running the algorithm with this estimate either finds the desired feasible or optimal solution (which one can verify), or fails.In case of failure, we conclude that κ < κ A , and replace κ by κ2 .Since the running time of the algorithms is linear in log(n + κ), the running time of all runs will be dominated by the last run, giving the desired bound.For simplicity, the algorithm descriptions use the explicit value κ A .

The Circuit Diameter Bound
In this section, we show Theorem 1.1, namely the bound O(m min{m, n − m} log(m + κ A )) on the circuit diameter of a polyhedron in standard form (P). As outlined in the Introduction, let B ⊆ [n] be a feasible basis and N = [n]\B such that x * = (A −1 B b, 0 N ) is a basic solution to (LP).We can assume n ≤ 2m: the union of the supports of the starting vertex x (0) and the target vertex x * is at most 2m; we can fix all other variables to 0. Defining ñ := |supp(x * ) ∪ supp(x (0) )| ≤ 2m and restricting A to these columns, we show a circuit diameter bound O(ñ(ñ − m) log(m + κ A )).This implies Theorem 1.1 for general n.In the rest of this section, we use n instead of ñ, but assume n ≤ 2m.The simple 'shoot towards the optimum' procedure is shown in Algorithm 1.
A priori, even finite termination is not clear.First, we show that the 'cost' N 1 decreases geometrically.It is a consequence of choosing the most improving circuit g (t) in each iteration.

Algorithm 1: Diameter-Bound
Input : Polyhedron in standard form (P), basis B ⊆ [n] with its corresponding vertex x * = (A −1 B b, 0 N ), and initial vertex x (0) .Output: Length of a circuit walk from x (0) to x * .
Proof.Let h (1) , . . ., h (k) with k ≤ n − m be the conformal circuit decomposition of x * − x (t) used in iteration t of Algorithm 1.Note that h where the last equality uses the conformality of the decomposition.Let α (t) be such that . Clearly, α (t) ≥ 1 because x (t) + g (t) ∈ P .Hence, Further, using 0 ≤ x N , we see that and so for all i ∈ [n] we have |x Our convergence proof is based on analyzing the following sets The set L t consists of indices i where x * i is much larger than the current 'cost' N 1 .On the other hand, the set R t consists of indices i where x (t) i is not much above x * i .The next lemma shows that the sets L t and R t are monotonically growing.Lemma 3.2.For every iteration t ≥ 0, we have N 1 is monotonically decreasing by Lemma 3.1, and x * j by Lemma 3.1.In both cases, we conclude that j ∈ R t+1 .
Our goal is to show that R t or L t is extended within O((n − m) log(n + κ A )) iterations.By the maximality of the augmentation, we know that at least one variable is set to zero in every iteration t.The following lemma shows that these variables do not lie in L t .
Clearly, any variable i that is set to zero in iteration t belongs to R t+1 .So, if i / ∈ R t , then we make progress as R t R t+1 .Note that this is always the case if i ∈ N .We show that if x is an elementary vector, we have ( Since A B has full column rank, we have supp( h) ∩ N = ∅ and so From ( 5), (6) and noting that h N 1 ≤ g 1 by our choice of g (t) , we get Tt ∞ as in the assumption of the lemma, then x where the last inequality is due to i ∈ T t by Lemma 3.3.It follows that i / ∈ R t as desired.
We are ready to give the convergence bound.We have just proved that a large x guarantees the extension of R t .Using the geometric decay of x (t) N (Lemma 3.1), we now show that if x N 1 drops sufficiently such that a new variable enters L t .
Proof of Theorem 1.1.Recall that we assumed n ≤ 2m without loss of generality.In light of Lemma 3.2, it suffices to show that either Tt ∞ .We may also assume that x N 1 > 0, as otherwise x (t) = x * .By Lemma 3.1, there where the second inequality is due to N ⊆ T t by Lemma 3.2.Thus, N 1 and so L t L r .

Circuit Diameter Bound for the Capacitated Case
In this section we consider diameter bounds for systems of the form (Cap-P) The theory in Section 3 carries over to P u at the cost of turning m into n via the standard reformulation Corollary 4.1.The circuit diameter of a system in the form (Cap-P) with constraint matrix Proof.Follows straightforward from Theorem 1.1 together with the reformulation (7).Let A denote the constraint matrix of (7).It is easy to check that κ A = κ A , and that there is a one-toone mapping between the circuits and maximal circuit augmentations of the two systems.
Intuitively, the polyhedron should not become more complex; related theory in [vdBLL + 21] also shows how two-sided bounds can be incorporated in a linear program without significantly changing the complexity of solving the program.
Theorem 1.2 is proved using a new procedure, which we outline below.A basic feasible point x * ∈ P u is characterised by a partition B ∪ L ∪ H = [n] where A B is a basis (has full column rank), x * L = 0 L and x * H = u H .In O(n log n) iterations, we fix all but 2m variables to the same bound as in x * ; for the remaining system with 2m variables, we can use the standard reformulation.
Algorithm 2 starts with a preprocessing.We let S t ⊆ L ∪ H denote the set of indices where x As long as |S t | > m, we proceed as follows.We define the cost function c ∈ R n by c i = 0 for i ∈ B, c i = 1/u i for i ∈ L, and c i = −1/u i for i ∈ H.For this choice, we see that the optimal solution of the LP min x∈Pu c, x is x * with optimal value c, x * = −|H|.
Depending on the value of c, x (t) , we perform one of two updates.As long as c, x (t) ≥ −|H| + 1, we take a conformal decomposition of x * − x (t) , and pick the most improving augmenting direction from the decomposition.If c, x (t) < −|H| + 1, then we use a support circuit augmentation obtained from Support-Circuit(A, c, x (t) , S t ).
Let us show that whenever Support-Circuit is called, g (t) is guaranteed to exist.This is because |S t | > m and x (t) i > 0 for all i ∈ S t .Indeed, if x (t) j = 0 for some j ∈ S t , then j ∈ H from the definition of S t .However, this implies that which is a contradiction.
The cost c, x (t) is monotone decreasing, and it is easy to see that c, x (0) ≤ n for any initial solution x (0) .Hence, within O((n − m) log n) iterations we must reach c, x (t) < −|H|+ 1.Each support circuit augmentation sets x (t+1) i = 0 for i ∈ L or x (t+1) i = u i for i ∈ H; hence, we perform at most n − m such augmentations.The formal proof is given below.
Proof of Theorem 1.2.We show that Algorithm 2 has the claimed number of iterations.As previously mentioned, c, x * = −|H| is the optimal value of the LP min x∈Pu c, x .Initially, c, x (0) = − i∈H x (0) u i ≤ n.Similar to Lemma 3.1, due to our choice of g (t) from the conformal circuit decomposition, we have c, x (t+1) + |H| ≤ (1 − 1 n−m )( c, x (t) + |H|).In particular, O((n − m) log n) iterations suffice to find an iterate t such that c, x (t) < −|H| + 1.
Note that the calls to Support-Circuit do not increase c, x (t) , so from now we will never make use of the conformal circuit decomposition again.An augmentation resulting from a call to Support-Circuit will set at least one variable i ∈ supp(g (t) ) to either 0 or u i .We claim that either x (t+1) i = 0 for some i ∈ L, or x (t+1) i = u i for some i ∈ H, that is, we set a variable to the 'correct' boundary.To see this, note that if x (t+1) i hits the wrong boundary, then the gap between c, x (t+1) and −|H| must be at least 1, a clear contradiction to c, x (t+1) < −|H| + 1.
Thus, after at most n − m calls to Support-Circuit, we get |S t | ≤ m, at which point we call Algorithm 1 with at most 2m variables, so the diameter bound of Theorem 1.1 applies.

Proximity Results
We now present Hoffman-proximity bounds in terms of the circuit imbalance measure κ A .A simple such bound was Lemma 2.7; we now present additional norm bounds.These can be derived from more general results in [DNV20a]; see also [ENV22].The references also explain the background and similar results in previous literature, in particular, to proximity bounds via ∆ A in e.g., [Tar86] and [CGST86].For completeness, we include the proofs.
The next technical lemma will be key in our arguments.See Corollary 5.2 below for a simple implication.
Before the proof, it is worth stating a useful special case L = ∅ and S = [n].
Corollary 5.2.Let x be a basic (but not necessarily feasible) solution to (LP).Then, for any z where Az = b, we have x ∞ ≤ κ A z 1 .
Proof of Lemma 5.1.First, we show that x S∩cl(L) = 0 due to our assumption.Indeed, any i ∈ S ∩ cl(L) with x i = 0 gives rise to a circuit in L ∪ {i} ⊆ supp(x), contradicting the assumption in the lemma.It follows that x S ∞ = x S\cl(L) ∞ ; let j ∈ S \ cl(L) such that |x j | = x S ∞ .Let z ∈ ker(A) + x be a minimizer of the RHS in the statement.We may assume that |x j | > |z j |, as otherwise we are done because κ A ≥ 1.
By conformality of the decomposition, |x j − z j | = t∈R |h (t) j |.According to Claim 5.3, for every t ∈ R, we have |h where the last inequality is obtained by combining the previous equation and inequalities.
The following proximity theorem will be key to derive x * i = 0 for certain variables in our optimization algorithm; see [DNV20a] and [ENV22, Theorem 6.5].For c ∈ R n , we use LP(c) to denote (LP) with cost vector c, and OPT(c) as the optimal value of LP(c).
Theorem 5.4.Let c, c ′ ∈ R n be two cost vectors, such that both LP(c) and LP(c ′ ) have finite optimal values.Let s ′ be a dual optimal solution to LP(c ′ ).For all indices j ∈ [n] such that it follows that x * j = 0 for every optimal solution x * to LP(c).
Proof.We may assume that c = c ′ , as otherwise we are done by complementary slackness.Let x ′ be an optimal solution to LP(c ′ ).By complementary slackness, s ′ j x ′ j = 0, and therefore x ′ j = 0.For the purpose of contradiction, suppose that there exists an optimal solution x * to LP(c) such that x * j > 0. Let h (1) , . . ., h (k) be a conformal circuit decomposition of x * − x ′ .Then, h (t) j > 0 for some t ∈ [k].Since h (t) is an elementary vector, |supp(h (t) )| ≤ m + 1 and so The first inequality here used Hölder's inequality and that c ′ , h is feasible to LP(c), this contradicts the optimality of x * .
The following lemma provides an upper bound on the norm of the perturbation c − c ′ for which the existence of an index j as in Theorem 5.4 is guaranteed.
Lemma 5.5.Let c, c ′ ∈ R n be two cost vectors, and let s ′ be an optimal dual solution to LP(c ′ ).
where the last inequality is due to s ′ + r − c and c being orthogonal.This gives us as desired because κ A ≥ 1.

A Circuit Augmentation Algorithm for Feasibility
In this section we prove Theorem 1.3: given a linear program (LP) with cost c = (0 [n]\N , 1 N ) for some N ⊆ [n], find a solution x with x N = 0 (showing that the optimum value is 0), or certify that no such solution exists.A dual certificate in the latter case is a vector y ∈ R m such that A ⊤ y ≤ c and b, y > 0.
Theorem 1.3 can be used to solve the feasibility problem for linear programs.Given a polyhedron in standard form (P), we construct an auxiliary linear program whose feasibility problem is trivial, and whose optimal solutions correspond to feasible solutions to (P).This is in the same tune as Phase I of the Simplex method: For the constraint matrix A = A −A , it is easy to see that κ A = κ A and that any solution Ax = b can be converted into a feasible solution to (Aux-LP) via (y, z) = (x + , x − ).Hence, if the subroutines Support-Circuit and Ratio-Circuit are available for (Aux-LP), then we can invoke Theorem 1.3 with N = {n + 1, n + 2, . . ., 2n} on (Aux-LP) to solve the feasibility problem of (P) in O(mn log(n + κ A )) augmentation steps.
Our algorithm is presented in Algorithm 3. We maintain a set N 1 for the current iterate x (t) , we add i to L t .Note that once an index i enters L t , it is never removed, even though x i might drop below this threshold in the future.Still, we will show that L t ⊆ supp(x (t) ) in every iteration.
Whenever rk(L t ) increases, we run Support-Circuit(A, c, x (t) , N ) iterations as long as there exists a circuit in supp(x (t) ) intersecting N .Afterwards, we run a sequence of Ratio-Circuit iterations until rk(L t ) increases again.The key part of the analysis is to show that rk(L t ) increases in every O(n log(n + κ A )) iterations.
N 1 , or the algorithm terminates with a dual certificate.
Proof.The oracle returns g (t) that is optimal to (2) and (y (t) , s (t) ) that is optimal to (3) with optimum value −λ.Thus, A ⊤ y + s = c and 0 ≤ s ≤ λw.Recall that we use weights /n, and therefore c, g (t) = −λ ≤ − c, x (t) /n.This implies the lemma, noting that

Next, we analyze what happens during
Proof.We have g i < 0 for some i ∈ N because supp(g (t) ) ∩ N = ∅ and c, g (t) ≤ 0. Hence, N 1 .The following lemma shows that once a coordinate enters L t , its value stays above a certain threshold.Lemma 6.3.For every iteration t ≥ 0, we have x N 1 for all j ∈ L t .Proof.Fix an iteration t ≥ 0 and a coordinate j ∈ L t .We may assume that x (t) N 1 > 0, as otherwise the lemma trivially holds because x (t) ≥ 0. Let r ≤ t be the iteration in which j was added to L r ; the lemma clearly holds at iteration r.
We analyze the ratio x N 1 for iterations t ′ = r, . . ., t.At an iteration r ≤ t ′ < t that performs Ratio-Circuit, observe that if x .
The first inequality is due to Lemma 6.1 and the fact that x (t ′ +1) − x (t ′ ) is an elementary vector whose support intersects N .This fact follows from c, g (t ′ ) < 0 because x N 1 > 0 and b, y (t ′ ) ≤ 0. The second inequality uses the monotonicity x N 1 and the triangle inequality.The third inequality uses the assumption x Hence, it suffices to show that Support-Circuit maintains the invariant x by Lemma 6.2.Since Algorithm 3 performs at most (m + 1)n Support-Circuit iterations, the total decrease of this ratio is at most (m + 1)nκ A ≤ 2mnκ A .As the starting value is at least 4mnκ A , it follows that this ratio does not drop below 2mnκ A .
Proof of Theorem 1.3.The correctness of Algorithm 3 is obvious.If the algorithm terminates due to x (t) N = 0, then x (t) is the desired solution to (LP).Otherwise, if the algorithm terminates due to b, y (t) > 0, then y (t) is the desired dual certificate as it is feasible to (DLP).
Next, we show that if rk(L t ) = m, then the algorithm will terminate in iteration r ≤ t + n with x (r) N ) induces a circuit in L t ∪ {i}, so Support-Circuit will be invoked.Since every call to Support-Circuit reduces supp(x (t) ), all the coordinates in N will be zeroed-out in at most n calls.
It is left to bound the number of iterations of Algorithm 3. In the first iteration and whenever rk(L t ) increases, we perform a sequence of at most n Support Circuit cancellations.Let us consider an iteration t right after we are done with the Support Circuit cancellations.Then, there is no circuit in supp(x (t) ) intersecting N .We show that rk(L t ) increases within O(n log(n + κ A )) consecutive calls to Ratio-Circuit; this completes the proof.
By Lemma 6.1, within O(n log(nκ A )) = O(n log(n + κ A )) consecutive Ratio-Circuit augmentations, we reach an iterate r = t+O(n log(n+κ A )) such that x (r) ) and N ⊆ [n] \ L t by Lemma 6.3, and there is no circuit in supp(x (t) ) intersecting N , applying Lemma 5.1 with x = x (t) and z = x (r) yields showing that some j ∈ [n] \ cl(L t ) must be included in L r .
defining a new cost function ci := s i if i ∈ S and ci := 0 if i / ∈ S. We perform Support-Circuit iterations as long as there exist circuits in supp(x) intersecting supp(c), and then perform further O(n log(n + κ A )) Ratio-Circuit iterations for the cost function c.If we now arrive at an iterate (x, s) = (x (t ′ ) , s (t ′ ) ) such that s i < δ for every i ∈ supp(x), then we truncate s as before to an optimal dual solution to LP(c ′′ ) for some vector c ′′ where c − c ′′ ∞ < 2δ.After that, Theorem 5.4 is applicable for the costs c, c ′′ and said optimal dual solution.Otherwise, we continue with additional phases.
The algorithm formalizes the above idea, with some technical modifications.The algorithm comprises at most m + 1 phases; the main potential is that the rank of the large index set L increases in every phase.We show that if an index i / ∈ cl(L) was added to L, then it must have s i < δ at the beginning of every later phase.Thus, these indices cannot be violating anymore.

Algorithm 4: Variable-Fixing
Input : Linear program in standard form (LP), and initial feasible solution x (0) .Output: Either an optimal solution to (LP), or a feasible solution x and ∅ = N ⊆ [n] such that x N = x * N = 0 for every optimal solution x * to (LP).
⊲ Any dual feasible solution to LP(c) 24 return (x (t) , N ); We now turn to a more formal description of Algorithm 4. We start by orthogonally projecting the input cost vector c to ker(A).This does not change the optimal face of (LP).If c = 0, then we terminate and return the current feasible solution x (0) as it is optimal.Otherwise, we scale the cost to c 2 = 1, and use Ratio-Circuit to obtain a feasible solution s(−1) to the dual of LP(c).
The rest of Algorithm 4 consists of repeated phases, ending when s(t−1) , x (t) = 0.In an iteration t, let S t = {i ∈ [n] : s(t−1) i ≥ δ} be the set of coordinates with large dual slack.The algorithm keeps track of the following set These are the variables that were once large with respect to x The first phase starts at t = 0, and we enter a new phase k whenever rk(L t ) > rk(L t−1 ).Such an iteration t is called the first iteration in phase k.At the start of the phase, we define a new modified cost c(k) from the dual slack s(t−1) by truncating entries less than δ to 0. This cost vector will be used until the end of the phase.Then, we call Support-Circuit(A, c(k) , x (t) , supp(c (k) )) to eliminate circuits in supp(x (t) ) intersecting supp(c (k) ).Note that there are at most n such calls because each call sets a primal variable x (t) i to zero.In the remaining part of the phase, we augment x (t) using Ratio-Circuit(A, c(k) , 1/x (t) ) until rk(L t ) increases, triggering a new phase.In every iteration, Ratio-Circuit(A, c(k) , 1/x (t) ) returns a minimum cost-to-weight ratio circuit g (t) , where the choice of weights 1/x (t) follows Wallacher [Wal89].It also returns a feasible solution (y (t) , s (t) ) to the dual of LP(c (k) ).After augmenting x (t) to x (t+1) using g (t) , we update the dual slack as s(t) := arg min This finishes the description of a phase.
Since rk(A) = m, clearly there are at most m + 1 phases.Let k and t be the final phase and iteration of Algorithm 4 respectively.As s(t−1) , x (t) = 0, and x (t) , s(t−1) are primal-dual feasible solutions to LP(c (k) ), they are also optimal.Now, it is not hard to see that c(k) ∈ Im(A ⊤ ) + c − r for some 0 ≤ r ≤ (m + 1)δ1 (Claim 7.3).Hence, s(t−1) is also an optimal solution to the dual of LP(c − r).The last step of the algorithm consists of identifying the set N of coordinates with large dual slack s(t−1) i .Then, applying Theorem 5.4 for c ′ = c − r allows us to conclude that they can be fixed to zero.
In order to prove Theorem 1.4, we need to show that N = ∅.Moreover, we need to show that there are at most T iterations of Ratio-Circuit per phase.First, we show that the objective value is monotone nonincreasing.Lemma 7.1.For any two iterations r ≥ t in phases ℓ ≥ k ≥ 1 respectively, c(ℓ) , x (r) ≤ c(k) , x (t) .
Proof.We proceed by induction on ℓ − k ≥ 0. For the base case ℓ − k = 0, iterations r and t occur in the same phase.So, the objective value is nonincreasing from the definition of Support Circuit and Ratio-Circuit.Next, suppose that the statement holds for ℓ − k = d, and consider the inductive step ℓ − k = d + 1.Let q be the first iteration in phase k + 1; note that r ≥ q > t.Then, we have c(ℓ) , x (r) ≤ c(k+1) , x (q) ≤ s(q−1) , x (q) ≤ c(k) , x (q) ≤ c(k) , x (t) .
The first inequality uses the inductive hypothesis.In the second inequality, we use that c(k+1) is obtained from s(q−1) by setting some nonnegative coordinates to 0. The third inequality is by the definition of s(q−1) .The final inequality is by monotonicity within the same phase.
The following claim gives a sufficient condition for Algorithm 4 to terminate.
Claim 7.2.Let t be an iteration in phase k ≥ 1.If Ratio-Circuit returns an elementary vector g (t) such that c(k) , g (t) = 0, then Algorithm 4 terminates in iteration t + 1.
The next two claims provide some basic properties of the modified cost c(k) .For convenience, we define c(0) := c.Claim 7.3.For every phase k ≥ 0, we have c(k) ∈ Im(A ⊤ ) + c − r for some 0 ≤ r ≤ kδ1.
Proof.We proceed by induction on k.The base case k = 0 is trivial.Next, suppose that the statement holds for k, and consider the inductive step k + 1.Let t be the first iteration of phase k + 1, i.e., c , and c(k+1) i = 0 otherwise.Note that s(t−1) ∈ {c (k) , s (t−1) }.Since both of them are feasible to the dual of LP(c (k) ), we have s(t−1) ∈ Im(A ⊤ ) + c(k) .By the inductive hypothesis, c(k) ∈ Im(A ⊤ ) + c − r for some 0 ≤ r ≤ kδ1.Hence, from the definition of c(k+1) , we have c(k+1) ∈ Im(A ⊤ ) + c − r − q for some 0 ≤ q ≤ δ1 as required.
Claim 7.4.For every phase k ≥ 0, we have c(k Proof.We proceed by induction on k.The base case k = 0 is easy because c ∞ ≤ c 2 = 1.Next, suppose that the statement holds for k, and consider the inductive step k + 1.Let t be the first iteration of phase k + 1.If s(t−1) = c(k) , then c(k+1) is obtained from c(k) by setting some coordinates to 0, so we are done by the inductive hypothesis.Otherwise, s(t−1) = s (t−1) .We know that s (t−1) is an optimal solution to (3) for Ratio-Circuit(A, c(k) , 1/x (t−1) ).Since c − r ∈ Im(A ⊤ ) + c(k) for some 0 ≤ r ≤ kδ1 by Claim 7.3, s (t−1) is also an optimal solution to (3) for Ratio-Circuit(A, c − r, 1/x (t−1) ).By (4), we obtain The third inequality is due to c 2 = 1, the fourth inequality follows from the fact that there are at most m + 1 phases, and the last inequality follows from the definition of δ.
We next show a primal proximity lemma that holds for iterates throughout the algorithm.
Lemma 7.5.Let t be the first iteration of a phase k ≥ 1.For any iteration r ≥ t, Proof.Fix an iteration r ≥ t and let ℓ ≥ k be the phase in which iteration r occurred.Consider the elementary vector g (r) .If it is returned by Support-Circuit, then g (r) i < 0 for some i ∈ supp(c (ℓ) ) by definition.If it is returned by Ratio-Circuit, we also have g (r) i < 0 for some i ∈ supp(c (ℓ) ) unless c(ℓ) , g (r) = 0. Note that if c(ℓ) , g (r) = 0, then the algorithm sets x (r+1) = x (r) , which makes the lemma trivially true.Hence, we may assume that such an iteration does not occur.
By construction, we have x (r+1) − x (r) = αg (r) for some α > 0, and α|g The second inequality uses that all nonzero coordinates of c(ℓ) are at least δ.The third inequality is by Lemma 7.1, whereas the fourth inequality is by Claim 7.4 and supp(c (k) ) = S t .
With the above lemma, we show that any variable which enters L t at the start of a phase, is lower bounded by poly(n, κ A ) x
Lemma 7.6.Let t be the first iteration of a phase k ≥ 1 and let i ∈ L t \ L t−1 .For any iteration t ≤ t ′ ≤ t + 2(m + 1)T , Proof.By definition, we have that x St 1 .With Lemma 7.5 we get The lower bound follows from T ≥ 1, as long as the constant in the definition of T is chosen large enough.
For any iteration t in phase k ≥ 1, let us define These are the variables which entered L t ′ at the start of a phase for all t ′ ≤ t.Note that rk(D t ) = rk(L t ) holds.As a consequence of Lemma 7.6, D t remains disjoint from the support of the modified cost c(k) .
Lemma 7.7.Let 0 ≤ t ≤ 2(m + 1)T be an iteration and let k ≥ 1 be the phase in which iteration t occured.Let D t ⊆ L t be defined as in (9).If c(k) , x (t) > 0, then Proof.For the purpose of contradiction, suppose that there exists an index i ∈ D t ∩ supp(c (k) ).
Let r ≤ t be the iteration in which i was added to L r .By our choice of D t , r is the first iteration of phase j for some j ≤ k, which implies that S r = supp(c (j) ).Since c(j) , x (r) ≥ c(k) , x (t) > 0 by Lemma 7.1, we have x Sr 1 > 0. However, we get the following contradiction The first inequality is by Lemma 7.6, the third inequality is by Lemma 7.1, while the fourth inequality is by Claim 7.4.
Proof of Theorem 1.4.We first prove the correctness of Algorithm 4. Suppose that the algorithm terminates in iteration t.We may assume that there is at least 1 phase, as otherwise x (0) is an optimal solution to (LP).Let k ≥ 1 be the phase in which iteration t occurred.Since s(t−1) , x (t) = 0 and x (t) , s(t−1) are primal-dual feasible solutions to LP(c (k) ), they are also optimal.By Claim 7.3, we know that c(k) ∈ Im(A ⊤ ) + c − r for some r ∞ ≤ (m + 1)δ.Hence, s(t−1) is also an optimal dual solution to LP(c ′ ) where Thus, the algorithm returns N = ∅.Moreover, for all j ∈ N , Theorem 5.4 allows us to conclude that x (t) j = x * j = 0 for every optimal solution x * to LP(c).Next, we show that if rk(L t ) = m in some phase k, then the algorithm will terminate in iteration r ≤ t + n + 1.As long as c(k) , x (t) > 0, we have D t ⊆ [n] \ supp(c (k) ) by Lemma 7.7.Moreover, any i ∈ supp(c (k) ) ∩ supp(x (t) ) induces a circuit in D t ∪ {i}, so Support-Circuit will be invoked.Since every call to Support-Circuit reduces supp(x (t) ), all the coordinates in supp(c (k) ) will be zeroed-out in at most n calls.Let t ≤ t ′ ≤ t + n be the first iteration when c(k) , x (t ′ ) = 0. Since Ratio-Circuit returns g (t ′ ) with c(k) , g (t ′ ) = 0, the algorithm terminates in the next iteration by Claim 7.2.
It is left to bound the number of iterations of Algorithm 4. Clearly, there are at most m + 1 phases.In every phase, there are at most n Support-Circuit iterations because each call sets a primal variable to 0. It is left to show that there are at most T Ratio-Circuit iterations in every phase.
Fix a phase k ≥ 1 and assume that every phase ℓ < k consists of at most T many Ratio-Circuit iterations.Let t be the first iteration in phase k.We may assume that rk(L t ) < m, as otherwise there is only one Ratio-Circuit iteration in this phase by the previous argument.Note that this implies x (t ′ ) S t ′ 1 > 0 for all t ′ ≤ t.Otherwise, L t ′ = [n] and rk(L t ′ ) = m, which contradicts rk(L t ′ ) ≤ rk(L t ).
Let r ≥ t be the first Ratio-Circuit iteration in phase k.Let D r ⊆ L r be as defined in (9).By Lemma 7.6 and our assumption, we have x (r) Dr > 0. We claim that D r ∩ supp(c (k) ) = ∅.This is clearly the case if c(k) , x (r) = 0. Otherwise, it is given by Lemma 7.7.We also know that there is no circuit in supp(x (r) ) which intersects supp(c (k) ).Hence, applying Lemma 5.1 with L = D r , S = supp(c (k) ), x = x (r) , z = x (r+T ) yields Since the main circuit-augmentation algorithm consists of applying Algorithm 4 at most n times, we obtain the desired bound on the number of iterations.

Circuits in General Form
There are many instances in the literature where circuits are considered outside standard equality form.For example, [BDLF16,KPS19,DLKS22] defined circuits for polyhedra in the general form where It implicitly includes polyhedra in inequality form, which were considered by e.g., [BFH15,BSY18].For this setup, they define g ∈ R n to be an elementary vector if (i) g ∈ ker(A), and (ii) Bg is support minimal in the collection {By : y ∈ ker(A), y = 0}.
In the aforementioned works, the authors use the term 'circuit' also for elementary vectors.Let us assume that This assumption is needed to ensure that P is pointed; otherwise, there exists a vector z ∈ R n , z = 0 such that Az = 0, Bz = 0. Thus, the lineality space of P is nontrivial.Note that the circuit diameter is defined as the maximum length of a circuit walk between two vertices; this implicitly assumes that vertices exists and therefore the lineality space is trivial.Under this assumption, we show that circuits in the above definition are a special case of our definition in the Introduction, and explain how our results in the standard form are applicable.Consider the matrix and vector and let W := ker(M ) ⊆ R n+m B .Let J denote the set of the last m B indices, and W := π J ( W ) denote the coordinate projection to J. The assumption (11) guarantees that for each s ∈ W , there is a unique (x, s) ∈ W ; further, x = 0 if and only if s = 0.

i
= x * i , i.e., we are not yet at the required lower and upper bound.If |S t | ≤ m, then we remove the indices in (L ∪ H) \ S t , and use the diameter bound resulting from the standard embedding as in Corollary 4.1.

S
r+T 1 , where the last inequality follows from Lemma 7.8 by choosing a sufficiently large constant in the definition of T .Note that cl(D r ) = cl(L r ) because D r is a spanning subset of L r .Thus, there exists an index i ∈ [n] \ cl(L r ) which is added to L r+T , showing that rk(L r+T ) > rk(L r ) as required.