Neighbourhood complexity of graphs of bounded twin-width

We give essentially tight bounds for, ν ( d, k ), the maximum number of distinct neighbourhoods on a set X of k vertices in a graph with twin-width at most d . Using the celebrated Marcus-Tardos theorem, two independent works [Bonnet et al., Algorithmica ’22; Przybyszewski ’22] have shown the upper bound ν ( d, k ) ⩽ exp(exp( O ( d ))) k , with a double-exponential dependence in the twin-width. The work of [Gajarsky et al., ICALP ’22], using the framework of local types, implies the existence of a single-exponential bound (without explicitly stating such a bound). We give such an explicit bound, and prove that it is essentially tight. Indeed, we give a short self-contained proof that for every d and k ν ( d, k ) ⩽ ( d + 2)2 d +1 k = 2 d +log d +Θ(1) k, and build a bipartite graph implying ν ( d, k ) ⩾ 2 d +log d +Θ(1) k , in the regime when k is large enough compared to d .


Introduction
The aim of this paper is to refine our understanding of how complex the neighbourhoods of graphs of bounded twin-width can be.We provide an improved bound on the neighbourhood complexity of such graphs, complemented by a construction showing that our bound is essentially tight.The improvements in the bounds for neighbourhood complexities translate directly to better structural bounds and algorithms, in some contexts which are explained below.
Twin-width.Twin-width is a recently introduced graph invariant [10]; see Section 2 for a definition.It can be naturally extended to matrices over finite alphabets and binary structures [10,7,12].Although classes of bounded twin-width are broad and diverse, they allow (most of the time, provided a witness is given as an input) improved algorithms, compared to what is possible on general graphs or binary structures.
Most prominently, it was shown [10] that, on n-vertex graphs given with a d-sequence (a witness that their twin-width is at most d), deciding if a first-order sentence φ holds can be solved in time f (d, φ)n, for some computable function f .In some special cases, such as for k-Independent Set or k-Dominating Set 1 , single-exponential parameterised algorithms running in time 2 O d (k) n are possible [5].In the same setting, the triangles of an n-vertex m-edge graph can be counted in time O(d 2 n + m) [21].See [8,19,26] for more applications of twin-width with an algorithmic flavour.
Classes of binary structures with bounded twin-width include bounded treewidth, and more generally, bounded clique-width classes, proper minor-closed classes, posets of bounded width (that is, whose antichains are of bounded size), hereditary subclasses of permutations, as well as Ω(log n)-subdivisions of n-vertex graphs [10], and particular classes of (bounded-degree) expanders [6].A rich range of geometric graph classes have bounded twin-width such as map graphs, bounded-degree string graphs [10], classes with bounded queue number or bounded stack number [6], segment graphs with no K t,t subgraph, and visibility graphs of simple polygons without large independent sets [4], to give a few examples.
If efficiently approximating the twin-width is a challenging open question in general, this is known to be possible for the above-mentioned classes (albeit a representation may be needed for the geometric classes) and for ordered graphs [7].By that, we mean that there are two computable functions f, g and an algorithm that, for an input n-vertex graph G from the class and an integer k, in time g(k)n O (1) , either outputs an f (k)-sequence (again, witnessing that the twin-width is at most f (k)) or correctly reports that the twin-width of G is larger than k.
VC density and neighbourhood complexity.VC density is related to the celebrated VC dimension [30].Given a set-system (or hypergraph) S on a domain X, the shatter function π S : The Perles-Sauer-Shelah lemma states that π and as +∞ if the VC dimension is unbounded.We define the VC density of an infinite class C of finite graphs as the VC density of the infinite set-system formed by the neighbourhood hypergraph of the disjoint union of the graphs of C, that is, where N G (v) denotes the set of neighbours of v in G.The VC density is an important measure in finite model theory, often more tractable than the VC dimension (see for instance [1,2]).Tight bounds have been obtained for the VC density of (logically) definable hypergraphs from graph classes of bounded clique-width [25] (with monadic second-order logic), and more recently, of bounded twin-width [19] (with first-order logic).
In structural graph theory and kernelisation [17] (a subarea of parameterised complexity [15]) the function π N (G) , where N (G) is the neighbourhood hypergraph of G, is often 1 called neighbourhood complexity.(See [3] for an algorithmic study of the computation of this notion.)In these contexts, obtaining the best possible upper bound for π N (G) (and not just the exponent matching the VC density) translates to qualitatively better structural bounds and algorithms; see for instance [9,11,16,29].
The r-neighbourhood complexity of G is the neighbourhood complexity of G r , with same vertex set as G, and an edge between two vertices at distance at most r in G. Reidl et al. [29] showed that among subgraph-closed classes, bounded expansion 2 is equivalent to linear r-neighbourhood complexity.Indeed, the more general nowhere dense classes [23] 3 have almost linear r-neighbourhood complexity [16]: there is a function f : N × N → N such that for every ε > 0, π N (G r ) (n) ⩽ f (r, ε)n 1+ε for all n.On hereditary classes, i.e., closed under taking induced subgraphs, there is no known characterisation of linear neighbourhood complexity.
As we already mentioned in a different language, bounded twin-width classes have been proven to have linear neighbourhood complexity.See [9,Lemma 3] or [27,Section 3] for two independent proofs, both using the Marcus-Tardos theorem [22].However, the dependence in the twin-width is doubly exponential in both papers.Setting ν(d, k) as the maximum number of distinct neighbourhoods on a set of size k within a graph of twin-width at most d, i.e., max{π There is a recent third proof not using the Marcus-Tardos theorem [19].
1 Some authors define the neighbourhood complexity as n → . 2 A notion from the Sparsity theory of Nešetřil and Ossona de Mendez [24] extending bounded degree and proper minorfree classes.
The authors tackle the more general problem of bounding the number of distinct first-order definable subsets within a fixed set.In the particular case of neighbourhoods, even though this is not made explicit in [19], their proof gives a similar upper bound of ν(d, k) to ours.
Our results.In this note, we give in Section 3 a short and self-contained proof (also not using the Marcus-Tardos theorem) that ν(d, k) ⩽ 2 d+log d+Θ (1) k.In Section 4, we complement that proof with a construction of a bipartite graph witnessing that ν(d, k) ⩾ 2 d+log d+Θ (1) k, which makes our singleexponential upper bound in twin-width essentially tight.

Preliminaries
We use the standard graph-theoretic notations: We now define the twin-width of a graph, following the definition of [10].
where E(G) and R(G) are two disjoint sets of edges on V (G): the usual edges (also called black edges) and the red edges.Informally, a red edge between two vertices u and v means that some errors have been made between u and v.The red degree of a trigraph is the maximum degree of the graph (V (G), R(G)).Any graph G can be interepreted as a trigraph G = (V (G), E(G), ∅).Given a trigraph and two vertices u, v ∈ V (G) (not necessarily adjacent), the trigraph G/u, v = G ′ is obtained by contracting u and v in a new vertex w such that: • the edges between vertices of V (G) \ {u, v} are the same in G ′ ; • we set the edges incident to w in the following way: In other words, the common black neighbours of u and v are black neighbours of w.All the other neighbours of u or v are red neighbours of w.Red edges stay red, black edges stay black, red and black edges become red.Moreover, non-edges stay as non-edges, non-edges and red edges become red edges, and non-edges and black edges become red edges.We say that by a contraction and has red degree at most d.The twin-width of G, denoted by tww(G), is the minimum integer d such that G admits a d-sequence.Note that an induced subgraph of G has a twin-width smaller or equal to the twin-width of G [10].
If u ∈ G i , then u(G) denotes the set of vertices of G eventually contracted to u in G i .Instead of considering the trigraphs G i , we might prefer to deal with the partitions of V (G) induced by the sets u(G) for u in G i : In this setting, we say that u(G) is a part of P i .We say that there is a red edge, a black edge or a non-edge between two parts u(G) and v(G) of P i if uv is a red edge, a black edge or a non-edge in G i .

Upper bound on the number of distinct neighbourhoods
We state and prove our upper bound on the maximum number of distinct X-neighbourhoods in bounded twin-width graphs.
Proof.Fix non-empty X ⊆ V (G).First of all, for all vertices of V (G)\X with the same X-neighbourhood, we keep only one representative.Note that the new graph G ′′ is an induced subgraph of G, thus its twinwidth is at most d.We further modify graph G ′′ by adding for each v ∈ X a new vertex u to G ′′ so that N (u) = N (v) if such vertex does not exist in V (G ′′ ) \ X.We do this one vertex at a time.The new graph is called G ′ and it has the same twin-width as G ′′ .
Let M = (d + 2)2 d+1 + 1.We prove by induction on n that an n-vertex graph of twin-width at most d with a set X of k ≥ 1 vertices, where all vertices outside X have a distinct X-neighbourhood, satisfies n ⩽ kM .This will prove that G ′ has at most kM vertices, and thus that in G, there are at most (M −1)k distinct X-neighbourhoods.
The statement is trivially true for n ⩽ 5 since M ⩾ 5, for all d ⩾ 0. Thus, assume n ⩾ 6.In particular, we have k > 1.Let x ∈ X.Let X ′ = X \ {x} and let T x be the set of pairs of vertices outside X that are twins with respect to X ′ , i.e.
Since every vertex of V (G ′ ) \ X has a distinct neighbourhood in X, there are at most two vertices of V (G ′ ) \ X with the same (possibly empty) neighbourhood N in X ′ ; namely the vertices u, v ∈ V (G ′ ) \ X with N (u) ∩ X = N and N (v) ∩ X = N ∪ {x} (if they exist).Hence, T x consists of pairwise-disjoint pairs of vertices.We prove the following claim.
Claim A. There exists a vertex x of X such that Consider the last step G ′ i of the sequence where all the parts of P i contain at most one vertex of X (that is, contrary to P i , some part of P i−1 contains two vertices of X).
Let P be a part of P i .Let x be the unique (if there exists one) element of P ∩ X.Then we claim that |P \ X| ⩽ 2 d+1 .Indeed, any two vertices of P \ X have some vertex in the symmetric difference of their X-neighbourhoods, either it is x, or some vertex x ′ of X outside P .If that distinguishing vertex is some x ′ that is not in P , then there has to be a red edge between P and the part that contains x ′ .There are at most d red edges with P as an extremity.Since all the elements of X are in distinct parts in G ′ i , it means that d + 1 vertices of X are enough to distinguish all the X-neighbourhoods of vertices of P \ X, and thus |P \ X| ⩽ 2 d+1 .
We now consider the next contraction in the sequence, which leads to G ′ i−1 .By definition of G ′ i , it must contract two vertices corresponding to two parts of P i that both contain an element of X.Let x 1 and x 2 be these two elements of X.Let Q be the part of P i−1 that contains both x 1 and x 2 .Let {u, v} be a pair of T x1 and let T x1 contain M ′ pairs.Since u and v have the same neighbourhood in X \ {x 1 }, it means that they are either both adjacent or both non-adjacent to x 2 , and exactly one of them is adjacent to x 1 .Thus, necessarily, one vertex among the pair {u, v} is adjacent to exactly one vertex among {x 1 , x 2 }.In particular, if this vertex is not in Q, then there has to be a red edge between the part containing this vertex and the part Q in G ′ i−1 .Since T x1 contains M ′ pairs (which are disjoint) and Q has at most 2 d+2 vertices not in X, there are at least M ′ − 2 d+2 vertices not in X whose part in G ′ i−1 has a red edge to Q. Since each other part has at most 2 d+1 vertices not in X, it makes at least M ′ −2 d+2 2 d+1 red edges incident to Q. Thus, we must have

Lower bound on the number of distinct neighbourhoods
Notice that when |X| and tww(G) are roughly the same, the bound from Theorem 1 cannot be sharp, since G ′ has at most 2 |X| + |X| vertices.However, when |X| is large enough compared to tww(G), we next show that the bound is sharp up to a constant factor.Proposition 2. There is a positive constant c, such that for any integer d, there is a bipartite graph G of twin-width at most d, and a large enough set X ⊆ V (G), with at least c • d2 d |X| = 2 d+log d+Θ (1) |X| distinct X-neighbourhoods in G.
Proof.Observe that the claim is clearly true for any small d.Thus, we do not need to consider separately graphs with small twin-width upper bounded by a constant.Hence, we assume from now on that d ≥ d ′ where d ′ is some positive constant (at least 3).
We construct the graph G as follows.Let A, B, C ∈ Z be three constants that will be given later (A and B will be roughly equal to √ d and C will be roughly equal to d).Let X = {x 1 , ..., x k } be an independent set of k ≥ d+2 √ d − 2+1 vertices.Our goal is to construct G so that each vertex in V (G)\X has a unique X-neighbourhood.For any integers i, j, t with 1 ⩽ i ⩽ j ⩽ i + A − 1, j + 2 ⩽ t ⩽ j + 1 + B and t ⩽ k − C, we create a set V i,j,t of vertices as follows.Consider the set X t = {x t+1 , ..., x t+C }.For every subset Y of X t , let Y ′ = {x i , ..., x j , x t } ∪ Y and add a vertex v Y ′ to V i,j,t , making it adjacent to the vertices of Y ′ .Each set V i,j,t has size 2 C and there are Θ(kAB) (for fixed A, B and C, and growing k) such sets.Thus there are Θ(kAB2 C ) vertices in the graph.
Any two vertices not in X have distinct X-neighbourhoods.Indeed, by considering the natural ordering of X induced by the indices, any vertex not in X is first adjacent to a consecutive interval of vertices from x i to x j , then is not adjacent to vertices from x j+1 to x t−1 (which is not empty since t ⩾ j + 2), and then adjacent to x t .Thus, if two vertices have the same X-neighbourhood, they must be in the same set V i,j,t .But then, they have a distinct neighbourhood in {x t+1 , ..., x t+C }.
We now prove that the twin-width of G is at most M = max{AB, C}+2.For that, we give a sequence of contractions with red degree at most M .
The contraction sequence is split into k − C steps.During these steps we first consider vertices of X one by one and then in the last one we deal with the remaining vertices of Step 0 corresponds to the starting point, where each vertex is alone.Let ℓ ⩾ 1.After Step ℓ, there will be the following parts in the corresponding partition (vertices not in any of the mentioned parts are in corresponding singleton parts containing only the vertices themselves): • Let i = ℓ.For each j, t such that i ⩽ j ⩽ i + A − 1 and j + 2 ⩽ t ⩽ j + 1 + B, there is a part B j,t .
The parts B i,t (parts with j = i), contain all the vertices of the sets V i ′ ,j ′ ,t such that j ′ ≤ i.The parts B j,t with j > i contain all the vertices of the sets V i ′ ,j ′ ,t such that i ′ ⩽ i and j ′ = j.Note that there are AB non-empty B j,t parts in total.
• There is a part X 0 that contains vertices from x 1 to x ℓ of X.
• There is a part T (for "trash") that contains all the vertices of the sets V i ′ ,j,t with t ⩽ ℓ + 1.
All the other vertices are not yet contracted.This corresponds to the vertices from x ℓ+1 to x k of X and to the vertices of the sets V i ′ ,j,t with i ′ > i = ℓ.Indeed, if i ′ ⩽ i and t ⩽ i + 1, then the vertices of V i ′ ,j,t are in T .If t ⩾ i + 2 but j ⩽ i, then they are in the part B i,t .If j > i, then they are in the part B j,t .
We first prove that the red degree after Step ℓ is at most M .Then, we explain how to get from Step ℓ to Step ℓ + 1 by keeping the red degree at most M .
Consider the part B j,t at the end of Step ℓ.A vertex in this part belongs to some set V i ′ ,j ′ ,t with i ′ ⩽ i = ℓ and j ′ = j if j > i or j ′ ⩽ i otherwise.In particular, two vertices of B j,t are adjacent to all the vertices between x i+1 and x j , to no vertex between x j+1 and x t−1 , to x t , and to no vertex after x t+C .Thus, there is a red edge between the parts B j,t and X 0 , and C red edges between the part B j,t and the vertices {x t+1 , ..., x t+C }.Therefore, the number of red edges incident with B j,t is at most C + 1.
Consider now the part T .Vertices in T are adjacent only to vertices of X up to x ℓ+C+1 .Since vertices x 1 to x ℓ are all in the part X 0 , the red degree of T is at most C + 2.
Single vertices not in X have no incident red edges: indeed, they are all in some sets V i ′ ,j,t for i ′ > i = ℓ and thus are not adjacent to any vertex of X 0 .For the same reason, there are red edges incident to X 0 only to T and to the parts B j,t .Hence, the red degree of X 0 is at most AB + 1.Similarly, the red degree of x i ′ , i ′ > i + 1 is at most AB + 1.Moreover, the red degree of x i+1 is at most one.Indeed, the only red edge is between x i+1 and T .
Finally, the red degree after step ℓ is at most max{AB + 1, C + 2} ⩽ M .Let ℓ ≥ 0. We now explain how we perform the contractions to go from step ℓ to step ℓ + 1.
1. (only if ℓ ≥ 1) Let i = ℓ.For any i + 3 ⩽ t ⩽ i + 2 + B, merge the part B i,t with the part B i+1,t resulting in part B i+1,t .The only new red edge this merging may lead to, when B i,t is non-empty, is between B i+1,t and x i+1 .Thus, we add only one red edge between x i+1 and B i+1,t .Thus, the red degree of B i+1,t is at most C + 2 and the red degree of x i+1 is at most 2.
2. Add all the vertices of V i+1,j,t for some j, t to the part (that might be empty at this point) B j,t .The red degree of B j,t is at most C + 2 since we might have a red edge between B j,t and x i+1 .
The number of nonempty parts B j,t at this point is at most AB + 1 (there is still the part B i,i+2 ).Adding T , this gives AB + 2 red edges incident to a vertex in X (or from part X 0 ).
3. Add x ℓ+1 to X 0 .The part X 0 can have red edges only to non-empty parts B j,t and to T , but no red edges to the single vertices.Thus, it has red degree at most AB + 2.
4. Put the part B i,i+2 into T .This part is only adjacent to vertices up to x ℓ+2+C , and thus has at most C + 2 red edges.
Thus, at each point, the red degree is always at most M = max{AB, C} + 2.
The process ends at step ℓ = k − C − 1.Then, all the vertices not in X are in some parts, and there are at most AB + 1 such parts.On the other side of the bipartition, we have part X 0 and C + 1 single vertices.Thus, the graph is bipartite with both sides of size at most M .One can contract each part independently to finish the contraction sequence.
To conclude, taking C = d − 2 and A = B = ⌊ √ d − 2⌋, we have M ⩽ d and kAB2 C = Θ(kd2 d ).Notice that we may assume that A, B and C are positive since d ≥ d ′ where d ′ was some well chosen positive constant.This concludes the proof.

Conclusion
We have given an essentially tight upper bound for the neighbourhood complexity of graphs of bounded twin-width together with a construction almost attaining this upper bound.Moreover, our method is simple and self-contained.A similar upper bound was implied by the techniques in [19] (though not stated explicitly).
It is known that the twin-width of G r can be upper-bounded by a function of the twin-width of G and r [10].Thus, graphs of twin-width at most d have linear r-neighbourhood complexity.Recently, improved bounds were given for planar graphs and proper minor-closed graph classes in [20] (such graphs also have bounded twin-width).We leave as an interesting open problem to obtain an essentially tight twin-width dependence for the r-neighbourhood complexity.
We remark that the neighbourhood complexity is also related to identification problems on graphs such as identifying codes or locating-dominating sets, where one seeks a (small) set A of vertices of a graph such that all other vertices have a distinct neighbourhood in A [18].Some works in this area about specific graph classes, are equivalent to the study of the neighbourhood complexity of these graph classes: see for example [14,18,28].Moreover, we note that for graph classes with VC density 1, since any solution has linear size, the natural minimisation versions of the above identification problems have a polynomial-time constant-factor approximation algorithm (trivially select the whole vertex set), while such an algorithm is unlikely to exist in the general case [14].Thus, the bounds given in the current work imply a better approximation ratio for these problems, when restricted to input graph classes of bounded twin-width.
of vertices outside X ′ are distinct.The graph G Y has at least n − M vertices, and twin-width at most d.By induction, we have n − M ⩽ |V (G Y )| ⩽ (k − 1)Mand thus, n ⩽ kM .Hence, once we recall that no vertex in X has unique X-neighbourhood, there are at most (M − 1)k distinct X-neighbourhoods, which completes the proof.