Bounds and algorithms for graph trusses

The $k$-truss, introduced by Cohen (2005), is a graph where every edge is incident to at least $k$ triangles. This is a relaxation of the clique. It has proved to be a useful tool in identifying cohesive subnetworks in a variety of real-world graphs. Despite its simplicity and its utility, the combinatorial and algorithmic aspects of trusses have not been thoroughly explored. We provide nearly-tight bounds on the edge counts of $k$-trusses. We also give two improved algorithms for finding trusses in large-scale graphs. First, we present a simplified and faster algorithm, based on approach discussed in Wang&Cheng (2012). Second, we present a theoretical algorithm based on fast matrix multiplication; this converts a triangle-generation algorithm of Bjorklund et al. (2014) into a dynamic data structure.


Introduction
In a number of contexts, a group of interacting agents can be represented in terms of an undirected graph G = (V, E).For example, in a social network, the vertices may represent people with an edge if they know each other.One basic task is to find a cohesive subnetwork of G: a maximal subgraph whose vertices are "highly connected" [17].This may represent a discrete community in the overall network, or another type of subgroup with a high degree of mutual relationship.We emphasize that since we are ultimately trying to understand a non-mathematical property of G, we cannot give an exact definition of a cohesive subnetwork.
A number of graph-theoretic structures can be used to find cohesive subnetworks in G.A clique is the most highly connected substructure.An alternate choice, suggested by [17], is the k-core, which is defined as a maximal connected subgraph in which each vertex has degree at least k.
Cohen [6,7] proposed a stronger heuristic based on triangle counts called the truss.Formally, a k-truss is defined to be a graph in which every edge is incident to at least k triangles and which has no isolated vertices.Note that a (k + 2)-clique is a k-truss. 1 A k-truss-component of G is defined to be a maximal edge set L ⊆ E such that the edge-induced subgraph G(L) is a connected k-truss.
The k-truss has been rediscovered and renamed several times.The earliest example was its definition as a k-dense core [16], which was motivated by the goal of detecting dense communities where the k-core proved to be too coarse.It was also defined as a triangle k-core in [23] and used as a motif exemplar in graphs.Other names include k-community [20] and k-brace [18].
The task of determining the truss-components of G is called truss decomposition.The trusscomponents of G can be derived from an associated hypergraph H which is defined as follows: the vertex set of H is the edge set of G, and the edge set of H is the set of triangles of G.Each k-core of H corresponds to a k-truss-component of G.
The trussness of an edge e of G, denoted τ (e), is defined to be the maximal value k such that e is in a k-truss-component of G. Equivalently, τ (e) is the coreness of the edge e regarded as a node of H. Truss decomposition algorithms typically first compute τ (e) for all edges e.The k-truss-components (for any value of k) can then be found by depth-first search of G restricted to edges e with τ (e) ≥ k.
An appealing feature of trusses is that truss decomposition algorithms are relatively practical, making them feasible for large graphs.As a starting point, Cohen's original algorithm [7] was essentially an adaptation of a graph core decomposition algorithm of Matula & Beck [14] to the hypergraph H.This was improved by Wang & Cheng [21] and Huang et al. [12] by avoiding explicit generation of H.These algorithms compute truss decomposition in O(mα(G)) time and O(m) memory, where α(G) denotes the arboricity of G.Note that mα(G) is at most O(m 3/2 ).
The k-truss-component structure of graphs has become a key tool for pattern mining and community detection in a variety of scientific and social network studies, e.g.[18,4].Additionally, the truss serves as a fast filter for finding cliques since a (k + 2)-clique is a k-truss.In these applications, the parameter k represents the anticipated size of a meaningful cohesive subnetwork.Typically, k may be much smaller than n.In this setting, it is often useful to compute a truncated truss decomposition up to some chosen parameter k trunc .This entails computing τ (e) for edges e with τ (e) ≤ k trunc ; other edges e record that τ (e) > k trunc but do not record the precise value.For instance, Verma et al. [19] discussed an algorithm for sparse graphs with runtime O(mk trunc ∆), where ∆ is the maximum degree of G.

Our contributions and overview
Despite its simplicity and its use for understanding real-world networks, the truss has seen little formal analysis, either from a combinatorial or algorithmic point of view.We address these gaps in this paper.
Our main combinatorial subject of investigation is the minimum number of edges in a k-truss.Section 2 gives asymptotically tight bounds for this quantity.Section 3 analyzes a stricter notion of critical k-truss, which is a k-truss none of whose subgraphs are themselves k-trusses.We summarize the main results of these sections as follows: Theorem 1.1.The minimum number of edges in a connected k-truss on n vertices, is n(1 Section 4 describes a new simple algorithm for truss decomposition.We analyze this algorithm in terms of a graph parameter we refer to as the average degeneracy δ(G), defined as: is a local property of the graph, which can be influenced by a few high-degree nodes, the parameter δ(G) is a global property and can be much smaller than α(G).To the best of our knowledge, this parameter has not been studied before.We show the following result: This algorithm is inspired by Wang & Cheng [21] but uses much simpler and faster data structures; it should be practical for large-scale graphs.
Section 5 gives an alternative algorithm for truncated truss decomposition based on matrix multiplication.We present here a slightly simplified summary in terms of the linear algebra constant ω, i.e. the value for which multiplication of N ×N matrices can be performed in N ω+o (1) time.(The current best estimate [9] is ω ≈ 2.38.)The algorithm is theoretically appealing, but the algorithm in Section 4 is more likely to be useful in practice.Fast matrix multiplication has been used for previous triangle counting and enumeration algorithms [1,3].For our decomposition algorithm, we turn these into dynamic data structures to maintain triangle lists as edges of G are removed.See Section 5 for more precise costing.

Truss combinatorics in the context of cohesive subnetworks
The truss is an interesting but somewhat obscure combinatorial object, and this is the primary reason for studying its combinatorial properties.In addition, there is an important connection between the extremal bounds for trusses and its use as a heuristic for cohesive subnetworks.
Intuitively, a cohesive subnetwork on n nodes should be highly connected.Thus the edge counts should be quite high, perhaps on the order of n 2 .By contrast, our results in Section 2 show an example of a connected k-truss with edge counts as low as Θ(kn).It may be surprising that a connected k-truss can be extremely sparse despite its use in finding cohesive networks.These can be viewed as pathological cases where the truss heuristic does a poor job at discovering the underlying graph structure.
The extremal example consists of a series of (k + 2)-cliques connected at vertices.This is clearly a collection of multiple distinct cohesive subnetworks.In particular, it contains many subgraphs which are themselves k-trusses.This extremal example motivates us to define a stricter notion of critical k-truss as a heuristic for finding cohesive subnetworks: namely, a collection of edges which is a k-truss but which contains no smaller k-truss.
It seems reasonable that this restriction might give a more robust heuristic for cohesive subnetworks.Yet, we will show in Section 3 that this has similar extremal examples.The additional restriction of criticality does not significantly increase the minimum edge count.These results suggest that, despite their usefulness for real-world graphs, both the k-truss and the critical k-truss can be fallible heuristics.

Notation
We let n denote the number of vertices and m the number of edges of a graph G = (V, E).The neighborhood of a vertex v ∈ V is the set N (v) = {u : (u, v) ∈ E} and d(v) = |N (v)| is the degree of v.We also define N + (v) = N (v) ∪ {v}.For simplicity, we assume throughout that G has no isolated vertices and m ≥ n/2.
We define a triangle to be a set of three vertices (v 1 , v 2 , v 3 ) where edges (v 1 , v 2 ), (v 2 , v 3 ), (v 3 , v 1 ) are all present in G.We also write (e 1 , e 2 , e 3 ) for this triangle, where edges e 1 , e 2 , e 3 are given by e 1 = (v 1 , v 2 ), e 2 = (v 2 , v 3 ), e 3 = (v 3 , v 1 ).We define (e) and (v) to be the number of triangles containing an edge e or vertex v respectively.We say edge e 1 and e 2 are neighbors if they share a vertex.
For an integer t we define [t] = {1, . . ., t}.We assume there is some fixed, but arbitrary, indexing of the vertices, and we define ID(v) ∈ [n] to be the identifier of vertex v.
For a vertex subset U ⊆ V , we define the vertex-induced subgraph G[U ] to be the graph on vertex set U and edge set {(u, v) ∈ E : u, v ∈ U }.For an edge set L ⊆ E, we define the edge-induced subgraph G(L) to be the graph on edge set L and vertex set {v ∈ V : (u, v) ∈ L for some u ∈ V }.
The complete graph on n vertices (n-clique) is denoted by K n .
We will analyze some algorithms in terms of the degeneracy of graph G, which we denote δ(G).See Appendix A for further definitions and properties.

Minimum edge counts for the k-truss
We begin by collecting a few simple observations on the vertex counts in a k-truss.Proof.Let e be any edge on v.This edge e has at least k triangles, giving k other edges incident on v.Each of these k + 1 edges has at least k triangles in G[N + (v)]; furthermore, each such triangle is counted by at most two edges incident on v.So (v) is at least k(k + 1)/2.Finally, the number of edges in Observation 2.2.The minimum number of vertices in a k-truss is exactly k + 2.
Proof.Each vertex must have degree at least k +1, thus each k-truss contains at least k +2 vertices.On the other hand, K k+2 is a k-truss with k + 2 vertices.
The properties can be used to bound the clustering coefficient, a common measure of graph density.Formally, the clustering coefficient of a vertex v is defined as cc 2 .Thus, cc(v) is a real number in the range [0, 1].
We also get a simple bound on the edge counts.Let us define M n,k to be the minimum number of edges for a connected k-truss on n vertices.Our main result in this section is to estimate this quantity M n,k , showing the following tight bounds: Theorem 2.5.For every k ≥ 1 and n ≥ k + 2, we have As an immediate corollary of Theorem 2.5, we also get a tight bound on triangle counts: Corollary 2.6.A connected k-truss on n vertices must contain at least (n−1)(k+2)k 6 triangles.
Proof.The graph has at least m ≥ (n − 1)(1 + k/2) edges.Each edge has at least k triangles; since each triangle is incident to three edges, this implies there are at least mk/3 triangles.
For the upper bound of Theorem 2.5, we use a construction based on vertex contraction.Namely, for a pair of graphs G, H, define G * H to be the graph obtained by contracting an arbitrary vertex of G to an arbitrary vertex of H.The resulting graph . This also shows that the bound of Corollary 2.6 is tight in this case.
A slightly modified construction shows the upper bound M n,k ≤ n(1+k/2)+O(k 2 ) for arbitrary values of n.Namely, let n = s(k + 1) + r for integer r in the range k + 2 ≤ r < 2k + 3, and consider ).We now turn to prove the lower bound of Theorem 2.5.Let G be a connected k-truss.Since G is connected, it has a spanning tree T , which we may take to be a rooted tree (with an arbitrary root).For a triangle t of G, we say that t is single-tree if exactly one edge is in T , otherwise it is double-tree.(If all three edges were in T , this would be a cycle on T .)If an edge e ∈ E − T participates in a double-tree triangle, we say that e is double-tree-compatible otherwise it is double-tree-incompatible.Proposition 2.7.Let (u, v, w) be a single-tree triangle where (u, v) ∈ T .Then either (u, w) or (v, w) is double-tree-incompatible.
Proof.If (u, w), (v, w) are both double-tree-compatible, this would imply that there are vertices x, y with (u, x), (x, w), (v, y), (w, y) ∈ T .Since (u, w) ∈ E − T , we must have y = u.But then u, x, w, y, v, u is a cycle on the tree T , which is a contradiction.
Our proof strategy will be to construct a function F : T × [k] → E − T , and then argue that F is 2-to-1.This will show that E − T has cardinality at least |T |k/2, and so ) which is the lower bound we need to show.
To define F , consider an edge e = (x, y) ∈ T , where y is a T -child of x.Arbitrarily select k triangles t e,1 , . . ., t e,k involving e.For i = 1, . . ., k, we define F (e, i) as follows: • If t e,i is double-tree, then F (e, i) is the unique off-tree edge of t e,i .
• If t e,i is single-tree and exactly one off-tree edge f of t e,i is double-tree-incompatible, then F (e, i) = f .
• If t e,i is single-tree and both off-tree edges of t e,i are double-tree-incompatible, then F (e, i) is the off-tree edge of t e,i containing y.
In light of Proposition 2.7, this fully defines the function F .We refer to an edge e as a preimage of f if F (e, i) = f for some index i ∈ [k].Since any triangle is determined by two of its edges, such index i is uniquely determined by e and f .Proposition 2.8.Suppose that edge f = (u, v) ∈ E − T is double-tree-incompatible, and (u, x) is a preimage of f where x is a T -child of u.Then edge (x, v) must be double-tree-compatible.Furthermore, f does not have a preimage (u, y) where y is a T -child of u distinct from x.
Proof.For the first result, suppose that F ((u, x), i) = f and consider the triangle t (u,x),i .Since f is double-tree-incompatible, necessarily t (u,x),i is single-tree.If the other edge (x, v) in this triangle were also double-tree-incompatible, then since x is a T -child of u we would have For the second result, suppose that (u, x) and (u, y) are preimages of f .By the argument in the preceding paragraph the edges (x, v) and (y, v) are both double-tree-compatible.So there are vertices r, s with (x, r), (r, v), (y, s), (s, v) ∈ T .One can then check that v, s, y, u, x, r, v is a cycle on T , a contradiction.Proposition 2.9.Every edge f = (u, v) ∈ E − T has at most two preimages under F .
Proof.Case I: f is double-tree-compatible.The only possible preimages to f would come from double-tree triangles in which f is the unique off-tree edge.There can only be a single such triangle; for, if (u, v, x) and (u, v, y) were two such triangles, then u, x, v, y, u would be a cycle on T .So the only possible preimages of f are the two tree-edges in this double-tree triangle.
Case II: f is double-tree-incompatible.We first claim that f cannot have three preimages (u, x), (u, y), (u, z).For, in this case, at least two vertices, say without loss of generality x, y, must be T -children of u.This is ruled out by Proposition 2.8.
So suppose that f has three preimages (u, x), (u, y), (v, z).By Proposition 2.8, it cannot be that both x and y are T -children of u.So assume without loss of generality that y is the T -parent of u and x is a T -child of u.
By Proposition 2.8, the edge (x, v) is double-tree-compatible.So there is some vertex r such that (x, r), (r, v) ∈ T .Since u is the T -parent of y, this implies that x is the T -parent of r and r is the T -parent of v and v is the T -parent of z.
Since (v, z) is a preimage of f and v is the T -parent of z, by Proposition 2.8 the edge (z, u) must be double-tree-compatible.So we have (z, s), (s, u) ∈ T for some vertex s.It can be seen that u, x, r, v, z, s, u is a cycle on T , a contradiction.This completes the proof of the lower bound of Theorem 2.5.

Critical connectivity of the k-truss
We define a critical k-truss as follows: It is clear that critical k-trusses are connected.The extremal graphs A 1 * • • • * A s for the upper bound in Theorem 2.5 are far from critical, as each subgraph A i is a k-truss.Arguably, critical k-trusses are more relevant for community detection -if a graph contains smaller k-trusses, then it is a conglomeration of communities rather than a single cohesive subnetwork of its own.
We begin with some simple observations.Observation 3.2.There is no critical k-truss with exactly k + 3 vertices.In a critical k-truss with more than k + 3 vertices, every vertex has degree at least k + 2.
, and in particular G contains a copy of the k-truss K k+2 .
Proof.Suppose that G is a critical 1-truss, and let e 1 be an edge of G.So e 1 is contained in some triangle with edges (e 1 , e 2 , e 3 ).Then G({e 1 , e 2 , e 3 }) is a 1-truss.
Let us define M * n,k to be the minimum number of edges in a critical n-node k-truss; we have M * n,k = ∞ if no such critical k-truss exists.To avoid edge cases covered by Observations 3.2 and 3.3, we assume that k ≥ 2 and n ≥ k + 4.
Proof.For the upper bound, consider the graph consisting of a cycle C of length n − 2, plus two new vertices x 1 , x 2 with edges to every vertex in C.This is a critical 2-truss with n vertices and 3n − 6 edges.
For the lower bound, let G = (V, E) be a critical 2-truss with m edges and n vertices.Define V to be the vector space of all functions from E to the finite field GF (2), and for any triangle t of G we define χ t to be the characteristic function of t, i.e. χ t (e) = 1 if and only if e ∈ t.
Select U to be any smallest set of triangles in G with the property that every edge is in at least two triangles of U .This is well-defined since G is a 2-truss.We claim that for every proper subset W U , the sum t∈W χ t is not identically zero.
For, suppose it is, and let L denote the set of edges appearing in the triangles t ∈ W .For any edge e ∈ L, we then have t∈W χ t (e) = 0 in the field GF (2).Since the sum is taken modulo two, there are at least two triangles t 1 , t 2 in G(L) containing e (by definition of L, there is at least one).Thus G(L) is a 2-truss.Since G is a critical 2-truss, we must have L = E. Thus, every edge of G is covered by at least two triangles in W .This contradicts minimality of U .
We have shown that there is at most one linear dependency among the functions χ t for t ∈ U (namely, corresponding to W = U ).A standard result (see e.g.[8]) is that the vector space V has dimension m − n + 1.This implies that |U | − 1 ≤ m − n + 1.On the other hand, every edge is in at least two triangles in U and so 3|U | ≥ 2m.Putting these inequalities together gives m ≥ 3n−6.
Our main result in this section is to estimate M * n.k , showing a result similar to Theorem 2.5.Specifically, we will show that (A more precise estimate is shown in Theorem 3.8.) Lemma 3.5.For any integers k ≥ 2, n ≥ k + 4 we have Proof.To show the first bound, let G = (V, E) be a critical k-truss with n vertices and m = M * n,k edges.Create a new graph G , which has all the vertices and edges of G, plus two new vertices x 1 , x 2 .We add a new set F of edges connecting x 1 , x 2 to the previous vertices, where F is chosen so that G = (V ∪ {x 1 , x 2 }, E ∪ F ) has the properties that (i) G is a (k + 2)-truss and (ii) F is inclusion-wise minimal with this property.
To show this is well-defined, we need to show that property (i) is satisfied when F is the set of all possible edges between the new vertices and the old ones.In this case, G has no isolated vertices, as G is a k-truss and has none.Also, any edge e ∈ E has k triangles from G and two new triangles in F .Finally, for each edge e = (x i , v) where v ∈ G, there is a triangle in G for each neighbor of v in G.By Observation 3.2, this implies that (e) ≥ k + 2.
Clearly G has n + 2 vertices and has m + |F | ≤ m + 2n edges.To show G is critical, suppose there are edge subsets E ⊆ E, F ⊆ F such that G (E ∪ F ) is a (k + 2)-truss.Then G(E ) must be a k-truss, as removing the edges incident to x 1 , x 2 can only remove 2 triangles per edge.Since G is critical, this implies E = ∅ or E = E.If E = ∅, then G(F ) must be a k-truss.However, G(F ) itself has no triangles, and so we must have F = ∅, i.e.E ∪ F = ∅.On the other hand, if E = E, then G (E ∪ F ) is a (k + 2)-truss; by definition of F , this implies that F = F and hence The second bound is essentially identical, except that we add only a single vertex instead of two vertices.
Proof.Suppose first that k is even.From Theorem 3.4, we have M * i,2 = 3i − 6 for any integer i ≥ 6.By repeated applications of Lemma 3.5 we get , and so We are now ready for the main construction to show the upper bound on M * n,k .This uses a type of graph embedding in the torus; we describe the construction in more detail in Appendix B. Lemma 3.7.Suppose there exists a graph T embedded in the torus with r faces, where each edge appears in two distinct faces, and each face F has s F ≥ 4 edges.Let g = F s F .Then for k ≥ 3 there is a critical k-truss with r(k − 2) + g/2 vertices and r k−1 2 + (k − 1 + 1/2)g edges.
Proof.For each face F , let us define T (F ) to be the corresponding subgraph of T .We form the graph G by starting with T .For each face F , we insert a copy of K k−1 , which we denote by C(F ).We add an edge from every vertex of C(F ) to every vertex in T (F ), and we let H(F ) denote these edges.We also define G(F ) = C(F ) ∪ H(F ) ∪ T (F ).See Figure 1.
Figure 1: This shows G(F ) for a single face F (here, the outside square), for k = 4.The triangle inside the square is C(F ).The square on the outside is T (F ).Each vertex of C(F ) is connected to each vertex of face T (F ), via an edge of H(F ).
We first compute the number of vertices and edges in G. First, since each edge of T appears in exactly two faces, T has 1 2 F s F = g/2 edges.By Euler's formula, T therefore has g/2 − r vertices.Each face F of T gives k − 1 vertices and k−1 2 edges in C(F ) and (k − 1)s F edges in H(F ).So G has (g/2 − r) + r(k − 1) = r(k − 2) + g/2 vertices and has g/2 g edges, as we have claimed.Let us check that G is a k-truss.For an edge e of T , the two corresponding faces include copies of K k−1 , so e has at least 2k − 2 ≥ k triangles.An edge of C(F ) has k − 3 triangles within C(F ) and at least s F ≥ 4 triangles from vertices of T (F ), for a total of k + 1 triangles.An edge of H(F ) has 2 triangles in T (F ) and k − 2 triangles within C(F ), a total of k triangles.
We next need to show that G is critical.Suppose that G(L) is a k-truss for L ⊆ E; we need to show that L = E or L = ∅.We do this in four stages.
(a) We first claim that, for every face F , either L contains all the edges in H(F ), or none of them.For, suppose L omits an edge e = (u, v) ∈ H(F ), where u ∈ T (F ) and v ∈ C(F ).
Every other edge e in H(F ) incident to u would then have at most k − 1 triangles in L, and so also e / ∈ L. Thus, L contains no edges of H(F ) incident to u.
Next, consider a vertex u adjacent to u in T (F ).Any edge e ∈ H(F ) incident on u can now have at most k − 1 triangles in L, since one of its triangles in G(F ) used an edge of H(F ) incident on u.Thus, L must omit all the edges of H(F ) incident on u as well.
Continuing this way around the cycle T (F ), we see that H(F ) ∩ L = ∅.
(b) We next claim that for every face F , if L omits any edge e ∈ G(F ) then it omits all the edges in H(F ).For, note that this edge e participates in a triangle with some edge e ∈ H(F ).
Thus e has at most k − 1 triangles in L and must be omitted.By part (a), this implies that all the edges in H(F ) are omitted.
(c) We next claim that for every face F , either L contains all the edges in G(F ), or none of them.
From parts (a) and (b), we see that if L omits any such edge, then it omits all the edges in H(F ).This implies that each edge e ∈ C(F ) has at most k − 2 triangles in L, and so e / ∈ L. Similarly, each edge e ∈ T (F ) has at most k − 1 triangles in L, coming from the graph C(F ) where F is the other face touching e; thus also e / ∈ L.
(d) Finally, we claim that either L = E or L = ∅.For, suppose that G omits an edge e ∈ G(F ) for some face F .By part (c), L omits every edge in G(F ).Now note that if F touches F in T , then G(F ) omits an edge, namely, the common edge of F and F .By part (c) this implies G(F ) ∩ L = ∅.Since T is connected, continuing this way around T we see that L = ∅.
We now get our final estimate for M * n,k : Theorem 3.8.For k ≥ 2 and n ≥ k + 4, we have Proof.The lower bound is an immediate consequence of Observation 3.2.Lemma 3.6 shows that ) and we are done.Similarly, Theorem 3.4 already shows this result when k = 2.
So suppose that k ≥ 3 and n > 2k, and we want to show the upper bound on M * n,k .We write n = ik + j, where j ∈ {0, . . ., k − 1} and i ≥ 2. As we show in Lemma B.1, for these parameters there is a toroidal embedding satisfying the conditions of Lemma 3.7, whose faces consist of two (j + 4)-cycles and i − 2 four-cycles.
The sum of edge counts for this embedding is given by F s F = 4(i − 2) + 2(j + 4) = 4i + 2j.By Lemma 3.7, there is a critical k-truss with ik +j = n vertices and m

Practical truss decomposition algorithm
We now present Algorithm 2 to compute the trussness τ (e) of every edge e in a graph.Recall that τ (e) is the maximal value k such that e is in a k-truss and that, after this has been computed, we can easily compute the k-truss-components of G by discarding all edges e with τ (e) < k and finding the connected components of the resulting graph.
Our algorithm here is inspired by the Wang & Cheng [21] and Huang et al. [12] algorithms, but uses simpler data structures.To explain, let us provide a brief summary of their algorithms.At each stage, they find an edge with the fewest incident triangles in the remaining graph, remove this edge, and then update the triangle counts for all neighboring edges.If an edge e has k triangles when it is removed, then it has τ (e) = k + 1.This process continues until all the edges have been removed from the graph.
While this is conceptually simple, it can be cumbersome to implement in practice.In particular, this requires relatively heavy-weight data structures to maintain the edges sorted in increasing order of triangle counts.For example, Wang & Cheng use a method of [2] based on a four-level hierarchy of associative arrays.(See [5] for related data structures.)While this could certainly be implemented, it is also clearly more complex than primitive data structures such as arrays.
The key idea of our new algorithm is that, instead of sorting the edges by triangle count, we only maintain an unordered list of edges whose triangle count is below a given threshold.We can afford to periodically re-scan the graph for edges with few triangles.
Let us begin by recalling the standard simple algorithm to enumerate the triangles in G: Algorithm 1 for all w ∈ N (u) do Instead of statically listing triangles, as in Algorithm 1, our truss decomposition Algorithm 2 keeps track of them as edges are removed from the graph.The main data structure is the array , which stores the number of triangles in the residual graph containing any given edge e.We also use a sentinel value denoted ∅ to indicate that edge e is no longer present in the residual graph.Other data structures include a stack S and a list L of edges which need to be processed.for all edges e ∈ L do 5: if (e) = k − 1 then push e onto S 6: if (e) = ∅ then remove e from L 7: end for 8: while S is non-empty do

9:
Pop edge e from S. Let e = (u, v) such that d(u) ≤ d(v) 10: (e) ← ∅ 11: for all w ∈ N (u) do 12: if (v, w) ∈ E and (u, w) = ∅ and (v, w) = ∅ then 13: end if end for 17: Output τ (e) = k − 1 18: end while 19: end for We refer to each iteration of the loop at line (3) of Algorithm 2 as round k.Let us first remark on the implementation of L. At first glance, it would appear to require a linked list, since we need to remove edges e from L in line (6).However, we only delete elements while we are iterating over the entire list, and so we can instead store L as a simple array.When we want to delete e from L, we just swap it to the end of the buffer instead.
We also note that round k = 1 can be simplified: for an edge e with (e) = 0, we can immediately output τ (e) = 0 and we do not need to push e onto the stack.This optimization can be useful for graphs which have a relatively small number of triangles.
We now show Algorithm 2 has the claimed complexity and correctly computes the values τ (e).At any given point in the algorithm, we define the set of edges e ∈ E with (e) = ∅ as the residual edges and denote them by R. Proposition 4.2.Any edge e gets added to S at most once over the entire lifetime of Algorithm 2.
Proof.If e gets added to S in round k, then line (10) ensures that (e) = ∅ by the end of round k, so that e never gets added in subsequent rounds.Also, an edge e can be added to S at most once in a given round, since before adding e to S in line (13) or (14) we first decrement (e).Proof.Since R = E initially, these properties are satisfied at line (2).Now suppose these properties are satisfied at the end of round k − 1; we want to show they remain satisfied in round k as well.
From property (b), we know that (e) ≥ k − 1 for all edges e ∈ R at the beginning of round k.Lines ( 4) -( 7) maintain the properties.Now, consider the state just before line (9), where we are processing some edge e ∈ S. Property (c) is maintained since e does not appear elsewhere in S. For property (a), since e gets removed from R at line (10), the counts in must be updated for all triangles of R involving e. Line ( 13) -( 14) are reached for each such triangle and (f ) is indeed properly adjusted for each edge f neighboring e.Finally for property (b), note that if any such edge f was placed into S, it would necessarily have f ∈ R and (f ) = k − 1.
Proposition 4.4.An edge e is in R at the end of round k if and only if τ (e) ≥ k.
Proof.By Proposition 4.3(b), every edge f ∈ R has either (f ) ≥ k or f ∈ S. Since S is empty after line (18), this implies that each e ∈ R has at least k triangles in G(R), and hence τ (e) ≥ k.
Conversely, suppose that some edge e with τ (e) ≥ k gets removed from R before round k+1.Let f be the first such edge removed and let E denote the k-truss-component containing f .Consider the state before line (9) when f is removed.Since f is the first such removed edge, all the other edges in E remain in R and so f still has at least k triangles in R.This contradicts Proposition 4.3(c).

Truncated truss decomposition using matrix multiplication
We now develop a theoretically more efficient algorithm for truncated truss decomposition up to some given bound k trunc .This allows us to compute the k-truss-components for any value k ≤ k trunc , by discarding edges with τ (e) < k and running depth-first search on the resulting graph.
The algorithm here, like Algorithm 2, is based on removing edges and updating triangle counts.The crux of the algorithm is the following observation.When an edge is removed, it must be incident on fewer than k triangles in the residual graph.If we could enumerate these triangles efficiently, then we could potentially update their edges in only O(k) time.
There is a long history of fast matrix multiplication for triangle enumeraton [1,22,3].These prior algorithms are inherently static: they treat the graph G as a fixed input, and the output is the triangle count or list of triangles.To compute the values τ (e), by contrast, we must dynamically maintain the triangle lists as edges are removed.We develop an algorithm based on a methods of [11,3] which reduce triangle enumeration to finding witnesses for boolean matrix multiplication.
Our algorithm uses multiplication of rectangular matrices; see [13] for further details.To measure the cost of this operation, we define the function Γ : [0, 1] → R + as: Γ(s) = inf{p ∈ R : ∃ an O(t p )-time algorithm for t × t s by t s × t matrix multiplication} It is shown in [10] that Γ(s) = 2 for s ≤ 0.31, and it is conjectured that Γ(s) = 2 for all s ∈ [0, 1].The value Γ(1) is also known as the linear-algebra constant ω.By standard reductions, there is a single randomized algorithm to multiply t × t s by t s × t matrices in time t Γ(s)+o (1) for all s.

Algorithm description
We begin by generating L = 10k trunc log n random vertex sets X 1 , . . ., X L , where each vertex goes into each X i independently with probability q = 1 ktrunc .We also define Y v = { ∈ [L] : v ∈ X } for each vertex v.The algorithm maintains two data structures corresponding to these sets, keeping track of the following information for each edge e = (u, v): 1.The total triangle count (e) 2. S(e, ) = w∈N (u)∩N (v)∩X ID(w) for each ∈ [L] An outline is shown in Algorithm 3. We provide more detail below on the implementation and runtimes of the steps.The proof of correctness is essentially the same as Theorem 4.5, so we do not provide it here.while (e) < k for some edge e ∈ G do Remove e from G and appropriately update S, We first compute the contribution to S coming from triangles with at least one light vertex.We do this by looping over light vertices and pairs of their incident edges.For each such triangle t = (u, v, w), we update S((u, v), ) ← S((u, v), ) + ID(w) for each ∈ Y w ; we similarly update values S((u, w), ) and S((v, w), ).This simple algorithm has expected runtime m 2−b+o (1) .
We next consider the triangles with only heavy vertices.For each index we form matrices B and B of dimensions |H| × |H ∩ X | as follows: For an edge e = (u, v) on heavy vertices u, v we compute the contribution to S(e, ) from heavy vertices w as: This matrix multiplication B B can be computed in |H| O(m 2b ) memory.As |H ∩ X | is a binomial random variable with mean q|H| ≤ 2m b−a , the total expected time for this over all indices is m a+bΓ(1−a/b)+o (1) .Line (4) We use similar data structures to Algorithm 2 for this.For each round k, we check if (e) < k for each edge e. Whenever we process an edge e in lines ( 4) -( 9), we check if (f ) = k−1 for some neighboring edge f and, if so, add f to a stack.Line (6) We use the following primary algorithm to enumerate triangles containing edge e = (u, v): for each ∈ [L] with S(e, ) ≤ n, let x be the vertex with ID(x ) = S(e, ) and test if there are edges (u, x ), (v, x ) in the residual graph.If so, then output a triangle (u, v, x ).
Let us define A to be the set of vertices w where the edges (u, w) and (v, w) remain in the current residual graph, and let W ⊆ A denote the set of all triangle vertices x enumerated in the primary algorithm.Note that we also maintain the residual triangle count (e) = |A|.If |W | is equal to the known value (e), then we have found all the triangles containing e.If |W | < (e), then use a second, slower, fallback option: we simply loop over all vertices in V .
We argue now that |W | = (e) with high probability.Note first that the update sequence in Algorithm 3 is not affected by the randomness in the sets X .So A can be regarded as a deterministic quantity.For an arbitrary vertex w ∈ A, observe that if A ∩ X = {w} for some index , then S(e, ) = ID(w) and hence w will go into W .For each , we have A ∩ X = {w} with probability q(1 − q) |A|−1 .As |A| ≤ k ≤ k trunc and q = 1 ktrunc and L = 10k trunc log n, this shows that

Acknowledgments
Thanks to Michael Murphy, Noah Streib, Lowell Adams, Tad White and Randy Dougherty for ideas and discussion.Thanks to the anonymous journal and conference reviewers for helpful suggestions and pointing us to some useful references.

A Properties of degeneracy and average degeneracy
The degeneracy of graph G, denoted δ(G), is the minimum value such that for every vertex set U ⊆ V , there is a vertex in G[U ] of degree at most δ(G).Many graph classes have bounded degeneracy, for example, planar graphs have degeneracy at most 5.
We remark that there is a connection between degeneracy and graph trussness: A number of previous triangle and trussness algorithms such as [5,12] have analyzed runtime in terms of degeneracy as well as a closely related graph parameter known as arboricity, denoted α(G).This is also related to a parameter known as the pseudo-arboricity α * (G).See [15] for definitions and background.The following are some standard and well-known bounds: Proposition A.2. Consider a graph G = (V, E) with m = |E| edges.
• G has an edge-orientation in which every vertex has out-degree at most α * (G).
• G has an acyclic edge-orientation in which every vertex has out-degree at most δ(G).
Note that, in light of the last two bounds in Proposition A.2, asymptotic runtime bounds are equivalent for α, α * , and δ.
For any edge e = (u, v), define h(e) = min(d(u), d(v)).With this notation, we recall the definition of average degeneracy as δ(G) = 1 m edges e h(e).The following result shows some intuition behind the term "average degeneracy."To state it informally, "most" edges of G have degeneracy "not much larger" than δ(G).Proof.The first result (in a slightly weaker form) was shown by [5]; for completeness, we provide a version of their proof here.By Proposition A.2 there exists an orientation of E such that every vertex has out-degree at most α * (G).Now compute:

B Constructions of toroidal graph embedding
Lemma B.1.For any integers i ≥ 0, t ≥ 4, there is a graph embedding in the torus whose faces consist of two t-cycles and i four-cycles, and where each edge is in two distinct faces.
Proof.There are two cases depending on the parity of t, as shown in Figure 2.
Case I: t = 2s.We view each t-cycle as a rectangle with side lengths s − 1 and 1.They are stacked vertically on top of i unit-length squares.Overall, we have one large rectangle with vertical sides of length i + 2(s − 1) and horizontal sides of length 1.
To get a torus, we first identify the top and bottom edges (marked X) to form a cylinder with circular circumference i + 2(s − 1).We next identify the left side of the cylinder with a twisted version of the right side, namely, we rotate one of the sides by s − 1.Thus, the two edges marked Y are identified.
Case II: t = 2s + 1.We view each t-cycle as a trapezoid with base length 1 and side lengths s − 1 and s, joined to form a rectangle of height 2s − 1.We put the i squares below them, and use a similar twisting process to join them into a torus.

Observation 2 . 1 .
Any vertex v in a k-truss G must have degree at least k + 1.Furthermore, the graph G[N + (v)] has at least k+1 2 triangles and k+2 2 edges.

Algorithm 2 1 :
Initialize an empty stack S, and initialize an edge-list L = E 2: Using Algorithm 1, compute the triangle counts (e) for all e ∈ E 3: for k = 1, . . .,

Proposition 4 . 3 .
Algorithm 2 maintains the following loop invariants on the data structures: (a) The array correctly records triangle counts for the graph G(R).(b) For every edge e ∈ R, either (e) ≥ k or e ∈ S. (c) Every edge e ∈ S satisfies e ∈ R and (e) < k.

Theorem 4 . 5 . 4 . 4 . 6 .
Algorithm 2 correctly computes τ (e) for all edges e.Proof.Suppose that line(17) outputs τ (e) = k − 1 for some edge e.By Proposition 4.3(c), this implies that e must have been in R at the beginning of round k, and hence by Proposition 4.4 we have τ (e) ≥ k − 1.On the other hand, e got removed from R at line(9), and so by Proposition 4.4 we have τ (e) < k.Thus, τ (e) is indeed k − 1.Note that the termination condition k = √ 2m of line (3) follows from Observation 2.Theorem Algorithm 2 runs in O(m δ(G)) time and O(m) memory.Proof.The array can be referenced or updated in O(1) time.Bearing in mind our remarks about implementing L as an array, all operations on S and L take O(1) time.Observation 4.1 shows that line (2) takes O(m δ(G)) time and O(m) memory.The data structures , S, L are indexed by edges, and so overall take O(m) memory.Next, let L k denote the value of list L at the beginning of round k.By Proposition 4.4, L k consists solely of edges with τ (e) ≥ k − 1.The runtime of the loop at lines (4) -(7) at round k is linear in the length of L k , and so the total work over all rounds is a constant factor times k |L k | ≤ k {e ∈ E : τ (e) ≥ k − 1} ≤ e (τ (e) + 1).For any edge e = (u, v) with τ (e) = k, we have (e) ≥ k so clearly d(u) ≥ k and d(v) ≥ k.Thus τ (e) ≤ k ≤ min(d(u), d(v)), and so k |L k | ≤ e 1 + min(d(u), d(v)) ≤ m + m δ(G) ≤ O(m δ(G)).

Line ( 1 ) 2 )
end for 10: for all remaining edges in G do Output τ (e) ≥ k trunc We write k trunc = m a ; since τ (e) ≤ √2m for all edges e, we assume that k trunc ≤ √ 2m and hence a ∈ [0, 1 2 + o(1)].We also use the notation Õ(x) = x × polylog(x) for any quantity x.With standard sorting methods, this takes Õ(Ln) time and memory in the worst case.Line (We use a method based on[22] for this step.The calculation of (e) is similar to S(e, ) so we only show the latter.We divide the vertices into two classes: theheavy vertices v (if d(v) > m 1−b ) and the light vertices (if d(v) ≤ m 1−b ).Here, b is a parameter in the range [a, 1] we will set later.We let H denote the set of heavy vertices, and observe that |H| ≤ 2m/m 1−b = 2m b .

Proposition A. 1 .
For any edge e we have τ (e) ≤ δ(G) − 1. Proof.Consider any k-truss-component E , and let U denote the set of vertices with at least one edge in E .Then G[U ] contains G(E ).By Observation 2.1, every vertex in G(E ) has degree at least k + 1.So G[U ] has minimum degree at least k + 1, which shows that δ(G) ≥ k + 1.
e∈E h(e) ≤ e=(u,v)∈E e oriented to v d(u) = u∈V out-degree(u) × d(u) ≤ u∈V α * (G) × d(u) = 2mα * (G) For the second result, let L denote the set of edges e with h(e) ≤ s = δ(G)/x.Since e h(e) = m δ(G), we must have |E − L|s ≤ m δ(G), i.e. |L| ≥ (1 − x)|E|.Now consider a vertex set U ⊆ V and let G = G(L)[U ].Each edge e ∈ G has an endpoint u ∈ U with d(u) ≤ s.Thus, the total number of edges in G is at most u∈U :d(u)≤s d(u) ≤ |U |s.Since this holds for all U , the graph G(L) has degeneracy at most s.

Figure 2 :
Figure 2: Two t-cycles packed on top of i four-cycles.The case of even values of t is shown on the left side and the case of odd values of t on the right side.