Covering a Graph with Clubs

Finding cohesive subgraphs in a network has been investigatSeveral alternative formulations of cohesive subgraph have been proposed, a notable one of them is s-club, which is a subgraph whose diameter is at most s. Here we consider a natral variant of the well-known Minimum Clique Cover problem, where we aim to cover a given graph with the minimum number of s-clubs, instead of cliques. We study the computational and approximation complexity of this problem, when s is equal to 2 or 3. We show that deciding if there exists a cover of a graph with three 2-clubs is NP-complete, and that deciding if there exists a cover of a graph with two 3-clubs is NP-complete. Then, we consider the approximation complexity of covering a graph with the minimum number of 2-clubs and 3-clubs. We show that, given a graph G = (V,E) to be covered, covering G with the minimum number of 2-clubs is not approximable within factor O(|V |1/2−ε), for any ε > 0, and covering G with the minimum number of 3-clubs is not approximable within factor O(|V |1−ε), for any ε > 0. On the positive side, we give an approximation algorithm of factor 2|V | log |V | for covering a graph with the minimum number of 2-clubs. Submitted: September 2018 Reviewed: February 2019 Revised: February 2019 Accepted: March 2019 Final: March 2019 Published: April 2019 Article type: Regular paper Communicated by: P. Ferragina A preliminary version of this paper appeared in [9]. Research partially supported by the project “ESIGMA” (ANR-17-CE23-0010). E-mail addresses: riccardo.dondi@unibg.it (Riccardo Dondi) mauri@disco.unimib.it (Giancarlo Mauri) florian.sikora@dauphine.fr (Florian Sikora) zoppis@disco.unimib.it (Italo Zoppis) 272 Dondi et al. Covering a Graph with Clubs


Introduction
The quest for modules inside a network is a well-known and deeply studied problem in network science, with applications in different fields, for example the analysis of biological or social network.A highly investigated problem is that of finding cohesive subgroups inside a network (for example see [26]), which in graph theory translates in highly connected subgraphs.A common approach is to look for cliques, that is graphs whose vertices are all pairwise connected.Several combinatorial problems based on clique have been considered, notable examples being the Maximum Clique problem ( [12,GT19]), the Minimum Clique Cover problem ( [12,GT17]), and the Minimum Clique Partition problem ( [12,GT15]).This last is a classical problem in theoretical computer science, that, given a graph, asks for a partition of the vertices into the minimum number of cliques.The Minimum Clique Partition problem has been deeply studied since the seminal paper of Karp [17], studying its complexity in several graph classes, like cubic graphs [5], unit-disk graphs [6,24,10] and bounded clique-width graphs [11].
When analyzing networks, asking for a complete subgraph is sometimes too restrictive, as interesting highly connected graphs do not always have connections between all pairs of vertices, for example due to noise in the data considered.
To overcome this limitation of the clique approach, alternative definitions of highly connected graphs have been proposed, leading to the concept of relaxed clique [18].A relaxed clique is a graph G = (V, E) whose vertices satisfy a property which is a relaxation of the clique property.Indeed, a clique is a subgraph whose vertices are all at distance one from each other, that is the diameter of the graph is one.Moreover, the vertices of a clique have the same degree (the size of the vertices in the clique minus one).Different definitions of relaxed clique are obtained by modifying one of the properties of clique.Some variants relax the distance between the vertices of the subgraph sought, thus leading to distance-based relaxed cliques, other variants relax the degree of the subgraph sought, leading to degree-based relaxed cliques, and so on (see [18] for a survey on different definitions of relaxed clique and their algorithmic properties).
In this paper, we focus on a distance-based relaxation.In a clique all the vertices are required to be at distance at most one from each other.Here this constraint is relaxed, so that the vertices have to be at distance at most s, for an integer s 1.A subgraph whose vertices are all distance at most s is called an s-club (notice that, when s = 1, an s-club is exactly a clique).The identification of s-clubs inside a network has been defined for the analysis of networks [21,1] and has been recently applied for the analysis of social networks [20,22,27], and biological networks [23,3].Interesting recent studies have shown the relevance of finding s-clubs in a network [20,22], in particular focusing on finding 2-clubs in real networks like DBLP or a European corporate network.
Contributions to the study of s-clubs mainly focus on the Maximum s-Club problem, that is the problem of finding an s-club of maximum size.Maximum s-Club is known to be NP-hard, for each s 1 [4].Even deciding whether JGAA, 23(2) 271-292 (2019) 273 there exists an s-club larger than a given size in a graph of diameter s + 1 is NP-complete, for each s 1 [3].The Maximum s-Club problem has been studied also in the approximability and parameterized complexity framework.A polynomial-time approximation algorithm with factor O(|V | 1/2 ) for every s 2 on an input graph G = (V, E) has been designed [2].This is optimal, since the problem is not approximable within factor O(|V | 1/2−ε ), on an input graph G = (V, E), for each ε > 0 and s 2 [2].
Maximum s-Club has been studied also in parameterized complexity framework.Maximum s-Club, unlike the problem of finding a clique of maximum size, is known to be fixed-parameter tractable, when parameterized by the size of an s-club [25,19,7].The Maximum s-Club problem has been investigated also for structural parameters and specific graph classes [15,14].
In this paper, we consider a different combinatorial problem, where we aim at covering the vertices of a network with a set of subgraphs.Similar to Minimum Clique Partition, we consider the problem of covering a graph with the minimum number of s-clubs such that each vertex belongs to an s-club.We denote this problem by Min s-Club Cover, and we focus in particular on the cases s = 2 and s = 3.We show some analogies and differences between Min s-Club Cover and Minimum Clique Partition.We start in Section 3 by considering the computational complexity of the problem of covering a graph with two or three s-clubs.This is motivated by the fact that Clique Partition is known to be in P when we ask whether there exists a partition of the graph consisting of two cliques, while it is NP-hard to decide whether there exists a partition of the graph consisting of three cliques [13], since Clique Partition is equivalent to GraphColoring on the complementary graph.As for Clique Partition, we show that it is NPcomplete to decide whether there exist three 2-clubs that cover a graph.On the other hand, we show that, unlike Clique Partition, it is NP-complete to decide whether there exist two 3-clubs that cover a graph.These two results imply also that Min 2-Club Cover and Min 3-Club Cover do not belong to the class XP for the parameter "number of clubs" in a cover.Notice that when we ask for the existence of a single s-club that covers a graph, we have to simply check in polynomial-time if the given graph is an s-club.
Then, we consider the approximation complexity of Min 2-Club Cover and Min 3-Club Cover.We recall that, given an input graph G = (V, E), Minimum Clique Partition is not approximable within factor O(|V | 1−ε ), for any ε > 0, unless P = N P [28].Here we show that Min 2-Club Cover has a slightly different behavior, while Min 3-Club Cover is similar to Clique Partition.Indeed, in Section 4 we prove that Min 2-Club Cover is not approximable within factor O(|V | 1/2−ε ), for any ε > 0, unless P = N P , while Min 3-Club Cover is not approximable within factor O(|V | 1−ε ), for any ε > 0, unless P = N P .In Section 5, we present a greedy approximation algorithm that has factor 2|V | 1/2 log 3/2 |V | for Min 2-Club Cover, which almost match the inapproximability result for the problem.
We start the paper by giving in Section 2 some definitions and by formally defining the problem we are interested in.

Preliminaries
Given a graph G = (V, E) and a subset V ⊆ V , we denote by G[V ] the subgraph of G induced by V .Given two vertices u, v ∈ V , the distance between u and v in G, denoted by d G (u, v), is the length of a shortest path from u to v in G.
The diameter of a graph G = (V, E) is the maximum distance between two vertices of V .Given a graph G = (V, E) and a vertex v ∈ V , we denote by . We may omit the subscript G when it is clear from the context.Now, we give the definition of s-club, which is fundamental for the paper.
Notice that an s-club must be a connected graph.We present now the formal definition of the Minimum s-Club Cover problem we are interested in.Problem 1 Minimum s-Club Cover (Min s-Club Cover) Input: a graph G = (V, E) and an integer s 2. Output: a minimum cardinality collection S = {V 1 , . . ., V h } such that, for each i with 1 i h, V i ⊆ V , G[V i ] is an s-club, and, for each vertex v ∈ V , there exists a set V j , with 1 j h, such that v ∈ V j .We denote by s-Club Cover(h), with 1 h |V |, the decision version of Min s-Club Cover that asks whether there exists a cover of G consisting of at most h s-clubs.
Notice that in Minimum Clique Partition we can assume that the cliques that cover a graph G = (V, E) partition V , hence the cliques are vertex disjoint.Indeed, it can be shown that there exist h cliques that cover a graph if and only if there exist h cliques that partition the vertices of a graph.Obviously, h cliques that partition the vertices of a graph cover also the graph.On the other hard, if there exists h cliques that cover the vertices of a graph, we can compute h cliques that partition the graph: if two cliques shares a vertex, we can remove it from one of the cliques.
We cannot make the assumption that covering and partitioning a graph is essentially the same problem for s-clubs.Indeed, in a solution of Min s-Club Cover, a vertex may be covered by more than one s-club, in order to have a cover consisting of the minimum number of s-clubs.Consider the example of Fig. 1.The two 2-clubs induced by {v 1 , v 2 , v 3 , v 4 , v 5 } and {v 1 , v 6 , v 7 , v 8 , v 9 } cover G, and both these 2-clubs contain vertex v 1 .However, if we ask for a partition of G, we need at least three 2-clubs (for example the 2-clubs induced by {v 1 , v 2 , v 3 , v 4 , v 5 }, {v 6 , v 7 } and {v 8 , v 9 }).This difference between Minimum Clique Partition and Min s-Club Cover is due to the fact that, while being a clique is a hereditary JGAA, 23(2) 271-292 (2019) 275 Figure 1: A graph and a cover consisting of two 2-clubs (induced by the vertices in the ovals).Notice that the 2-clubs of this cover must both contain vertex v 1 .If v 1 is contained only in one 2-club, for example in the 2-club induced by {v 1 , v 2 , v 3 , v 4 , v 5 }, then two 2-clubs are needed to cover {v 6 , v 7 , v 8 , v 9 }, since the subgraph induced by {v 6 , v 7 , v 8 , v 9 } is not a 2-club (v 6 and v 9 have distance 3 in this subgraph).property, this is not the case for being an s-club.If a graph G is an s-club, then a subgraph of G may not be an s-club (for example a star is a 2-club, but the subgraph obtained by removing its center is not anymore a 2-club).
A problem related to Min s-Club Cover, is that of partitioning a graph G = (V, E) into the minimum number of s-clubs, denoted by Min s-Club Partition.Notice that, unlike the case of cliques, while a solution of Min s-Club Partition is also a solution of Min s-Club Cover, the opposite is not true.For example, the cover of Fig. 1 consisting of two 2-clubs is not a solution of Min s-Club Partition.Moreover, an optimal solution of Min s-Club Partition on the example of Fig. 1 consists of three 2-clubs.An optimal solution of Min s-Club Cover on a graph G contains at most the same number of s-clubs of an optimal solution of Min s-Club Partition on G.

Computational Complexity
In this section we investigate the computational complexity of 2-Club Cover and 3-Club Cover.We show that 2-Club Cover(3), that is deciding whether there exists a cover of a graph G with three 2-clubs, and 3-Club Cover (2), that is deciding whether there exists a cover of a graph G with two 3-clubs, are NPcomplete.

2-Club Cover(3) is NP-complete
In this section we show that 2-Club Cover(3) is NP-complete by giving a reduction from the Clique Partition(3) problem, that is the problem of computing whether there exists a partition of a graph G p = (V p , E p ) in three cliques.Consider an instance G p = (V p , E p ) of Clique Partition(3), we construct an instance G = (V, E) of 2-Club Cover(3) (see Fig. 2).The vertex set V is defined as follows: The set E of edges is defined as follows: Before giving the main result of this section, we prove a property of G.
Lemma 1 Let G p = (V p , E p ) be an instance of Clique Partition(3) and let G = (V, E) be the corresponding instance of 2-Club Cover(3).Then, given two vertices v i , v j ∈ V p and the corresponding vertices w i , w j ∈ V : and only if there exists a vertex w i,j (or w j,i ), which is adjacent to both w i and w j .But then, by construction, We are now able to prove the main properties of the reduction.
Lemma 2 Let G p = (V p , E p ) be a graph input of Clique Partition(3) and let G = (V, E) be the corresponding instance of 2-Club Cover(3).Then, given a solution of Clique Partition(3) on G p = (V p , E p ), we can compute in polynomial time a solution of 2-Club Cover(3) on G = (V, E).
Proof: Consider a solution of Clique Partition(3) on G p = (V p , E p ), and let We conclude the proof observing that, by construction, since Based on Lemma 1, we can prove the following result.
For each d, with 1 d 3, V p d is defined as: , it is easy to compute in polynomial time a partition of G p in three cliques (since being a clique is a hereditary property).
We are now able to prove the main result of this section.
Proof: By Lemma 2 and Lemma 3 and from the NP-hardness of Clique Partition(3) [17], it follows that 2-Club Cover(3) is NP-hard.The membership to NP follows easily from the fact that, given three 2-clubs of G, it can be checked in polynomial time whether they are indeed 2-clubs and whether they cover all vertices of G.

3-Club Cover(2) is NP-complete
In this section we show that 3-Club Cover( 2) is NP-complete by giving a reduction from a variant of Sat called 5-Opposite-Sat.Recall that a literal is positive if it is a non-negated variable, while it is negative if it is a negated variable.
Problem 2 5-Opposite-Satisfiability (5-Opposite-Sat) Input: a collection of clauses C = {C 1 , . . ., C p } over the set of variables X = {x 1 , . . ., x q }, where each C i ∈ C, with 1 i p, contains exactly five literals and does not contain both a variable and its negation.Output: a truth assignment f to the variables in X such that each clause C i , with 1 i p, contains a positive and a negative literal satisfied by f .
A clause C i is opposite-satisfied by a truth assignment f to the variables X if there exist a positive literal and a negative literal in C i that are both satisfied by f .Notice that we assume that there exist at least one positive literal and at least one negative literal in each clause C i , with 1 i p, otherwise C i cannot be opposite-satisfied.Moreover, we assume that each variable in an instance of 5-Opposite-Sat appears both as a positive literal and a negative literal in the instance.Notice that if this is not the case, for example a variable appears only as a positive literal, we can assign a true value to the variable, as defining an assignment to false does not contribute to opposite-satisfy any clause.First, we show that 5-Opposite-Sat is NP-complete, which may be of independent interest.Theorem 2 5-Opposite-Sat is NP-complete.
Proof: We reduce from 3-Sat, where given a set X 3 of variables and a set C 3 of clauses, which are a disjunction of 3 literals (a variable or the negation of a variable), we want to find an assignment to the variables such that all clauses are satisfied.Moreover, we assume that each clause in C 3 does not contain a positive variable x and its negation x, since such a clause is obviously satisfied by any assignment.The same property holds also for the instance of 5-Opposite-Sat we construct.
Consider an instance (X 3 , C 3 ) of 3-Sat, we construct an instance (X, C) of 5-Opposite-Sat as follows.Define X = X 3 ∪ X N , where X 3 ∩ X N = ∅ and X N is defined as follows: ), where l i,p , with 1 p 3 is a literal, that is a variable (a positive literal) or a negated variable (a negative literal), we define two clauses C i,1 and C i,2 as follows: The set C of clauses is defined as follows: We claim that (X 3 , C 3 ) is satisfiable if and only if (X, C) is oppositesatisfiable.
Assume that (X 3 , C 3 ) is satisfiable and let f be an assignment to the variables on X 3 that satisfies C 3 .Consider a clause C i in C 3 , with 1 i |C 3 |.Since it is satisfied by f , it follows that there exists a literal l i,p of C i , with 1 p 3, that is satisfied by f .Define an assignment f on X that is identical to f on X 3 and, if l i,p is positive, then assigns value false to both x C,i,1 and x C,i,2 , if l i,p is negative, then assigns value true to both x C,i,1 and x C,i,2 .It follows that both C i,1 and C i,2 are opposite-satisfied by f .
Assume that (X, C) is opposite-satisfied by an assignment f .Consider two clauses C i,1 and C i,2 , with 1 i |C|, that are opposite-satisfied by f , we claim that there exists at least one literal of C i,1 and C i,2 not in X N which is satisfied.Assume this is not the case, then, if C i,1 is opposite-satisfied, it follows that x C,i,1 is true and x C,i,2 is false, thus implying that C i,2 is not oppositesatisfied.Then, an assignment f that is identical to f restricted to X 3 satisfies each clause in C. Now, since 3-Sat is NP-complete [17], it follows that 5-Opposite-Sat is NPhard.The membership to NP follows from the observation that, given an assignment to the variables on X, we can check in polynomial-time whether each clause in C is opposite-satisfied or not.
Let us now give the construction of the reduction from 5-Opposite-Sat to 3-Club Cover (2).Consider an instance of 5-Opposite-Sat consisting of a set C of clauses C 1 , . . ., C p over set X = {x 1 , . . ., x q } of variables.We assume that it is not possible to opposite-satisfy all the clauses by setting at most two variables to true or to false (this can be easily checked in polynomial-time).
Before giving the details, we present an overview of the reduction.Given an instance (X, C) of 5-Opposite-Sat, for each positive literal x i , with 1 i q, we define vertices x T i,1 , x T i,2 and for each negative literal x i , with 1 i q, we define a vertex x F i .Moreover, for each clause C j ∈ C, with 1 j p, we define a vertex v C,j .We define other vertices to ensure that some vertices have distance not greater than three and to force the membership to one of the two 3-clubs of the solution (see Lemma 4).The construction implies that for each i with 1 i q, x T i,1 and x F i belong to different 3-clubs (see Lemma 5); this corresponds to a truth assignment to the variables in X.Then, we are able to show that each vertex v C,j belongs to the same 3-club of a vertex x T i,1 , with 1 i q, and of a vertex x F h , with 1 h q, adjacent to v C,j (see Lemma 7); these vertices correspond to a positive literal x i and a negative literal x h , respectively, that are satisfied by a truth assignment, hence C j is opposite-satisfied.Now, we give the details of the reduction.Let (X, C) be an instance of 5-Opposite-Sat, we construct an instance G = (V, E) of 3-Club Cover(2) as follows (see Fig. 3).The vertex set V is defined as follows: The edge set E is defined as follows: We start by proving some properties of the graph G.
Lemma 4 Consider an instance (C, X) of 5-Opposite-Sat and let G = (V, E) be the corresponding instance of 3-Club Cover(2).Then, (1) Proof: We start by proving (1).Notice that any path from r to y must pass through r T , r * T or r F .Each of r T , r * T or r F is adjacent to vertices x T i,1 , x T i,2 and x F i , with 1 i q (in addition to r ), and none of these vertices is adjacent to y, thus concluding that d G (r , y) > 3.Moreover, observe that for each vertex v C,j , with 1 j p, there exists a vertex x T i,1 , with 1 i q, or x F h , with 1 h q, that is adjacent to v C,j , with 1 j p, thus d G (r , v Cj ) = 3, for each j with 1 j p.As a consequence of (1), it follows that (2) holds, that is d G (r, y) > 3. Since d G (r , v Cj ) = 3, for each j with 1 j p, it holds (3) d G (r, v C,j ) > 3.
Finally, we prove (4).Notice that N 2 G (r) = {r , r * T , r T , r F } and that none of the vertices in N 2 G (r) is adjacent to r F and r T , thus d G (r, r F ) > 3.
are two 3-clubs of G that cover G.As a consequence of Lemma 4, it follows that r and r are in exactly one of Next, we show a crucial property of the graph G built by the reduction.
Lemma 5 Given an instance (C, X) of 5-Opposite-Sat, let G = (V, E) be the corresponding instance of 3-Club Cover(2).Then, for each i with Proof: Consider a path π of minimum length that connects x T i,1 and x F i , with 1 i q.First, notice that, by construction, the path π after x T i,1 must pass through one of these vertices: r T , r T , x T i,2 or v C,j , with 1 j p.We consider the first case, that is the path π after x T i,1 passes through r T .Now, the next vertex in π is either r or x T h,1 , with 1 h q.Since both r and x T h,1 are not adjacent to x F i , it follows that in this case the path π has length greater than three.
We consider the second case, that is the path π after x T i,1 passes through r T .Now, after r T , π passes through either y 1 or x T h,1 , with 1 h q.Since both x T i,1 x y 1 and x T h,1 are not adjacent to x F i , it follows that in this case the path π has length greater than three.
We consider the third case, that is the path after x T i,1 passes through x T i,2 .Now, the next vertex of π is either r * T or y 1 or x F h , with 1 h q and h = i.Since r * T , y 1 and x F h are not adjacent to x F i , it follows that in this case the path π has length greater than three.
We consider the last case, that is the path after x T i,1 passes through v C,j , with 1 j p.We have assumed that x i and x i do not belong to the same clause, thus by construction x F i is not incident in v C,j .It follows that after v C,j , the path π must pass through either y or x T h,1 , with 1 h q, or x F z , 1 z q and z = i.Once again, since y, x T h,1 and x F z are not adjacent to x F i , it follows that also in this case the path π has length greater than three, thus concluding the proof.Now, we are able to prove the main results of this section.
Lemma 6 Given an instance (C, X) of 5-Opposite-Sat, let G = (V, E) be the corresponding instance of 3-Club Cover(2).Then, given a truth assignment that opposite-satisfies C, we can compute in polynomial-time two 3-clubs that cover G.
Proof: Consider a truth assignment f on the set X of variables that oppositesatisfies C. In the following we construct two 3-clubs , for each i with 1 i q, and d G[V1] (r , x F i ) = 2, for each i with 1 i i q.As a consequence, it holds that r T , r T and r F have distance at most three in , and from each vertex x F i .Since r, r T , r * T and r F are in N (r ), it follows that r, r , r T , r * T and r F are at distance at most 2 in G[V 1 ].Hence, we focus on vertices x T i,1 , with 1 i q, x T h,2 , with 1 h q and x F j , with 1 j q.Since there exists a path that passes trough , since by construction h = j and {x T h,2 , x F j } ∈ E. Finally, x T i,1 and x F j are at distance two in G[V 1 ], since there exists a path that passes trough We now consider G[V 2 ].We recall that, for each i with 1 Furthermore, we recall that we assume that each x i appears as a positive and a negative literal in the instance of 5-Opposite-Sat, thus each vertex x T i,1 , with 1 i q, and each vertex x F h , with 1 h q, are connected to some V C,j , with 1 j p.
First, notice that vertex y is at distance at most three in G[V 2 ] from each vertex of V 2 , since it has distance one in G[V 2 ] from each vertex v C,j , with 1 j p, thus distance two from x T i,1 , with 1 i q, and x F h , with 1 h q, and three from x T i,2 , with 1 i q, r T and r F .Since y is adjacent to y 2 , it has distance one from y 2 and two from y 1 .Now, consider a vertex v C,j , with 1 j p.Since f opposite-satisfies C, it follows that there exist two vertices in V 2 , x T i,1 , with 1 i q, and x F z , with 1 z q, which are connected to v C,j .It follows that v C,j has distance 2 in G[V 2 ] from r T and from r F , and at most 3 from each x T h,1 ∈ V 2 , with 1 h q, and from each x F z ∈ V 2 , with 1 z q.Furthermore, notice that, since v C,j is adjacent to x F z and x F z is adjacent to each x T h,2 ∈ V 2 , with 1 h q and h = z, then v C,j has distance at most two in G[V 2 ] from each x T h,2 ∈ V 2 .Finally, since v C,j is adjacent to y, it has distance two and three respectively, from y 2 and y 1 , in We have already shown that it has distance at most three in G[V 2 ] from any v C,j , with 1 j p, and two from y. Since x T i,1 is adjacent to r T , it has distance at most two from each other vertex x T h,1 , with 1 h q, and three from each other vertex x T h,2 of G[V 2 ].Moreover, it has distance two from y 1 and three from y 2 and r F .Since x T i,2 is adjacent to every vertex x F z ∈ V 2 , with 1 z q, as z = i, it follows that x T h,1 has distance at most two from every vertex x F z ∈ V 2 .Consider a vertex x T i,2 ∈ V 2 , with 1 i q.We have already shown that it has distance at most two from each , it has distance three from y and two from r , with 1 i q, since by construction they are both adjacent to y 1 .Since x T i,2 is adjacent to y 1 , thus it has distance at most two from , and thus distance two from y 1 and three from , with 1 i q, thus it has distance two from each x T i,1 and distance three from r T in G[V 2 ].Since by construction there exists at least one v C,j , with 1 j p, adjacent to x F h , thus x F h has distance two from y and three from each Finally, we consider vertices r T , r F , y 1 and y 2 .Notice that it suffices to show that these vertices have pairwise distance at most three in G[V 2 ], since we have previously shown that any other vertex of V 2 has distance at most three from these vertices in G[V 2 ].Since r T , r F , y 2 ∈ N (y 1 ), they are all at distance at most two.It follows that G[V 2 ] is a 3-club, thus concluding the proof.
Lemma 7 Given an instance (C, X) of 5-Opposite-Sat, let G = (V, E) be the corresponding instance of 3-Club Cover(2).Then, given two 3-clubs that cover G, we can compute in polynomial time a truth assignment that opposite-satisfies C.
for each j with 1 j p.Moreover, by Lemma 5 it follows that for each i with 1 i q, x T i,1 and x F i do not belong to the same 3-club, that is exactly one belongs to V 1 and exactly one belongs to V 2 .
By construction, each path of length at most three from a vertex v C,j , with 1 j p, to r F must pass through some x F h , with 1 h q.Similarly, each path of length at most three from a vertex v C,j , with 1 j p, to r T must pass through some x T i,1 .Assume that v C,j , with 1 j p, is not adjacent to a vertex It follows that v C,j is only adjacent to y and to vertices x F w , with 1 w q (x T u,1 , with In the first case, notice that y is adjacent only to v C,z , with 1 z p, and y 2 , none of which is adjacent to r T (r F , respectively), thus implying that this path from v C,j to r T (to r F , respectively) has length at least 4. In the second case, x F w (x T u,1 , respectively) is adjacent to r F , r F , v C,j and x T i,2 (r T , r T , v C,j , x T u,2 , respectively), none of which is adjacent to r T (r F , respectively), implying that also in this case the path from v C,j to r T (to r F , respectively) has length at least 4. Since r T , r F , v C,j ∈ V 2 , it follows that, for each v C,j , the set V 2 contains a vertex x T i,1 , with 1 i q, and a vertex x F h , with 1 h q, connected to v C,j .By Lemma 5 exactly one of x T i,1 , x F i belongs to V 2 , thus we can construct a truth assignment f as follows: The assignment f opposite-satisfies each clause of C, since each v C,j is connected to a vertex x T i,1 , for some i with 1 i q, and a vertex x F h , for some h with 1 h q.
We can now state the main result of this section.
Proof: By Lemma 6 and Lemma 7, and from the NP-hardness of 5-Opposite-Sat (see Theorem 2), it follows that 3-Club Cover( 2) is NP-hard.The membership in NP follows easily from the fact that, given two 3-clubs, it can be checked in polynomial time whether are 3-clubs and cover all vertices of G.

Hardness of Approximation
In this section we consider the approximation complexity of Min 2-Club Cover and Min 3-Club Cover and we prove that Min 2-Club Cover is not approximable within factor O(|V | 1/2−ε ), for each ε > 0, and that Min 3-Club Cover is not approximable within factor O(|V | 1−ε ), for each ε > 0, unless P = NP.

Hardness of Approximation of Min 2-Club Cover
The proof for Min 2-Club Cover is obtained with a reduction very similar to that of Section 3.1.We present a preserving-factor reduction from Minimum Clique Partition to Min 2-Club Cover.Let G p = (V p , E p ) be a graph input of Minimum Clique Partition, we build in polynomial time a corresponding instance G = (V, E) of Min 2-Club Cover as in Section 3. 1.In what follows we prove the following results that are useful for the reduction.

Hardness of Approximation of Min 3-Club Cover
We show that Min 3-Club Cover is not approximable within factor O(|V | 1−ε ), for each ε > 0, unless P = NP, by giving a preserving-factor reduction from Minimum Clique Partition.
Consider an instance G p = (V p , E p ) of Minimum Clique Partition, we construct an instance G = (V, E) of Min 3-Club Cover by adding a pendant vertex connected to each vertex of V p .Formally, V = {u i , w i : We prove now the main properties of the reduction.
Then, for each i, with 1 h k, define the following subset V h ⊆ V : that cover G. First, we show that for each V h , 1 h, k, and for each w i , w j ∈ V h , with 1 i, j |V p |, it holds that u i , u j ∈ V h .Indeed, notice that N (w i ) = {u i } and N (w j ) = {u j }, and by the definition of a 3-club we must have d G[V h ] (w i , w j ) 3, it follows that u i , u j ∈ V h .Hence, we can define a set of cliques of G p .For each V h , with 1 h k, define a set V p h : h , then w i , w j ∈ V h , and this implies {v i , v j } ∈ E p .Notice that the cliques V p 1 , . . ., V p k may overlap, but starting from V p 1 , . . ., V p k , we can easily compute in polynomial time a clique partition of G p consisting of at most k cliques.
Lemma 10 and Lemma 11 imply the following result.

An Approximation Algorithm for Min 2-Club Cover
In this section, we present an approximation algorithm for Min 2-Club Cover that achieves an approximation factor of 2|V | 1/2 log 3/2 |V |.Notice that, due to JGAA, 23(2) 271-292 (2019) 287 Theorem 4, the approximation factor is almost tight.We start by describing the approximation algorithm, then we present the analysis of the approximation factor.
Algorithm 1: Club-Cover-Approx is similar to the textbook greedy approximation algorithm for Minimum Dominating Set and Minimum Set Cover.While there exists an uncovered vertex of G, the Club-Cover-Approx algorithm greedily defines a 2-club induced by the set N [v] of vertices, with v ∈ V , such that N [v] covers the maximum number of uncovered vertices (notice that some of the vertices of N [v] may already be covered).While for Minimum Dominating Set the choice of each iteration is optimal, here the choice is suboptimal.Notice that indeed computing a maximum 2-club is NP-hard.
Clearly the algorithm returns a feasible solution for Min 2-Club Cover, as each set N [v] picked by the algorithm is a 2-club and, by construction, each vertex of V is covered.Next, we show the approximation factor yielded by the Club-Cover-Approx algorithm for Min 2-Club Cover.
First, consider the set V D of vertices v ∈ V picked by the Club-Cover-Approx algorithm, so that N [v] is added to S. Notice that |V D | = |S| and that V D is a dominating set of G, since, at each step, the vertex v picked by the algorithm dominates each vertex in N [v], and each vertex in V is covered by the algorithm, so it belongs to some N [v], with v ∈ V D .
Let D be a minimum dominating set of the input graph G.By the property of the greedy approximation algorithm for Minimum Dominating Set, the set V D has the following property [16]: The size of a minimum dominating set in graphs of diameter bounded by 2 (hence 2-clubs) has been considered in [8], where the following result is proven.

Conclusion
There are some interesting directions for the problem of covering a graph with s-clubs.From the computational complexity point of view, the main open problem is whether 2-Club Cover(2) is NP-complete or is in P.Moreover, it would be interesting to study the computational/parameterized complexity of the problem in specific graph classes, as done for Minimum Clique Partition [5,6,24,10].For example Minimum Clique Partition is polynomial time solvable for graphs of bounded clique-width [11].Finally, from the approximation complexity point of view, there is a small gap between the inapproximabily result and the approximation factor for Min 2-Club Cover, an open problem is reducing this gap.

Figure 2 :
Figure 2: An example of a graph G p input of Clique Partition(3) and the corresponding graph G input of 2-Club Cover(3).

Lemma 8
Let G p = (V p , E p ) be a graph input of Minimum Clique Partition and let G = (V, E) be the corresponding instance of Min 2-Club Cover.Then, given a solution of Minimum Clique Partition on G p = (V p , E p ) consisting of k cliques, we can compute in polynomial time a solution of Min 2-Club Cover on G = (V, E) consisting of k 2-clubs.Proof: Consider a solution of Minimum Clique Partition on G p = (V p , E p ) where {V p 1 , V p 2 , . . ., V p k } is the set of k cliques that partition V P .We define a solution of Min 2-Club Cover on G = (V, E) consisting of k 2-clubs as follows.For each d, 1 d k, let

Theorem 4
be a graph input of Minimum Clique Partition and let G = (V, E) be the corresponding instance of Min 2-Club Cover.Then, given a solution of Min 2-Club Cover on G = (V, E) consisting of k 2-clubs, we can compute in polynomial time a solution of Minimum Clique Partition on G p = (V p , E p ) with k cliques.Proof: Consider the 2-clubs G[V 1 ], . . ., G[V k ] that cover G.As for the proof of Lemma 3, the result follows from the fact that by Lemma 1, given w i , w j ∈ V d , for each d with 1 d k, it holds that {v i , v j } ∈ E. As a consequence, we can define a solution of Minimum Clique Partition on G p = (V p , E p ) consisting of k cliques as follows, for each d, 1 d k:V p d = {v i : w i ∈ V d } Unless P = NP, Min 2-Club Cover is not approximable within factor O(|V | 1/2−ε ), for each ε > 0.Proof:The inapproximability of Min 2-Club Cover follows from Lemma 8 and Lemma 9, and from the inapproximability of Minimum Clique Partition, which is known to be inapproximable within factor O(|V p | 1−ε )[28] (where G p = (V p , E p ) is an instance of Minimum Clique Partition).Hence Min 2-Club Cover is not approximable within factor O(|V p | 1−ε ), for each ε > 0, unless P = NP.By the definition of G = (V, E), it holds |V | = |V p |+|E p | |V p | 2 hence, for each ε > 0, Min 2-Club Cover is not approximable within factor O(|V | 1/2−ε ), unless P = NP.

Lemma 10
Let G p = (V p , E p ) be an instance of Minimum Clique Partition and let G = (V, E) be the corresponding instance of Min 3-Club Cover.Then, given a solution of Minimum Clique Partition on G p = (V p , E p ) consisting of k cliques, we can compute in polynomial time a solution of Min 3-Club Cover on G = (V, E) consisting of k 3-clubs.Proof: Consider a solution of Minimum Clique Partition on thus concluding the proof.Lemma 11 Let G p = (V p , E p ) be a graph input of Minimum Clique Partition and let G = (V, E) be the corresponding instance of Min 3-Club Cover.Then, given a solution of Min 3-Club Cover on G = (V, E) consisting of k 3-clubs, we can compute in polynomial time a solution of Minimum Clique Partition on

Theorem 6
Let OP T be an optimal solution of Min 2-Club Cover, then Club-Cover-Approx returns a solution having at most 2|V | 1/2 log 3/2 |V ||OP T | 2-clubs.Proof: Let D be a minimum dominating set of G and let OP T be an optimal solution of Min 2-Club Cover.We start by proving that |D| 2|OP T ||V | 1/2 log 1/2 |V |.For each 2-club G[C], with C ⊆ V , that belongs to OP T , by Lemma 12 there exists a dominating set D C of size at most 1 + |C| + ln(|C|) 2 |C| + ln(|C|).Since |C| |V |, it follows that each 2-club G[C] that belongs to OP T has a dominating set of size at most 2 |V | + ln(|V |).Consider, now, D = C∈OP T D C .It follows that D is a dominating set of G, since the 2-clubs in OP T covers G. Since D contains |OP T | sets D C and |D C | 2 |V | ln(|V |), for each G[C] ∈ OP T , it follows that |D | 2|OP T | |V | + ln(|V |).Since D is a minimum dominating set, it follows that |D| |D | 2|OP T |( |V | + ln(|V |)).By Equation1, it holds|V D | 2|D| log |V | thus |V D | 2|V | 1/2 ln 1/2 |V | log |V ||OP T | 2|V | 1/2 log 3/2 |V ||OP T |.Notice that, starting from a solution S of Algorithm 1, we can compute in polynomial time an approximated solution for the Min 2-Club Partition problem on input G having factor 2|V | 1/2 log 3/2 |V ||OP T | , where OP T is an optimal solution of Min 2-Club Partition on G. Indeed, first observe that |OP T | |OP T |, as observed in Section 2. Recall that S consists of 2-clubs N [v], with v ∈ V D ⊆ V .Then, starting from S, compute a solution S of Min 2-Club Partition by greedily assigning the shared vertices to exactly one 2club of S such that if there exists a 2-club N [u] ∈ S, then there exists a 2-club C u ∈ S with C u ⊆ N [u].Notice that, each vertex u ∈ V D is part only of the 2-club C u ⊆ N [u] and it is not assigned to any other 2-club of S .S is a solution of Min 2-Club Partition on input G and contains as many 2-clubs as S. Thus |S | = |S| 2|V | 1/2 log 3/2 |V ||OP T | 2|V | 1/2 log 3/2 |V ||OP T |.