Community Structure Recovery and Interaction Probability Estimation for Gossip Opinion Dynamics

We study how to jointly recover the community structure and estimate the interaction probabilities of gossip opinion dynamics. In this process, agents randomly interact pairwise, and there are stubborn agents never changing their states. Such a model illustrates how disagreement and opinion fluctuation arise in a social network. It is assumed that each agent is assigned with one of two community labels, and the agents interact with probabilities depending on their labels. The considered problem is to jointly recover the community labels of the agents and estimate interaction probabilities between the agents, based on a single trajectory of the model. We first study stability and limit theorems of the model, and then propose a joint recovery and estimation algorithm based on a trajectory. It is verified that the community recovery can be achieved in finite time, and the interaction estimator converges almost surely. We derive a sample-complexity result for the recovery, and analyze the estimator's convergence rate. Simulations are presented for illustration of the performance of the proposed algorithm.


Introduction
Networks appear across domains from biology to sociology. Real networks often exhibit community structures, where subsets of nodes have dense connections locally but sparse connections globally [1]. Community detection is to partition nodes according to the network topology. There is a growing interest in studying community detection based on state observations of dynamics [2][3][4][5][6][7]. Lacking topology data makes the problem harder than classic ones. Particularly, it is unclear how to recover communities out of a single trajectory of opinion dynamics [8].

Related work
In this subsection, we first review key community definitions and detection approaches [1], then discuss recover-worked dynamics has emerged. The problem is to recover communities only based on state observations of a dynamical process. The main difference between this problem and the classic ones, especially the dynamicbased methods, is that the network is not available. The papers [2,7] apply maximum likelihood methods to cascade data. The paper [2] also proposes a two-step procedure, first constructing a network and then clustering agents based on the network. The authors of [4][5][6] introduce the blind community detection method, using sample covariance matrices of agent states for recovery. The author in [3] investigates simultaneously reconstructing the topology and the community structure for epidemics and the Ising model. The paper [15] studies recovery for an Ising blockmodel.
We study how to jointly recover the community structure and estimate the interaction probabilities of gossip opinion dynamics. The problem arises from recent investigation of learning interpersonal influence from dynamics [8]. Network data is useful for decision making, but directly collecting such data can be hard, due to topic specificity [16], consistency issues [17], and privacy concern [18]. Learning large-scale networks may be computationally expensive, so recovering communities as a coarse description is a good option. The gossip update rule captures the random nature of individual interactions. It is a fundamental element of many opinion models [19], and has also been extensively studied [20]. Stubborn agents, such as media and opinion leaders, play a crucial role in opinion formation [21]. The paper [22] shows that the existence of stubborn agents can explain opinion oscillation. A generalization of stubborn agents is to assume that each agent has some level of stubbornness with respect to its initial belief. This generalization is considered by the Friedkin-Johnsen model and its extensions [23,24].

Contributions
We consider jointly recovering communities and estimating interaction probabilities for gossip opinion dynamics. Each agent is assigned with one of the two community labels, and the agents interact with probabilities depending on their labels. Our contributions are as follows: 1. We study properties of the model by leveraging results on Markov chains and stochastic approximation (SA) (Theorem 1). It is shown that regular-agent states converge in distribution to a unique stationary distribution, and the time average of the agent states converge almost surely. An explicit expression for the mean of the stationary distribution is given (Proposition 2). 2. We develop a joint algorithm (Algorithm 1) to recover the community structure and to estimate the interaction probabilities, based on Polyak averaging and SA techniques. The algorithm is able to recover the communities in finite time, and then able to estimate the interaction probabilities consistently (Theorem 2). 3. We show how to theoretically analyze the developed joint algorithm. A concentration inequality for Markov chains (Lemma 1) is obtained, and it is used in the sample-complexity analysis of the recovery step (Theorem 3). The obtained result shows that the probability of unsuccessful recovery decays exponentially over time. Additionally, we analyze convergence rate of the interaction estimator from an SA argument (Theorem 4).
The obtained results indicate that a Polyak averaging technique can be useful for recovering communities based on a single trajectory. In addition, we establish a sample-complexity result for successful recovery (recovering all community labels correctly), providing a quantitative dependence of the successful recovery probability on model parameters. These two points make our paper different from [4][5][6], which use covariance matrices of samples from several trajectories, and different from [25], which considers learning a sparse characterization of the network from the gossip model. The considered problem is different from classic system identification (e.g., [26]), because stubborn agents normally have fixed states, which does not satisfy input conditions required for system identification, and also because community recovery cannot be obtained directly from parameter estimates. The major differences between this paper and its conference version [27] are that we clarify our assumptions in more detail, characterize the sample complexity and the convergence rate of the algorithm, and add more numerical experiments to illustrate its performance.

Outline
The rest of the paper is organized as follows. Section 2 formulates the problem. Analysis of the model is given in Section 3, and a joint recovery and estimation algorithm is proposed in Section 4. Section 5 presents convergence results of the algorithm, and Section 6 provides several numerical experiments. Finally, Section 7 concludes the paper. Some proofs are postponed to appendices.
Notation. Denote the n-dimensional Euclidean space by R n , the set of n × m real matrices by R n×m , the set of nonnegative integers by N, and the set of positive integers by N + . Let 1 n be the all-one vector with dimension n, e i be the unit vector with i-th entry being one, I n be the n × n identity matrix, and 0 n,m be the n × m all-zero matrix. Define 1 n1,n2 := 1 n1 1 T n2 , where A T represents the transpose of a matrix A. Denote the Euclidean norm of a vector by ∥ · ∥, and denote the maximum absolute column sum norm, spectral norm, and maximum absolute row sum norm of a square matrix by ∥ · ∥ 1 , ∥ · ∥, and ∥ · ∥ ∞ . Denote the diagonal matrix with the elements of a vector x on the main diagonal by diag{x}.
For a vector x ∈ R n , denote its i-th component by x i , and for a matrix A = [a ij ] 1≤i,j≤n ∈ R n×n , denote its (i, j)-th entry by a ij or [A] ij . The matrix A is said to be row stochastic if a ij ≥ 0 and A1 = 1, and to be substochastic if a ij ≥ 0 and the row sums of A are not larger than one. Denote the spectral radius of A by ρ(A). The cardinality of a set Ω is denoted by |Ω|. The function I [property] is the indicator function equal to one if the property in the bracket holds, and equal to zero otherwise. For two sequences {a k } and {b k } with a k ∈ R n and 0 ̸ = b k ∈ R, k ≥ 1, a k = O(b k ) means that ∥a k /b k ∥ ≤ C for all k and some C > 0, and a k = o(b k ) means that lim k→∞ ∥a k /b k ∥ = 0. An event happens almost surely (a.s.) if it happens with probability one. E{X} is the expectation of the random vector X. The notation d → represents convergence in distribution.
To define a Markov chain taking values on (R n , B(R n )), where B(R n ) is the Borel σ-field, we first define the transition probability kernel P (x, A), x ∈ R n , A ∈ B(R n ), satisfying that for each A ∈ B(R n ), P (·, A) is a nonnegative measurable function on R n , and for each x ∈ R n , P (x, ·) is a probability measure on B(R n ). A (homogeneous) Markov chain {X(t), t ∈ N} on R n satisfies that for all t ∈ N, A ∈ B(R n ), and x ∈ R n , Using transition probability kernel P (·, ·), we can define n-step transition probability of {X(t)} inductively by and P 0 (x, A) = I [x∈A] , for all x ∈ R n and A ∈ B(R n ). A stationary distribution of a Markov chain {X(t)} with transition probability kernel P (·, ·) is a probability measure π on B(R n ) such that π(A) = R n π(dx)P (x, A), A ∈ B(R n ).

Problem Formulation
This section introduces the considered model and the definition of communities, and formulates the problem.

Gossip Model with Stubborn Agents
The gossip model is a random process over an undirected graph G = (V, E) with the agent set V, the edge set E, and no self-loops. The agents have two types, regular and stubborn, denoted by V r and V s , respectively (V = V r ∪ V s , V r ∩ V s = ∅). Each agent i has a state X i (t) ∈ R, and the state vector at time t ∈ N is X(t) ∈ R n . Stubborn agents do not change their states during the process.
An interaction probability matrix W = [w ij ] ∈ R n×n captures agent interactions, where w ij = w ji ≥ 0, w ij > 0 ⇔ {i, j} ∈ E, i, j ∈ V, and 1 T W 1/2 = 1. At time t, edge {i, j} is selected with probability w ij independently of previous updates, and agents update as follows, with the averaging weight q ∈ [0, 1), otherwise. (1) and a sequence of independent and identically dis- The update rule (1) can then be written as Since stubborn agents never change their states, we rewrite (2) to end up with the following compact form of the gossip model with stubborn agents: where X r (t) and X s (t) are the state vectors obtained from stacking the states of regular and stubborn agents, respectively, X s (t) ≡ X s (0), and [A(t) B(t)] is the matrix obtained from stacking rows of R(t) corresponding to regular agents. So {[A(t) B(t)], t ∈ N} is a sequence of i.i.d. random matrices. Assume that the initial vector X(0) is fixed, for simplicity. If X(0) is random, we can study the model by conditioning on realizations of X(0).

Communities
We follow the framework of SBMs [12] and Ising blockmodels [15], and assume that agents have pre-assigned community labels. We define a community as the set of agents that have the same label.
In particular, we consider the scenario where the network has two disjoint communities, V 1 and V 2 . Denote the community label of i by C(i), so C(i) = k for i ∈ V k , k = 1, 2. We call C the community structure of the network. We further assume that the interaction probability of the agents i and j with i ̸ = j is (a) The left (right) graph demonstrates the case where ws > w d (ws < w d ), in which agents within the same community interacting more (less) often than agents between communities. The width of edges is proportional to the number of interactions.
(b) The adjacency matrix of a graph generated from SBM(n, ν1, ν2, ps, p d ) with n = 5000, ν1 = 0.4, ν2 = 0.6, ps = 5 log n/n, p d = log n/n. Dots represent nonzero entries, so the block structure of the matrix is clearly visible. where w s , w d ∈ (0, 1) and w s ̸ = w d . Thus agents in the same community (different communities) interact with probability w s (w d ). Fig. 1(a) illustrates two different interaction models, via a simulation where a gossip model defined by (4) is run for 2000 iterations and the number of interactions between agents is counted.
The following example illustrates how the preceding assumption arises naturally from an SBM. It shows that a graph generated from an SBM defines an interaction probability matrix close to an averaged version with the same structure as (5).
Example 1 Consider an SBM with two communities, commonly studied in community detection [12]. Such an SBM is a random graph, denoted by SBM(n, ν 1 , ν 2 , p s , p d ).
Here n is the number of agents, ν 1 ∈ (0, 1) (resp. ν 2 ∈ (0, 1)) is the portion of agents with community label 1 (resp. label 2), where ν 1 + ν 2 = 1 and ν 1 n and ν 2 n are integers, and p s , p d ∈ (0, 1) are the link probabilities between agents in the same and in different communities. We assume C(i) = 1, 1 ≤ i ≤ ν 1 n, and C(i) = 2, The SBM(n, ν 1 , ν 2 , p s , p d ) randomly generates an undi- . This result implies that, if the network of the gossip model is generated from the SBM, then the interaction probability matrix of the gossip model is close to E{A}/E{α} when n is large. Note that E{A}/E{α} has exactly the same structure as W in (5) with n k = ν k n, k = 1, 2, w s = p s /E{α}, and w d = p d /E{α}. Fig. 1(b) demonstrates this concentration phenomenon with an obvious two-block structure. The concentration indicates that behavior of the gossip model over a graph generated from the SBM may not deviate too far from the gossip model over the averaged graph, when n is large. In fact, in [28] we show that the expected stationary states of the two models are close, if log n/n = o(min{p s , p d }). This result indicates that the gossip model over the averaged graph can be considered as an approximation of the model over the SBM, and results for the former model can be extended to the latter model.
Remark 1 A general assumption for community labels in the SBM is that each agent gets a label k with probability ν k independently of each other, k = 1, 2. This is essentially equivalent to the label assignment with deterministic node portions when n → ∞ (Remark 3 of [12]). Note that it is possible to extend the fixed-label assumption considered in Example 1 to the deterministic-portion assumption, by conditioning on each assignment and using the law of total probability. The condition log n/n = O(min{p s , p d }) implies that the expected agent degree is at least O(log n). In this case, the SBM generates connected graphs with high probability. The difference between p s and p d has to be large enough to make exact recovery possible [12]. Here we consider the dynamics over the averaged graph, so the detectability only requires w s ̸ = w d (Assumption 1 (ii)). Future work will study detectability in the SBM case.

Community Recovery and Interaction Estimation
The considered problem is to recover the community structure and to estimate the interaction probabilities based on state observations, as follows.
Problem. Given a trajectory of the gossip model with the interaction matrix (5), develop an algorithm to jointly recover the community structure C and estimate the interaction probabilities w s and w d .

Remark 2
In the preceding problem, we assume that the developed algorithm uses data coming from the gossip model over the averaged graph. A natural question is how this algorithm performs if it uses a trajectory of the gossip model over a graph sampled from an SBM. In Section 6, we illustrate through simulation that the algorithm performs well also in the SBM case. Such performance is guaranteed by that these two processes behave similarly in terms of their stationary states, as explained in Example 1. We use "community recovery" instead of "community detection" to avoid ambiguity, following the terminology of [15], because here agent behavior depends directly on the community structure.
We further sort the agents as follows: is the set of regular (resp. stubborn) agents in the community k, k = 1, 2. Denote n rk := |V rk |, n sk := |V sk |, n r := |V r |, and n s := |V s |. In the considered problem, the total number of agents is known in advance, the network has two communities, and the stubbornagent states are observable. But difficulty still remains since n k , n rk , n sk , k = 1, 2, and interaction information are unknown. The interaction information cannot be obtained in general situations (e.g., agent states are only observed at some time steps, or observations are corrupted by noise, as discussed in Remark 7).

Model analysis
This section studies model behavior, and provides an explicit expression for the mean of the stationary distribution. Assumptions are summarized as follows.
where s := min 1≤i≤ns {x s i }, s := max 1≤i≤ns {x s i }, x s := X s (0) = [(x s1 ) T (x s2 ) T ] T is the stubborn state vector, and x sk is the vector for the community k, k = 1, 2.

Remark 3
In Assumption 1 (i.1), the order of agents is sorted for convenience, but we do not know which group each agent belongs to, before community recovery. It is necessary to assume w s ̸ = w d . Otherwise, W has no block structure. Regular agents are assumed to start from S, which is reasonable and intuitively means that regular states lie between the extreme stubborn states.
Before studying model behavior, we explicitly write the block structures ofĀ := E{A(t)} andB := E{B(t)} in the following proposition, which says that the block structure of W results in similar agent updates in the same community.
is the (i, j)-th entry ofR (resp. W ). If i and j are in the same community, then w ij = w s . Otherwise, w ij = w d . The values of other off-diagonal entries ofR can be obtained by following the same argument and the definition of R ij . For the diagonal entries ofR, note that R(t) is row stochastic a.s., sor ii = 1 − j̸ =ir ij . By comparing R(t) with A(t) and B(t) in the gossip model, we can obtain (8). 2 Corollary 1 If Assumption 1 holds and there exists at least one stubborn agent in the network (i.e., n r < n), thenĀ is Schur stable, namely, ρ(Ā) < 1.
PROOF. We know from Proposition 1 thatĀ has the form (8). If there exists at least one row ofĀ with row sum less than one. So from Lemma 4 in Appendix D, the corollary follows. 2 Now we provide the stability and limit theorems of the gossip model. Theorem 1 (Stability and limit theorems) Suppose that Assumption 1 holds and there exists at least one stubborn agent in the network (i.e., n r < n). The following results hold for the gossip model with stubborn agents.
(i) The model has a unique stationary distribution π with mean x r , and X r (t) d → π, as t → ∞.
(ii) The expectation of the state vector converges to x r : Remark 4 The first two results show that the agent states, although may not converge a.s., converge in distribution to a unique stationary distribution, and their expectations converge to the mean of the stationary distribution. The third result indicates that we can obtain the value of x r by computing the state time average.
The next proposition shows that x r also has a block structure, indicating that regular agents in the same community behave similarly on average.
Proposition 2 Under the conditions of Theorem 1, As a result, where x r is given in (9), and 1 T n sk x sk is defined to be zero if n sk = 0, k = 1, 2.

Remark 5
In Appendix B, we study the block structure of x r in a multiple-community case, as a generalization of Proposition 2.
The above proposition means that regular agents in the same community have the same limit, which is a weighted average of stubborn states. Hence it is possible to split regular agents by computing the state time average. However, we are unable to do so if only one community has stubborn agents, or the stubborn states are similar. The following condition rules out these cases.
Assumption 2 Both communities have stubborn agents (i.e., n s1 n s2 > 0), and This assumption has a practical meaning: stubborn agents are distributed among communities, and agents from different communities are more likely to have distinct opinions. Under Assumption 2, we have the following result, indicating that the presence of stubborn agents enhances the separation of regular agents. PROOF. It suffices to note from Proposition 2, This result shows that Assumption 2 is a necessary and sufficient condition for regular agents from different communities having nonidentical expected stationary states. Note that 1 T ns1 x s1 /n s1 ̸ = 1 T ns2 x s2 /n s2 is generic (i.e., it holds for almost all x s ∈ R ns ).

Joint Recovery and Estimation Algorithm
In this section, we design a joint recovery and estimation algorithm (Algorithm 1) to address the considered problem. We assume the following connections between stubborn and regular agents are known. The information means that we have prior knowledge about stubborn agents, which may be gathered from other sources in practice.
Assumption 3 For every stubborn agent i ∈ V s , it is known for Algorithm 1 that there exists a regular agent j i ∈ V r such that i and j i are in the same community (i.e, C(i) = C(j i )). Now we are ready to introduce Algorithm 1, in which we denote the estimates at time t of community label C(i), interaction probabilities w s and w d , byĈ(i, t),ŵ s (t), and w d (t), respectively. In addition, we use S r i (t) to represent the (i − n 1 + n r1 )-th entry of S r (t), i ∈ V 2 = {n 1 + 1, . . . , n 1 + n r2 } for simplicity. Note that both n 1 and n r1 are unknown in the algorithm. In the gossip model, agents randomly interact and update states. Algorithm 1 partitions the agents and estimates interaction strength between agents, out of these state observations, without interaction information.

Remark 6
The difficulty of recovery is to find a quantity revealing the community structure. Algorithm 1 exploits the trajectory data by using S r (t). From Proposition 2 we know that the entries of S r (t) converge to two distinct values corresponding to the communities. Hence clustering methods (Line 4 of Algorithm 1, or other methods such as k-means) can be used. For estimation of interaction probabilities, the key is to find consistent parameter equations. Here we use the property of stationary states. Note that, from (9), it follows that x r satisfies the following equation, x r =Āx r +Bx s , which implies that under Assumptions 1 and 2. From the definition of W , w s and w d also satisfy the relation given in Assumption 1 (ii). Therefore, the following system of linear equations for (x y) T (n s1 χ 1 − 1 T ns1 x s1 )x + (n 2 χ 1 − n r2 χ 2 − 1 T ns2 x s2 )y = 0 (n 1 (n 1 − 1) + n 2 (n 2 − 1))x + 2n 1 n 2 y = 2 (13) has a unique solution (w s w d ), under Assumptions 1 and 2, for fixed n k , n rk , and n sk . But these quantities

4:
Community recovery: where ji is defined in Assumption 3.
6: end for are unknown, so we leverage SA techniques to estimate them, as presented in Line 5 of Algorithm 1. Note that the algorithm does not need to know the averaging weight q.

Convergence Analysis
This section studies the performance of Algorithm 1.
The following result means that communities can be recovered in finite time, and the interaction probability estimates are convergent.
(i) The community recovery is achieved in finite time: there exists a positive integer-valued random variable T such thatĈ(i, t) = C(i), for all i ∈ V and t > T . (ii) The interaction estimator converges a.s., namely, Remark 7 Since Algorithm 1 uses the property (10), it can also deal with situations where state observations are corrupted. For example, one cannot observe the whole trajectory but can only sample the states at some time steps. Ergodic property ensures that the time average of the sampled states still converges, if the sampling process is independent of the update, and the number of samples tends to infinity [25,29]. Another situation is that the observations are disturbed by i.i.d. zero-mean noise independent of the process. The law of large numbers guarantees that the influence of noise vanishes over time.
Now we investigate the sample complexity of the community recovery, and the convergence rate of the interaction estimator. The following result is useful for studying the sample complexity of the community recovery.

PROOF. See Appendix F.
Remark 8 Similar concentration results to Lemma 1 have been obtained in the literature for other models. One class of results leverage Markov chain approaches and normally require stability such as uniform ergodicity [30,31] or explicit bounds of the derivative of the initial measure with respect to the stationary measure [32]. It is hard to derive these properties for Markov chains without continuous distributions [33], as in our case. Another line of research studies concentration of Polyak averages, and contains step-size conditions [34], which cannot be applied to our problem either.
Using the preceding lemma, we are able to compute when the differences between entries of S r (t) and x r are small enough, such that agents in different communities have distinct state time averages. As a result, we obtain a sample-complexity result for the community recovery. The next theorem shows that the probability of recovering communities successfully depends on the network, the interaction probabilities, and the stubbornagent states. The probability tends to one as t goes to infinity.

Theorem 3 (Sample complexity)
Under the conditions of Theorem 2, for the community recovery step of Algorithm 1, it holds that, for t > t 0 , δ is given in Proposition 2, and s ands are given in (7).

PROOF. See Appendix G.
2

Remark 9
This result provides a sample complexity characterization for recovering community from a single trajectory. Multiple-trajectory sample complexity is investigated by [4][5][6]. The parameter δ reflects the combined effect of the cardinality of stubborn and regular agents and the interaction probabilities. cĀ captures the "speed" of information diffusion, and increases with ρ(Ā). c nr depends on the number of regular agents. c s increases with the range of the states and decreases with the difference of averaged stubborn states in different communities, and c w measures the difference between interaction probabilities within and between communities. Smaller δ, cĀ, n r , and max{|s|, |s|} would make the recovery easier, and so would larger c w and |n s1 1 T ns2 x s2 − n s2 1 T ns1 x s1 |. For the gossip model over a graph sampled from an SBM, Example 1 indicates that the algorithm can recover most of the community labels, which is illustrated in Section 6.
We have the following result for the convergence rate of the interaction estimator. It shows that the convergence rate also depends on the model parameters, and a large enough step-size parameter a ensures that the rate can achieve O(1/ √ t).

PROOF. See Appendix H. 2
Remark 10 In the theorem, η increases with the combined effect of the number of agents and the interaction probabilities (i.e., (w s n 2 + w d n 1 )/δ) and with the disagreement of stubborn agents, and decreases with the cardinality of each community. When a ≥ 1/(2|η|), the estimator achieves its optimal rate. Larger η provides a wider selection range. Simulation in Section 6 shows that the algorithm using a trajectory from the gossip model over an SBM can estimate the ratio of the link probabilities.

Numerical Simulation
This section illustrates the performance of Algorithm 1, conducts an algorithm comparison, and applies Algorithm 1 to the SBM case and a real network.
To illustrate the performance of Algorithm 1 under Assumptions 1-3, consider a network consisting of twelve agents. The two communities both have five regular agents and one stubborn agent. Set interaction probabilities be w s = 5/186 and w d = 1/186. The stubborn agent in community 1 (resp. community 2) has state 1 (resp. −1). The initial states of regular agents are drawn from uniform distribution on (−1, 1). The averaging weight is set to be q = 1/2 in all experiments. Fig. 2(a) shows that Algorithm 1 recovers the communities in finite time, where the accuracy at time t is defined by 1 is a permutation function (to prevent a reverse distribution of labels), S 2 is the group of permutations on {1, 2}, C(i) is agent i's community label,Ĉ(i, t) is the estimate of agent i's label at time t, and n = 12. Consistency of the interaction estimator with step-size parameter a = 1 is demonstrated in Fig. 2(b). These results validate Theorem 2.
We now show the sample complexity of the community recovery (Theorem 3) and compare the recovery step with the k-means, k-means++ [35], and spectral clustering methods [12]. This experiment considers the gossip model under Assumptions 1-3 with n = 400, n 1 = 150, and n s1 = n s2 = 8. Let w s /w d = 5 and solve the two parameters from (6). Let stubborn agents in community 1 (resp. community 2) have state 1 (resp. −1), and generate the initial states of other agents from uniform distribution on (−1, 1). By running the algorithms for 200 times, we obtain the relative frequency that the algorithms recover all community labels, defined by p t := ( N k=1 max σ∈S2 {I [σ(Ĉk(i,t))=C(i),∀i∈V] })/N , where N = 200 andĈ k (i, t) is the estimate of agent i's label at time t in the k-th run. After computing the time average S r (t), we use k-means and k-means++ with k = 2 instead of Line 4 of Algorithm 1, to recover communities. To implement the spectral clustering method, assume that edge activation is known, and use the activation information to estimate the interaction probability matrix W . Applying spectral clustering to estimates of W obtains community estimates. Fig. 3 shows that the probabilities of unsuccessful community recovery of all approaches tends to zero exponentially over time. The spectral clustering method performs much better than other algorithms, because it directly uses interaction information, but the required time is still of the same order as the other algorithms. The k-means and k-means++ methods perform similarly to each other, and also similarly to Algorithm 1. This observation indicates that the major challenge of the considered problem is how to use agent states to recover communities without topological information.   We now consider the case where trajectories of the gossip model over graphs sampled from SBMs are given to Algorithm 1. We use three SBMs with size n = 100, 300, 900 and with two equal-sized communities (ν 1 = ν 2 = 0.5). Set n r1 = n r2 = 0.45n, and n s1 = n s2 = 0.05n. Let the link probability in the same community be p s = (log n) 2 /n and the link probability between different communities be p d = (log n)/n. For each SBM, we generate 20 graph samples. For each graph sample, we run Algorithm 1 for 20 times. Regular states are generated the same as earlier and stubborn agents in community 1 (resp. community 2) have state 1 (resp. −1). Fig. 4(a) shows that Algorithm 1 has high community recovery accuracy, increasing with n. This phenomenon results from the concentration discussed in Example 1. Algorithm 1 outputsŵ s (t) andŵ d (t) as estimates of the two distinct non-zero values of E{W}/E{α}. Note that [cp s , cp d ] defines the same E{W}/E{α} for all c > 0, so we can only estimate the ratio p s /p d without knowing the expected number of edges of the SBM. Fig. 4(b) shows that the median of the estimation error for trajectory samples from each SBM is close to zero and decreases with n.
Zachary's karate club network [36], presented in Fig.  5(a), is used to demonstrate an application of Algorithm 1. An edge represents frequent interaction between the two agents. The strength of interactions between agents is modeled by a weighted adjacency matrix (see matrix C in [36]). A conflict between agents 1 and 34 results in a fission of the club. In the experiment, we assume that only the opinions can be observed, instead of interactions between agents. The process is modeled by the gossip model with stubborn agents. Agents 1 and 34 are set to be stubborn agents holding different opinions.
In addition, one edge in Fig. 5(a) is selected at each time with a probability proportional to interaction strength given in [36]. The goal is to partition the agents into communities based on only state observations. Note that the network structure departures from our assumptions, but the result shown in Fig. 5(b) indicates that our algorithm can finally recover the community structure as time increases, without topological and interaction information.

Conclusion and Future Work
In this paper, we developed a joint algorithm to recover the community structure and to estimate the interaction probabilities for gossip opinion dynamics. It was proved that the community recovery is achieved in finite time, and the interaction estimator converges almost surely. We analyzed the sample complexity of the recovery and convergence rate of the estimator. Future work includes to study the case where all regular agents have the same stationary expectation, and to analyze the community detection problem for dynamics over the SBM.

A Proof of Example 1
In this section, we prove the inequality given in Example 1. We need the following two results, whose proofs can be found in Sections 2.3 and 4.5 of [37], respectively.
Lemma 2 (Chernoff inequality) Let X = n i=1 X i , where X i are independent Bernoulli random variables with expectation p i , and let µ := E{X} = n i=1 p i . Then for all δ ∈ (0, 1), and for all a > µ,

Lemma 3 (Matrix Bernstein inequality) Let
Now we prove the inequality given in Example 1. For two sequences of real numbers {a k } and n j=i+1 a ij , it holds that Note that We first bound terms related to α. Setting δ = 1/2 and we obtain from Lemma 2 that Utilizing Lemma 2 again with δ = 1/ √ n, we have that Now we decompose A−E{A} as the sum of independent random matrices Hence Therefore, from Lemma 3, it follows that Let a = C 1 √ β 1 log n for some constant C 1 , and it holds that The right-hand side of the preceding equation tends to zero as n → ∞ when the constant C 1 is large enough and log n/n = O(max{p s , p d }).
Finally, we need a bound for ∥A∥. Note that ∥A∥ ≤ ∥E{A}∥ + ∥A − E{A}∥. Since we have already bound the second term, it suffices to study the spectrum of E{A}. Observe that E{A} + p s I n has rank 2 and its two eigenvectors corresponding to the nonzero eigenvalues are [1 T ν1n 1 T ν2n ] T and [1 T ν1n − 1 T ν2n ] T . We can compute nonzero eigenvalues and obtain the upper bound of their absolute values as follows So ∥E{A}∥ ≤ β 2 + p s . To sum up, it holds that with probability

B Result on multiple-community case
In this section, we provide a result on the expression of (I −Ā) −1B in the multiple-community case without proof, as a counterpart of Proposition 2, and briefly discuss how to generalize Algorithm 1 to the multiplecommunity case.

B.2 Result and discussion
We consider the scenario where the agents can be partitioned into multiple groups V 1 , . . . , V K (i.e., V = ∪ K k=1 V k and V k ∩V l = ∅ for all k, l ∈ [K]). To ease notation, we assume that V 1 = {1, . . . , n 1 }, V 2 = {n 1 + 1, . . . , n 1 + n 2 }, . . . , , and K i=1 n k = n. Also, sort regular and stubborn agents in each community in the following way, Here, V ri (resp. V si ) is the set of regular (resp. stubborn) agents in community i. Let n ri := |V ri | and n si := |V si |. Finally, denote the cardinality of regular and stubborn agents by n r := |V r | = K i=1 n ri and n s := |V s | = K i=1 n si respectively. Assumptions in the multiple-community case, similar to Assumption 1, are summarized as follows. (2) The within-group and inter-group interaction probabilities are w s , w d > 0, respectively, with w s ̸ = w d , and T is the stubborn state vector, and x sk is the vector for the community k, k ∈ [K].
The following proposition generalizes Proposition 2 to the multiple-community case under Assumption 4. The proof directly follows from the inverse formula of block matrices [38] and is omitted.
Proposition 4 Suppose that Assumption 4 holds and there exists at least one stubborn agent in the network (i.e., n r < n). Then (I −Ā) −1 exists, and it holds that , Here where S is a set consisting of finite number of distinct positive integers, for p = 0 we define the term (j1,...,jp)∈S p < p l=1 (n r,j l /(w s n j l + w d k̸ =j l n k )) = 1, and d A (∅) = e A (∅) = 1. In addition,

Remark 11
The above proposition provides a parallel result of Proposition 2, and shows that (I −Ā) −1B has a block structure and hence so does x r . The developed framework in the current paper indicates that studying a condition similar to Assumption 2 would be the key to addressing the problem in the general case, but such condition would be more complex and the investigation of it is beyond the scope of this paper. Note that it is possible to generalize Algorithm 1 to the multiple-community case by examining the structure of x r more carefully, and the techniques we developed in the current paper would contribute to the analysis of the generalized algorithm.

C Some results on stochastic approximation
In this section we introduce some results, which are used throughout the later appendices, on a linear stochastic approximation algorithm. Consider deterministic matrices H(t), H ∈ R l×l and random vectors z(t), e(t), v(t) ∈ R l , and define the following linear recursion, The following results follow from Lemma 3.1.1 and Theorem 3.1.1 of [39].

D Proof of Theorem 1
From Theorem 2.1 of [40] we have the following stability result for Markov chains.
Proposition 6 Consider a Markov chain on R n , defined by , t ∈ N} is a sequence of i.i.d. random matrices taking values in R n×(n+1) , such that E{log + ∥A(t)∥} < ∞ and E{log where ← − Φ A (s, t) := A(s) · · · A(t) for 0 ≤ s ≤ t, and ← − Φ A (s, t) = I for s > t. In addition, the only invariant subspace of R n is itself, where an invariant subspace is a linear subspace L ⊂ R n such that P{Y (1) ∈ L|Y (0) = y} = 1 for all y ∈ L. Then the infinite random series converges a.s., and its distribution is the unique stationary distribution of the Markov chain {Y (t)}.
We also need the following lemma for substochastic matrices to establish Corollary 1, which will be used in the proof of Theorem 1.
Lemma 4 Consider a substochastic matrix A = [a ij ] ∈ R n×n . If for every row i, 1 ≤ i ≤ n, there exists an integer j, 1 ≤ j ≤ n, satisfying that the sum of jth row less than one, and a sequence of distinct integers k 1 = i, k 2 , . . . , k m = j, 1 ≤ m ≤ n, such that a k1k2 a k2k3 · · · a km−1km > 0, then ρ(A) < 1.
PROOF. From Theorem 8.3.1 of [41], we know that there is a nonnegative nonzero vector x ∈ R n such that There must exists j ∈ T such that i∈T a ji < 1. Otherwise, i∈T a ji = 1 for all j ∈ T and T ⫋ {1, . . . , n} by assumption, which means that a jk = 0, for all k ̸ ∈ T . This contradicts with the assumption. So ρ(A) i∈T x i < j∈T x j , and ρ(A) < 1. 2 Proof of Theorem 1: We use Proposition 6 and Corollary 1 to verify the first part of (i). Note that for some constant γ > 0, where the last equality follows from the Jordan canonical decomposition. Thus from Jensen's inequality, we have that from Corollary 1.
Assumption (iii), the assumption for the existence of stubborn agents, and update rule (1) ensure that R n is the only invariant subspace of itself. So let U (t) := B(t)X s (t), and it follows from Proposition 6 that X r * := ∞ j=0 ← − Φ A (0, j − 1)U (j) converges a.s., and its distribution π is the unique stationary distribution of {X r (t)}. In addition, from Markov inequality and (D.1), So by Borel-Cantelli lemma, ← − Φ A (0, t)X r (0) converges to zero a.s. Hence converges a.s. to X r * . Since ← − X r (t) and X r (t) has the same distribution, we have that X r (t) d → π, as t → ∞.
To prove (iii), write Using notations in Appendix C, let z(t) = S r (t) − x r , a(t) = 1/(t + 1), H(t) ≡ H = −I, e(t) = X r (t) − E{X r (t)}, and v(t) = E{X r (t)} − x r . So from Proposition 5 in Appendix C, to show z(t) → 0, it suffices to validate that ∞ t=1 a(t)e(t) < ∞ and v(t) → 0 as t → ∞. The latter follows from (ii). Note that On the right side of the above equality, the first two terms are weighted sums of martingale difference sequences, and converge by Theorem B.6.1 of [39], and the last term converges since X r (t) is bounded and a(t) − a(t − 1) = −1/t(t + 1). Hence ∞ t=1 a(t)e(t) < ∞ is obtained by combining the left side and the third term on the right together, and noting that (I −Ā) −1 exists by Corollary 1. In this section, we prove Lemma 1 by leveraging the techniques introduced in [30]. We need the following result.

G Proof of Theorem 3
We use Lemma 1 to proof the theorem. For the gossip model with stubborn agents, define f i (x) = x i , x ∈ R nr , 1 ≤ i ≤ n r , and we have that for t > 2∥g i ∥ s /ε and ε > 0 where g i (x) := ∞ t=0 E{X r i (t) − x r i |X(0) = x}, x ∈ S. Note that g i (x), x ∈ S, is the i-th component of vector where the second inequality holds becauseĀ is symmetric, implying ∥Ā∥ 2 = ρ(Ā), and the last inequality follows from x ∈ S. Since |g i (x)| ≤ ∥G(x)∥, we know that ∥g i ∥ s ≤ s * , and for 1 ≤ i ≤ n r , t > 2s * /ε, and ε > 0, it holds that Now let ε 0 = |χ 1 − χ 2 |/(2n r (n r + 1)), it follows from the proof of Theorem 2 that which leads to the conclusion, combined with the explicit form of |χ 1 − χ 2 | given in the proof of Proposition 3. 2

H Proof of Theorem 4
We follow the notations in the proof of Theorem 2.