The Asynchronous DeGroot Dynamics

We analyze the asynchronous version of the DeGroot dynamics: In a connected graph $G$ with $n$ nodes, each node has an initial opinion in $[0,1]$ and an independent Poisson clock. When a clock at a node $v$ rings, the opinion at $v$ is replaced by the average opinion of its neighbors. It is well known that the opinions converge to a consensus. We show that the expected time $\mathbb E(\tau_\varepsilon)$ to reach $\varepsilon$-consensus is poly$(n)$ in undirected graphs and in Eulerian digraphs, but for some digraphs of bounded degree it is exponential. Our main result is that in undirected graphs and Eulerian digraphs, if the degrees are uniformly bounded and the initial opinions are i.i.d., then $\mathbb E(\tau_\varepsilon)=\text{polylog}(n)$ for every fixed $\varepsilon>0$. We give sharp estimates for the variance of the limiting consensus opinion, which measures the ability to aggregate information (``wisdom of the crowd''). We also prove generalizations to non-reversible Markov chains and infinite graphs. New results of independent interest on fragmentation processes and coupled random walks are crucial to our analysis.


Introduction
The DeGroot dynamics [10] is arguably the most prominent model of non-Bayesian social learning.According to the classic DeGroot dynamics, agents synchronously update their opinions at discrete time periods assuming a weighted average opinion of their neighbors.We analyze a natural variant of the classic model, the asynchronous DeGroot dynamics, according to which agents update their opinions at the rings of independent Poisson clocks.
The classic DeGroot model has been studied extensively (see survey [12]).It seems realistic to assume that not all agents update their opinion at the very same time, but rather do so at different random times.However, few quantitative results for the asynchronous dynamics have been proved.
A fundamental difference between the asynchronous DeGroot dynamics and many other averaging dynamics, is that in the other dynamics the (weighted) average of the opinions never changes, whereas, in the asynchronous DeGroot dynamics such an invariant does not exist.This feature makes the analysis of the DeGroot dynamics harder.It also raises questions about the distribution of the consensus which do not arise in the other models.
Most of the results regarding the classic model stem from its immediate relation to a well studied mathematical object, Markov chains.The analysis of the asynchronous counterpart is more subtle.In addition to Markov chain methods, we employ the connection to a fragmentation process of independent interest.1.1.Model and results.Let V be a finite or countable set of vertices.A network consists of a stochastic matrix P with rows and columns indexed by V .The asynchronous DeGroot process associated with P is defined as follows.Each vertex v ∈ V holds an opinion f t (v) ∈ R at every time t ∈ R + .At time 0, the initial opinions {f 0 (v)} v∈V are either deterministic, or random.Each vertex v is endowed with an independent Poisson clock of rate 1.At a time t in which the clock of vertex v rings, its opinion is updated through the DeGroot updating rule f t (v) := u∈V P vu f t− (u), (1.1) where f t− (u) := lim sրt f s (u).
A special case of interest is when P is the transition matrix of a simple random walk on a (locally) finite undirected graph G = (V, E).In this case, each time a vertex rings, it updates its opinion to be the average of the opinions of its neighbors.We refer to this dynamics as the DeGroot dynamics on G.
The following known theorem 1 is the asynchronous version of DeGroot's classical result [10].
Theorem 1.1.If V is finite and P is irreducible, then there exists a random variable f ∞ such that for every v ∈ V , we have lim t→∞ f t (v) = f ∞ almost surely.
The focus of this paper is two fold: studying the rate of convergence and the concentration of f ∞ in finite graphs and generalizing the convergence result to infinite graphs.
When V is finite, we have max the dynamics is defined as the stopping time In general, E[τ ε ] can be exponential in |V |.We find classes of stochastic matrices for which E[τ ε ] is polynomial in |V |, e.g., simple random walks on graphs.
In Section 2 we prove refinements of these bounds that take into account the diameter of G, its eigenvalues, and the initial configuration.Specifically, Part (i) of Theorem 1.2 follows from Theorem 2.1, part (ii) from Corollary 2.4, and the last sentence from Remark (1) after Theorem 2.1.
If the graph G is directed, the time to consensus can be exponential (See Claim 6.2), but when G is a directed Eulerian graph (indegree equals outdegree at each vertex) we prove a polynomial upper bound in Corollary 2.6.
Our main contribution is showing that when the initial opinions are i.i.d., rather than arbitrary, and the graph has a bounded degree, convergence to consensus is much faster if n is large enough.We give a polylogarithmic upper bound for the consensus time in this setting.
Theorem 1.3.Let G be either a connected undirected graph or an Eulerian directed graph.Suppose that G has n vertices and maximal degree ∆.Consider the DeGroot dynamics on G with initial opinions being [0, 1]-valued i.i.d.random variables.Then, there exists a universal constant C > 0 such that for any ε > 0 as long as n is sufficiently large depending on ∆.
Without the assumption of bounded degree, the consensus time can be polynomial in n (see Claim 6.5).Theorem 1.3 follows from a more general result, Theorem 5.1, and Remark 5.3.The polylogarithmic upper bound given in Theorem 1.3 cannot be improved in general.Indeed, later in Claim 6.3, we show that on the cycle graph, E[τ ε ] ≥ Cε −4 log 2 n for any ε ≥ n −1/9 .Theorem 1.1 does not hold for infinite networks as stated.Claim 6.7 shows that for certain initial opinions convergence may fail.The following theorem provides a class of infinite networks and initial opinions to which Theorem 1.1 does extend.A more general result is given in Theorem 5.2.
Theorem 1.4.Consider the DeGroot dynamics on an infinite, connected graph of bounded degree and suppose that the initial opinions are bounded i.i.d.random variables with expectation µ.Then, for every v ∈ V The assumption of a bounded degree cannot be removed.Claim 6.6 provides an example of a graph of unbounded degree on which the opinions do not converge, even when the initial opinions are bounded i.i.d.random variables.
Remark 1.5.On finite networks, the DeGroot dynamics is well defined through the updating rule (1.1).On infinite networks, a more subtle definition is required which is given in Definition 3.1.
1.2.Related work.Several asynchronous averaging dynamics have been studied, e.g., dynamics in which an edge is chosen randomly and then the two vertices on that edge update their opinion to be the average of the two.Boyd, Ghosh, Prabhakar, and Shah [4] analyzed such models in the settings of gossip algorithms and proved spectral bounds for convergence times; these bounds are polynomial in n when the edge is chosen uniformly.
Deffuant, Neau, Amblard and Weisbuch [9] proposed a model similar to that of Boyd et al except that the vertices update their opinion only if the the distance between the opinions is smaller than a certain fixed parameter.Deffuant et al showed that the opinions converge to several clusters.In each cluster there is consensus, but the clusters are so far in their opinions that they do not exchange any information.

Convergence Rate on Finite Graphs
The goal of this section is to estimate the rate of convergence of asynchronous DeGroot dynamics from arbitrary initial opinions to consensus on finite graphs.
Every irreducible transition matrix P has a unique stationary distribution π: a row vector satisfying πP = π.The chain is called reversible if ∀v, w ∈ V, π(v)P vw = π(w)P wv .
In that case, all its eigenvalues are in [−1, 1].If 1 = λ 1 > λ 2 are the two top eigenvalues of P , then γ := 1 − λ 2 is called the spectral gap of P .Let π min denote the minimal element of π.If P corresponds to a simple random walk (SRW) on a graph G = (V, E), then π min ≥ 1 2|E| .For any function f : V → R we define its oscillation by The next theorem bounds the convergence time to consensus for reversible chains and SRW on graphs.
Theorem 2.1.Let P be reversible and let γ denote its spectral gap.Suppose osc(f 0 ) ≤ 1.Consider the stopping time Then, for 0 < ε < 1, we have We prove this theorem in a sharper form in the next two subsections.In Section 2.4, we bound the convergence time for irreducible chains that need not be reversible.

Remarks.
(1) This theorem implies part (i) of Theorem 1.2.To obtain the last sentence of that theorem, we use the known fact that the diameter of a regular graph with n nodes and degree ∆ is at most 3n/∆; see, e.g., the proof of Proposition 10.16 in [19].(2) Note that if f t (v) ∈ [a, a + ε] for all v ∈ V , then the limiting consensus f ∞ will also be in this interval.(3) On the n-cycle, we have γ = (2 + o(1))π 2 /n 2 (See Section 12.3.1 in [19]), so the bound in (b) is better; but in most graphs, (a) is superior.For example, on expander graphs, where γ is bounded away from 0, the bound in (a) is logarithmic in n, while the bound in (b) is larger than n.(4) By taking the initial opinion vector f 0 to be the right eigenvector of P corresponding to λ 2 , we show in Section 5 that E(τ ε ) is at least of order γ −1 for constant ε, so the bound in (a) is sharp up to the logarithmic factor.

2.1.
Preliminaries.We first recall some notation and results from Markov chain theory, that can be found, e.g., in [23] and [19].
• We will use the π-weighted scalar product on R V : and the corresponding norm f = f π := f, f .
• The time reversal of a Markov chain with transition matrix P and stationary distribution π, is given by the matrix P * , where ∀v, w ∈ V, π(v)P * vw = π(w)P wv , so P * is the adjoint of P with respect to the π-weighted scalar product.The chain is reversible iff P * = P .In general, since P is irreducible, P * is also irreducible and has the same stationary distribution π.This also holds for the symmetrization (See, e.g., Lemma 13.6 in [19] for this identity in the reversible case, and [11] or [23] for the general case.)Note that the time reversal P * and the symmetrization P have the same energy functionals as P : Denote by λ 2 ( P ) the second largest eigenvalue of the reversible matrix P , and let γ := 1 − λ 2 ( P ) denote its spectral gap.• There is a variational formula for the spectral gap: See, e.g., eq.(1.2.13) in [23] or Remark 13.8 in [19].Definition.Let F t denote the σ-algebra generated by {f s } s≤t .In this section f 0 is deterministic, so F t is generated by the clock rings in [0 exists, we refer to it as the drift of the process {Ψ(t, f t )} at time t.In Markov process theory (see, e.g., [18,Chap. 6]), D is the infinitesimal generator of the time-space process {(t, f t )} t≥0 .In particular, if D Ψ(t, f t ) ≡ 0 for all t ≥ 0, then {Ψ(t, f t )} t≥0 is a martingale, and if D Ψ(t, f t ) ≤ 0 for all t ≥ 0, then {Ψ(t, f t )} t≥0 is a supermartingale.
The drift operator D satisfies a version of the product rule for derivatives.Suppose that ϕ : [0, ∞) → R is differentiable and D Ψ(t, f t ) exists for all t.Then Taking conditional expectation given F t , dividing by h, and letting h → 0 gives In the next lemma, we record the drifts of some key summaries of the opinion profile f t .
Lemma 2.2.The empirical mean M t := E π (f t ) is a martingale, and the drift of We also have Proof.We have Dividing by h and letting h → 0, we see that {M t } has zero drift, so it is a martingale.Next, we consider the second moment of M t .By orthogonality of Martingale increments, Dividing by h and passing to the limit yields which implies (2.2).Finally, Proof.If f : V → R is updated at v according to our dynamics, then the energy E P (f ) is decreased by (This computation used reversibility; in the non-reversible case, the coefficient π(v)P vw in the first line would be replaced by its average with π(w)P wv .In that case, the energy could increase in an update and need not be a supermartingale, see the first paragraph of Subsection 2.4.)Abbreviating E t := E P (f t ), we infer from the preceding display that Dividing by h and letting h ↓ 0, we get (2.4) By the variational formula (2.1), we have Observe that (I − P )f, 1 = 0 for any f : V → R, so by Cauchy-Schwarz, Consequently, We may assume that M 0 = E π (f 0 ) = 0, since subtracting a constant from f 0 does not affect E P (f 0 ) and Var(f ∞ ).By Lemma 2.2 and (2.4), D M 2 t + π max E P (f t ) ≤ 0, so M 2 t + π max E P (f t ) is a supermartingale.Thus, by Lebesgue's bounded convergence theorem, . This proves the upper bound on Var(f ∞ ); the lower bound is proved similarly, using the submartingale M 2 t + π min E P (f t ).
Remark.The corollary above assumed that the initial opinion profile f 0 is deterministic.If f 0 is random, then the same argument gives

2.3.
The case of simple random walk.In this section we prove Part (b) of Theorem 2.1.We begin by presenting a few well known notions and results regarding simple random walks.Since we assume each edge in E has unit resistance, the effective conductance C(v ↔ w) from v to w is given by (See, e.g., the solved exercise 2.13 in [22].)The effective resistance is R(v ↔ w) := 1 C(v↔w) .• The commute time for simple random walk between v and w equals 2|E|R(v ↔ w) (See Prop.10.7 in [19].) Therefore, by (2.4) we have (2.7) Let N(t) be the total number of clock rings in [0, t].To verify the claim, we first observe that for each k ≥ 0, the conditional expectation does not depend on t, and satisfies |a k | ≤ 1.Therefore, is a smooth function of t.Since the energy is nonincreasing, Fatou's Lemma gives By (2.6) and Jensen's inequality, Therefore, setting t * = 4R max |E| we get Where the second inequality is Markov's inequality, and the third inequality follows from (2.7).The same argument shows that (2.8) Next, we use an iteration argument and argue inductively that for all k ≥ 1 and all initial opinion profiles f 0 , we have (2.9) Indeed, the base case k = 1 is just (2.8).Next, observe that so applying the induction hypothesis and additivity of expectation proves (2.9) for all k ≥ 1. Taking k = ⌈log 2 (1/ε)⌉ completes the proof of (b).

Beyond reversible chains.
In digraphs, the energy can increase and need not be a supermartingale.A simple example is the directed walk on the n-cycle {0, 1, . . ., n−1}, where . This issue persists if we make P lazy by averaging it with the identity, or if we change P to satisfy P ij = 1/2 iff j − i ∈ {1, 2} mod n.Instead of tracking the energy, we will track the empirical variance.
In the next theorem, the matrix P * P will be important.This matrix could be reducible even if P is irreducible (E.g., for the directed walk on the cycle, P * P = I.) Recall that P = (P + P * )/2 and γ = 1 − λ 2 ( P ).Let γ P * P = 1 − λ 2 (P * P ).Note that γ P * P > 0 iff (2.10) For any f : V → R, Using the variational principle twice yields By the product rule, Λ t := e γt var π (f t ) satisfies (2.12) Therefore, (ii) The hypothesis that P vv ≥ δ for all v means that P = δI + (1 − δ)Q for some stochastic matrix Q.Then Therefore, for any f : V → R, we have (2.13) The variational principle yields that γ P * P ≥ 2δγ , whence γ ≥ 2δγ, so the claim follows from (i).
Recall that a digraph is Eulerian iff it is connected, and for every vertex its out-degree equals its indegree.
Corollary 2.6.Let P be the transition probabilities matrix of a simple random walk on an Eulerian digraph G = (V, E) with n vertices and m edges.I.e., , where C is an absolute constant.For a lazy random walk on G (that can be obtained from P by replacing it with (P + I)/2), the bound is Proof.Since G is Eulerian, the stationary measure is given by π(v) = deg(v)/m.The timereversal of P is SRW on the G with all edges reversed, and the symmetrization P is SRW on the undirected graph obtained by ignoring the orientations of the edges.Thus 1/γ ≤ Cnm, see, e.g., [21] or [19,Chap. 10].The claim now follows from Theorem 2.5.
Next, we bound the variance of the consensus f ∞ .In the example of the directed cycle from the beginning of this section, if f 0 (k) = 0 for 0 ≤ k < n/2 and f 0 (k) = 1 for n/2 ≤ k < n, then Var(f ∞ ) = 1/4, and the consensus is far from the initial empirical average.The following corollary ensures this does not occur if π max is small and every node puts a substantial weight on its own opinion.
Proof.Without loss of generality, by subtracting a constant from f 0 , we may assume that M 0 = 0.By Lemma 2.2 and (2.11), On the other hand, (2.10) and (2.13) imply that

Fragmentation Process and Related Results
In this section we introduce a process which is closely related to the DeGroot dynamics.Let X(t) be a continuous time Markov chain on V with independent unit Poisson clocks on the vertices and transition probabilities matrix P .Let F t be the σ-algebra generated by the clock rings up to time t.The fragmentation process originating Note that the {m t (o, •)} o∈V are all interdependent since they are all measurable w.r.t. the same clock rings.We will often abbreviate m t (v) := m t (o, v) when o is clear from the context or when we make a statement that refers to every o ∈ V .The process {m t (v)} v∈V,t≥0 is a step process adapted to the filtration {F t } t∈R + that satisfies the following Markovian rules: • At time 0, • Suppose that the clock of vertex v rings at time t ≥ 0, then for every u ∈ V , Namely, the mass of vertex v is pushed to its neighbors proportionally to v's row in P .We now explain the relation between the DeGroot and the fragmentation processes.We first demonstrate this relation on finite V and then use this relation to define the DeGroot process on infinite V .Suppose first that V is finite.In this case, the DeGroot dynamics can be defined through the DeGroot updating rule (1.1).In what follows we identify a Poisson process with its cumulative function N(s) that counts the number of rings in the time interval [0, s].Suppose that the Degroot dynamics uses the Poisson processes N v (s) for v ∈ V .Let t > 0 and define new Poisson processes Consider the the fragmentation processes m t s (o, v) generated from the Poisson processes N t v (s).We have the following identity This identity clearly holds for s = 0 and one can prove it using induction on the times of clock rings 0 = We use Equation (3.3) to define the DeGroot process in general (V possibly infinite).We obtain several results regarding the fragmentation process.In our results, we assume that the transition probabilities decay at least as fast the inverse square root of the time.
Assumption 1 (square root decay of transition probabilities).We say that a network with matrix P satisfies assumption 1 with parameter For a family of finite networks to satisfy this assumption, the parameter C 0 must be the same for all networks in the family.
There are many interesting networks that satisfy Assumption 1.One example is simple random walks on connected bounded degree, finite or infinite graphs.See, e.g., [ On infinite networks that satisfy Assumption 1, the fragmentation process converges to zero uniformly almost surely.Theorem 3.3 below shows that after time t, it is very unlikely that a fraction of the total mass concentrates on one vertex.Theorem 3.3.There is a universal constant β > 0 with the following property.Suppose that |V | = ∞ and Assumption 1 holds with parameter C 0 .Then for every ε > 0, there exists t 0 = t 0 (C 0 , ε), such that for every vertex o ∈ V and all t ≥ t 0 , we have where {m t (•)} is the fragmentation process originating at o.
The exponent β we obtain in the proof is far from optimal and in fact we conjecture that the following holds: Conjecture 3.4.For any ε > 0 there exists c ε > 0 such that P(∃v, m t (v) ≥ ε) < e −cεt .
The next three propositions will be used in our proofs of the theorems.These propositions provide bounds on certain moments of the fragmentation process.The proofs of the Propositions are given in Section 4.
The first proposition bounds the second moment in infinite networks from above.
Proposition 3.5.Let m t (v) be the fragmentation process originating at an arbitrary vertex.Under Assumption 1, there exists a universal constant α > 0 and a real number , where V may be either finite or infinite.
Note that the constant α cannot be greater than 1 2 , since if P is the transition matrix of a symmetric random walk on the line graph Z, then where P (t) is the transition probabilities matrix of the continuous time symmetric random walk.
Our proof of Proposition 3.5 provides a certain 0 < α < 1 2 .In Remark 4.13, we explain why we conjecture that the proposition should hold with α arbitrarily close to 1  2 .The next proposition studies the case of reversible chains.Under this assumption, we obtain the improved (and optimal) bound of O(t −1/2 ).Proposition 3.6.Let V be a (finite or infinite) network satisfying Assumption 1. Suppose in addition that P is the transition matrix of a reversible Markov chain on V with stationary measure π.Let ∆ = sup v,u∈V π v /π u and let m t (v) be the fragmentation process originating at an arbitrary vertex.Then, there exists a constant C depending on C 0 and ∆ such that for all t ≤ |V | 2 /3 we have The third proposition refers to higher moments of the fragmentation process.Proposition 3.7.Let 0 < α < 1 and C 1 > 1. Suppose that V is either finite or infinite and that the network satisfies Assumption 1.In addition, suppose that the network has the property that the fragmentation process originating at any vertex satisfies for all t ≤ |V | 2 /3.Then, there exists t 0 = t 0 (C 0 , C 1 , α) such that for all d ≥ 2 and all max(t 0 , d 10/α ) ≤ t ≤ |V | 2 /3 we have Theorem 3.3 easily follows from Propositions 3.5 and 3.7.Indeed, let α be the constant from Proposition 3.5.Let t > 1 sufficiently large and d := t α/10 .By Markov's Inequality, we have

Markov Chains with Shared Clocks
In this section, we prove Propositions 3.5, 3.6 and 3.7.The proofs begin by translating the discussion on the moments of the fragmentation process to a discussion on certain joint Markov chains.
Let X 1 (t), X 2 (t), . . .be continuous-time Markov chains on V with the following joint Markovian law.They all use the same Poisson clocks but have independent trajectories (with transition probabilities given by P ).Thus, each X i (t) has the same law (as X(t) from Equation (3.1)) and they are correlated with each other only through sharing the same Poisson clocks.
Proof.Let F denote the sigma-algebra generated by the clock rings.By the definition of the fragmentation process, Let d ∈ N and t ≥ 0. Conditioning on F , the trajectories of the random walks are independent and identically distributed and therefore Claim 4.1 follows by taking expectations.
The reversible case is easier than the non-reversible case and therefore we begin with the proof of Proposition 3.6.
4.1.Proof of Proposition 3.6.Let X 1 (t) and X 2 (t) be independent continuous-time Markov chains with shared Poisson clocks.By Claim 4.1 it suffices to bound the probability that X 1 (t) = X 2 (t).
We start with some notations that will be useful throughout the proof.A path p is a finite or infinite sequence of states of the Markov chain.We let |p| be the number of steps taken by the path and let p −1 be the reversed path.For an integer n and a path p with |p| ≥ n we denote by p n the path obtained from the first n steps of p.
Next, let A 1 (p) be the event that the first |p| steps of X 1 were along the path p.Similarly, we let A 2 (p) be the event that the first |p| steps of X 2 were along the path p.Note that the events A 1 (p) and A 2 (p) are independent of each other and independent of the Poisson clocks on the vertices.We have that Finally, let N 1 (t) and N 2 (t) be the number of steps taken by the Markov chains X 1 and X 2 respectively up to time t.Clearly N 1 (t), N 2 (t) ∼ Poisson(t).It is convenient to work inside the event B := {|N 1 (t) + N 2 (t) − 2t| ≤ t} which holds with high probability.Indeed, We have that where the first union is over pairs of integers n 1 and n 2 with |n 1 + n 2 − 2t| ≤ t and the second union is over pairs of paths q and q ′ with |q| = n 1 and |q ′ | = n 2 that start at o and end at the same vertex.Next, let C n be the set of paths p from o to o with |p| = n in V .Clearly, the paths q and q ′ appearing in the union above can be concatenated to a path p ∈ C n where n := n 1 + n 2 such that we have q = p n 1 and q ′ = (p −1 ) n 2 .Thus, by the union bound, where in the last inequality we used (4.1) and the fact that the chain is reversible.We claim that for any p ∈ C n , } is independent of the trajectory of the chain X 1 after n 1 steps and the trajectory of X 2 after n − n 1 steps and therefore we can extend these trajectories arbitrarily.Substituting this we get We need the following lemma in order to estimate the sum on the right-hand side of (4.3).The statement of the lemma requires additional notations.Let 0 = t 0 < t 1 < t 2 < • • • be the times in which either one of the Poisson clocks N 1 (t) or N 2 (t) ring (i.e., the jump times of N 1 (t) + N 2 (t)).We also let Z ∞ denotes the σ-algebra generated by (X Z ∞ contains all the discrete information of the continuoustime Markov chain (X 1 (t), X 2 (t)).Lemma 4.2.For all t ≤ n ≤ 3t we have almost surely In order to prove the lemma we will need the following technical claim.To this end, recall that the convolution of two functions ϕ, ψ : R + → R is given by ϕ * ψ(x) := x 0 ϕ(y)ψ(x − y)dy.Claim 4.3.Let n ≥ 2 and let k 1 , k 2 > 0 such that 2k 1 + k 2 = n.Let ϕ 1 , ϕ 2 and ϕ 3 be the densities of Gamma(k 1 , 1), Gamma(k 2 , 2) and Gamma(⌊n/2⌋, 1) respectively.Then, for all x ≤ n we have that ϕ 1 * ϕ 2 (x) ≤ Cϕ 3 (x) for some absolute constant C > 0.
Proof.Let ϕ 4 and ϕ 5 be the densities of Gamma(k 2 /2, 1) and Gamma(n/2, 1) respectively.First, we claim that for all x > 0, ϕ 2 (x) ≤ C 2 ϕ 4 (x) for some C 2 > 0. Indeed, where in the first inequality we used that f (x) := x a e −x is maximal when x = a and in the last two inequalities we used Stirling's formula.Thus, using that k 1 + k 2 /2 = n/2 we obtain Finally, we have where the first inequality holds trivially when n is even and when n is odd it follows from Stirling's formula using that x ≤ n.The claim follows from (4.5) and (4.6) (with C = C 2 C 3 ).
We now turn to prove Lemma 4.2.
Proof of Lemma 4.2.Condition on Z ∞ and let Without loss of generality, we may assume that N 1 (t k ) + N 2 (t k ) = n.Indeed, otherwise the conditional probability in the left-hand side of (4.4) is 0 and the lemma follows.Suppose in addition that where Z 1 , Z Note that k, k 1 , k 2 are measurable in Z ∞ while Z i and W i are independent of Z ∞ .Moreover, note that 2k 1 + k 2 = n since N 1 (t i ) + N 2 (t i ) jumps by 2 when X 1 (t i ) = X 2 (t i ) and jumps by 1 otherwise.Conditioning on the value of Z k 1 +1 we obtain that Finally, by (4.7), the density of t k conditioning on Z ∞ is exactly the density ϕ 1 * ϕ 2 from Claim 4.3.Thus, if we let T and Z be independent random variables with T ∼ Gamma(⌊n/2⌋, 1) and Z ∼ exp(1), then by Claim 4.3 and using that n ≥ t we have where N is a Poisson(t) random variable and where the last inequality is a standard estimate on the Poisson distribution.Substituting (4.9) into (4.8)finishes the proof of (4.4).
We now proceed with the proof of Proposition 3.5.The events A 1 (p) and A 2 (p) are measurable in Z ∞ .Thus, substituting the bound (4.4) into (4.3)we obtain where in the second inequality we used Assumption 1 and the fact that t ≤ n ≤ 3t ≤ |V | 2 .This finishes the proof of the proposition using (4.2).

4.2.
Proof of Proposition 3.5.We first provide an informal overview of the proof.The event {X 1 (t) = X 2 (t)} is partitioned into O(t log t) events and Assumption 1 is used for estimating the probability of these events as follows: To estimate the RHS of (4.10), we analyze the Z 2 + -valued process {N(s)} s = {(N 1 (s), N 2 (s))} s conditioned on X1 , X2 , where Xi is the trajectory of X i , namely, X i (t) = Xi (N i (t)).The process N(•) is a continuous-time simple random walk on a directed graph whose vertex set is Z 2 + .There are unit Poisson clocks on the edges.The initial state is (0, 0), and the outgoing edges from each state (z 1 , z 2 ) ∈ Z 2 depend on X1 , X2 as follows: For n ∈ N, consider the time in which the random walk crosses the line z 1 +z 2 = n denoted by τ (n) := min{s > 0 : where The event E 1 depends only on the trajectory of N(•).The probability of E 2 conditioned on the trajectory of N(•) is of order O(t − 1 2 ), by Lemma 4.2.Therefore, it remains to estimate the probability of E 1 conditioned on X1 , X2 .Specifically, it remains to show that for every t large enough (depending on C 0 ), every x ≈ t and every y ∈ Z, for some universal constant α > 0.
Most of the effort in the proof of Proposition 3.5 is devoted to proving (4.11).Note, first, that (4.11) does not hold at every realization of X1 , X2 .For example, if X1 = X2 , then N 1 (s) = N 2 (s) for every s.We identify a class of realizations of X1 , X2 , for which (4.11) does hold and call them "good trajectories."The notion of good trajectories is not binary, it is rather a spectrum.The goodness of the trajectories determines the value of α: the better the trajectories the larger an α can be guaranteed.It turns out that to show that the trajectories are good with high probability, it is sufficient to bound from below the expected proportion of time c in which X 1 (s) = X 2 (s), and the greater the lower bound the greater the "goodness" guarantee.In our proof, we first obtain a naive lower bound on c depending on C 0 and use it to obtain an α that depends on C 0 .Then, from (4.11), we immediately get c arbitrarily close to 1, and with it get a universal α.
We turn now to the formal proof of Proposition 3.5.Let us introduce some notations and conventions first.
• We work under Assumption 1 with parameter C 0 .
• A pair of vertices v, u ∈ V induce a probability measure over X 1 (t), X 2 (t) by setting X 1 (0) = v, X 2 (0) = u.This probability measure is denoted P v,u and the respective expectation is denoted E v,u .• The (independent) trajectories of X 1 (t) and X 2 (t) (t ≥ 0) are denoted X1 (n) and X2 (n) (n = 0, 1, 2, . ..) respectively.• The (unit) Poisson clock that counts the steps of each X i is denoted N i (t).That is, X i (t) = Xi (N i (t)) (i = 1, 2, t ≥ 0).• The times in which either one of the chains progress are denoted (t n ) ∞ n=1 , i.e., these are the step times of N 1 (t) + N 2 (t).
• The joint trajectory is defined as • The σ-algebra generated by Z(0), . . ., Z(n) is denoted Z n and the σ-algebra generated by {Z(n)} ∞ n=0 is denoted Z ∞ .• The set {1, . . ., m} is denoted [m], and n + [m] denotes the set {n + 1, . . ., n + m}.Lemma 4.4.Suppose there exist δ ∈ (0, 1) and m ∈ N such that P m vu ≤ 1 − δ, for every v, u ∈ V .Then, we have Similarly, Adding together the two inequalities yields P(Y = 0) ≤ 1 − δ, which completes the proof for the case v = u.In the case v = u, we the proof is concluded with Lemma 4.5.Suppose there exists m ∈ N and c > 0 such that for every v, u ∈ V .Then, for every k ≥ m, every ε > 0 and every v, u ∈ V , where By Markov's Inequality, It follows by Azuma's Inequality that Therefore, by Markov's Inequality, which concludes the proof of Lemma 4.5.
Consider the events A i,j k := {y k ( X≥i 1 , X≥j 2 ) ≤ c 1 k}.Lemma 4.5 ensures that P(A i,j k ) < e −c 2 k , for every i, j ∈ N. Outside the event 0≤i,j≤3t A i,j k , the trajectories X1 , X2 are (c 1 , t, k)-good, therefore, by the union bound, for every t large enough depending on m and ε.This concludes the proof of Lemma 4.7.
Corollary 4.8.For every c ∈ (0, 1) there exists α = α(c) > 0 such that for every n ∈ N and every martingale S 1 , . . ., S n the following holds.Suppose, for every 1 To proceed we need some further notations.
Proof of Lemma 4.9.We condition on the event We first assume that c > 1/2 and prove the lemma with α arbitrarily close to α(2c − 1) of Corollary 4.8.For simplicity, we assume that n is divisible by k.Consider the martingale S 0 , S 1 , . . ., S n/k defined by S i := S(τ (ki)).
[If n is not divisible by k, then since n > k 2 , one can find integers k 1 , k 2 , . . ., k ⌊n/k⌋ , such that The definition of S i is then amended as We would like to apply Corollary 4.8.Let us verify its conditions.The increments |S i+1 − S i | are bounded by k, since for every ℓ ∈ N.
We now need to bound the variance of the increments of The quadratic variation of S(n) between time τ (ki) and τ (ki + ℓ) is denoted for every ℓ ∈ N.
To complete the proof of the lemma, we relax the assumption that c > 1/2 and explain how to amend the proof so as to obtain α arbitrarily close to α(c/ √ 2).Amend the definition of S i to be S i := S(τ (2ki)).As before, it can be assumed that n is divisible by 2k.Fix i. Amend the definition of τ to be τ (2k(i + 1)) − τ (2ki).As before, we get k ≤ τ ≤ 2k + 1, and therefore, Consider the normalized martingale S i / √ 2k + 1 and apply Corollary 4.8 to get the desired bound.
The following lemma is a weaker version of Proposition 3.5.Here, the number α depends on C 0 rather than being a universal constant.Lemma 4.10.There exist numbers t 0 , α > 0 depending on C 0 , where α depends on C 0 only through the number c 1 by Lemma 4.7, such that for any t ∈ (t 0 , |V | 2 /2) Furthermore, the exponent α can be arbitrarily close to the number α( Let t > t 0 > 0, k := ⌈(log t) 2 ⌉, and c 1 > 0 be given by Lemma 4.7, and A t the event that ( X1 , X2 ) are (c 1 , t, k)-good trajectories.Let Where, the last inequality follows from Assumption 1.By Lemma 4.2, there exists a universal constant C such that for every t > 0 Given n 1 , n 2 ∈ I t , Where, α = α(c 1 ) > 0 is given by Lemma 4.9, and C is the universal constant given by (4.16).The proof of Lemma 4.10 is concluded by plugging (4.17) into (4.15)for all the values of n 1 , n 2 in the summation.
We can now show that the condition of Lemma 4.5 holds for every c ∈ (0, 1).Lemma 4.11.For every c ∈ (0, 1), there exists m (depending on C 0 and c) such that if Proof of Lemma 4.11.Fix v, u ∈ V and c ∈ (0, 1).Let A := {t ≥ 0 : X 1 (t) = X 2 (t)}.By Lemma 4.10, there exists α > 0 and t 0 (depending on C 0 ) such that Let S m+1 be the sum of m + 1 independent unit exponential random variables.Since S m+1 stochastically dominates t m+1 (indeed, S m+1 can be realized as the time of the m+ 1-th jump of X 1 (t)), for any m large enough depending on C 0 and c.
Proposition 3.5 is concluded from Claim 4.1 and the following lemma.
Lemma 4.12.There exists a universal constant α > 0 and number t 0 > 0 depending on C 0 , such that for any t ∈ (t 0 , |V | 2 /2), Proof of Lemma 4.12.Lemma 4.11 ensures that there exists m depending on C 0 such that as long as |V | 2 ≥ 4m.Plugging this to Lemma 4.5, provides constants c 1 , c 2 > 0 such that for every k large enough depending on C 0 (e.g., k ≥ m 2 ).The proof of Lemma 4.12 is concluded by applying Lemma 4.10 with the constant c 1 .
Remark 4.13.In the proof of Lemma 4.12, we could have chosen c 1 arbitrarily close to 1. Therefore, Lemma 4.12 and Proposition 3.5 holds with respect to any α < α * := sup c∈(0,1) where α(c) is given by Corollary 4.8.The result of [13] provides some α * < 1 2 .We conjecture that the true value of α * is 1  2 .An indication to that is a result in [2] (see Corollary 1.1 and equation (1.10) therein), that shows that in Corollary 4.8, under the stricter assumption that |S i − S i−1 | ≤ r, where lim cր1 α(c) = 1 2 .It is not shown in [2] how C(r) depends on r.We believe that proving that C(r) is polynomial in r should be sufficient for concluding that any α ∈ 0, 1   2   can by used in Proposition 3.5.

4.3.
Proof of Proposition 3.7.Throughout the proof we think of C 0 , C 1 and α as fixed and allow the constants C and c to depend on them.Let X 1 (t), . . ., X d (t) be the Markov chains of Claim 4.1.By the assumption of the proposition and Claim 4.1 we have for all i < j ≤ d that For i < j ≤ d define the set of times A i,j := {t > 0 : X i (t) = X j (t)}.
We start with the following lemma.Lemma 4.14.For all k sufficiently large (depending on C 0 , C 1 , α) we have where |A i,j ∩ [0, t]| denotes the Lebesgue measure of the set A i,j ∩ [0, t].
Proof.Let i = j and k ∈ N. We start by bounding the k-th moment of |A i,j ∩ [0, t]|.We have that where the last equality is by symmetry.Now, by Proposition 3.5, for all 0 = s 0 ≤ s 1 ≤ • • • ≤ s k we have that Thus, we obtain that By Markov's inequality, we have for all k ≥ 1 , where the last inequality holds as long as C 2 is sufficiently large.This completes the proof of the lemma.
Let Xi be the trajectory of . Thus, as long as t is sufficiently large (depending on C 0 , C 1 , α) by Lemma 4.14 and a union bound, we have where in the last inequality we used that d ≤ t α/10 .Thus, by Markov's inequality Let P t be the set of infinite random walk trajectories (p 1 , . . ., p d ) for which We Proof.Let Ñi for i ≤ d be Poisson processes independent of each other and independent of Xi for all i.We couple N i and Ñi in the following way.For any s / ∈ A we let N i (s) jump whenever Ñi (s) jumps and for s ∈ A we let N i jump according to Poisson clocks on the vertices that are independent of Ñi and Xi for i ≤ d.It is clear that the processes N i defined in this way have the right law.Moreover, for all i = j the Poisson processes Ñi and N j are independent.
Next, note that the process N i (s) − Ñi (s) is a martingale and its predictable quadratic variation is |A ∩ [0, s]|.Thus, by Freedman's inequality (see [25,Lemma 2.1]) we have that Therefore, by the definition of P t we have It follows that where in the second to last inequality we used that P(Poisson(t) = n) ≤ 1/(2 √ t) for all sufficiently large t and all n ∈ N.This finishes the proof of the lemma.
In what follows it is slightly easier to work with finite trajectories rather than infinite trajectories.We let P ′ t be the set of trajectories (p ′ 1 , . . ., p ′ d ) of length ⌊2t⌋ that can be extended to trajectories (p 1 , . . ., p d ) ∈ P t .We claim that by Lemma 4.15, for all (p 1 , . . ., p d ) ∈ P ′ t and all n 1 , . . ., n d ≤ 2t we have that where in here Xi = p i is a shorthand for Xi (n) = p i (n) for all n ≤ 2t.Indeed, the event ∀i, N i (t) = n i is independent of the trajectories Xi after 2t steps and therefore, when estimating the conditional probability, these trajectories an be extended arbitrarily.
The stage is now ready for the proof of Proposition 3.7.
Proof of Proposition 3.7.Let I t := t − t 1/2+α/18 , t + t 1/2+α/18 and note that Thus, by (4.18) and Lemma 4.15 we have that where in the fourth inequality we used Assumption 1 and the fact that Xi are independent.This finishes the proof of the proposition using Claim 4.1.

IID Initial Opinions
In this section we work under the assumption that the initial opinions are bounded i.i.d.random variables.We note that the interval [0, 1] in Assumption 2 can be replaced with any other bounded interval.Indeed, this follows by re-scaling and shifting all the opinions.Furthermore, even a compact support is not crucial.Most of our arguments work for distributions with finite exponential moment such as normal, exponential, etc.For simplicity, we did not work in the most general settings.
Under Assumption 2, we derive the next two theorems regarding convergence to consensus on finite and infinite networks respectively.Theorem 5.1.Suppose that Assumptions 1 and 2 hold and |V | = n is sufficiently large (depending on C 0 ).Then, there exists a universal constants C, c > 0 such that for any ε ≥ n −c we have Theorem 5.2.If V is infinite and Assumptions 1 and 2 hold, then lim t→∞ f t (v) = µ almost surely, for every v ∈ V .Remark 5.3.Theorem 1.4 follows from Theorem 5.2 using the fact that a simple random walk on an infinite connected graph of bounded degree satisfies Assumption 1 (see Remark 3.2).
Next, we explain how Theorem 1.3 follows from Theorem 5.1.Let C, c be the constants from Theorem 5.1 and let t 1 := (ε −1 log n) C .Suppose first that ε ≥ n −c and note that by Theorem 5.1 we have P(τ ǫ ≥ t 1 ) ≤ e − log 2 n .Next, observe that at time t 1 all the opinions are in [0, 1] and therefore, on the event {τ ε ≥ t 1 }, we can use either Theorem 1.2 in the undirected case or Corollary 2.6 in the Eulerian directed case to bound the expected additional time required after t 1 to reach an ε consensus by n 4 log(1/ε), for all sufficiently large n.We get for some universal constant C where G t is the sigma algebra generated by the process up to time t.Next, suppose that ε ≤ n −c .In this case Theorem 1.3 follows immediately from Theorem 1.2 and Corollary 2.6.Both Theorem 5.1 and Theorem 5.2 rely on the following crucial lemma.Lemma 5.4 below estimates the rate in which the distribution of the opinions concentrates around µ.
Lemma 5.4.Under Assumptions 1 and 2, there exists t 0 > 0 (depending on C 0 ) and a universal constant β > 0 such that for every t ∈ (t 0 , |V | 2 /3), every ε > 0, and every o ∈ V Proof of Lemma 5.4.Without loss of generality, suppose that ε < 1/2.Let o ∈ V and consider the fragmentation process m t (v) originating from o. Recall that Note that the fragmentation process is independent of the initial opinions.
Let F be the sigma algebra generated by the fragmentation process.On the event A := ∀v, m t (v) ≤ a ∈ F we have that and therefore, by Azuma's inequality, on this event we have Thus, as needed Proof of Theorem 5.1.Let β be the universal constant from Lemma 5.4 and let ε ≥ n −β/4 .By Lemma 5.4, and a union bound we obtain P(τ ε > t) ≤ P ∃o ∈ V, |f t (o) − µ| ≥ ε/2 ≤ 2n exp(−ε 2 t β /4), for any t ∈ (t 0 , n 2 /3), where t 0 depends only on C 0 .Letting t 1 := (ε −1 log n) 4/β ≤ n 2 /3 we have P(τ ε > t 1 ) ≤ e − log 2 n .We obtain that there exists C 3 > 0 such that the event holds with probability at least 1/2.Next, note that conditioning on the fragmentation process we have that Thus, letting F be the sigma algebra generated by the fragmentation process and using the fact that the tail P(X > x) for X ∼ N(0, σ 2 ) increases with σ we obtain that on A This finishes the proof of the lemma using that P(A) ≥ 1/2.
We can now prove Claim 6.3.
Proof of Claim 6.3.Recall the definition of the fragmentation process m t s (v, u) given in the discussion before equation (3.2).This process uses the clocks on the vertices in the DeGroot dynamics going backward in time from t to 0. Moreover, recall that in this coupling we have the identity f t (v) = u m t t (v, u)f 0 (u).Let t 1 := δε −4 log 2 n where δ > 0 is sufficiently small (independently of ε and n) and will be determined later.For a vertex v in the cycle consider the event B v := For all t ≤ t 1 and any u ∈ V with d(u, v) ≥ 2t 1 we have m t 1 t (v, u) = 0 .
Note that the event B v depends only on the Poisson clocks of the vertices at distance at most 2t 1 from v.Moreover, the compliment of the event B v occurs only if there is a decreasing sequence of rings on ⌊2t 1 ⌋ consecutive vertices along the cycle starting from v. It follows that P(B v ) ≥ 1 − Ce −ct 1 .Define C v := B v ∩ {f t 1 (v) ≥ ε} and note that by Lemma 6.4 we have where the last inequality holds for as long as δ is sufficiently small.An important observation is that the event C v depends only on the initial opinions and the Poisson clocks of the vertices that are at distance at most 2t 1 from v.
Let k := ⌊ √ n⌋ − 1 and let v 1 , . . .v k be a sequence of (roughly equally spaced) vertices along the cycle such that d(v i , v j ) ≥ √ n for any i and j.Using that ε ≥ n −1/9 we get that 4t 1 ≤ √ n and therefore C v i are mutually independent.Thus by (6.2) we have that By symmetry we have P ∃v, f t 1 (v) ≤ −ε ≥ 3/4 and therefore P(τ ε ≥ t 1 ) ≥ 1/2.This finishes the proof of the claim.
Sketch of proof.The graph is defined as follows.Let S 1 be a star graph with n leafs and center v 1 and S 2 an identical copy of S 1 with center v 2 .The centers v 1 and v 2 are connected by a path of length n.Let t = δn 2 , where δ > 0 is a constant that will be determined later.We claim that . Indeed, by (3.1) and , where κ(δ) > 0 satisfies κ(δ) → 0 as δ → 0. The last inequality holds as long as δ is sufficiently small and using that π v 1 = n+1 3n > 1 3 .Similarly, From the above claim, it follows that P(|f t (v 1 ) − f t (v 2 )| > 1/4) > c, for some constant c > 0. In both of these examples the initial opinions f 0 will be bounded (in [−1, 1]) and therefore f t as well.It follows that, for every v ∈ V , the convergence lim t→∞ f t (v) in probability is equivalent to convergence in L 1 and L 2 .Hence, it is sufficient to find examples for which there exists v ∈ V such that E[f t (v)] does not converge.By the definitions of the DeGroot dynamics (3.3) and the fragmentation process (3.1), E[f t (v)|initial opinions] = u∈V P t v,u f 0 (u), where P t is the t-fold transition matrix of a continuous time Markov chain on V .Claim 6.6.There exists a network with bounded i.i.d.initial opinions and a vertex v such that the opinion of v diverges in probability.
Proof.The matrix P is the transition matrix of a simple random walk on a graph G = (V, E) constructed as follows.The graph contains an infinite one-sided line.Each vertex v i on the line (i ∈ N) is further connected to a set L i of N i = |L i | leaves.
The numbers N 1 , N 2 , . . .are defined recursively together with times t 1 < t 2 < • • • such that the following property holds.Let X(t) be the continuous time random walk on G originating at v 1 .Define hitting times T i := inf{t : X(t) = v i } and events The following property should hold: where δ > 0 is a constant that will be fixed later.

Assumption 2
(i.i.d.initial opinions).The initial opinions are i.i.d.random variables with expectation µ and distribution supported on [0, 1].

6. 3 .
Examples of non-convergence of opinions.In this section we show that Assumptions 1 and 2 in Theorem 5.2 are crucial.We remark that our examples apply for the synchronous DeGroot model as well.We construct two examples: the first violates only Assumption 1 and the second violates only Assumption 2.
[3,,emma B.0.2] or[20, Lemma 3.4].In fact, these lemmas apply to a more general class of reversible Markov chains.Another class of Markov chains that satisfy Assumption 1 is lazy random walks on directed finite Eulerian graphs that are either regular, or have bounded degree.See, e.g.,[3, Lemma  2.4].
have that P ( X1 , . . ., Xd ) ∈ P t ≥ 1 − t −d .(4.18) Lemma 4.15.Let (p 1 , . . ., p d ) ∈ P t and recall that N i (t) is the number of steps taken by the walk X i up to time t.For any n 1 , . . ., n d we have that