On parallel time in population protocols

The parallel time of a population protocol is deﬁned as the average number of required interactions in which an agent in the protocol participates, i.e., the quotient between the total number of interactions required by the protocol and the total number n of agents, or just roughly the number of required rounds, where a round stands for a sequence of n consecutive interactions. This naming triggers an intuition that at least the expected number of parallel steps suﬃcient to implement a round is O ( 1 ) . In a single parallel step only mutually independent interactions can be involved. We show that when the transition function of a population protocol is treated as a black box then the expected maximum number of parallel steps necessary to implement a round is (cid:2) ( log n loglog n ) . We also provide a combinatorial argument for a matching upper bound on the expected number of parallel steps under additional assumptions. Further, we extend these bounds by showing that the situation changes dramatically for sequences of m = (cid:2) ( n log n ) interactions. Then, the expected number of parallel steps required to implement such sequences is (cid:3) ( mn ) under the aforementioned additional assumptions. Thus, it asymptotically coincides with the notion of parallel time, i.e., O ( mn ) , for sequences of interactions produced by protocols solving any non-trivial problems requiring (cid:2) ( n log n ) interactions.


Introduction
In this paper we consider the model of probabilistic population protocols. It was originally intended to model large systems of agents with limited resources [4]. In this model, the agents are prompted to interact with one another towards a solution of a common task. The execution of a protocol in this model is a sequence of pairwise interactions between agents chosen uniformly at random [4,6,10]. During an interaction, each of the two agents, called the initiator and the responder (the asymmetry assumed in [4]), updates its state in response to the observed state of the other agent following the predefined (global) transition function. The efficiency of population protocols is typically expressed in terms of the number of states used by agents and the number of interactions required by solutions (e.g., with high probability (w.h.p.) or in expectation). There is a vast literature on population protocols, especially for such basic problems as majority and leader election [3,6,7,10,12,14].
In general, parallel time has various definitions depending on the system and application, e.g., see [2,15]. In the literature on population protocols [6,10,12], the concept of parallel time as the number of required interactions divided by the number n of agents is widely spread. In other words, one divides the sequence of interactions in an execution of a population protocol into consecutive subsequences of n interactions called rounds. Then one estimates the expected number of required rounds or the number of required rounds w.h.p. Population protocols for any non-trivial problem require (n log n) interactions [10]. 1 Hence, the expressions resulting from dividing bounds on the number of interactions by n are not only simpler but also more focused on the essentials. Fast population protocols are commonly identified with those having poly-logarithmic parallel time.
When the transition function is a black box, an interaction depends on an earlier interaction in a given sequence of interactions if the two interactions share at least one agent. One can implement a group of interactions from the sequence in a single parallel step provided that no interaction in the group depends on another interaction in the group and all interactions in the sequence on which the interactions in the group depend have been already implemented.
Clearly, the average number of interactions that an agent takes part is a lower bound on the number of parallel steps when the transition function of a population protocol is a black box. However, calling this trivial lower bound parallel time may mislead readers not familiar with or not recalling the definition. They may start to believe that by the random choice of a pair of agents for each interaction in a sequence of at least n interactions, there should be a lot of independent interactions in the sequence that could be implemented in parallel. Consequently, they could believe that the whole protocol could be implemented in parallel in time proportional to the number of rounds. The main result of this note is that this intuition is too optimistic for sequences of O (n) interactions, though it is asymptotically correct for sequences of (n log n) interactions.
It is obvious that one can construct a round (i.e., a sequence of n interactions) that requires n parallel steps when the transition function of a population protocol is treated as a black box. We show that the expected maximum length of a dependency chain of interactions in a single round is ( log n log log n ). The lower bound implies that when the transition function is treated as black box and the update of the states of interacting agents requires (1) steps then the expected maximum number of parallel steps necessary to implement a round is ( log n log log n ).
The upper bound opens for the possibility of a matching, fast parallel implementation of a single round in expectation under additional assumptions. In particular, the parallel implementation relies on a decomposition of a dependency directed acyclic graph (DAG) whose vertices correspond to the interactions in the round into O ( log n log log n ) levels of independent vertices (i.e., interactions) having the same maximum distance to the source vertices, i.e., vertices of indegree 0. We complement this analysis and show that the asymptotic discrepancy between the parallel time for a sequence of interactions (i.e., its length divided by n) and the expected number of parallel steps necessary to implement the sequence disappears for longer sequences. Namely, we prove that the expected maximum length of a dependency

A lower bound on expected number of parallel steps required by a round
For a sequence S of interactions, we shall consider the dependency directed acyclic graph (DAG) D(S), where vertices correspond to interactions in the sequence and two vertices v, u are connected by the directed edge (v, u) if and only if the interaction corresponding to v precedes the interaction corresponding to u and the two interactions share at least one agent.
In each round R (of n interactions), there are 2n participants slots, so on the average each agent participates in two interactions in R. Hence, the expected number of edges of the dependency DAG D(R) is at least linear. They can form long directed chains excluding the possibility of an efficient implementation of the round in parallel.

Remark 1.
There is a round R such that the dependency DAG D(R) includes a directed path of length n − 1 (i.e., it has depth ≥ n − 1). In consequence, any implementation of the round (when the transition function is treated as black box and the update of the states of interacting agents takes one step) requires n parallel steps.
Proof. It is sufficient to let the i-th agent participate in the i-th and i + 1-th interactions for i ≤ n − 1. The dependency DAG of so specified round includes a directed path of length n − 1.
Of course, the round specified in Remark 1 yielding a dependency chain of linear length is highly unlikely. However, the expected maximum length of a dependency chain is at least almost logarithmic in n. Proof. Consider a sequence S of n pairwise interactions between the n agents picked uniformly at random. We shall show that the expected maximum number of interactions in S that a single agent participates is ( log n log log n ).
We have 2n balls, where for any k ∈ [n], the balls numbered 2k − 1 and 2k correspond to the k-th interaction in S, and n bins are in one-to-one correspondence with the n agents. Allocating the balls numbered 2k − 1, 2k into two distinct bins A and B, respectively, specifies the interaction between the agents corresponding to the bins A and B. To guarantee that the bins A and B are distinct, the ball numbered 2k cannot be allocated to the bin A. Since the pairwise interactions are performed between the n agents picked uniformly at random, the destinations of the balls are random with the aforementioned restriction on the even balls. Therefore, by [11], the expected maximum load of a bin with just odd balls in our model is Euler's gamma function which is known to satisfy (−1) (n) = log n log log n (1 + o(1)).
Hence, in expectation, there is an agent involved in at least log n log log n (1 + o(1)) interactions.

An upper bound on expected maximum length of a dependency chain in a round
The bound in Theorem 1 follows from the fact that one expects that at least one agent will be involved in ( log n log log n ) interactions, which immediately implies that the expected maximum length of a directed path in the dependency DAG of a round is ( log n log log n ). However, if one considers concurrently more agents, then perhaps the expected maximum length of a directed path in the dependency DAG can be significantly larger, that is ω( log n log log n )?
In this section, we prove that this is not the case, implying that the lower bound in Theorem 1 is asymptotically tight. In order to derive our upper bound on the expected maximum length of a directed path in the dependency DAG of a round, we shall identify interactions with labeled edges in K n . To model directed paths in the dependency DAG of a round, we need the following concept.
We will consider labeled undirected multigraphs, where each edge has a unique label. An interference path of length k in such a multigraph is any sequence of edges e 1 , . . . , e k such that e i ∩ e i+1 = ∅ for every 1 ≤ i < k. We say an interference path is monotone if the labels on the interference path form a strictly increasing sequence.

Theorem 2. Let c be an arbitrary positive constant and let n be a sufficiently large integer. Consider the process of selecting
n edges labeled 1, . . . , n in K n independently and uniformly at random. 2 Then, for k = (3+c) log n log log n , with probability at least 1 − 1 n c , the obtained multigraph has no monotone interference path of length k.
Proof. The proof is by simple counting arguments. Let G be the (random) multigraph constructed by our process. G has n vertices, n edges (possibly with repetitions), and each edge has a distinct label from [n]. 2 That is, we run the following process: The meaning here is that e 1 , . . . , e k ; L corresponds to the interference path with edges e 1 , . . . , e k and with labels such that e i has label equal to the i-th smallest element from L.
Let us observe that Indeed, we can choose any of the Let us take an arbitrary interference path P = e 1 , . . . , For P to exist in G, for every 1 ≤ i ≤ k, the process must have chosen edge e i in step χ i of the algorithm.
The probability for that to happen is equal to 1 for every 1 ≤ i ≤ k. All the probabilities are independent for different i, and therefore if we let X P be the indicator random variable that P is a monotone interference path in G, then (for n ≥ 2) Let E k be the random event that G has a monotone interference path of length k. By the inequalities (1) and (2), and by the union bound, we obtain Finally, we use the fact that for Euler's gamma function then we obtain k ≥ (−1) (n c+2 ) which implies k! ≥ n c+2 by straightforward calculations, and 8 k · n = o(n 2 ). Consequently, we have Since E k is the event that G has a monotone interference path of length k, the bound above implies that with probability at least 1 − n −c the random labeled multigraph G has no monotone interference path of length k = (c+3)·log n log log n = log n log log n .
Note that each directed path of length k in the dependency DAG of a round corresponds to a monotone interference paths of length k in the multigraph in Theorem 2. Hence, we obtain the following corollary from Theorem 2. For i = 0, 1, 2, . . . , let the i-th level of the DAG denote the set of its vertices (i.e., interactions) whose maximum distance to a source vertex (i.e., a vertex of indegree 0) is i.
It follows that the expected number of levels is O ( log n log log n ).
Consequently, if the decomposition of the DAG into its levels is given and the update of the states of interacting agents takes O (1) steps then the expected number of parallel steps required to implement a round is O log n log log n .

(n log n) interactions
In this section, we estimate the expected maximum length of a dependency chain when the number m of interactions is at least (n log n). First, let us observe that m n is a trivial lower bound for the maximum length of a dependency chain for a sequence of m interactions, and therefore in what follows we show only an upper bound of O m n . We adopt the notation and the proof of Theorem 2. The difference from the proof of Theorem 2 is that now the set of labels becomes [m]. Consequently, the inequality . Next, following the proof of Theorem 2, if we plug this upper bound for the size of IP k in (3) then we obtain . Finally, if we take any positive c, then by setting k = max{ 16em n , (c + 1) log 2 n} = O ( m n + log n), we can use the well-known bound k! ≥ ( k e ) k to obtain Suppose that the decomposition of the dependency DAG of m interactions into its levels analogous to that of the dependency DAG of a round of n interactions described in Section 3 is given and an update of the states of interacting agents takes O (1) steps. Then, the expected number of parallel steps required to implement the sequence of m interactions is O ( m n ), i.e., it corresponds to the parallel time.

Final remarks
While we present our bounds to hold in expectation, our analysis holds w.h.p. In particular, the lower bound of Theorem 1 holds also with high probability, as do the upper bounds of Theorems 2, 3. In case of Theorem 1, this observation relies on the high-probability bound for the balls-into-bins process [13] which is analogous to the one in expectation.
Our almost logarithmic lower bound on the expected maximum length of a directed path in the dependency DAG of a round in Theorem 1 is implied by the lower bound on the expected maximum number of interactions sharing a single agent in a round. It is a bit surprising that our upper bound on the expected maximum length of a directed path in the DAG of the round asymptotically matches the aforementioned lower bound. For example, in the round constructed in the proof of Remark 1, each agent takes part in O (1) interactions but the DAG of the round contains a directed path of length n − 1!
The problem of estimating the expected depth of random circuits raised and studied by Diaz et al. in [9] seems closely related. The motivation of Diaz et al. [9] was an estimation of how quickly a random circuit could be evaluated in parallel. Arya et al. improved the results of [9] by providing tight (log n) bounds on the expected depth of random circuits in [5]. Their improved results rely on Markov chain techniques.
The problem with the parallel implementation of a sequence of interactions based on the decomposition of the dependency DAG into the levels is that it requires the knowledge of the sequence in advance. To achieve a more on-line parallel implementation one can generalize the concept of an interaction between two agents to include that of a k-parallel interaction. It is defined as a sequence of k mutually independent interactions involving 2k agents totally. Then, a sequence of t(n) interactions composed of t(n)/k consecutive k-parallel interactions can be implemented in O (t(n)/k) parallel steps. The related problem of designing a fast parallel randomized method of drawing k disjoint pairs of agents uniformly at random is also of interest in its own rights. In a recent paper [8], Berenbrink et al. provide a method of forming several matchings between agents in order to simulate population protocols efficiently in parallel. For many known protocols, their batch simulator requires amortized sub-constant time per interaction and is fast in practice.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
No data was used for the research described in the article.