Directed random graphs with given degree distributions

Given two distributions F and G on the nonnegative integers we propose an algorithm to construct in- and out-degree sequences from samples of i.i.d. observations from F and G, respectively, that with high probability will be graphical, that is, from which a simple directed graph can be drawn. We then analyze a directed version of the configuration model and show that, provided that F and G have finite variance, the probability of obtaining a simple graph is bounded away from zero as the number of nodes grows. We show that conditional on the resulting graph being simple, the in- and out-degree distributions are (approximately) F and G for large size graphs. Moreover, when the degree distributions have only finite mean we show that the elimination of self-loops and multiple edges does not significantly change the degree distributions in the resulting simple graph.


Introduction
In order to study complex systems such as the World Wide Web (WWW) we propose a model for generating a simple directed random graph with prescribed degree distributions.The ability to match degree distributions to real graphs is perhaps the first characteristic one would desire from a model, and although several models that accomplish this for undirected graphs have been proposed in the recent literature [8,10,11,19], not much has been done for the directed case.In the WWW example that motivates this work, vertices represent webpages and the edges represent the links between them.Empirical studies (e.g., [9,14]) suggest that both the in-degree and out-degree, number of links pointing to a page and the number of outbound links of a page, respectively, follow a power-law distribution, a characteristic often referred to as the scale-free property.
The model we propose in this paper is closely related to the work in [8] for undirected graphs, where given a probability distribution F , the goal is to provide an algorithm to generate a simple random graph whose degree distribution is approximately F .Two of the models presented in [8], as well as the model in [24], are in turn related to the well-known configuration model [6,25], where vertices are given stubs or halfedges according to a degree sequence {d i } and these stubs are then randomly paired to form edges.To obtain a prescribed degree distribution, the degree sequence {d i } is chosen as i.i.d.random variables having distribution F .This method allows great flexibility in terms of the generality of F , which is very important in the applications we have in mind.The most general of the results presented here require only that the degree distributions have finite (1 + ǫ)th moment, and are therefore applicable to a great variety of examples, including the WWW.
For a directed random graph there are two distributions that need to be chosen, the in-degree and outdegree distributions, denoted respectively F = {f k : k ≥ 0} and G = {g k : k ≥ 0}.The in-degree of a node corresponds to the number of edges pointing to it, while the out-degree is the number of edges pointing out.To follow the ideas from [8,24], we propose to draw the in-degree and out-degree sequences as i.i.d.observations from distributions F and G. Unlike the undirected case where the only main problem with this approach is that the sum of the degrees might not be even, which is necessary to draw an undirected graph, in the directed case the corresponding condition is that the sum of the in-degrees and the sum of the out-degrees be the same.Since the probability that two i.i.d.sequences will have the same sum, even if their means are equal, converges to zero as the number of nodes grows to infinity, the first part of the paper focuses on how to construct valid degree sequences without significantly destroying their i.i.d.properties.Once we have valid degree sequences the problem is how to obtain a simple graph, since the random pairing may produce self-loops and multiple edges in the same direction.This problem is addressed in two ways, the first of which consists in showing sufficient conditions under which the probability of generating a simple graph through random pairing is strictly positive, which in turn suggests repeating the pairing process until a simple graph is obtained.The second approach is to simply erase the self-loops and multiple edges of the resulting graph.In both cases, one must show that the degree distributions in the final simple graph remain essentially unchanged.In particular, if we let f (n) k be the probability that a randomly chosen node from a graph of size n has in-degree k, and let g (n) k be the corresponding probability for the out-degree, then we will show that, f as n → ∞.We also prove a similar result for the empirical distributions.
The question of whether a given pair of in-and out-degree sequences ({m i }, {d i }) is graphical, i.e., from which it is possible to draw a simple directed graph, has been recently studied in [13,17], where algorithms to realize such graphs have also been analyzed.Random directed graphs with arbitrary degree distributions have been studied in [21] via generating functions, which can be used to formalize concepts such as "incomponents" and "out-components" as well as to estimate their average size.Models of growing networks that can be calibrated to mimic the power-law behavior of the WWW have been analyzed using statistical physics techniques in [15,16].The approach followed in this paper focuses on one hand on the generation of in-and out-degree sequences that are close to being i.i.d. and that are graphical with high probability, and on the other hand on providing conditions under which a simple graph can be obtained through random pairing.The directed configuration model with (close to) i.i.d.degree sequences, although not a growing network model, has the advantage of being analytically tractable and easy to simulate.
The rest of the paper is organized as follows.In Section 2 we introduce a model to construct in-and outdegree sequences that are very close to being two independent sequences of i.i.d.random variables having distributions F and G, respectively, but whose sums are the same; in the same spirit as the results in [1] we also show that the suggested method produces with high probability a graphical pair of degree sequences.In Subsection 3.1 we prove sufficient conditions under which the probability that the directed configuration model will produce a simple graph will be bounded away from zero, and show that conditional on the resulting graph being simple, the degree sequences have asymptotically the correct distributions.In Subsection 3.2 we show that under very mild conditions, the process of simply erasing self-loops and multiple edges results in a graph whose degree distributions are still asymptotically F and G.

Graphs and degree sequences
As mentioned in the introduction, the goal of this paper is to provide an algorithm for generating a random directed graph with n nodes with the property that its in-degrees and out-degrees have some prespecified distributions F and G, respectively.Moreover, we would like the resulting graph to be simple, that is, it should not contain self-loops or multiple edges in the same direction.The two models that we propose are based on the so-called configuration or pairing model, which produces a random undirected graph from a degree sequence {d 1 , d 2 , . . ., d n }.In [8,24] the prescribed degree distribution is obtained by drawing the degree sequence {d i } as i.i.d.random variables from that distribution.More details about the configuration model can be found in Section 3.
Following the same idea of using a sequence of i.i.d.random variables to generate the degree sequence of an undirected graph, the natural extension to the directed case would be to draw two i.i.d.sequences from given distributions F and G.We note that in the undirected setting the two main problems with this approach are: 1) that the sum of the degrees may be odd, in which case it is impossible to draw a graph, and 2) that there may not exist a simple graph having the prescribed degree sequence.The first problem is easily fixed by either sampling the i.i.d.sequence until its sum is even (which will happen with probability 1/2 asymptotically), or simply adding one to the last random number in the sequence.The second problem, although related to the verification of graphicality criteria (e.g., the Erdös-Gallai criterion [12]), turns out to be negligible as the number of nodes goes to infinity, as the work in [1] shows.For directed graphs a graphicality criterion also exists, and the second problem turns out to be negligible for large graphs just as in the undirected case.Nonetheless, the equivalent of the first problem is now that the potential in-degree and out-degree sequences must have the same sum, which is considerably harder to fix.Before proceeding with the formulation of our proposed algorithm we give some basic definitions which will be used throughout the paper.
it is also true that in general One potential idea to fix the problem is to sample one of the two sequences, say the in-degrees, as i.i.d.observations {γ 1 , . . ., γ n } from F and then sample the second sequence from the conditional distribution G given that its sum is Γ n = n i=1 γ i .This approach has the major drawback that this conditional distribution may be ill-behaved, in the sense that the probability of the conditioning event, the sum being equal to Γ n , converges to zero in most cases.It follows that we need a different mechanism to sample the degree sequences.The precise algorithm we propose is described below; we focus on first sampling two independent i.i.d.sequences and then add in-or out-degrees as needed to match their sums.
The following definition will be needed throughout the rest of the paper.Definition 2.5.We say that a function L(•) is slowly varying at infinity if lim x→∞ L(tx)/L(x) = 1 for all fixed t > 0. A distribution function F is said to be regularly varying with index α > 0, We will also use the notation ⇒ to denote convergence in distribution, P −→ to denote convergence in probability, and N = {1, 2, 3, . . .} to refer to the positive integers.

The Algorithm
We assume that the target degree distributions F and G have support on the nonnegative integers and have common mean µ > 0.Moreover, suppose that there exist slowly varying functions L F (•) and L G (•) such that for all x ≥ 0, where α, β > 1.
We refer the reader to [4] for all the properties of slowly varying functions that will be used in the proofs.However, we do point out here that the tail conditions in (2.1) ensure that F has finite moments of order s for all 0 < s < α, and G has finite moments of order s for all 0 < s < β.The constant will play an important role throughout the paper.The algorithm is given below.
1. Fix 0 < δ 0 < κ. where . ., i ∆n }, 0 otherwise, and Remark 2.6.(i) This algorithm constructs a bi-degree-sequence (N, D) having the property that (ii) Note that we have used the capital letters N i and D i to denote the in-degree and out-degree, respectively, of node i, as opposed to using the notation m i and d i from Definition 2.4; we do this to emphasize the randomness of the bi-degree-sequence itself.(iii) Clearly, neither {N 1 , . . ., N n } nor {D 1 , . . ., D n } are i.i.d.sequences, nor are they independent of each other, but we will show in the next section that asymptotically as n grows to infinity they have the same joint distribution as ({γ i }, {ξ i }).(iv) We will also show that the condition in step 4 has probability converging to one.(v) Note that we always choose to add degrees, rather than fixing one sequence and always adjust the other one, to avoid having problems with nodes with in-or out-degree zero.

Asymptotic behavior of the degree sequence
We now provide some results about the asymptotic behavior of the bi-degree-sequence obtained from the algorithm we propose.The first thing we need to prove is that the algorithm will always end in finite time, and the only step where we need to be careful is in Step 4, since it may not be obvious that we can always draw two independent i.i.d.sequences satisfying |∆ n | ≤ n 1−κ+δ0 in a reasonable amount of time.The first lemma we give establishes that this is indeed the case by showing that the probability of satisfying condition |∆ n | ≤ n 1−κ+δ0 converges to one as the size of the graph grows.All the proofs in this section can be found in Subsection 4.1.
Since the sums of the in-degrees and out-degrees are the same, we can always draw a graph, but this is not enough to guarantee that we can draw a simple graph.In other words, we need to determine with what probability will the the bi-degree-sequence (N, D) be graphical, and to do this we first need a appropriate criterion, e.g., a directed version of the Erdös-Gallai criterion for undirected graphs.The following result (Corollary 1 on p. 110 in [3]) gives necessary and sufficient conditions for a bi-degree-sequence to be graphical; the original statement is for more general p-graphs, where up to p parallel edges in the same direction are allowed.The notation |A| denotes the cardinality of set A.
We now state a result that shows that for large n, the bi-degree-sequence (N, D) constructed in Subsection 2.1 is with high probability graphical.Related results for undirected graphs can be found in [1], which includes the case when the degree distribution has infinite mean.Theorem 2.9.For the bi-degree-sequence (N, D) constructed in Section 2.1 we have The second property of (N, D) that we want to show is that despite the fact that the sequences {N i } and {D i } are no longer independent nor individually i.i.d., they are still asymptotically so as the number of vertices n goes to infinity.The intuition behind this result is that the number of degrees that need to be added to one of the i.i.d.sequences {γ i } or {ξ i } to match their sum is small compared to n, and therefore the sequences {N i } and {D i } are almost i.i.d. and independent of each other.This feature makes the bi-degreesequence (N, D) we propose an approximate equivalent of the i.i.d.degree sequence considered in [1,8,24] for undirected graphs.
To end this section, we give a result that establishes regularity conditions of the bi-degree-sequence (N, D) which will be important in the sequel.

The configuration model
In the previous section we introduced a model for the generation of a bi-degree-sequence (N, D) that is close to being a pair of independent sequences of i.i.d.random variables, but yet has the property of being graphical with probability close to one as the size of the graph goes to infinity.We now turn our attention to the problem of obtaining a realization of such sequence, in particular, of drawing a simple graph having (N, D) as its bi-degree-sequence.
The approach that we follow is a directed version of the configuration model.The configuration, or pairing model, was introduced in [6] and [25], although earlier related ideas based on symmetric matrices with {0, 1} entries go back to the early 70's; see [7,26] for a survey of the history as well as additional references.The configuration model is based on the following idea: given a degree sequence d = {d 1 , . . ., d n }, to each node v i , 1 ≤ i ≤ n, assign d i stubs or half-edges, and then pair half-edges to form an edge in the graph by randomly selecting with equal probability from the remaining set of unpaired half-edges.This procedure results in a multigraph on n nodes having d as its degree sequence, where the term multigraph refers to the possibility of self-loops and multiple edges.Although this algorithm does not produce a multigraph uniformly chosen at random from the set of all multigraphs having degree sequence d, a simple graph uniformly chosen at random can be obtained by choosing a pairing uniformly at random and discarding the outcome if it has self-loops or multiple edges [26].The question that becomes important then is to estimate the probability with which the pairing model will produce a simple graph.For the undirected graph setting we have described, such results were given in [2,6,20,22,25] for regular d-graphs (graphs where each node has exactly degree d), and in [18,20,23] for general graphical degree sequences.
From the previous discussion, it should be clear that it is important to determine conditions under which the probability of obtaining a simple graph in the pairing model is bounded away from zero as n → ∞.
Such conditions are essentially bounds on the rate of growth of the maximum (minimum) degree and/or the existence of certain limits (see, e.g., [18,20,23]).The set of conditions given below is taken from [23], and we include it here as a reference for the directed version discussed in this paper.
Condition 3.1.Given a degree sequence d = {d 1 , . . ., d n }, let D [n] be the degree of a randomly chosen node, i.e., a) Weak convergence.There exists a finite random variable D taking values on the positive integers such that Remark 3.2.It is straightforward to verify that if the degree sequence is chosen as an i.i.d.sample {D 1 , . . ., D n } from some distribution F on the positive integers having finite first moment, then parts (a) and (b) of Condition 3.1 are satisfied, and if F has finite second moment then also part (c) is satisfied; the adjustment made to ensure that the sum of the degrees is even, if needed, can be shown to be negligible.
Condition 3.1 guarantees that the probability of obtaining a simple graph in the pairing model is bounded away from zero (see, e.g., [23]), in which case we can obtain a uniformly simple realization of the (graphical) degree sequence {d i } by repeating the random pairing until a simple graph is obtained.When part (c) of Condition 3.1 fails, then an alternative is to simply erase the self-loops and multiple edges.These two approaches give rise to the repeated an erased configuration models, respectively.
Having given a brief description of the configuration model for undirected graphs, we will now discuss how to adapt it to draw directed graphs.The idea is basically the same, given a bi-degree-sequence (m, d), to each node v i assign m i inbound half-edges and d i outbound half-edges; then, proceed to match inbound half-edges to outbound half-edges to form directed edges.To be more precise, for each unpaired inbound half-edge of node v i choose randomly from all the available unpaired outbound half-edges, and if the selected outbound half-edge belongs to node, say, v j , then add a directed edge from v j to v i to the graph; proceed in this way until all unpaired inbound half-edges are matched.The following result shows that conditional on the graph being simple, it is uniformly chosen among all simple directed graphs having bi-degree-sequence (m, d).All the proofs of Section 3 can be found in Subsection 4.2.
Proposition 3.3.Given a graphical bi-degree-sequence (m, d), generate a directed graph according to the directed configuration model.Then, conditional on the obtained graph being simple, it is uniformly distributed among all simple directed graphs having bi-degree-sequence (m, d).
The question is now under what conditions will the probability of obtaining a simple graph be bounded away from zero as the number of nodes, n, goes to infinity.When this probability is bounded away from zero we can repeat the random pairing until we draw a simple graph: the repeated model; otherwise, we can always erase the self-loops and multiple edges in the same direction to obtain a simple graph: the erased model.These two models are discussed in more detail in the following two subsections, where we also provide sufficient conditions under which the the probability of obtaining a simple graph will be bounded away from zero.
We end this section by mentioning that another important line of problems related to the drawing of simple graphs (directed or undirected) is the development of efficient simulation algorithms, see for example the recent work in [5] using importance sampling techniques for drawing a simple graph with prescribed degree sequence {d i }; similar ideas should also be applicable to the directed model.

Repeated Directed Configuration Model
In this section we analyze the directed configuration model using the bi-degree-sequence (N, D) constructed in Subsection 2.1.In order to do so we will first need to establish sufficient conditions under which the probability that the directed configuration model produces a simple graph is bounded away from zero as the number of nodes goes to infinity.Since this property does not directly depend on the specific bi-degreesequence (N, D), we will prove the result for general bi-degree-sequences (m, d) satisfying an analogue of Condition 3.1.As one may expect, we will require the existence of certain limits related to the (joint) distribution of the in-degree and out-degree of a randomly chosen node.Also, since the sequences {m i } and {d i } need to have the same sum, we prefer to consider a sequence of bi-degree-sequences, i. let (N [n] , D [n] ) denote the in-degree and out-degree of a randomly chosen node, i.e., a) Weak convergence.There exist finite random variables γ and ξ taking values on the nonnegative integers and satisfying b) Convergence of the first moments.
We now state a result that says that the number of self-loops and the number of multiple edges produced by the random pairing converge jointly, as n → ∞, to a pair of independent Poisson random variables.As a corollary we obtain that the probability of the resulting graph being simple converges to a positive number, and is therefore bounded away from zero.The proof is an adaptation of the proof of Proposition 7.9 in [23].
Consider the multigraph obtained through the directed configuration model from the bi-degree-sequence (m n , d n ), and let S n be the number of self-loops and M n be the number of multiple edges in the same direction, that is, if there are k ≥ 2 (directed) edges from node v i to node v j , they contribute (k − 1) to M n .
Proposition 3.5.(Poisson limit of self-loops and multiple edges) as n → ∞, where S and M are two independent Poisson random variables with means respectively.
Since the probability of the graph being simple is P (S n = 0, M n = 0), we obtain as a consequence the following theorem.
Theorem 3.6.Under the assumptions of Proposition 3.5, It is clear from Proposition 2.11 that Condition 3.4 is satisfied by the bi-degree-sequence (N, D) proposed in Subsection 2.1 whenever F and G have finite variance.This implies that one way of obtaining a simple directed graph on n nodes is by first sampling the bi-degree-sequence (N, D) according to Subsection 2.1, then checking if it is graphical, and if it is, use the directed pairing model to draw a graph, discarding any realizations that are not simple.Alternatively, since the probability of (N, D) being graphical converges to one, then one could skip the verification of graphicality and re-sample (N, D) each time the pairing needs to be repeated.
The last thing we show in this section is that the degree distributions of the resulting simple graph will have with high probability the prescribed degree distributions F and G, as required.More specifically, if we let (N (r) , D (r) ) be the bi-degree-sequence of the final simple graph obtained through the repeated directed configuration model with bi-degree-sequence (N, D), then we will show that the joint distribution converges to f i g j , and the empirical distributions, converge in probability to f k and g k , respectively.The same result was shown in [8] for the undirected case with i.i.d.degree sequence {D i }.
Proposition 3.7.For the repeated directed configuration model with bi-degree-sequence (N, D), as constructed in Subsection 2.1, we have: and  b) for all k = 0, 1, 2, . . ., Remark 3.8.Note that by the continuous mapping theorem, (a) implies that the marginal distributions of the in-degrees and out-degrees, converge to f i and g j , respectively.The same arguments used in the proof also give that the joint empirical distribution converges to f i g j in probability.

Erased directed configuration model
In this section we consider the erased directed configuration model, which is particularly useful when the probability of drawing a simple graph converges to zero as the number of nodes increases, which could happen, for example, when Condition 3.4 (d) fails.Given a bi-degree-sequence (m, d), the erased model consists in first obtaining a multigraph according to the directed configuration model and then erase all self-loops and merge multiple edges in the same direction into a single edge, with the result being a simple graph.Note that the graph obtained through this process no longer has (m, d) as its bi-degree-sequence.
As for the repeated model, let (N (e) , D (e) ) be the bi-degree-sequence of the simple graph obtained through the erased directed configuration model with bi-degree-sequence (N, D).Define the joint distribution k = j) i, j = 0, 1, 2, . . ., and the empirical distributions, The following result is the analogue of Proposition 3.7 for the erased model.

Proofs
In this section we give the proofs of all the results in the paper.We divide the proofs into two subsections, one containing those belonging to Section 2 and those belonging to Section 3. Throughout the remainder of the paper we use the following notation:

Degree Sequences
This subsection contains the proofs of Lemma 2.7, Theorems 2.9 and 2.10, and Proposition 2.11.
Proof of Lemma 2.7.Let Z i = γ i − ξ i and note that the {Z i } are i.i.d.mean zero random variables.If E[Z 2 1 ] < ∞, then Chebyshev's inequality gives as n → ∞.

Suppose now that E[Z
By the union bound, as n → ∞, which converges to zero by basic properties of slowly varying functions (see, e.g., Proposition 1.3.6 in [4]).Next, note that since To estimate the integral note that where in the third step we used Proposition 1.5.10 in [4].Now note that from where it follows that as n → ∞.In view of this, we can use Chebyshev's inequality to obtain Finally, to see that this last bound converges to zero note that as n → ∞.This completes the proof.
Before giving the proof of Theorem 2.9 we will need the following preliminary lemma.
Lemma 4.1.Let {X 1 , . . ., X n } be an i.i.d.sequence of nonnegative random variables having distribution function V , and let X (i) denote the ith order statistic.Then, for any k ≤ n, where B(n, p) is a Binomial(n, p) random variable.Since the function u(t) = min{t, k} is concave, Jensen's inequality gives Proof of Theorem 2.9.Since by construction By conditioning on how many of the D i are larger than n (1+ǫ)/β we obtain that (4.3) is bounded by where D n was defined in Lemma 2.7.Now note that by the union bound we have as n → ∞, where the last step follows from Lemma 2.7 and basic properties of slowly varying functions (see, e.g., Chapter 1 in [4]).
Next, to analyze (4.2) let k n = ⌊n (1+ǫ)/β ⌋ and note that we can write it as where x (i) is the ith smallest of {x 1 , . . ., x n }.Now let a 0 = E[min{ξ 1 , 1}] = G(0) > 0 and split the last probability as follows To bound (4.5) use D i ≥ ξ i for all i = 1, . . ., n and Chebyshev's inequality to obtain while the union bound gives that (4.4) is bounded by where b n = a 0 n − n 1/2+ǫ .For the second probability the union bound again gives as n → ∞.Finally, by Markov's inequality and Lemma 4.1, x −α+ǫ dx the proof is complete.
The last two proofs of this section are those of Theorem 2.10 and Proposition 2.11.
Proof of Theorem 2.10.Let u : N r+s → [−M, M ], M > 0, be a continuous bounded function, and let ∆ n , D n be defined as in Lemma 2.7.Then, Let T = r t=1 τ it + s t=1 χ js .Since u is bounded then (4.6) is smaller than or equal to To compute the last expectations let F n = σ(γ 1 , . . ., γ n , ξ 1 , . . ., ξ n ) and note that and symmetrically, from where it follows that (4.6) is bounded by as n → ∞.To analyze (4.7) we first note that by Lemma 2.7, Therefore, (4.7) is equal to Proof of Proposition 2.11.Fix ǫ > 0 and let D n = {|∆ n | ≤ n 1−κ+δ0 }.For the first limit fix i, j = 0, 1, 2, . . .and note that by the union bound, where in the last step we used Chebyshev's inequality.Clearly, Var(1(γ and since by Lemma 2.7 P (D n ) → 1 as n → ∞, then the second term converges to zero.To analyze the first term note that at most one of χ k or τ k can be one, hence, Next, for the average degrees we have and since τ i χ i = 0 for all 1 ≤ i ≤ n, for any δ 0 < δ < κ.By Lemma 2.7, P (D n ) converges to one, and by the Weak Law of Large Numbers (WLLN) we have that each of (4.8), (4.9) and (4.10) converges to zero as n → ∞, as required.To see that (4.11) converges to zero use Markov's inequality to obtain which implies that (4.12) converges to zero.

Configuration Model
This subsection contains the proofs of Proposition 3.3, which establishes the uniformity of simple graphs, Propositions 3.5 and 3.7, which concern the repeated directed configuration model, and Proposition 3.9 which refers to the erased directed configuration model.
Proof of Proposition 3.3.Suppose m and d have equal sum l n , and number the inbound and outbound half-edges by 1, 2, . . ., l n .The process of matching half edges in the configuration model is equivalent to a permutation (p(1), p(2), . . ., p(l n )) of the numbers (1, 2, . . ., l n ) where we pair the ith inbound half-edge to the p(i)th outbound half-edge, with all l n !permutations being equally likely.Note that different permutations can actually lead to the same graph, for example, if we switch the position of two outbound half-edges of the same node, so not all multigraphs have the same probability.Nevertheless, a simple graph can only be produced by n i=1 d i !m i !different permutations; to see this note that for each node v i , i = 1, . . ., n, we can permute its m i inbound half-edges and its d i outbound half-edges without changing the graph.It follows that since the number of permutations leading to a simple graph is the same for all simple graphs, then conditional on the resulting graph being simple, it is uniformly chosen among all simple graphs having bi-degree-sequence (m, d).
Next, we give the proofs of the results related to the repeated directed configuration model.Before proceeding with the proof of Proposition 3.5 we give the following preliminary lemma, which will be used to establish that under Condition 3.4 the maximum in-and out-degrees cannot grow too fast.Lemma 4.2.Let {a nk : 1 ≤ k ≤ n, n ∈ N} be a triangular array of nonnegative integers, and suppose there exist nonnegative numbers {p j : j ∈ N ∪ {0}} such that Then, lim n→∞ max 1≤k≤n a nk n = 0. Proof.Define and note that F and F n are both distribution functions with support on the nonnegative integers.Define the pseudoinverse operator h −1 (u) = inf{x ≥ 0 : u ≤ h(x)} and let where U is a Uniform(0,1) random variable.It is easy to verify that X n and X have distributions F n and F , respectively.Furthermore, the assumptions imply that as n → ∞ and as n → ∞, where the exchange of sums is justified by Fubini's theorem.Now note that by Fatou's lemma, which implies that lim Finally, from where it follows that Proof of Proposition 3.5.Following the proof of Proposition 7.9 in [23], we define the random variable Mn to be the total number of pairs of multiple edges in the same direction, e.g., if from node v i to node v j there are k ≥ 2 edges, their contribution to Mn is k 2 .Note that M n ≤ Mn , with strict inequality whenever there is at least one pair of nodes having three or more multiple edges in the same direction.We claim that Mn − M n P −→ 0 as n → ∞, which implies that if (S n Mn ) ⇒ (S, M ), then (S n , M n ) ⇒ (S, M ) as n → ∞.To prove the claim start by defining indicator random variables for each of the possible self-loops and multiple edges in the same direction that the multigraph can have.For the self-loops we use the notation u = (r, t, i) to define I u := 1(self-loop from the rth outbound stub to the tth inbound stub of node v i ), and for the pairs of multiple edges in the same direction we use w = (r 1 , t 1 , r 2 , t 2 , i, j) to define J w := 1(r s th outbound stub of node v i paired to t s th inbound stub of node v j , s = 1, 2).
The sets of possible vectors u and w are given by Next, note that by the union bound, P Mn − M n ≥ 1 ≤ P (at least two nodes with three or more edges in the same direction) ≤ 1≤i =j≤n P (three or more edges from node v i to node v j ) as n → ∞, where for the last step we used Condition 3.4 and Lemma 4.2.It follows that Mn − M n P −→ 0 as claimed.
We now proceed to prove that (S n , Mn ) ⇒ (S, M ), where S and M are independent Poisson random variables with means λ 1 and λ 2 , respectively.To do this we use Theorem 2.6 in [23] which says that if for any p, q ∈ N where (X) r = X(X − 1) • • • (X − r + 1), then (S n , Mn ) ⇒ (S, M ) as n → ∞.To compute the expectation we use Theorem 2.7 in [23], which gives E (S n ) p ( Mn ) q = u1,...,up∈I w1,...,wq∈J where the sums are taken over all the p-permutations, respectively q-permutations, of the distinct indices in I, respectively J .
Next, by the fact that all stubs are uniformly paired, we have that unless there is a conflict in the attachment rules, i.e., one stub is required to pair with two or more different stubs within the indices {u 1 , . . ., u p } and {w 1 , . . ., w q }, in which case Therefore, from (4.13) we obtain E[(S n ) p ( Mn ) q ] ≤ u1,...,up∈I w1,...,wq∈J where |A| denotes the cardinality of set A. Now note that as n → ∞.Hence, it follows from Condition 3.4 that as n → ∞.Since p and q remain fixed as n → ∞, we have To prove the matching lower bound, we note that (4.14) occurs exactly when there is a conflict in the attachment rules.Each time a conflict happens, the numerator of (4.15) decreases by one.Therefore, ..,up∈I w1,...,wq∈J 1(u 1 , . . ., u p , w 1 , . . ., w q have a conflict) u1,...,up∈I w1,...,wq∈J 1(u 1 , . . ., u p , w 1 , . . ., w q have a conflict) + o (1) as n → ∞.To bound the total number of conflicts note that there are three possibilities: a) a stub is assigned to two different self-loops, or b) a stub is assigned to a self-loop and a multiple edge, or c) a stub is assigned to two different multiple edges.
We now discuss each of the cases separately.For conflicts of type (a) suppose there is a conflict between the self-loops u a and u b ; the remaining p − 2 self-loops and q pairs of multiple edges can be chosen freely.Then the number of such conflicts is bounded by |I| p−2 |J | q = O n p+2q−2 , hence it suffices to show that the total number of conflicting pairs (u a , u b ) is o(n 2 ) as n → ∞.Now, to see that this is indeed the case, first choose the node v i where the conflicting pair is; if the conflict is that an outbound stub is assigned to two different inbound stubs then we can choose the problematic outbound stub in d ni ways and the two inbound stubs in m ni (m ni −1) ways, whereas if the conflict is that an inbound stub is assigned to two different outbound stubs then we can choose the problematic inbound stub in m ni ways and the two outbound stubs in d ni (d ni − 1) ways.Thus, the total number of conflicting pairs is bounded by For conflicts of type (b) suppose there is a conflict between the self-loop u a and the pair of multiple edges w b ; choose the remaining p− 1 self-loops and q − 1 multiple edges freely.Then, the number of such conflicts is bounded by Finally, for conflicts of type (c) we first fix w a and w b and choose freely the remaining p self-loops and q − 2 multiple edges, which can be done in less than |I| p |J | q−2 = O n p+2q−4 ways.It then suffices to show that the number of conflicting pairs (w a , w b ) is o(n 4 ) as n → ∞.A similar reasoning to that used in the previous cases gives that the total number of conflicting pairs is bounded by We conclude that in any of the three cases the number of conflicts is negligible, which completes the proof.
Proof of Proposition 3.7.Let S n be the event that the resulting graph is simple, and note that the bi-degreesequence (N (r) , D (r) ) is the same as (N, D) given S n .
To prove part (a) note that for any i, j = 0, 1, 2, . . ., from where it follows that Theorem 2.10 gives that the second term converges to zero, and for the first term use Theorem 3.6 to obtain that both P (S n ) and P (S n |G n ) converge to the same positive limit, so by dominated convergence, For part (b) we only show the proof for g k (n) since the proof for f k (n) is symmetrical.Note that g k (n) is a quantity defined on S n .Fix ǫ > 0 and use the union bound to obtain  which also converges to zero.This completes the proof.
Finally, the last result of the paper, which refers to the erased directed configuration model, is given below.Since the technical part of the proof is to show that the probability that no in-degrees or out-degrees of a fixed node are removed during the erasing procedure, we split the proof of Proposition 3.9 into two parts.
The following lemma contains the more delicate step.Proof.We only show the result for E + since the proof for E − is symmetric.Define the set and note that in order for all the inbound stubs of node v 1 to survive the erasing procedure, it must have been that they were paired to outbound stubs of N 1 different nodes from {v 2 , . . ., v n }.Before we proceed it is helpful to recall some definitions from Section 2, and D n = {|∆ n | ≤ n s }, where s = 1 − κ + δ 0 ; also, {γ i } and {ξ i } are independent sequences of i.i.d.random variables having distributions F and G, respectively.Now fix 0 < ǫ < 1 − s and let G n = σ(N 1 , . . ., N n , D 1 , . . ., D n ).Then, since D i = ξ i + χ i ≥ ξ i , Next, condition on F n = σ(γ 1 , . . ., γ n , ξ 1 , . . ., ξ n ) and note that It follows that the expectation in (4.18) is equal to (i1,i2,...,iγ 1 )∈P (i1,i2,...,iγ 1 )∈P It follows by Fatou's lemma, Lemma 2.7 and Theorem 2.10 that lim inf n→∞ Next, define the function u + n : N → [0, ∞) as and note that it only remains to prove that for all t ∈ N, lim inf n→∞ u + n (t) = 1.Now let 0 < a < µ and note that The SLLN and bounded convergence give lim n→∞ P (Γ n−1 < an) = 0 and lim sup n→∞ E 1(Γ n−1 ≥ an) (Γ n−1 + t)n t (Γ n−1 + t + n s ) t+1 −

Theorem 2 . 8 .
Given a set of n vertices V = {v 1 , . . ., v n }, having bi-degree-sequence (m, d) = ({m 1 , . . ., m n }, {d 1 , . . ., d n }), a necessary and sufficient condition for (m, d) to be graphical is a) e., {(m n , d n )} n∈N where (m n , d n ) = ({m n1 , . . ., m nn }, {d n1 , . . ., d nn }), since otherwise the equal sum constraint would greatly restrict the type of sequences we can use (e.g., m i = d i for all i ∈ N).The corresponding version of Condition 3.1 is given below.Condition 3.4.Given a sequence of bi-degree-sequences {(m n , d n )} n∈N satisfying respectively.It follows from this notation that

mm
ni (m ni − 1)d ni (d ni − 1).By Lemma 4.2 and Condition 3.4 we have n i=1 m ni (m ni − 1)d ni (d ni − 1) ≤ max 1≤i≤n ni d ni = o(n 2 ) , and it suffices to show that the number of conflicting pairs (u a , w b ) is o(n 3 ) as n → ∞.Similarly as in case (a), an outbound stub of node v i can be paired to a self-loop and a multiple edge to node v j in d ni m ni m nj (d ni − 1)(m nj − 1) ways, and an inbound stub of node v i can be paired to a self-loop and a multiple edge from node v j in m ni d ni d nj (m ni − 1)(d nj − 1) ways, and so the total number of conflicting pairs is bounded by

Lemma 4 . 3 .
Consider the graph obtained through the erased directed configuration model using as bi-degreesequence (N, D), as constructed in Subsection 2.1.Let E + and E − be the number of inbound stubs and outbound stubs, respectively, that have been removed from node v 1 during the erasing procedure.Then,lim n→∞ P (E + = 0) = 1and lim n→∞ P (E − = 0) = 1.