Throughput Capacity of Ad Hoc Networks with Route Discovery

Throughput capacity of large ad hoc networks has been shown to scale adversely with the size of network n . However the need for the nodes to ﬁnd or repair routes has not been analyzed in this context. In this paper, we explicitly take route discovery into account and obtain the scaling law for the throughput capacity under general assumptions on the network environment, node behavior


INTRODUCTION
In wireless ad hoc networks, the terminals (nodes) communicate without the aid of any infrastructure. There are many challenges involved in the design of these networks. One particular challenge is involved with the routing of data packets. Typically, the source and destination nodes for a particular data packet are not within direct communication range. This leads to a multihop scenario where the packet must be routed and forwarded through other nodes in the network on the way to the destination. Many routing algorithms, like those found in [1][2][3][4], have been proposed for ad hoc networks.
In real networks, nodes may join and leave, some (or all) nodes are highly mobile, and node-to-node channels are subject to strong fading. In such cases, the problem of finding new routes and repairing old routes can present significant difficulties. In particular, there are situations when nodes have to resort to broadcasting. This causes the effect known as "broadcasting storm" that has been studied in the literature [5][6][7][8][9]. A quantitative analysis of the route discovery process based on broadcasting was given in [10] where the connection between the route discovery process arrival rate and the probability of its success was established by analytical means.
The subject of this paper is the effect of the route discovery process (RDP) on the throughput capacity of ad hoc networks. Previous results for the network capacity and throughput, like those found in [11][12][13][14] (see also [15][16][17][18][19] for an analysis of the effect of mobility on throughput), ignore the route discovery process and focus solely on the data traffic that ad hoc networks can support. On the other hand, under certain conditions (e.g., nodes leaving and joining) the route discovery process can consume a significant portion of network resources and become detrimental to overall network performance and stability. For example, if more route discovery processes are initiated than can be sustained, then they will likely fail resulting in more retransmissions. In this scenario, the network can become inundated with route request (RREQ) packets and the overall network throughput can significantly decrease.
In the following, we determine the impact of the route discovery process on network throughput (defined as in [11]) by determining the asymptotic behavior and scalability with the number of nodes for a network that has both data and RDP transmissions. Let W be the number of bits that a node can successfully transmit per unit time. We characterize the throughput in terms of two additional basic RDP-related quantities.
(i) The average time that a route stays intact once established: τ(n). (ii) The function G(·) (defined in the next section) and characterizing the efficiency of route discovery in the network.

EURASIP Journal on Wireless Communications and Networking
(iii) The "correction factor" κ(n) that describes how the dependence between different RDPs initiated by the same node affects the expected number of RDPs the node has to initiate in order to find a route to the destination.
We show that two qualitatively different situations can be distinguished.
(1) (τ(n)/κ(n))G(1/n) = o(1/ n log n). In this case, the RDP resource usage is severe enough to become the throughput bottleneck and change its scaling compared to the case when all routes are known. The throughput scales as where the notation Θ(·) stands for "soft" asymptotic behavior which ignores 1 powers of log n. (2) (τ(n)/κ(n))G(1/n) = Ω( log n/n). In this case the RDP does not affect the throughput significantly (in the order of magnitude sense) and the main limiting factor for the throughput is still the interference between data transmissions We apply these general results to some typical examples, with some specific but reasonable assumed models for τ(n) and G(1/n), to show that the actual scaling of the throughput can be changed from the case where routing is ignored. In fact, for two of these cases we show which implies routing can cause even more severe throughput scaling problems in ad hoc networks. This occurs, for example, when new nodes join a network for which τ(n) is independent of n. On the other hand, later examples indicate that extremely efficient route repair can lessen, and maybe even eliminate, the just mentioned additional scaling problems. The rest of this paper is organized as follows. In Section 2, we describe the system model, state the assumptions and derive some preliminary results. Section 3 explores the auxiliary problem of ad hoc network capacity in the case when nodes cannot always transmit. In Section 4, we explore the bounds on ξ(n)-the expected time it takes a node to find a route. In Section 5, we put the pieces together and present the main result of the paper-the throughput scaling in the presence of RDP. Section 6 contains conclusions.

SYSTEM MODEL, ASSUMPTIONS, AND PRELIMINARIES
We consider a wireless ad hoc network with n nodes distributed uniformly over a unit square area. Half of all nodes are sources and the other half are destinations. The sourcedestination correspondence is one-to-one. Each source node can be in two states: state D and state N, depending on the state of knowledge of a route to its destination. In the state D, it can transmit data to its destination d(i), and in state N it cannot transmit due to lack of route knowledge. We characterize the network behavior with respect to the route knowledge by the following quantities.
(i) The length of time during which a node stays in the state D has an expected value of τ(n) which is assumed to be determined exogenously. (ii) The length of time during which a node stays in the state N has an expected value of ξ(n) which is to be determined in the course of analysis.
Nodes can leave and join the network, but they always do so in pairs. We also assume that if a pair of nodes leaves the network, another pair joins so that the total node count n is unchanged. If a pair of nodes joins the network, the nodes appear at random locations uniformly distributed over the network area.
When a source node is in the N state, it tries to discover a route to its new destination. For that purpose, it broadcasts RREQ packets. Let S RREQ be the size (in bits) of a RREQ packet. Recall that W is the number of bits that a node can successfully transmit per unit time. This implies that the transmission of a RREQ packet can be effected in a time of In the following, we assume that all time is slotted with the slot size equal to δt. In any time slot a node can either (re)transmit a data packet of size equal to S RREQ or (re)broadcast a RREQ packet. The maximum lifetime of an RDP is assumed to be equal to l, that is, we assume that a timeout for all RREQ packets is set to l time slots.
We also assume, without loss of generality, that a half time slots are devoted to data transmission and in the other half of the time slots only RDPs take place. During data slots, we assume that all nodes that are currently in D state send data to their respective destinations according to some schedule that allows data transmission at a rate not exceeding the corresponding interference-limited capacity (much like in [11]). All nodes are assumed to have an unlimited buffer where packets can be stored and transmitted when according to the schedule. The sources in the N state as well as all destinations can act as relays.

3
Transmission success for any packet time (data or RREQ) is governed by the Protocol Model 2 in which the transmission from node i to node j within distance of r from i is successful if and only if there is no other transmitting node k within the distance of (1 + Δ)r from j. Here r is the transmission range which cannot be less than log n/πn to ensure that the network is connected with high probability [21]. We assume that the transmission range can be different for data and RREQ packets but is the same for all packets of the same type. We introduce the following notation for the quantities related to the RDP processes.
(i) n t (t)-the number of nodes transmitting (or retransmitting) an RREQ packet in a given time slot t. (ii) n nt (t) and n rt (t)-the numbers of nodes transmitting a new RREQ packet and retransmitting (relaying) an RREQ packet, respectively, in time slot t. Note that n nt (t) + n rt (t) = n t (t). (iii) n r (t)-the number of nodes that successfully receive an RREQ packet in time slot t for the first time, that is, the receptions of RREQ packets that the same node has received at an earlier time do not count toward n r (t). (iv) λ-the total rate of RDP processes arrival for the whole network, that is, the rate of new RREQ packet generation in the network. Note that, in the notation introduced above, λ = E(n nt ). (v) ν-the rate at which a node generates RREQ packets once it needs to (re)discover a route, that is, is in the N state. In order to make things more concrete, we assume that a node initiates RDPs at fixed time intervals equal to 1/ν until it finds the destination. (vi) Q-an unconditional probability that an RDP is successful at discovering a route. (vii) f k -a fraction of all other nodes (except for the source) reached by an RDP k. That is if a total of r k nodes received the corresponding RREQ packet, then f k = r k /(n − 1).
In order to make analytical derivations possible, we make the following regularity and stationarity assumptions.
(i) The processes n r (t), n t (t), n nt (t) and n rt (t) are (weakly) stationary with finite autocorrelation length. In particular, the corresponding expectations and variances exist and independent of time t. The covariances vanish for lags exceeding h, for example, Cov(n r (t), n r (s)) = 0 for |t − s| > h. (ii) For a given node, the process of switching states between states D and N is a stationary renewal process. Specifically, if we denote the duration of periods when the node was in the D state be u i and the duration of periods when the node was in the N state be v i for 2 Note that we could easily generalize this model to take into account the effects of fading and shadowing by introducing random direction dependent interference regions (in terminology of [20]) instead of circular interference regions considered in this paper. The main results would not change. We do not consider the more general case explicitly in order to keep the presentation technically simpler.

Preliminary results
Consider a time horizon of T time slots. Let N RDP (T) be the number of RDP processes initiated during this time. Clearly, where N i (T) is the number of RDP processes initiated by the source node i, and the sum is over all n/2 source nodes. In its own turn, the quantity N i (T) can be written as where N DN,i (T) is the number of D to N states changes (route losses) that the node i has during these T time slots, and N s,i j is the number of times the node i has to initiate an RDP process after route loss j until it finds a valid route to the destination. The first auxiliary lemma establishes the asymptotic behavior of the variance of N i (T).
where α is a constant independent of T.
Proof. Since the random variables N s,i j for different values of j are i.i.d., we can use (6) to find the variance of N i (T), On the other hand, since the process of switching states from D to N and back is a renewal process, we can use the result in [22, Chapter XIII] stating that The expectation of N DN,i can also be found using the results in [22, Chapter XIII],

EURASIP Journal on Wireless Communications and Networking
where α = 1/(E(u)+E(v)) and β is a constant independent of T. Now, using the Chebyshev inequality together with (10), we obtain Setting z = T 3/4 and dividing by T, we have Finally, using (9), (8) and taking the limit T → ∞, we obtain the statement of the lemma with α = α Var(N s ).
The next lemma establishes fact that the actual value of N RDP (T) (as opposed to the expected value) is well behaved for large values of the time horizon T.
On the other hand, since lim T→∞ E(N RDP (T)) = λT, we can apply the Chebyshev inequality to obtain that, for large enough T, Setting z = (λT) 3/4 and dividing by λT, we arrive at for large enough T. Finally, taking the limit T → ∞, we obtain the statement of the lemma.
The following lemma expresses the overall RDP arrival rate λ via ν, ξ(n) and τ(n).
Proof. Consider a time horizon of T time slots. For a given source node i, the expected number of RDP processes initiated by this node during T time slots can be computed using (6) as Using the renewal property of the process of node state change we can find (see [22,Chapter 8]) that where C is a constant independent of T and lim T→∞ T = 0. Since E(N s ) = νE(v) = νξ(n), and E(N RDP (T)) = (n/2)E(N i (T)), we can write

RDP success probability
The key measure of the effectiveness of a route discovery process is the probability that it succeeds in finding a route. So we have to be able to characterize the probability of success of an RDP in the given environment. We will do it using the following definition.
With this definition, we have that if f is the fraction of nodes that an RDP process has reached, the probability of a successful route discovery by the process conditioned on the fraction f (and not on anything else) is Q f = G( f ). The unconditional probability of a successful route discovery can be found as

Examples
(1) The "totally random" (TR) model. In this model, the probability of a success of a given RDP is given simply by the fraction of nodes reached by this process This scenario can be realized, for example, in the situation where new nodes join the network and attempt to find routes to other newly joined nodes. Indeed, in this case, assuming that both source and destination locations are random, any node out of f (n − 1) nodes reached by the RDP initiated by the source has an equal probability of being the destination.
(2) The "semirandom" (SR) model. Suppose a node i is attempting to find a destination d(i) that is already present in the network. If the nodes use the multihop transmission with the hops mostly between nearest neighbors (e.g., for throughput maximization), then it is straightforward to show (see, e.g., [11]) that the number of other routes passing through a given node already present in the network is Θ( n/ log n). This implies that finding a route to d(i) is equivalent to finding any of the nodes in the set A(d(i)) that have their routes passing through d(i). It is clear that the number of such nodes will be Θ( n/ log n) as well If we assume that the nodes in the set A(d(i)) are randomly distributed in the network, then it is easy to see that the probability of success of RDP will behave as follows: A specific example of such a function is where c is a constant independent of n.
(3) The "completely local" (CL) model. In this model an RDP only needs to reach a fixed (independent of n) number of nodes so that the probability of success can approach 1. This model is appropriate for the case of "perfect" route repair algorithms in which a route between two nodes is repaired as soon as it is broken, and the effectiveness of the repair does not depend on the number of nodes in the network, that is, An example of such function is where c 1 is a constant independent of n. This case can be looked upon as "the best" case, an idealization which can be realized under some rather restricted conditions whose analysis we postpone to future publications. When thinking of possible shapes of the function G(·), it is reasonable to assume that the RDP processes are "totally random" (model TC) in the worst case. In other words, it is reasonable to exclude cases in which the probability of a node finding its destination is lower than the fraction of all nodes reached by the corresponding RDP process. The latter situation is in principle possible. For example, consider the situation in which the new nodes join the network in locations that are correlated with the locations of the corresponding destinations. If the correlation is such that the average distance between the source and destination exceeds the average distance in the network, it is possible to have G( f ) < f for 0 < f < 1. However it is fairly clear that such a situation is "unnatural" and we assume that nothing like this actually happens. With this assumption, we have the following assumption.
Since, clearly, G(0) = 0 and G(1) = 1, it is also reasonable to assume that the function G( f ) is concave.
We would like to relate the unconditional probability or route discovery success Q to the function G(·) and the expected number of first-time RREQ packet receptions E[n r ]. First, let us introduce some useful notation.
The following auxiliary lemma relates the expected number of first-time receptions in a time slot E(n t ) and the expected fraction of nodes reached by an RDP process E( f ).

Lemma 4.
Proof. Consider a time horizon of T time slots. Let N r (T) = T t=1 n t (t) be the total number of first-time RREQ receptions during these T time slots. On the other hand, let N RDP (T) be the total number of RDP processes initiated in the network during these T time slots, and let N r (T) be the total number of nodes reached by these RDP processes. Since the longest lifetime of an RDP process is equal to l, it is easy to see that Let us denote by n r and f , the sample means of the quantities n t (t) and f k , respectively, We can bound the variance of n r as follows: Cov n r (t), n r (s) where we have used the finite covariance length assumption Cov(n r (t), n r (s)) for |s − t| > h. In the same way, we can upper bound the variance of f , Now, an application of the Chebyshev inequality yields for n r : where we have used the bound (30). Setting

EURASIP Journal on Wireless Communications and Networking
In the same way, an application of the Chebyshev inequality and the use of (31) yields and, setting z = N RDP (T) −1/4 , we obtain We can rewrite (28) as Now, combining Lemma 2 with (33) and (35), using (36) and the union bound and taking the limit T → ∞, we see that the relation has to hold with probability 1, which proves the lemma.
Now we can use Lemma 4 to establish a relationship between the unconditional probability Q of route discovery success and the function G(·). Lemma 5. If route discovery is described by the function G( f ), then the unconditional route discovery success probability Q can be upper bounded as Proof. Since , we can use the concavity of G(·) to see that Q ≤ G(E[ f ]). Then, using Lemma 4, we obtain the statement of the present lemma.
Note that, for the TR model, we can obtain a simple expression for the unconditional probability of success as a corollary to the above lemma.

Corollary 1. The unconditional success probability of an RDP for the TR model is given by
Proof. Since in this case G( f ) is simply f , we obtain and using Lemma 4, the corollary follows.

NETWORK CAPACITY WHEN NODES CANNOT ALWAYS TRANSMIT
In this section, we find upper and lower bounds on the throughput capacity of a networks where nodes spend a fraction of their time in the N state in which they cannot transmit data packets to their destinations.

Upper bounds
First, let us consider the case when, for large n, the average length of active periods (when nodes are in the D state) is not much smaller than that of period of "dormancy" (when nodes are in the N state). In the asymptotic notation, this means that In this case, it is easy to see that the results on capacity reported in [11] are valid. Next, consider the case when the average length of active periods becomes negligible compared to the "dormant" ones as the network size n increases, that is, Let us denote by F the limit (if it exists) of the ratio F M as M → ∞. We can show that, under the assumptions made in Section 2, the limit indeed exists and determined by in a simple way by the expectations τ(n) and ξ(n).

Lemma 6. The limit F(n) = lim M→∞ F M exists and
with probability 1.
Proof. Using the renewal assumption, we can determine the variance of the sums S (u) The use of the Chebyshev inequality and the above variances yields with probability 1, where we have used the inequalities (47).
With the above lemma, we can obtain a "dormancy induced" upper bound on the throughput.
Proof. Consider a long time T (measured in RDP time slots). According to Lemma 6, only T(τ(n)/(τ(n) + ξ(n))) of these time slots can be used by any node for data transmission. During these time slots a node can send at most T(τ(n)/(τ(n) + ξ(n)))S RREQ bits to its destinations. So the inequality has to hold. Since δt = S RREQ /W, we obtain from (50) that which proves the theorem.
On the other hand, regardless of states of nodes, we have the following upper bound on the throughput induced by interference between simultaneous data transmissions.

Theorem 2. The per node throughput T (n) is upper bounded as
Proof. The proof can be found, for example, in [11].
Combining Theorems 1 and 2, and choosing the tighter bound depending on the behavior of the ratio τ(n)/ξ(n), we obtain the following corollary.

Lower bounds
In order to show that the bounds of Corollary 2 are achievable up to a constant we will demonstrate that there exists a feasible transmission schedule that allows us to obtain the required per node throughput. To achieve that goal, we need to perform a few auxiliary steps which we do below.

Tessellation
The tessellation (which we will call U 1 ) of the square region that turns out to be convenient for our goals is the regular one: we divide it into identical smaller squares with side g each. Anticipating the transmission strategy to be employed below, we choose the parameter g in such a way that every cell can always directly communicate with 4 of its neighbors using the smallest common range of communication that in turn is chosen in a way to ensure connectivity with high probability (i.e, the probability that approaches 1 when n → ∞). As mentioned in Section 2, for connectivity, we have to employ the range where c > 1/π. We chose c = 10 for simplicity. Then, to ensure that each cell can directly communicate with 4 neighbors, one needs to set the cell size to be

Upper bound on the transmission schedule length
We call two cells interfering neighbors if there is a point in one cell within a distance of (2 + Δ)r(n) from a point in the other cell. It is easy to see that only transmissions from the cells that are interfering neighbors can interfere with each other. The following lemma is by now standard in the literature on ad hoc network capacity (see, e.g., [11]).

Lemma 7.
There exists a transmission schedule in which each cell can transmit in one of every c+1 time slots, where c depends only on the parameter Δ.

Number of nodes in a cell
To make the transmission schedule presented below feasible, we need to ensure that every cell contains at least one node with high probability. Given the square geometry we have chosen, this is easy to do. Indeed, let us compute the probability that a given cell does not have any nodes in it. If a single node is placed in the system, the probability that a cell does not contain that node is the ratio of area outside the cell over the total area. For n nodes, this ratio is raised to the n power. Since the area of a cell is g(n) 2 so P(no node is in a cell) ≤ n −2 .
We need to find the probability that there is at least one node in every cell whp, or equivalently, the probability there is no node in some cell is zero whp. Since there are no more than 1/g 2 cells in the network, by an application of the union of event bound we obtain the following statement.

Lemma 8. The probability that there is a cell that does not contain a single node is upper bounded by
In other words, all cells contain at least one node with high probability.

Routes of packets between nodes
We organize transmission in the following way. The entire system is tessellated into square cells of area g(n) 2 . The routing of packet between nodes proceeds as follows. To route a packet between two nodes, we employ at most two straight lines: one vertical and one horizontal. 3 Each time a packet is transmitted from a node in a cell to some node in an adjacent cell. The direction of both the vertical and the horizontal part of the route is chosen randomly (recall that the network lives on a torus). In the final hop, the packet is transmitted to the destination from a node in a cell adjacent to the cell containing the destination. Now, let us consider a given cell C i and count the number of routes passing through it. Let us denote this number by N i . The following lemma demonstrates that the maximum possible value of N i can be upper bounded with high probability.

Lemma 9. The asymptotic relation
holds with high probability.
Proof. Consider vertical components of the packet routes passing through cell C i . Let us denote their number by V i . Because of the random choice of the routes' directions the expected value of V i will be equal to half of the expected value of the number of nodes in the vertical strip formed by the "column" of cells above and below the cell C i . The area of this strip is equal to g(n). So for the expected value of V i we obtain 3 It is possible that only one straight line is needed.
The use of the Chernoff bound yields which proves the lemma.
On the other hand, we can show that the number of routes passing through every cell can be lower bounded. This is done in the next lemma.

Lemma 10. The asymptotic relation
holds with high probability.
Proof. We use the notation introduced in Lemma 9. As was shown in that lemma, We can now use the Chernoff bound to obtain Setting = 1/2 and using (69), we obtain In the same way, we obtain It is obvious that the same inequality will hold for the sum Since there are 1/g(n) 2  This completes the proof of the lemma.

Achievable throughput
We can now find the achievable per node throughput. This is the subject of the next two theorems.

Theorem 3.
If then the per node throughput is achievable with high probability.
Proof. Consider a long time T. Since each source can generate data only in the D state, using Lemmas 6 and 9, we see that the number of packets N T,i that has to be served by the cell C i can be upper bounded as with high probability. Since τ(n)/ξ(n) = o(1/ n log n), we have that, with high probability, which is less than the number of time slots T/ c that, as shown in Lemma 7 each cell can be active in. This implies that the per node throughput of is achievable with high probability, which proves the theorem.
The meaning of the next theorem is that, if the ratio τ(n)/ξ(n) is large enough, the throughput limited by the interference between data transmissions can be achieved.

Theorem 4. If
then the per node throughput is achievable with high probability.
Proof. Consider a long time T. Since each source can generate data only in the D state, using Lemmas 6 and 10 we see that the number of packets N T,i to be served by the cell C i can be lower bounded as with high probability. Since τ(n)/ξ(n) = Ω(1/ n log n), we have that, with high probability, which implies that there is enough data so that the cell can serve a packet in each slot it can become active (and the number of such slots is T/ c). Therefore the per node throughput of is achievable with high probability.

SCALING OF ξ(N)
A node that needs to find a route will initiate an RDP. Since it may not be successful, the node might have to initiate it several times. We assume that the node initiates RDPs with frequency of ν until the route is found. The next lemma computes a lower bound on the expected number of RDPs that a node will need to initiate in order to find the route.

Lemma 11. The expected number of RREQ transmissions, E(N s ), which is required by a node for a successful route discovery is lower bounded as
Proof. Let f i be the fraction of nodes reached the by ith RDP initiated by the source in question. Also, let Q j ( f j ) be the conditional probability of the jth RDP finding the destination provided that all the previous ones failed to do so. Then the expected number of attempts conditioned on f 1 , f 2 , . . . can be written as Taking an expectation with respect to f and using mutual independence 4 of the components of the random vector f = ( f 1 , f 2 , . . .), we obtain

EURASIP Journal on Wireless Communications and Networking
where Q is the unconditional probability of route discovery success, and Q i for i = 2, 3, . . . is the probability of route discovery success by ith consecutive RDP provided all the previous ones have failed. Now note that since an RREQ packet is more likely to reach the destination that is physically closer to the source, we will assume that the following inequalities 5 hold: that is, a failure to reach the destination by the previous RREQ will not increase the probability of success for the next RREQ. Therefore, we have the following inequality In order to obtain a more precise characterization of E(N s ), more details of the protocol used as well as physical layer characteristics of the environment such as fading and shadowing are needed. This is an important task that falls beyond the scope of the present paper. Here, we will simply state that where κ(n) ≥ 1 is the "correction" factor due to dependence between RREQ belonging to the same RDP.
We leave the dependence of κ(n) on n undetermined although it is easy to see by comparing (88) with (90) that κ(n) ≥ 1.
The expected duration of the time period during which a node stays in the N state searching for a route can be computed as We can use Lemma 3 and (91) to obtain the expression for the total RDP arrival rate λ:

Lower bound on ξ(n)
We would like to demonstrate that the average length ξ(n) of a node "inactivity" period is bounded from below and the bound depends on the shape of the route discovery success function G(·). 5 It may be possible to prove (88) starting from some assumptions on the RDP protocol and nodes mobility.

Theorem 5. The expected length of the time interval during which a node stays in the N state is
Proof. From Lemma 5 and the simple fact that E(n r ) ≤ n − 1, we have From Lemma 3, we have Now let us consider the cases τ(n)ν ≤ 1 and τ(n)ν > 1.
From (95), we can obtain Thus, and, therefore, where we have used the fact that ν ≤ 1 and concavity of the function G(·).

Upper bound on ξ(n)
Next, we would like to find an upper bound on the average length of "data inactivity" period ξ(n). Note that, in order to find a lower bound, it was sufficient to assume that all network resources were devoted to route discovery with no data transmission taking place. For an upper bound, we need to present a constructive network resource division scheme between RDP and data transmission.
We make the following assumption about the probability distribution of the fraction of nodes reached by an RDP.

Regularity condition 1
The probability distribution of the fraction of nodes reached by an RDP is such that m f ≥ γE( f ), where m f is the median of the distribution, and γ > 0 is a constant independent of n.
Note that the goal of making this assumption is to rule out "pathological" distributions of the fraction of nodes reached by an RDP. Thus it is not restrictive in that any distribution that can be realized in practice should satisfy it for an appropriate value of γ. Lemma 12. If regularity condition 1 holds, then Q ≥ (1/ 2)G(γE(n r )/λ(n − 1)).
, by definition of a median, we have which proves the lemma.
Let us now demonstrate that it is in principle possible to organize transmissions in the network such that the expected number of first-time RDP receptions E(n r ) can be equal to a fixed (albeit possibly small) fraction of n. The main idea is to make sure that the nodes do not transmit "too many" RDP packets so that the network gets overwhelmed with them.
To this end, let us construct a transmission and relaying strategy that has the desired property. Let all the nodes transmit only data packets in even time slots and only RDP packets in odd time slots. If a node receives an RDP, it will retransmit it in the next odd time slot during which it is not busy initiating its own RDP process, with probability p. We call this "strategy A." We can also assume, without loss of generality that τ(n) < ξ(n) since if it were not so, we would immediately come back to the situation where RDP does not have a significant effect on capacity. First, we prove an auxiliary lemma.
Proof. Since nodes generate RDP packets independently of each other, the distribution of the number of new RDP will be Poisson with parameter λ. So, according to the Chebyshev inequality, On the other hand, since τ(n) < ξ(n) it is easy to see from Lemma 3 that nν/4 ≤ λ ≤ nν/2. Combining this with (104), we obtain that As to the number of RDP retransmissions in the same time slot, obviously, E(n rt ) < np. If it were the case that E(n rt ) = np, we would be able to use the Chernoff 's bound to obtain Pr n rt > (1 + )E n rt < e − 2 E(nrt) /4 , (106) or, setting = 1, But since, in fact, E(n rt ) < np, the inequality (107) will hold as well.
Finally, combining (105) and (107), and using the union bound we obtain the statement of the lemma.
We can now state the following proposition.

Proposition 1. If strategy A is followed for a sufficiently long time T then, for this period,
where c is a constant independent of n.
Proof. Let k = √ 5(1 + Δ) . Let us introduce a second, more coarse, tessellation U 2 such that one cell of the tessellation U 2 consists of 16k 2 cells of the tessellation U 1 described before (so that one cell of U 2 is a 4k × 4k array of cells of U 1 ). We see that the number of cells in tessellation U 2 is equal to n/32k 2 log n. Now, let us set ν = c 1 / log n where c 1 is a constant independent of n. Then, according to Lemma 13, the expected number of nodes transmitting a new RDP packet is equal to nc 1 / log n w.h.p. (where (1/4) c 1 ≤ c 1 ≤ (3/2) c 1 is another constant independent of n). Since the locations of these nodes are mutually independent, for sufficiently large n, the distribution of number of such nodes in any cell of tessellation U 2 is close to Poisson with parameter c 2 = 32k 2 c 1 , and the probability that there is exactly one such node in a cell of tessellation W 2 is close to c 2 e −c2 and is, therefore, at least 0 < c 2 < c 2 e −c2 which is also independent of n.
In the same way, using Lemma 13, we can set p = c 3 / log n and convince ourselves that the probability that there are no nodes retransmitting an RDP packet in a given cell of tessellation U 2 is at least c 3 which is also independent of n. Therefore the probability that there is exactly one node transmitting a new RDP packet and there are no nodes retransmitting an RDP packet, in a given cell of tessellation U 2 , is at least c 2 c 3 = c 4 , which is also independent of n.
Let us now, inside each cell of tessellation U 2 , highlight a square consisting of 4k 2 small cells (those of tessellation U 1 ) in the center so that there is a "guard zone" of width k small cell sizes around it. If a given big cell contains exactly one node transmitting a new RDP packet (and no nodes retransmitting RDP packets), then the probability that this single transmitting node lies in the highlighted square is equal to 4k 2 /16k 2 = 1/4. Now, consider a long time T. During this time there will be T/2 slots during which only RDP packets are transmitted. During these T/2 slots, there will be a total of at least N T = (T/2)·(n/32k 2 log n)·c 4 ·(1/4) = (c 5 ·Tn)/ log n transmissions of a new RDP packet by node located in highlighted 2k × 2k square so that there are no other simultaneous transmissions in the same big cell of U 2 tessellation. Now, note that the presence of the guard zone ensures that no transmission from other big cells interferes with receptions in the highlighted square. Therefore, all nodes in at least 2 small cells in the highlighted square will successfully receive that transmission of a new RDP packet. The expected number of such nodes will therefore be no less than N T · 2 · 2 log n = 4c 5 Tn. Dividing this number by the total time (number of time slots) T, we obtain that the expected number of successful first-time RDP packet receptions per time slot will be no less than 4c 5 n where c 5 is independent of n. This proves the proposition.
We can find an upper bound on ξ(n).
We know, from the proof of Proposition 1, that if we set ν = c 1 / log n, (so that c 1 n/2 log n ≤ λ ≤ c 1 n/ log n) then E(n r ) ≥ cn. We can now use (112) to obtain Note that if γ c/ c 1 ≥ 1, we have G(γ c/ c 1 n) ≥ G(1/n) and hence If γ c/ c 1 < 1, then the concavity of G(·) implies that G(γ c/ c 1 n) ≥ (γ c/ c 1 )G(1/n) and, therefore which completes the proof of the theorem.
Putting together the results of Theorems 5 and 6, we obtain a semitight (up to log n) asymptotic characterization of the quantity ξ(n) which we state as a corollary.

RDP LIMITED THROUGHPUT
In this section, we collect the pieces to obtain the main result of this paper: the scaling of the RDP limited throughput of a random ad hoc network. The next theorem covers the case where the RDP plays the role of the throughput bottleneck.
Now that the general asymptotic behavior of the RDPlimited throughput is given, let us consider several simple examples and obtain the throughput scaling.

Examples
(1) The need of route discovery due to changing node membership. In this model, a new pair of nodes i and j join the network and need to discover a route to each other. They stay connected during a time period of τ i j after which the connection is terminated and the nodes leave the network (turn themselves off). This process of pairs joining the network and leaving continues is such a way that the network is in an equilibrium in the sense of the number of nodes participating in it at any given time. Let us assume that E(τ i j ) = τ for any i and j. In this case it is reasonable to assume that τ does not depend on n, as it depends only on the behavior pattern of individual nodes. Since both nodes that just have joined the network are "new" to it, the TR model of success probability has to be used. This implies that and therefore, Then it follows from Theorem 7 that the RDP limited throughput scales as (2) The node membership in the network is constant but the loss of routes is due to severe fading or excessive node mobility. In this case, it would be reasonable to assume that a route from a node i to d(i) is lost and has to be rediscovered whenever a link is broken. The number of links between i and d(i) is Θ( n/ log n), and the rate at which a link brakes depends only on fading environment and the mobility characteristics and therefore is independent of n. Therefore, assuming that the links brake independently, we obtain that the rate at which the nodes i and d(i) lose their route will be Θ( n/ log n). Hence, the average length of time during which the route stays intact is τ(n) = Θ log n n .
On the other hand, assume that for the success of route discovery the SR model has to be used. Indeed, since both the source and the destination are still present in the network, they are "known" to Θ( n/ log n) other nodes which is the assumption under which the SR model is obtained. Therefore, and hence (Theorem 7), (3) The situation is just as above, with the exception of the route discovery probability. Assume that the route repair algorithm is so good that it is able to repair the broken link immediately so that the CL model is appropriate. Then as above, but and, therefore So, in case κ(n) = Θ(1), Theorem 8 gives In other words, the scaling is the same as in the case with no route discovery meaning that under such conditions the main limitation is still data transmission.

CONCLUSION
In this paper, we have explored the problem of the throughput of ad hoc networks in the presence of route discovery processes. Specifically, we assumed that nodes in a network do not always have the knowledge of routes to the corresponding destinations and have to find them. We consider the effect of RDP on the throughput explicitly, and obtain results that generalize the previously known scaling behavior of the throughput of random ad hoc networks. We find that, under certain conditions on the network environment and the algorithms used for route discovery or repair, the effect of RDP on the throughput starts dominating that of data transmission and the scaling of the throughput changes. We obtain both the conditions for the change and the scaling of the RDP limited throughput. Note that we made an assumption of the function G( f ) being concave on the interval [0, 1]. This assumption seems a natural one to make and it makes some of the proofs of the paper easier. However, it is not critical to the results. In fact, if the assumption of concavity of G( f ) did not hold, the results that rely on it (e.g., Lemma 5) would still hold if we made an assumption on the distribution of f similar to regularity condition 1, (guarding against "pathological" distributions and therefore not restrictive).