Random Beamforming in Millimeter-Wave NOMA Networks

This paper investigates the coexistence between two key enabling technologies for the fifth generation (5G) mobile networks, non-orthogonal multiple access (NOMA) and millimeter-wave (mmWave) communications. Particularly, the application of random beamforming to the addressed mmWave-NOMA scenario is considered in this paper, in order to avoid the requirement that the base station knows all the users' channel state information. Stochastic geometry is used to characterize the performance of the proposed mmWave-NOMA transmission scheme, by using the key features of mmWave systems, e.g., mmWave transmission is highly directional and potential blockages will thin the user distribution. Two random beamforming approaches which can further reduce the system overhead are also proposed to the addressed mmWave-NOMA communication scenario, where their performance is studied by developing analytical results about sum rates and outage probabilities. Simulation results are also provided to demonstrate the performance of the proposed mmWave-NOMA transmission schemes and verify the accuracy of the developed analytical results.


I. INTRODUCTION
Non-orthogonal multiple access (NOMA) has recently received considerable attention as a promising multiple access (MA) technique to be used in fifth generation (5G) mobile networks [1], [2]. Compared to conventional orthogonal multiple access (OMA), such as time division multiple access andor frequency division multiple access, NOMA encourages spectrum sharing among multiple users, rather than serving a single user in one orthogonal bandwidth block [3], [4]. Sophisticated power allocation policies and detection methods, such as cognitive radio inspired power allocation, superposition coding and successive interference cancellation (SIC), are used to combat the co-channel interference which is not presented in OMA cases [5], [6]. It is worth pointing out that the use of NOMA can still effectively support massive connectivity and efficiently meet users' diverse QoS requirements, even if the users have similar channel conditions [7].
As an promising enabling technology for 5G networks, NOMA has been shown to be compatible to many other 5G techniques, such as massive multiple-input multiple-output (MIMO), cognitive radio networks, as well as other types of MA techniques, such as orthogonal frequency division multiple access (OFDMA) [8]- [10]. The purpose of this paper is to investigate the coexistence between NOMA and another important 5G technique, millimeter-wave (mmWave) communications [11]- [15]. Even though more bandwidth resources are available at very high frequencies, the use of NOMA is still important for the following reasons: • The highly directional feature of mmWave transmission implies that users' channels can be highly correlated, which potentially degrades the system performance. But such correlation is ideal for the application of NOMA. • The combination supports massive connectivity in dense networks, e.g., where there are hundreds of users to be connected in a small area. • The rapid growth of mobile Internet services, particularly emerging virtual reality (VR) and augmented reality (AR) services, will dwarf the radio spectrum gains obtained from the mmWave bands, which means that further improvement of the spectral efficiency is still important.
In this paper, we consider a mmWave-NOMA downlink scenario, in which a base station equipped with multiple antennas communicates with multiple single-antenna nodes. While MIMO-NOMA has been extensively studied in [16]- [18], the application of mmWave communications makes the addressed MIMO-NOMA scenario much different, mainly due to the characteristics of mmWave propagation. The contributions of this paper are four-fold: • We first consider the application of random beamforming to the addressed mmWave-NOMA scenario, in which a single beam is randomly generated by the base station. While random beamforming does not require the base station to know all the users' channel vectors, conventional random beamforming still requires all the users to send their scale channel gains to the base station, which can consume significant system overhead in a network with a large number of users. The fact that mmWave transmission is highly directional is used in this paper to avoid scheduling those users who are likely to have low signal strength, which reduces the number of users who need to feed their channel quality information back to the base station and hence reduces the system overhead. Stochastic geometry is applied to characterize the sum rate and the outage probabilities achieved by the proposed beamforming scheme, where the blockage feature of mmWave propagation is also used to model the user distribution more realistically. • In a fast time varying situation, in which the phases and the amplitudes of the users' channel gains change rapidly, a low-feedback transmission scheme is proposed by assuming that only the users' distance information is available to the base station. As a result, the users are ordered according to their path losses, instead of their effective channel gains. The impact of this partial channel state information (CSI) on the performance of the mmWave-NOMA downlink network is investigated. • A one-bit feedback random beamforming scheme is also proposed in order to further reduce the system overhead.
Δ BS Δ Fig. 1. A system diagram for the addressed mmWave-NOMA scenario.θn denotes a beamforming vector randomly generated by the base station. Only the users that fall into a specific wedge-shaped sector will be scheduled, which is to ensure that the maximal angle difference between a scheduled user's channel vector and its associated beam is ∆.
In particular, the base station sets a threshold which is broadcast to the users. Each user feeds one bit back to the base station to indicate its channel quality. The use of one-bit feedback can effectively reduce the amount of feedback, but will cause an ordering ambiguity at the base station. The impact of this ambiguity on the performance of the one-bit feedback transmission scheme is investigated. Furthermore, the effect of the threshold is also characterized, where the obtained analytical results show that a properly designed threshold can ensure that the full diversity gain is achievable by the user selected to be the NOMA strong user. • The performance for the more challenging scenario in which the base station generates multiple orthonormal beams is also investigated. Compared to the case with a single beam, each user in the scenario with multiple beams suffers more interference, including intra NOMA group interference and inter-beam interference. Because mmWave transmission is highly directional, inter-beam interference can be effectively suppressed by scheduling the users whose channel vectors are aligned with the randomly generated beams. Exact expressions for the outage probabilities achieved by the random beamforming scheme and their approximations are developed in order to obtain greater insights.
II. SYSTEM MODEL Consider a mmWave-NOMA downlink transmission scenario with one base station communicating with multiple users, as shown in Fig. 1. The base station is equipped with M antennas and each user has a single antenna. Denote the disk covered by the base station by D. Assume that the base station is located at the origin of D and denote the radius of the disk by R D . Assume that users are randomly deployed in the disc following a homogeneous Poisson point process (HPPP) with density λ [19]. Therefore, the number of users in the disk is Poisson distributed, i.e., P (K users in D) = µ K e −µ K! , where µ = πR D 2 λ. As discussed in [12] and [13], the mmWave channel model is quite different from those of conventional lower frequency cellular networks; in particular, the mmWave-based channel vector from the base station to user k can be expressed as follows: where L is the number of multi-paths, d k denotes the distance between the transceivers, α N LOS and α LOS denote the path loss exponents for the non-line-ofsight (NLOS) and line-of-sight (LOS) paths, respectively, a k,l denotes the complex gain for the l-th path and is complex Gaussian distributed, i.e., a k,l ∼ CN (0, 1), and θ l k denotes the normalized direction of the l-th path. We assume that the channel gains are independent from path to path. For notational simplicity, the normalized direction of a path is treated the same as its physical angle of departure, and in Section III-A, we can show this simplification has no impact on the performance of the proposed mmWave-NOMA scheme.
As discussed in [13] and [20], in mmWave communications, the effect of LOS links is dominant, compared to those of NLOS links, e.g., the gain of an LOS link can be 20 dB stronger than those of NLOS links. Therefore the first factor at the right-hand side of (1) is dominant, which yields the following simplified channel model: where the subscripts 0 and LOS have been omitted to simplify the notation. In practice, the direct path between the mmWave transceivers might be blocked by obstacles, which means that an LOS path does not always exist. As a result, in addition to path loss and fading attenuation, mmWave transmission also suffers potential blockages, which is an important feature to be captured. A simple way to model these blockages is to assume the existence of an LOS path if the distance between the transceivers is smaller than a threshold [14] and [21]. Alternatively, a more sophisticated way to model the probability of there being an LOS path for mm-Wave transmission has been introduced in [15], [22] as follows: where φ is determined by the building density, the shape of the buildings, etc. In this paper, we will use (4) for modelling blockages in the addressed mmWave communication scenario. It is important to point out that these blockages will thin the node distribution, which will be discussed in detail in the next section.

III. RANDOM BEAMFORMING: A SINGLE-BEAM CASE
Many existing precoding and beamforming schemes for NOMA require that the base station has access to the users' CSI. These approaches can consume a substantial amount of system overhead, if there are many users in the system. In order to reduce the system overhead, we consider the application of random beamforming to mmWave-NOMA communication scenarios.

A. The Application of Random Beamforming to NOMA
In this section, we focus on the case in which a single beam, denoted by p, is generated at the base station. Note that in the context of mmWave communications, analog precoding is preferable compared to digital precoding since the amplitude of a signal is kept constant and only the phase is changed. Therefore, following [13] and [23], we use the following choice for beamforming: whereθ is uniformly distributed between −1 and 1. This choice of precoding is analog precoding since it alters the signal phase only, and keeps the signal modulus constant. It is worth pointing out that this beamformer is also a special case of the hybrid precoding design in [24] with one radio frequency chain and M antennas. One straightforward solution for user scheduling is to ask each user to feed its effective channel gain |h H j p| 2 back to the base station, and then the base station schedules the user with the strongest channel. However, such an approach will still consume considerable system overhead, particularly if there are many users in the cell.
In the context of mmWave communications, a useful observation is that many users do not have to participate in the competition for access to the channel, as explained in the following. Without loss of generality, user j is randomly chosen to be served on beam p. The effective channel gain of this user on the randomly generated beam, |h H j p| 2 , can be written as follows: Following steps similar to those in [13], this effective channel gain can be rewritten as follows: where F M (x) denotes the Fejér kernel. Note that the Fejér kernel goes to zero quickly by increasing its argument, i.e., F M (x) → 0 for increasing x. This means that a user can have a large effective channel gain on beam p if this user's channel vector is aligned with the direction of the beam. Following this observation, we will schedule only the users who are located in the wedge-shaped sector served by the beam, as highlighted in Fig. 1. Particularly, this sector is denoted by D θ , and its central angle is 2∆, which means that the maximal angle difference between a scheduled user's channel vector and the beam is ∆, and ∆ → 0 is required to ensure a large effective channel gain. Note that, when ∆ → 0, the use of the normalized direction of a path to replace its physical angle of departure has no impact on the performance of the proposed scheme, as illustrated in the following. Recall that the normalized direction θ is a function of the physical angle of departure, denoted by φ θ , i.e., θ = 2d sin(φ θ ) λ , where λ and d are the carrier wavelength and the antenna separation distance, respectively. If ∆ → 0, we have |θ−θ j | → 0, and hence |φθ −φ θj | → 0, which means that the two physical angles are very similar if the two normalized directions are similar. Furthermore, as ∆ → 0, the application of Taylor series leads to θ j −θ ≈ 2d cos(φθ) λ (φ θj − φθ), and so our analytical results based on the normalized directions can be extended to the case with the physical angles in a straightforward manner.

B. The Implementation of NOMA
Suppose that there are K users in the sector, D θ , and these users are ordered according to their effective channel gains as follows: Similarly to [6] and [16], we consider the case in which two users will be selected for the implementation of NOMA. Note that the implementation of NOMA in long term evolution advanced (LTE-A) is also based on the two-user case [25].
Since the aim of this paper is to study the impact of NOMA on mmWave communications, without loss of generality, we assume that user i and user j are paired together for NOMA transmission on a randomly generated beam. Note that i and j can be chosen arbitrarily, constrained by 1 ≤ i < j ≤ K. As a result, the performance of mmWave-NOMA with different scheduled users can be investigated, and the insights obtained from the performance analysis can offer guidelines for the design of practical user scheduling algorithms. Therefore, the signal sent by the base station is given by where β i denotes the power allocation coefficient. Since |h H i p| 2 < |h H j p| 2 , the application of NOMA means β i ≥ β j , where β 2 i + β 2 j = 1. Therefore, user i will receive the following observation: where n i denotes additive Gaussian noise. User i will treat its partner's message as noise and directly decode its information with the following signal-to-interference-plus-noise ratio (SINR): where ρ denotes the transmit signal-to-noise ratio (SNR). As a result, the outage probability for user i to decode its information is given by which is conditioned on the number of users in D θ , where User j first tries to decode its partner's message with the following SINR: If SINR i→j ≥ ǫ i , the user can decode its own message with the following SNR: after removing its partner's information, a procedure known as SIC. Therefore the outage probability experienced by user j can be expressed as follows: which is again conditioned on K.
As a result, the outage sum rate achieved by the mmWave-NOMA transmission scheme can be expressed as follows: and the sum rate achieved by mmWave-OMA can be expressed similarly as follows: where P n|K OMA denotes the conditional outage probability when OMA is used. The reason for using the OMA mode in (15) is that it is possible to have a single user in D θ . In this case, NOMA cannot be implemented and we simply use OMA, i.e., P n|K OMA = P log(1 + ρ|h H n p| 2 ) < 2R n , for n ∈ {i, j} 1 .

C. Characterization of the Sum Rate and Outage Probabilities
In order to evaluate the sum rate shown in (15), it is important to find expressions for the outage probabilities, P o j|K and P o i|K ; these are related to the probability density function (pdf) of the ordered channel gain, |h H j p| 2 , which is provided in the following lemma.
Lemma 1. Suppose that there are K users in D θ . The pdf of the ordered channel gain, |h H j p| 2 , is given by where × λφ 2 e −φr 2∆λγ(2, R D φ) rdrdθ, 1 One can also use P

1|K
OM A = P log(1 + ρ|h H 1 p| 2 ) < R 1 for the case K = 1, which will make the notation in (15) and (16) more complicated. It is worth pointing out that the probability of having K = 1 is very small and different designs for this trivial case do not cause much difference to the overall sum rate. and γ(·) denotes the incomplete gamma function.
Proof. The density function in the lemma can be evaluated by first characterizing the unordered channel gains and then applying the theory of order statistics.
First we focus on an unordered channel gain, denoted by |h H π(j) p| 2 . Denote the location of this node by x π(j) , where its probability distribution and pdf are denoted by P X π(j) and p X π(j) , respectively. In this case we can find the cumulative distribution function (CDF) of the unordered channel gain as follows: where r(x) denotes the distance from the origin to point x.
Note that the conditioning on K has been omitted since it does not affect the CDF.
It is important to note that the nodes participating in NOMA no longer follow the original HPPP with parameter λ, because of potential blockages. Particularly, with the blockage model in (4), it is less likely for a user far away from the base station to have an LOS path. Therefore, following the discussions in [26], the effect of blockages is to thin the original homogeneous point process and this thinning process yields another PPP with the following intensity: Therefore, the mean measure for this new PPP, denoted by µ Φ2 (D θ ), can be obtained as follows: As a result, after considering potential blockages, the probability of having K users in the sector, D θ , can be obtained as follows: Since the intensity and the mean measure of the new PPP are known, the pdf of x π(j) can be written as follows: Accordingly, the CDF of the unordered channel gain can be written as follows: and by using polar coordinates, the expression for F π(j) (y) in the lemma can be obtained. After using the assumption that all the channel gains are independent and identically distributed and also applying the theory of order statistics [27], the proof is complete.
By applying the above lemma and also some algebraic manipulations, P o j|K and P o i|K can be obtained in the following corollary.
Corollary 1. By using the proposed mmWave-NOMA transmission scheme, the outage probability experienced by user j conditioned on K is given by The conditional outage probability for user i is given by j ǫi . By using the above corollary and substituting (21), (23) and (24) into (15) and (16), the sum rates achieved by mmWave-NOMA and mmWave-OMA can be calculated.

D. Asymptotic Performance Analysis
The obtained results shown in (23) and (24) are quite complicated, since they involve the calculation of double integrals. In order to obtain some insight, we will obtain approximations to these expressions. Particularly, our asymptotic studies are carried out by using the following two assumptions. One is that the central angle of the sector, 2∆, is small, i.e., ∆ → 0, and the other is the high SNR assumption. The use of these two assumptions leads to the following lemma.
Lemma 2. When ∆ → 0 and at high SNR, the conditional outage probabilities P o i|K and P o j|K can be approximated as follows: where k ∈ {i, j}. The diversity gain available at user k is k.
Proof. In order to use the assumption ∆ → 0, recall that the Fejér kernel can be written as follows: Note that |θ − θ| ≤ ∆. When ∆ is small, the Fejér kernel can be approximated as follows: where the first approximation follows from sin(x) ≈ x for x → 0, and the second approximation is due to the two following facts: Therefore, the CDF of an unordered channel gain can be approximated as follows: After applying the assumption that ∆ → 0, we will further apply the high SNR approximation. Note that at high SNR, both η i and η j go to zero, which means After some algebraic manipulations, the CDF for an unordered channel gain can be approximated as follows: By using the above approximations, the outage probability at user i can be approximated as follows: which means that the diversity gain at user i is i. The results for user j can be obtained similarly, and the proof is complete.
Remark: Note that an implication of having a small ∆ is that the area of the sector becomes so small that there might be no user in it. But in many practical scenarios, such as in a sport stadium or a conference hall, the users are so densely deployed that it is always possible to find multiple users located in a sector even with a small ∆.

IV. RANDOM BEAMFORMING WITH LIMITED FEEDBACK
In the previous section, it is assumed that the base station has perfect knowledge of the users' effective channel gains. However, for a fast time varying situation, this assumption might not be realistic, since the phases of the channel vectors and their fading coefficients, θ k and a k , are changing rapidly. In this section, we investigate two random beamforming schemes with low system overhead.

A. With the Distance Information Available at the Base Station
Compared to the phases and fading coefficients of the channels, the users' distance information will change relatively slowly, which means that it is more realistic for the base station to have access to the users' distance information only. Therefore, in this subsection, we investigate the impact of this partial CSI on the performance of mmWave-NOMA.
Again assume that only the users that fall into the sector D θ will participate in the NOMA transmission. Assume that there are K users in this sector. Since the users' distances are known, the base station will order the users according to the following criterion: instead of using the effective channel gains which are not known to the base station. Similarly to the previous section, we schedule user i and user j for the NOMA transmission to act as the weak and strong users, respectively. Since a user with a shorter distance has a stronger channel condition, we take i > j.
Note that the density functions of the ordered distances have been found in [30] when the users are distributed randomly in a ball. The shape of the addressed area is a sector, but the steps provided in [30] are still applicable, as shown in the following. Particularly, the CDF of d k can be calculated from the probability of the event that there are less than k users inside a sector with radius r, i.e., where A(r) denotes a sector with radius of r, and E i denotes the event that there are i users in A(r). Following steps similar to those for obtaining (20), the factor µ Φ2 (A(r)) can be found as follows: Substituting the expression for µ Φ2 (A(r)) into the CDF expression, the CDF of d k can be expressed as follows: As a result, the corresponding pdf for the k-th smallest distance can be found as follows: where we have used the fact that dγ(2, rφ) dr = e −rφ rφ 2 .
The difference between the above pdf expression and the one in [30] is due to the facts that the area for the addressed problem is not a ball and the addressed density is a function of r.
On the other hand, note that the angle of user k's channel vector is independent of its distance, and it is uniformly distributed between (θ − ∆) and (θ + ∆). Therefore the CDF of user k's channel gain can be obtained as follows: It is important to point out that the above CDF is valid only if we can find the k-th nearest node. Or in other words, if there is no boundary to D θ and the nodes are spread throughout of the plane, the above CDF can be applied. For the addressed scenario, the users are confined in D θ , i.e., r ≤ R D , which means that it is possible that the k-th nearest node does not exist, i.e., there are fewer than (k − 1) nodes in D θ . By using the result in (37) and also considering the possible choices for the number of users in D θ , we can obtain the following lemma for the outage probability and the sum rate.
Lemma 3. When only the users' distance information is available, the outage probability for the k-th nearest node can be written as follows: where k ∈ {i, j}. Moreover, the outage sum rate can be shown as follows: where the k-th nearest user has a targeted data rate of R k .
Remark 1: It is important to point out that the sum rate in (39) means that no transmission will take place if the i-th nearest user cannot be found in D θ , and the NOMA transmission is adopted even if the j-th nearest user can be found but the i-th one cannot. Note that other transmission strategies can also be used for these trivial cases which happen with low probabilities in a densely deployed network.
Remark 2: Note that one can also use a CDF expression conditioned on K to find the outage probability, but this is difficult to evaluate since the conditioning on K converts the Poisson point process to a Bernoulli one to which the result in (36) is not applicable.
Asymptotic performance analysis: While the expressions for the outage probability and the sum rate in Lemma 3 can be calculated numerically, approximations are still desirable in order to obtain greater insight. Following steps similar to those in the previous section, i.e., when ∆ approaches zero, the Fejér kernel can be simplified, which yields the following approximation: Furthermore notice that both η i and η j approach zero at high SNR, which yields the following approximation: and F j (η j ) can be obtained similarly. The integral over θ can be obtained by following steps similar to those in the previous section. In addition, define the integral over r, a factor not related to the transmit SNR, as follows: Therefore the outage probability can be approximated as follows: which demonstrates that the use of distance information only yields a diversity gain of one for all the users. This is expected since the base station has access to partial CSI only and the dynamics of the fading gains cannot be used.

B. With One-Bit Feedback
In the case in which the number of users in the sector D θ is very large, feeding these users' effective channel gains or distances back to the base station can still be very demanding. As an alternative, asking each user to feed only one bit about its channel quality back to the base station can substantially reduce the system overhead.
In particular, the base station will first set a threshold, ξ, ξ > 0, which will be broadcast to all the users prior to the downlink transmission. Each user in the sector will compare its effective channel gain with ξ and send 1 to the base station if its channel gain is larger than ξ, otherwise it will send 0 to the base station. As a result, the users in the sector will be divided into two sets, denoted by S 1 and S 2 , respectively.
Particularly, the users in S 2 are the ones which feed 1 back to the base station, i.e., and S 1 is defined similarly by grouping those users whose feedbacks are 0. When there is more than one user in D θ , i.e., K ≥ 2, and |S n | = 0, n ∈ {1, 2}, the base station will randomly select one user from S 1 to be paired with another user randomly selected from S 2 . If all the K nodes are in one group, the base station will randomly select two users from this group for the implementation of NOMA. If there is only one user in the sector, i.e., K = 1, this user will be served solely by the base station. No user will be served if both sets are empty, which happens only if K = 0. In the following, we will focus on the case with K ≥ 2.
The following lemma provides the outage probabilities for the users selected to act as the NOMA strong and weak users, respectively.
Lemma 4. Suppose that there are K ≥ 2 users in D θ . When each user only feeds one bit back to the base station using the above protocol, the outage probability for the user selected to act as the weak user is given by and the outage probability for the user selected to act as the strong user is given by where P(|S 2 | = n) = K n F π(j) (ξ) K−n 1 − F π(j) (ξ) n , P(|S 1 | = n) = P(|S 2 | = K − n), F S1|K (y) = F π(j) (min{y,ξ}) F π(j) (ξ) , F S2|K (y) = max 0, , andR 1 andR 2 denote the targeted rates for the users selected to act as the weak and strong users, respectively.
Proof. Since there are K ≥ 2 users in the sector, the outage probability experienced by the user chosen to act as the strong user in NOMA can be expressed as follows: where F S2|K (·) denotes the CDF of the effective channel gain of a user randomly selected from S 2 and its expression will be evaluated later. The probability P(|S 2 | is not empty) is equivalent to K n=1 P(|S 1 | = n). Note that F S1|K (η 2 ), the CDF of the weak user's channel, is used for the case of |S 2 | = 0 since the base station will select one user randomly from S 1 to act as the strong user with the targeted rate ofR 2 for the NOMA transmission. Similarly, the outage probability experienced by the user selected to act as the weak user can be expressed as follows: P o S1 =F S1|K (η 1 ) K n=1 P(|S 1 | is not empty) (47) where the variables are defined similarly to their counterparts in (46). Given that there are K users in the sector, the probability for the case of |S 2 | = n can be obtained as shown in the lemma, which is due to the fact that all the users' channels are independent and identically distributed.
The CDF of the effective channel gain of a user randomly selected from S 1 can be expressed as follows: Following steps similar to those in Section III, the addressed CDF can be obtained as follows: for ξ > 0.
On the other hand, the CDF of the effective channel gain of a user randomly selected from S 2 can be expressed as follows: if y > ξ, otherwise F S2|K (y) = 0. Again following steps similar to those in Section III, this CDF can be found as follows: if ξ < y, otherwise F S2|K (y) = 0. Substituting (49) and (51) into (47) and (46), the outage probabilities in the lemma can be obtained and the proof is complete.
By using the outage probabilities obtained in the above lemma, one can easily find an expression for the outage sum rate, which is omitted here due to space limitations.
Obviously the choice of ξ will have a significant impact on the performance of the addressed one-bit feedback scenario. To investigate this impact, we will first study the impact of ξ on the CDFs, F S1|K (η 1 ) and F S2|K (η 2 ).
1) The impact of the threshold on F S k |K (η k ): Becauseη 1 approaches zero at high SNR, min{η 1 , ξ} will also approach zero at high SNR, with a rate of decaying no smaller thanη 1 . Note that the outage probability of the user selected to act as the NOMA weak user is related to F S1|K (η 1 ), which can be approximated at high SNR as follows: In this paper, we are interested in the following two choices of ξ.
• If ξ is a constant and not a function of the transmit SNR, ρ, the following holds at high SNR: (53) • If ξ decreases at a rate of 1 ρ x , i.e., ξ∼ 1 ρ x , x > 0, we have the following approximation: On the other hand, the impact of ξ on F S2|K (η 2 ) can be demonstrated as follows.
• If ξ∼ 1 ρ x , x > 0, we have the following approximation: 2) The impact of ξ on the users' outage probabilities, P o S k : We first focus on the user selected to act as the weak user, whose diversity gain is shown in the following lemma.
Lemma 5. For the two considered choices of the threshold, i.e., either ξ∼ 1 ρ x or ξ is a constant, the diversity order of the user selected to act as the weak user is always one.
Proof. For notational simplicity, let It can be shown that the probability of having n users in group S 2 can be approximated as follows: if ξ approaches zero. If ξ is not a function of ρ, neither is this probability.
• If ξ is a constant, the outage probability of the user selected to act as the weak user can be simplified as follows: since F S2|K (η 1 ) = 0, K n=1 P(|S 1 | = n) is a constant and F S1|K (η 1 )∼ 1 ρ as explained in (53). Therefore the user's diversity order is one for this choice of ξ. • If ξ∼ 1 ρ x , x > 0, the outage probability for the user selected to act as the weak user can be approximated as follows: which is always at the order of 1 ρ as explained in the following. If x > 1, min{η 1 , ξ} = ξ and max{0,η 1 − ξ} ≈η 1 . These two observations lead to the following approximation: sinceη 1 is dominant. If x = 1, we have the following approximation: since both min{η 1 , ξ} and |η 1 − ξ| are at the order of 1 ρ . Further, if 0 < x < 1, we have the following approximation: Therefore, we can conclude that, as long as ξ∼ 1 ρ x , x > 0, the diversity order of the user selected to act as the weak user is one. Since the user's diversity order is one for both cases, the proof is complete.
However, the impact of ξ on P o S2 is more complicated as illustrated in the following: • If ξ is a constant, the diversity order of the user selected to act as the strong user is one, since which is due to the following facts: F S2|K (η 2 ) = 0, P(|S 2 | = 0) is a constant and F S1|K (η 2 )∼ 1 ρ as explained in (53).
• If ξ∼ 1 ρ x , x ≥ 1, the outage probability for the user selected to act as the strong user can be approximated as follows: As can be seen from (62), the choice of the threshold ξ has significant impact on the achievable diversity gain. For example, a full diversity gain of K can be obtained by using the following choice of ξ:

A. System Model and Outage Performance
Consider a scenario in which the base station will form N , 1 < N ≤ M , orthonormal beams, denoted by p m , 1 ≤ m ≤ N , where p H m p m = 1 and p H m p n = 0 if m = n. These beamforming vectors are predefined, and it is assumed that they are known to the base station and the users prior to transmission. Following [13] and [23], these N orthonormal beamforming vectors can be constructed as follows: for 1 ≤ m ≤ N , where ζ denotes a random variable following a uniform distribution between −1 and 1. For notational simplicity, we denote ζ + 2(m−1) N byθ m . Again this beamformer can also be viewed as a special case of the hybrid precoding design in [24], in which the fully-connected architecture is used with N radio frequency chains, M antennas and a digital precoding matrix set as an identity matrix.
Prior to downlink transmission, the base station will first broadcast pilot signals on these N orthogonal beams. Similarly to D θ , define D θm as the wedge-shaped sector aroundθ m with a central angle of 2∆, as shown in Fig. 1. Only the users that fall into the sector D θm will participate in the NOMA transmission on beam m. Denote the number of users in D θm by K m and the k th user's channel by h m,k , 1 ≤ k ≤ K m . Each user will measure its effective channel gain on its corresponding beam, where user k's effective channel gain on the m-th beam is given by |h H m,k p m | 2 . Without loss of generality, we assume that the base station schedules user i and user j on beam m, to act as the weak and strong users, respectively.
Therefore, the base station will superimpose two users' messages on each of the N beams as follows: where β 2 m,1 + β 2 m,2 = 1.
Therefore, user j on beam m will receive the following observation: where n m,j denotes additive Gaussian noise. User j on beam m will first decode the message to user i in the same pair, and then remove this message from its observation. Such SIC needs to be carried out before its own message is decoded. As a result, the SINR for user j on beam m to decode its partner's message can be expressed as follows: Define R m,1 as the targeted rate for user i on beam m and ǫ m,1 = 2 Rm,1 − 1, where ǫ m,2 and R m,2 are defined for user j similarly. If SINR m,i→j ≥ ǫ m,1 , intra-group interference can be cancelled and the user can decode its own information with the following SINR: User i on beam m will decode its own message directly with the following SINR: Different from the case with one beam, the users' SINRs are functions not only of |h H m,i p m | 2 but also of |h H m,i p n | 2 , n = m. In conventional non-NOMA scenarios, users can be scheduled according to their SINRs, i.e., the user with the strongest SINR on beam m will be selected to be served on this beam. However, in the addressed scenario, one user can have two different SINR functions. For example, user j's performance depends on two different SINR functions, SINR m,i→j and SINR m,j . For the purpose of illustration, we focus on a simple user scheduling scheme based on distances, a strategy similar to the one proposed in Section IV-A. Therefore we can order these users who will participate in the NOMA transmission on beam m as follows: Furthermore suppose that user i has a distance larger than that of user j, i.e., i > j. The outage probability experienced by user j can be expressed as follows: Again applying the mmWave channel model, SINR m,i→j , can be written as follows: Similarly, SINR m,j , can be expressed as follows: Unlike those SINR functions in the previous sections, the SINRs for the case with multiple beams become more complicated. An interesting observation is that the three factors in the numerator and denominator of SINR m,i→j share the same fading coefficient. In this case, the outage probability of user j on beam m can be expressed as shown in (74)

B. Asymptotic Performance Analysis
Without loss of generality, we focus on the first beam, i.e., m = 1. In this case, the factor F m j,n can be written as follows: where 2 ≤ n ≤ N . We have the following Taylor series approximation: where F , exist for all orders. Assume that the beams are separated with sufficient gaps, and one can expect that F M − 2(n−1)π N → 0, for 2 ≤ n ≤ N . Further assuming ∆ → 0, (θ n − θ 1,j ) approaches zero, which means ǫm,2 Therefore the sum of the interference terms in the SINR expressions can be approximated as follows: where For the case n = 1, we have which is obtained from (27). As a result, at high SNR, the outage probability experienced by user i can be expressed as follows: where c 4 = β 2 1,1 − ǫ 1,1 β 2 1,2 . The outage probability for user j can be obtained similarly. As a result, following steps similar to those in Section IV-A, the outage sum rate and the outage probabilities can be obtained.

VI. NUMERICAL STUDIES
In this section, the performance of the proposed mmWave-NOMA transmission schemes are evaluated by using computer simulations, where the accuracy of the developed analytical results will also be verified. The path loss exponent is set as α = 2, since line-of-sight links are focused. The radius of D is R D = 10m, the noise power is −30dBm, the blockage parameter is set as φ = 0.1, and β 2 i = 3 4 and β 2 j = 1 4 are used as the NOMA power allocation coefficients. It is worth pointing out that our analytical results are developed for arbitrary choices of these parameters, and using other choices of these parameters will lead to conclusions similar to those drawn in this section. In Fig. 2, the performance of the proposed random beamforming scheme in mmWave-NOMA systems with perfect CSI is studied, where the mmWave-OMA scheme is used as a benchmark. Fig.2.(a) shows the outage sum rates achieved by the two MA schemes, and Fig. 2.(b) shows the outage probabilities of the two transmission schemes. As can be observed from Fig. 2.(a), the use of NOMA can yield a  significant sum rate gain over the OMA scheme, and this gain increases when the targeted data rate of the strong user is increased. For example, for R j = 4 bits per channel use (BPCU), the gain of mmWave-NOMA over mmWave-OMA is 1 BPCU, when the transmission power of the base station is 30 dBm. When R j is increased to 6 BPCU, the performance gain of the NOMA scheme over OMA becomes 5 BPCU. On the other hand, Fig. 2.(b) shows that the mmWave-NOMA scheme can also effectively reduce the outage probability, compared to OMA, particularly for the user with the stronger channel. It is also important to point out that the developed approximation results for the sum rate and the outage probabilities are tight at high SNR, and the developed exact expressions match the simulation results perfectly.
In Fig. 3, the performance of the mmWave-NOMA and mmWave-OMA schemes is compared, for the situation in which the base station has access to the users' distance information only. The trivial cases in which the i-th and j-th nearest nodes do not exist can cause error floors to the outage probabilities. Therefore, we slightly change the definition of the outage probability by counting only the cases in which the two nodes can be found in D θ . Take the outage probability for user i as an example. The outage probability curves are obtained by using n3 n1−n2 , where n 1 denotes the total number of simulations, n 2 denotes the number of events in which user i cannot be found in D θ , and n 3 denotes the number of outage events by excluding the outage events caused by the case in which user i cannot be found (i.e., n 2 ). This is consistent with (38) since the probability shown in the figure is equivalent to the following one As can be observed from both figures, the use of NOMA can yield a significant performance gain in the sum rate and effectively reduce the outage probability, compared to the OMA scheme, even if only the distance information is available to the base station. Again both figures also demonstrate the accuracy of the developed analytical results.        significant impact on the performance of the one-bit feedback scheme. As discussed in Section IV-B, the diversity gain of the strong user is particularly sensitive to the choice of the threshold, and a choice of ξ =η j − 1 ρ K yields a diversity gain of K, whereas the diversity gain of the weak user is always one for the discussed choices of ξ. Fig. 5 clearly confirms these analytical results and demonstrates the impact of ξ on the diversity gain. For example, the slope of the strong user's outage probability curve becomes larger when increasing K, which demonstrates that the diversity gain of this user is an increasing function of K. On the other hand, the slope for the other user's outage probability curve is always the same, which shows that the diversity gain of the weak user is not sensitive to the choice of the threshold.
In Fig. 6, the performance of the proposed mmWave-NOMA scheme with multiple randomly generated beams is illustrated, where OMA is used as the benchmark again. Different from the previous cases with a single beam, the use of multiple beams means that users in the mmWave-NOMA system suffer more interference. Particularly, even if the strong user in a NOMA pair can use SIC to remove its parter's message, it still experiences interference from the users on other beams. However, the fact that mmWave propagation is highly directional can be used to effectively reduce such inter-beam interference. The reason is that the inter-beam interference,   Fig. 7. Sum-rate comparison between the proposed random beamforming transmission schemes. M = 4, K = 5, λ = 1, and ∆ = 0.4. The threshold is set as 1 2 (η 1 +η 2 ).
n =m |am,j | 2 (1+d α m,j ) F M π[θ n − θ m,j ] , is a function of the angle difference between a user's channel vector and the interference beams. With a choice of ∆ = 0.01, i.e., the central angle is about 4 degrees, the inter-beam interference is significantly suppressed, as shown in the two figures. The superior performance of NOMA can also be clearly demonstrated by the fact that the outage probability for the strong user in OMA cannot be reduced to zero, regardless of how large the transmission power is. On the other hand, the use of NOMA can reduce the outage probability rapidly by increasing the transmission power, which is due to the fact that NOMA can realize better spectral efficiency.
Finally, we compare the mmWave-NOMA scheme with perfect CSI to the two schemes with limited CSI. Intuitively, the cases with limited CSI will result in some performance degradation, but the simulation results in Fig. 7 indicate that the schemes with limited feedback can yield an increase of the system throughput, as explained in the following. Take a four-user case as an example, where the users are ordered as in (8). Suppose that the perfect-CSI based scheme is to schedule user 1 and user 2, i.e., two users with poor channel conditions. Because of the ordering ambiguity caused by the use of partial CSI, the one-bit feedback scheme might schedule user 3 and user 4. According to the broadcast capacity region in [31], scheduling users with better channel conditions yields a larger sum rate, which means that it is possible for the schemes with partial CSI to outperform the one with perfect CSI. Fig.7 clearly demonstrates this phenomenon. For example, given K users, when the user with the worst channel condition is paired with the user with the second worst channel condition. The schemes with limited feedback can outperform the scheme with perfect CSI, when the transmission power is 20 dBm. It is worth pointing out that a similar observation has been previously reported in [9] in the context of massive MIMO.

VII. CONCLUSIONS
In this paper, we have investigated the coexistence between NOMA and mmWave communications. We have first considered the application of random beamforming to the addressed mmWave-NOMA scenario, by focusing on the case with a single beam generated at the base station. Stochastic geometry has been applied to characterize the performance of the mmWave-NOMA transmission scheme, by using the key features of mmWave networks, i.e., mmWave transmission is highly directional and potential blockages will thin the user distribution. Two beamforming approaches that can effectively reduce feedback have also been proposed to the addressed mmWave-NOMA communication networks, and the performance for the scenario with multiple beams has also been studied. The provided simulation results have demonstrated that the developed analytical results are accurate, and the proposed mmWave-NOMA transmission schemes yield significant performance gains over conventional mmWave-OMA schemes.