Genetic Algorithm-Based Beam Refinement for Initial Access in Millimeter Wave Mobile Networks

Initial access (IA) is identified as a key challenge for the upcoming 5G mobile communication system operating at high carrier frequencies, and several techniques are currently being proposed. In this paper, we extend our previously proposed efficient genetic algorithm(GA-) based beam refinement scheme to include beamforming at both the transmitter and the receiver and compare the performancewith alternative approaches in themillimeter wavemultiusermultiple-input-multiple-output (MU-MIMO) networks. Taking the millimeter wave communications characteristics and various metrics into account, we investigate the effect of different parameters such as the number of transmit antennas/users/per-user receive antennas, beamforming resolutions, and hardware impairments on the system performance employing different beam refinement algorithms. As shown, our proposed GA-based approach performs well in delay-constrained networks with multiantenna users. Compared to the considered state-of-the-art schemes, ourmethod reaches the highest service outage-constrained end-to-end throughputwith considerably less implementation complexity. Moreover, taking the users’ mobility into account, our GA-based approach can remarkably reduce the beam refinement delay at low/moderate speeds when the spatial correlation is taken into account. Finally, we compare the cases of collaborative users and noncollaborative users and evaluate their difference in system performance.


Introduction
The next generation of cellular systems (5G) requires both higher data rates (in the order of 10-100 Gbps) and lower endto-end latencies (down to 1 ms) than previous generations [1]. For this reason, it is aimed at utilizing frequency bands in the 30-300 GHz range in order to obtain sufficiently large bandwidths/data rates. Due to power limitation and high path loss at these frequencies, the coverage range is typically small so that highly directional transmissions are required for such millimeter wave (MMW) communications. On the other hand, the physical size of antennas at the MMW band is relatively small, such that large-scale beamforming can be performed in practice [2,3]. Employing large-scale beamforming during the initial access (IA) procedure can be a good way to overcome the increased path loss experienced at higher frequencies (see Section 2 for literature review of the IA systems).
One of the most challenging tasks of IA is that the base stations (BSs) make omnidirectional cell searches with directional beams and at the receiver side the users choose their best beam direction to detect the BSs. Successful access means, e.g., that the received power or the signal-to-noise ratio (SNR) is beyond certain thresholds. After a basic connection is established, the BSs and the users can begin exchanging messages and implement a beam refinement procedure to further improve the beam directions and do additional control actions [4].
For example, the user mobility can be handled by beam refinement. With 5G, it is expected to access wireless networks not only at home or in the office, but also at moving speeds such as in a vehicle. In the moving scenario, the beam refinement process can keep tracking the beams by exploiting spacial correlations so that the computational delay can be remarkably reduced. Furthermore, for vehicular user equipment (VUEs), the system-level performance is improved if we allow a scheme using device-to-device (D2D) communications to enhance the links [5].
IA beamforming at MMW is different from the conventional one since it is hard to acquire the channel state information (CSI) at these frequencies. For this reason, codebook-based beamforming has been recently proposed 2 Wireless Communications and Mobile Computing as an efficient method to reduce the dependency on CSI estimation/feedback [6,7]. Also, several works have been presented on both physical layer and procedural algorithms of IA beamforming [8][9][10][11][12][13][14][15]. However, in these works either the algorithms are designed for special metrics, precoding/combining schemes, and channel models or the implementation complexity grows significantly by an increasing number of BSs/users. Moreover, the running delay of the algorithm has been rarely considered in the performance evaluation. On the other hand, generic machine learningbased schemes have been recently proposed which can be effectively applied for different channel models with acceptable implementation complexity [6,7,[16][17][18].
In this paper, we study the effect of beam refinement on the performance of MMW networks. In our previous work, we proposed an efficient genetic algorithm-(GA-) based beamforming approach [18] which reaches almost the same performance compared to the exhaustive search with low complexity. Based on [18], the contributions of this paper are as follows. (1) We include the GA-based beam refinement at both the transmitter and the receiver side. Also, (2) we compare different machine learning-based analog beamforming approaches for the beam refinement during IA, including GA-based beamforming [18], Tabu search beamforming [16], link-by-link beamforming [17], and two-level codebook beamforming [6,7] in large-but-finite multiuser multipleinput-multiple-output (MU-MIMO) MMW communication systems. Moreover, (3) we analyze the effect of various parameters such as the number of transmit/receive antennas, total power budget, and the power amplifier (PA) efficiency on the network performance. As opposed to the literature, we take the algorithm running delay into account. Thus, there is a trade-off between finding the optimal beamforming matrices and reducing the data transmission time slot, and the highest throughput may be achieved by few iterations. We study the system performance in terms of the end-to-end throughput with service outage constraints as well as the implementation complexity. (4) Furthermore, we evaluate and compare the performance of the considered algorithms under various mobile speed of the users. (5) Finally, we consider the case of collaborative users and compare the system performance in the cases with and without information exchanges among users.
Our results demonstrate that the running delay of the algorithms and power amplifier inefficiency affect the system performance remarkably, which should be carefully considered in the system design. Moreover, our proposed GAbased approach outperforms the considered state-of-the-art schemes, in terms of throughput, and reaches (almost) the same results as in the exhaustive search-based approach with fewer number of iterations. Furthermore, when taking the user mobility into account, the GA-based approach can remarkably reduce the algorithm running delay based on the beamforming results in the previous time slots. With collaborative users, the end-to-end throughput can be improved due to the data exchange by D2D links. Thus, the GA-based beamforming approach can be an appropriate candidate for IA in future wireless networks.

Literature Review
In this part, we present some related research work on IA. The reader familiar with the research area can skip this section and go to Sections 3-5 where we present the system model, the algorithm descriptions, and the simulation results, respectively.
Beamforming techniques at MMW bands have been considered in standard developments IEEE 802. 15.3c (TG3c) [19], IEEE 802.11ad (TGad) [20], and ECMA-387 [21]. The problem formulation for IA beamforming at MMW frequencies is introduced in [8] where a fast-discovery hierarchical search method is proposed. Moreover, several design options for MMW IA are presented in [22], where the basic steps in the 3rd-Generation Partnership Project (3GPP) Long-Term Evolution (LTE) standard are used as references, and the overall delay of each design option as a function of the system overhead is evaluated. Then, [11] compares three approaches, namely, exhaustive search, two-step, and context information-based, in terms of miss-detection probability and discovery time. Another comparison work is presented in [12], where it is shown that different IA protocols have a tradeoff between delay and average user-perceived throughput.
In [18], we introduce a genetic algorithm-based initial beamforming approach and evaluate the effect of the algorithm running delay on the network performance. There are also previous works using the GA-based selection approach in different communication networks. For instance, in [23] an efficient scheduling scheme is designed based on the genetic algorithm in the return-link of a multibeam satellite system. A turbo-like beamforming scheme based on the Tabu search algorithm is proposed in [16] to reduce both searching complexity and system overhead. A concurrent beamforming protocol, which we refer to as link-by-link beamforming, is presented in [17] to achieve high capacity in indoor MMW networks. Finally, for multistage beamforming, a treestructured multilevel beamforming codebook is designed for MMW wireless backhaul systems in [6]. Also, in [7], a lowcomplexity multistage codebook is designed to support the IEEE 802.15.3c protocol. In [9], an exhaustive beam search method is proposed. Two beamforming schemes, namely, random-phase beamforming and directional beamforming, have been tested in [10] under the line-of-sight (LOS) channel conditions. A low-complexity beamforming scheme for initial user discovery is proposed in [13] where limited feedback-type codebooks are used. In [14], an accurate analytical framework for MMW system performance has been developed. Impact of obstacles on the cell search process is considered in [15] for the first time, and a geo-located context database is proposed to speed up the cellular attachment operations by storing and processing the information about the previous cell discovery attempts.

System Model
We consider a MU-MIMO setup with transmit antennas at a BS and multiantenna VUEs, each with antennas. As a result, there are = × total antennas at the receiver side (see Figure 1). This is an extension of our work [18] with single receive antennas and allows for beamforming at the receiver side. We assume that each user has perfect CSI. Also, as a more explicit model compared with [24], VUEs are allowed to exchange data with each other by using D2D links which is similar to the model in, e.g., [5]. We set > . At each time slot , the aggregated received signal vector y( ) at time over the users after receive beamforming can be described as where is the total power budget, H( ) ∈ C × is the channel matrix with the ( , )th element given by , ( ) = , ℎ , ( ), where , is the distance between the receiver antenna and the transmitter antenna and is a path loss parameter, and ℎ , ( ) denotes the small scale fading. x( ) ∈ C ×1 is the intended message signal, V( ) ∈ C × is the precoding matrix at the BS, U( ) ∈ C × is the aggregated combining matrix at the users' side, and z( ) ∈ C ×1 denotes the independent and identically distributed (IID) Gaussian noise matrix. We assume channels remain the same during the whole algorithm running procedure. In this way, we can drop the time index in the following. In our algorithm we assume that each user can share their received signal in order to reach the optimal performance; i.e., is known by user with ̸ = . However, we also compare this user-collaborate scheme with the case that users have no collaborations; i.e., is not known by user with ̸ = .
Furthermore, the channel model H is described as where H LOS and H NLOS denote the line-of-sight and the nonline-of-sight (NLOS) components of the channel, respectively, and the NLOS component is assumed to follow a complex Gaussian distribution. Also, controls the relative strength of the LOS and the NLOS components. In (2), setting = 0 represents an NLOS condition while → ∞ gives a LOS channel. We use this model because most cases in MMW systems have the LOS channel.

Initial Beam Refinement
Procedure. Unlike a conventional beamforming procedure acquiring CSI, in MMW systems we suggest to perform codebook-based beam refinement, which means selecting a precoding matrix V out of a predefined codebook W T at the BS while selecting a combining matrix U out of a predefined codebook W R at the receiver side, sending test signal, and finally making decisions on transmit/receive beam patterns based on the users' feedback about their performance metrics. As the final step of IA [4], the beam refinement procedure can obtain a refined beam alignment at the cost of computational delay. The time structure for a packet transmission can be seen in Figure 2, where part of the packet period is dedicated to design appropriate beams in the IA procedure (mainly the beam refinement part) and the rest is used for data transmission. Thus, we need to find a balance between the beam design delay and the data transmission period by choosing an efficient approach.
Here, we use discrete Fourier transform-(DFT-) based codebooks [25] at both sides which are defined as for the BS, while for the users, where vec ≥ max( , ) is the number of codebook vectors. Note that since our algorithm is generic, one can apply our proposed algorithm for different kinds of codebooks.

Performance Metrics.
The machine learning-based schemes of [6,7,[16][17][18] are generic, in the sense that they can be implemented for different metrics. For the simulations, however, we consider the service outage-constrained endto-end throughput, the complexity and the average number of required iterations as the system performance metric. In some scenarios, it may be required to serve the users with some minimum required rates; otherwise service outage occurs. In the -th iteration round of the algorithm, the service outage-constrained end-to-end throughput in bit-per-channel-use (bpcu) is defined as where Here, denotes the achievable rate of the user at the end of the -th iteration. Also, parameter is the relative delay cost for running each iteration of the algorithm which fulfills it < 1 with it being the maximum possible number of iterations. Then, log 2 (1 + ) is the minimum per-user rate while represents the minimum required signal-tointerference-plus-noise ratio (SINR) of each user. Also, is the SINR at the receiver of user in the iteration round . Hence, we define the satisfied user as SINR ≥ . Here, , is the ( , )-th element of the matrix G = |U HV | 2 which is referred to as the channel gain throughout the paper. Moreover, is the system bandwidth and 0 is the power spectral density of the noise. We set 0 = 1 to simplify the system so that the power (in dB, 10 log 10 ) denotes the receiver side SNR as well.
The optimization problem of (5) is formulated as As opposed to, e.g., [ [29,Eq. 5], we consider the algorithm running delay in the performance analysis. As seen in the following, there is a trade-off between optimizing beamforming matrices and reducing the data transmission period. In this case the optimal solution may be achieved by running the algorithms for a limited number of iterations.
where cons , out , max refer to as the consumed power, the output power, and the maximum output power of the PA, respectively. Also, ∈ [0, 1] represents the power efficiency and ∈ [0, 1] is a parameter depending on the PA class. Setting = 1, max = ∞ and = 0 in (10) represents the special case (with an ideal PA).

On the Effect of User Mobility.
Beamforming solutions for mobile users at high carrier frequencies are important in 5G wireless mobile communications. Here, we use the following mobility model to evaluate the performance of our proposed GA-based beamforming approach and compare the results with those of the considered state-of-the-art schemes. Consider Figure 3 with = 4 multiple-antenna VUEs with data exchange D2D links. Here, we have two cases during the users' mobility.
Case 1. This case includes beam refinement with a random queen as initial guess (dash-line VUEs in Figure 3).

Wireless Communications and Mobile Computing 5
In each time slot with instantaneous channel realization H ∈ C × , do the followings: (I) Initialization: Consider , e.g., = 10, sets of precoding matrices V and combining matrices U , = 1, . . . , , randomly selected from the pre-defined codebook W T and W R . (II) Selection: For each V and U , evaluate the instantaneous value of the objective metric , = 1, . . . , , for example end-to-end throughput (5). Find the best beamforming matrix which results in the best value of the considered metric, named as the Queen, e.g., V and U satisfies Case 2. This case includes beam refinement using the queen in Case 1 as initial guess (full-line VUEs in Figure 3).
By mobility we exploit the spatial correlation by setting the queen of the previous time slot as one of the initial guesses of the next time slot. For , in (1) we assume that we know the moving speed V and the time duration of mobility Δ . In this way, we can get an estimate of the user position in Case 2 in a circle whose radius is found by V⋅Δ with the user position at the previous time slot being the center.

Algorithm Description
In this study, we compare the performance of different IA beamforming methods as follows.
Extended GA-based search [ ]: the algorithm starts by making possible beam selection sets at both transmitter and receiver, i.e., submatrices of each codebook. During each iteration, we choose the best set, named as the Queen, based on the performance metrics (for example, (5)). Next, we keep the queen and regenerate < similar sets around the Queen by making small changes to the Queen (in the simulations, we replace 10% of the Queen columns randomly without loss of generality). Finally, the other − − 1 beamforming matrices are selected randomly to avoid the algorithm from being trapped in a local minima. Note that reducing for a given can increase the chance of being trapped. After it iterations (set by the designer), the queen is returned as the beam selection result in the current time slot. In this way, this is an extended version of our GA-based approach with beamforming at both the transmitter and the receiver, the basic principles of which can be found in Algorithm 1.

Tabu search [ ]:
The Tabu search approach follows the basic idea as in the GA-based scheme [16] where we choose and update the queen by iterations. The only difference is the evolution method of the queen in successive iterations. With Tabu, we use the definition of neighborhood in [16]: one matrix A is defined as another matrix B's neighborhood if (1) A has only one different column compared with B or (2) the index difference between the two corresponding columns in A and B is equal to one. To make beam selection sets, we change the queen from previous round to its neighbors.

Link-by-link search [ ]:
in this strategy, the beam design of users is not optimized simultaneously. Instead, with a greedy approach, the beamforming solution is settled user-by-user by considering the interference from the other − 1 links. The system performance improves in successive iterations until it converges to some (sub)optimal beamforming rules.
Two-level search [ , ]: being inspired by multistage beamforming techniques, e.g., [6,7], we design a two-levelcodebook search scheme for our system. In the first level, the BS transmits messages over wider sectors using the codebook with vec /2 columns, while in the second level it searches the optimal solution within the best such sector by steering narrower beams with an vec -column codebook.

On the Implementation Complexity.
To compare different methods, it is necessary that we consider the implementation complexity of each algorithm. For this reason, we derive the per-iteration complexity of different algorithms based on the fact that the product of matrices of size × and × has the complexity O( 2 ) in MATLAB. In this way, the periteration complexity for the GA-based approach is given by and Tabu = GA , link-by-link = × GA , two-level = 2 × GA . is the number of beam selection sets within each iteration.

On the Effect of User Collaboration.
In order to optimize the end-to-end system throughput (5), each user needs to share its received signal with the other users via the D2D links as mentioned in Section 2. Note that we do not consider the overhead of building up the D2D links in this work. We compare two cases regarding the user collaboration. Case 1 (collaborative users (CUs)). Each user knows the received signals of the other users and the system throughput is optimal.
Case 2 (noncollaborative users (NCUs)). Each user only knows their own received message and the system throughput is suboptimal.
In Section 5, we evaluate the performance of the GA and the Tabu methods in these two cases and investigate the potential gains of collaboration.

Simulation Results
In the simulations, we use the channel model in (2) in the cases with = 0, 3. We set H LOS = 1 × where 1 × refers to the normalized all-ones complex matrix. Except for Figure 4 which shows an example of the GA-based procedure, for each point in the curves the results are obtained by averaging over 10 4 different channel realizations. In all figures, we set it = 1000 since it is a sufficiently large number of iterations after which no performance improvement is observed. Also, in all figures except for Figure 11, we use the normalized distance , = = 1. Moreover, we set = 10, = 5 and vec = 128. In all figures, except for Figure 9, we use the ideal PA; i.e., set max = ∞, = 0, = 1 in (10). In Figure 9 we study the effect of imperfect PAs. In Figures 4, 7, 9, and 10, we consider the service outage-constrained end-to-end throughput (5) as the performance metric with = −4 dB. Finally, Table 1 shows the average number of required iterations in each algorithm to reach the (sub)optimal solution.
On the convergence behavior: Figure 4 gives an example of the GA performance in the cases with ( = 0.001) and without costs of running the algorithm ( = 0), respectively (see (5)). Here, example means we run our algorithm within one single channel realization. From Figure 4 we observe that very few iterations are required to reach the maximum throughput for the cases with delay cost, which is around = 130. That is, considering the cost of running the algorithm, the maximum throughput is obtained by finding a suboptimal beamforming matrix and leaving the rest of the time slot for data transmission (see Figure 2). As a result, as the number of iterations increases, the cost of running the algorithm reduces the end-to-end throughput converging to zero at = 1/ (see (5)). Note that the top value of the delay case is less than the other one due to the delay cost.
If there is no running delay, on the other hand, the system performance improves with the number of iterations monotonically. However, the developed algorithm leads to (almost) the same performance as the exhaustive search-based scheme with very limited number of iterations. For example, with the parameter settings of Figure 4, our algorithm reaches more than 90% of the maximum achievable throughput with less than 100 iterations. On the other hand, with the parameter settings of Figure 4, exhaustive search implies testing in the order of 10 30 possible beamforming matrices. Note that we cannot guarantee that the results are exactly the same with the optimal but because of the "random" part of the algorithm they become very close with large number of it . The tradeoff between the performance and the delay cost is the concern here instead of the exact throughput value.
Finally, all considered schemes follow the same laddertype convergence behavior as in Figure 4. This is because with the considered algorithms the system performance is not necessarily improved in each iteration and may be trapped into local minima. However, considering a couple of random solution checks in each iteration helps to avoid the local minima as the number of iterations increases.
On the effect of service outage: Figure 5 demonstrates the service outage-constrained end-to-end throughput (5) for different values of the required received SNR thresholds in (5). Also, Fig. 6 studies the service outage probability in the cases optimizing (5). Here, the results are presented for = 8, = 8, = 32, = 0, vec = 128, which means single-antenna user at the receiver side. As demonstrated in Figures 5 and 6, the service outage constraint affects the end-to-end and the per-user throughput significantly at low SNRs/severe service outage constraints. However, the effect of the service outage probability decreases as the SNR increases or decreases (Figures 5 and 6).
Comparison of schemes: in Figure 7, we compare the throughput (5) reached by the considered algorithms. It can be seen from the figure that for a broad range of SNRs the GAbased beamforming [18] leads to the best system throughput, followed by the link-by-link search [17], Tabu search [16], and two-level search [6,7].
Moreover, using the same parameter settings of Figure 7, in Figure 8 we compare the cumulative distribution function (CDF) of the per-user throughput (5) reached by the considered algorithms. From the figure we can see that the GAbased beamforming [18] leads to the best per-user throughput distribution, which means more users can be served by higher throughput, followed by the link-by-link search [17], Tabu search [16], and two-level search [6,7]. Table 1 shows the average number of iterations that is required in each scheme to reach a (sub)optimal solution.
Extended GA-based Search Link-by-link [17] Two-level [6][7] Tabu [16]  Here, the results are presented for = 0, = 32, = 4, 8, 12. We can see that, in all methods, except for the link-by-link approach, the required number of iterations is almost insensitive to the number of receive antennas for the considered parameter setting of Table 1.
On the effect of imperfect power amplifier: Figure 9 evaluates the effect of the power amplifier on the throughput (5). We can see that the inefficiency of the PA affects the performance remarkably but this effect decreases with the SNR. This is reasonable because the effective efficiency of the PAs effective = ( out / max ) increases with SNR. End-to-end throughput R (bpcu) On the effect of the number of receive antennas: Figure 10 shows the effect of number of receive antennas per-user on the throughput (5). As seen in the figure, the end-to-end throughput increases with the number of per-user antennas as expected, since multiantenna techniques can improve the data rate remarkably. Moreover, the relative performance gain of the GA-based and the link-by-link scheme, compared to the other considered schemes, increases with the number of receive antennas, which is an interesting point when designing large-scale networks.
On the effect of the user mobility: Figure 11 shows the effect of the users' mobility on the beam refinement delay for the considered algorithms. Inspired by [11], we evaluate the beam refinement delay (we assume that each iteration takes 10 −4 overhead of Δ ) of each algorithm in Cases 1 and 2 to check how well these algorithms are suitable for the mobile users. The algorithm running delays in Cases 1 and 2 of each method are all presented in the plot. Here, the results are presented with = 32, = 4, = 2, = 0, = 0, = 32 dB, moving time Δ = 1 ms, = −3.5. As seen in the figure, both the GA-based algorithm and the Tabubased algorithm can remarkably reduce the beam refinement delay for a broad range of users speeds, since they can use the beam refinement solution in Case 1 as the initial guess in Case 2 when the moving distance is not large. Note that Tabu search has the lowest delay in both cases since it simply changes the queen to its neighbors which takes full advantage of the spacial correlations. However, for GA-based scheme as the users speed increases the beam refinement delay increases slightly, intuitively because the spatial correlation between the positions in successive time slots decreases. Moreover, both the link-by-link search and the two-level-based search do not show noticeable performance gain.
On the effect of collaborative users: Figure 12 shows the effect of the users' collaboration on the end-to-end throughput for the GA and Tabu algorithms. Also, Table 2 presents the average number of required iterations for both the GA and the Tabu search in the cases with the CUs and the NCUs. Here, the results are presented with = 32, = 4, = 2, = 0, = 0. As seen in the figure, the performance of both the GA-based algorithm and the Tabu-based algorithm are reduced in the case of NCU. Also, these reductions decrease as the SNR increases. On the other hand, in Table 2 it can be seen that the NCU case requires  much smaller iteration time compared with the CUs case for different system configurations. Only one iteration is required for the case with ≤ 3.

Conclusion
We extended our previously proposed genetic algorithm-(GA-) based beam refinement scheme to include beamforming at both the transmitter and the receiver, and we compared the performance with alternative beam refinement algorithms in an MU-MIMO system, in terms of the service outage-constrained end-to-end throughput and the implementation complexity. Particularly, our extended genetic algorithm-based scheme can reach almost the same throughput as in the exhaustive search-based approach with relatively few iterations in delay-constrained systems. Also, compared to the considered state-of-the-art schemes, our scheme leads to the highest throughput/per-user throughput and the lowest per-iteration implementation complexity, and the relative performance gain increases with the number of receive antennas. Moreover, non-ideal power amplifiers affect the system performance remarkably, which should be carefully considered during the system design. Furthermore, the GA-based approach can exploit the spatial correlation and remarkably reduce the beam refinement delay for a broad range of users speeds, which means it is an appropriate approach for mobile users. Finally, collaborative users can improve the system-level performance at the expense of computational complexity. For future work, we will investigate our proposed algorithm with more realistic parameter settings/scenarios and compare the result with other structured beamforming methods.

Conflicts of Interest
The authors declare that they have no conflicts of interest.