A Semidynamic Bidirectional Clustering Algorithm for Downlink Cell-Free Massive Distributed Antenna System

Cell-free massive distributed antenna system (CF-MDAS) can further reduce the access distance between mobile stations (MSs) and remote access points (RAPs), which brings a lower propagation loss and higher multiplexing gain. However, the interference caused by the overlapping coverage areas of distributed RAPs will severely degrade the system performance in terms of the sum-rate. Since that clustering RAPs can mitigate the interference, in this paper, we investigate a novel clustering algorithm for a downlink CF-MDAS with the limited-capacity backhaul. To reduce the backhaul burden and mitigate interference effectively, a semidynamic bidirectional clustering algorithm based on the long-term channel state information (CSI) is proposed, which has a low computational complexity. Simulation results show that the proposed algorithm can efficiently achieve a higher sum-rate than that of the static clustering one, which is close to the curve obtained by dynamic clustering algorithm using the short-term CSI. Furthermore, the proposed algorithm always reveals a significant performance gain regardless of the size of the networks.

1. Introduction 1.1. Background and Related Work. Recently, with the widespread adoption of smartphones and the popularity of multimedia services, mobile data traffic is exploding. As current cellular networks are reaching their breaking point, there is an urgent need to develop new innovative solutions [1]. In cell-free massive distributed antenna systems (CF-MDASs), the antennas are distributed over the intended coverage area. Meanwhile, it has a very large number of remote access points (RAPs) which can use a direct measurement of channel characteristics to serve all mobile stations (MSs) in the same frequency band [2]. It is expected to be a key technology enabler of the sixth generation (6G) mobile communication systems [3]. In CF-MDAS, a large number of MSs in the whole area will be served simultaneously by a large number of separately distributed RAPs, which coordinate with the central processing unit (CPU) [4].
In contrast to the traditional DAS [5], CF-MDAS can further reduce the access distance between MSs and RAPs, which brings low path loss and high spatial multiplexing gain. However, CF-MDAS brings more serious inter-RAP interference, especially in the overlapping area than that of the conventional DAS. Due to the collaboration among RAPs, the system performance based on the sum-rate can be optimized effectively in this way. Nevertheless, it requires the complete channel state information (CSI) of all RAPs processed jointly, a strict synchronization across RAPs, and strong information exchange backhaul capability. Thanks to RAP clustering, which is rated as one of the promising techniques to combat inter-RAP interference for CF-MDAS [6,7], the scale of collaboration can be reduced and backhaul burden by sharing full CSI between a limited number of RAPs can be diminished as well.
Generally, the clustering algorithms are classified into static clustering and dynamic clustering. Static clustering is formed according to the geographic locations of the BSs without any CSI [8,9]. The number of the selected BSs in the cluster is fixed, which does not change over time. Though static clustering is simple enough and does not rely on fast backhaul, MSs at the edge of clusters still suffer from serious intercluster influence. Several studies on dynamic clustering have investigated to overcome the above mentioned problems, Shi et al. proposed a dynamic user-centric cell clustering algorithm [10], which can not only cancel the joint intracluster interference but also effectively alleviate the overall and per-BS cooperation cost. However, it can only count on short-term CSI. In [11], the authors proposed a clustering algorithm based on maximum coordination gain which focuses on minimizing the intercell interference to the celledge MS. Nevertheless, the clustering algorithm ignores the bidirectional cooperation gain between RAPs. The authors in [12] proposed a bidirectional dynamic network (BDN) to improve efficiency in achieving better spectral efficiency (SE) performance. It is worth noting that even dynamic clustering can be exploited to achieve higher cooperative gains than static clustering, but its complexity is very high.
It is also noted that most of the existing clustering algorithms are unidirectional. One cluster chooses the best cluster freely that can bring high channel gain to itself. At the same time, the dynamic forming cooperative clusters will result in frequent changes of clusters and lead to a large signaling overhead, which is based on the short-term CSI. Furthermore, in the CF-MDAS, owing to the limited-capacity backhaul, sharing the short-term CSI and data information of all RAPs is difficult.

Motivation and Contributions.
In the existing literature, we investigate the clustering problem of the CF-MDAS with limited-capacity backhaul, aimed at maximizing the system sum-rate. Since traditional dynamic clustering algorithms cannot be applied to the CF-MDAS directly, it can only consider unidirectional clustering and depend on the short-term CSI. To this end, we propose a semidynamic bidirectional clustering algorithm using long-term CSI. The main idea of the algorithm is to cluster RAPs according to the bidirectional average rate gain among clusters. Our novelties and contributions can be summarized as follows: (i) We develop a network model where each MS is randomly placed into the network. Compared to most bibliographies where the number of MSs in each RAP is fixed, we consider the number of MSs is different in each cluster (ii) We derive a closed-form expression for the average rate per MS based on some approximation techniques, which can be computed with a low computational complexity (iii) Based on the derived expression, we proposed that the process of the semidynamic bidirectional clustering algorithm can be approximately equivalent to combining two clusters with the maximum bidirectional average rate gain per MS in each iteration (iv) We propose a semidynamic bidirectional clustering algorithm for the downlink cell-free CF-MDAS. The proposed algorithm can reduce the backhaul burden and obtain a higher sum-rate with longterm CSI. Simulation results show that our proposed algorithm can achieve a higher sum-rate than the static clustering. Furthermore, our proposed algorithm achieves a performance very close to the optimum curve obtained by dynamic clustering algorithm with the short-term CSI The remainder of this paper is organized as follows. In Section 2, the system model used in this study is described. A semidynamic bidirectional clustering algorithm is proposed in Section 3. Then, we discuss the simulation assumptions and compare the performance of different RAP clustering algorithms in Section 4. Finally, the paper is concluded in Section 5.
For notations, matrices and column vectors are denoted by bold capital letters X and bold letters x, respectively. The transpose and Hermitian transpose are denoted by ð·Þ T and ð·Þ H , respectively. The F × F identity matrix is denoted by I F . The vector 2-norm of x is represented by kxk. The space of all M × N matrices with complex entries is represented by ℂ M×N . A combination of k elements taken from n different elements is presented by n k ! . A complex Gaussian distribution function with mean 0 and variance σ 2 is given by C N ð0, σ 2 Þ. The Gamma distribution function with the shape parameter μ and the scale parameter θ is given by Γðμ, θÞ. The cardinality of a set U is denoted by jUj. The expectation operation is denoted by Eð·Þ.

System Model
In this section, the general system model for the CF-MDAS is introduced including the network model, channel model, and signal model descriptions, respectively. Then, the ergodic achievable sum-rate is given.

Network Model.
We consider a 2-dimension downlink CF-MDAS which consists of M = f1, 2,⋯,Mg RAPs and K = f1, 2,⋯,Kg MSs as shown in Figure 1. RAPs are located at the center of each hexagon, where each RAP is equipped with N t antennas. Define R 1 as the distance between one RAP and any vertex of its hexagon. Therefore, the distance between two nearest RAPs is ffiffi ffi 3 p R 1 . MSs are distributed randomly in the network, and each MS is equipped with a single antenna. We define the number of RAPs along each dimension is the size of network, and it can be changed. A simple illustration of clustering is given in Figure 1, the RAPs in the same color hexagons form a cluster. If a RAP is associated with no MS, it is assumed to be sleeping.
Let V = fV 1 , V 2 ,⋯,V L g, ∀V i ∩ V j = ϕ be the set of clusters, where L is the number of clusters. All MSs choose the best RAP with the maximal large-scale fading. Denote the set of MSs in cluster i as U i , where U i = f1, 2,⋯,jU i jg. Then, the set of MSs can be finally defined as U = fU 1 , U 2 ,⋯,U L g.

Channel
Model. The channel vector between RAP m in cluster i and MS k in cluster i is noted as 2 Wireless Communications and Mobile Computing where g miki denotes the small-scale fading with CN ð0, 1Þ independent and identically distributed (i.i.d.) elements, and β miki denotes the large-scale fading, which can be modeled as where f miki is a log-normal shadow fading variable between RAP m and MS k, η is the path loss exponent, and d miki is the distance between RAP m and MS k.

Signal Model.
The received signal vector of MS k in cluster i is where h ki is the composite channel vector from all RAPs in cluster i to MS k in cluster i noted as h ki = ½h 1iki , h 2iki ,⋯, h jV i jiki , h jki is the composite channel vector from all RAPs in cluster j to MS k in cluster i, w ki is the beamforming vector assigned for MS k in cluster i defined as h jki = ½h T 1jki , h T 2jki ,⋯,h T jV j jjki T , and s k is the data symbol with unit variance destined to for MS k. z ki is the noise following the distribution CN ð0, σ 2 Þ. P is the average transmit power of each RAP.

Ergodic Sum-Rate.
The intracluster interference can be cancelled by using zero forcing (ZF) beamforming, that is, is the compound channel matrix between RAPs in cluster i and MSs within the cluster. Therefore, the signal-to-interference-plus-noise ratio (SINR) of MS k is Then, the downlink rate of MS k can be expressed as With above observations, the ergodic achievable sumrate of the system can be presented as

Semidynamic Bidirectional Clustering Algorithm
As stated previously, dynamic clustering based on short-term CSI will lead to a large signaling overhead among RAPs and MSs, making it infeasible in practical systems. Therefore, we

Wireless Communications and Mobile Computing
propose to form clusters based on long-term CSI. In this section, we first derive the asymptotical average rate per MS associated with long-term CSI, which will be used in the following clustering algorithm design. In what following, we analyze the bidirectional cooperation willingness and the complexity of the optimal clustering by exhaustive search (ES) algorithm. Finally, a semidynamic bidirectional clustering algorithm is proposed.
3.1. Average Rate per MS. In this subsection, we first employ a Gamma approximation technique pioneered in [13,14] to obtain the distributions of both the signal and the interference terms. Based on the distributions, we derive the asymptotical average rate per MS in the high-SINR regime.
The useful channel strength can be denoted as where g H miki g miki = kg miki k 2 is distributed as a chi-square random variable (RV) with 2N t degrees of freedom scaled with 1/2 [14], thus β miki g H miki g miki~Γ ðN t , β miki Þ. Therefore, h H ki h ki is a sum of independent Gamma RVs which does not yield a mathematically tractable expression. Fortunately, the sum of independent nonidentically distributed Gamma RVs can be well approximated by employing the second-order matching technique shown in the following lemma.
Lemma 1 (see [13]). Assume fx i g are independent Gamma RVs with shape and scale parameters μ i and θ i , the sum ∑ i x i can be approximated as another Gamma distributed RV Y which has the same firstand second-order moments, Yð As the consequence of Lemma 1, the approximate distribution of the useful channel strength can be presented as the Γðμ ki , θ ki Þ distribution, wherein Similarly, the interference channel strength can be noted as and its approximate distribution is Γðμ kj , θ kj Þ, where From (11), it is easy to see that μ ki ≤ jV i jN t , where the upper bound becomes exact, when β 1iki = β 2iki = ⋯β jV i jiki . In order to get tractable distributions, in [13], the authors used (9) and (11) to obtain an equivalent i.i.d. channel vector of MS k, i.e., approximate the nonisotropic channel vector h ki as an isotropic vector with i.i.d. CN ð0, θ ki Þ elements and the nonisotropic channel vector h jki as an isotropic vector with i.i.d. CN ð0, θ kj Þ elements. Besides, the authors in [12] proposed that each spatial dimension contributes μ ki /jV i jN t and μ kj /jV i jN t to the shape parameters of the distribution of the signal term and interference term, respectively. Noting that each signal beam lies in a ðjV i jN t − jU i j + 1Þ dimensional space and each interference beam spans a onedimensional [14][15][16], the shape parameter associated with the distribution of the signal term jh H ki w ki j 2 becomes ðjV i jN t − jU i j + 1Þðμ ki /jV i jN t Þ, and the shape parameter associated with the distribution of the interference term jh H jki w bj j 2 turns into μ kj /jV j jN t . Therefore, the distributions of the signal and interference terms can be written as, respectively, with μ kl , θ kl , μ kj , and θ kj as the ones defined in (9) and (11).
In [14], the authors assumed the ZF beams designed at each interfering cluster are orthogonal and verifies the accuracy of this approximation. Based on this, ∑ b∈U j jh H jki w bj j 2 is a sum of jU j j independent Gamma RVs which have the same scale parameters. Therefore, the total interference power produced by cluster j follows that Proposition 2. Based on (12) and (14) and in the high-SINR regime, the average rate of MS k in cluster i can be approximated as where β iki = ð∑ m∈V i β miki Þ/jV i j, and β jki = ð∑ t∈V j β t jki Þ/jV j j.

Wireless Communications and Mobile Computing
For proof, see Appendix A. Figure 2( ð16Þ γ ki and γ bj are all small, both cluster i and j want to cooperate with each other. γ ki is small and γ bj is large, cluster i wants to cooperate with cluster j but cluster j does not want to cooperate with cluster i. γ ki is large and γ bj is small, cluster i does not want to cooperate with cluster j but cluster j wants to cooperate with cluster i. γ ki and γ bj are all large, neither cluster i nor j wants to cooperate with each other.

Bidirectional Cooperation. As shown in
In general, γ ki ≠ γ bj , which means cooperation in different directions making a huge difference. As shown in Figures 2(a)-2(d), there are four different cooperation scenarios according to the locations of MSs.
In Figures 2(b) and 2(c), the cooperation desire of one side is strong, without loss of generality, the other side is correspondingly weak. If only considering the unidirectional cooperation desire of one side, it will not be able to obtain optimal clustering results. To this end, we should consider the bidirectional cooperation between cluster i and j.

The Problem of Clustering.
Due to the limited-capacity backhaul, the maximum number of RAPs in a cluster is defined as Q. By restricting the value of Q, the information exchange of intracluster RAPs can be reduced. To maximize the system sum-rate, the objective problem of clustering can be described as The above problem is a combinatorial optimization problem, and the optimal solution can be obtained by exhaustive search (ES) algorithm. All possible clustering results are defined as a set G that satisfies the cluster size no more than Q. Then, jGj can be written as Clustering RAPs requires two steps. The first step is to determine the cluster size jV i j,i = 1, 2, ⋯, L. The second step is to calculate the number of RAP combination jGj. Assume the size of each cluster is same, i.e., jV 1 j = jV 2 j = ⋯ = jV L j = ½M/L, the possible number of RAP cluster combination scheme is M Therefore, the complexity of clustering by ES algorithm is OðM · M!Þ. For a large M, this method is not feasible. Alternatively, we propose a low-complexity semidynamic bidirectional clustering algorithm using greedy algorithm which will be discussed in the next subsection to find a good suboptimal solution.
3.4. Semidynamic Bidirectional Clustering Algorithm. Consider the bidirectional cooperation, we use the rate gain per MS of the cluster i after cooperating with cluster j to measure its unidirectional cooperation desire to cooperate with cluster j. Assume cluster i and j cooperate as a new cluster l, then we define αði, jÞ and αðj, iÞ as the rate gain per MS of cluster i and j after cooperating, where where R i = ð∑ k∈U i R ki Þ/jU i j is the average rate per MS of cluster i. Analogously, R j and R l are the average rate of each MS of cluster j and l.
Using the greedy algorithm, the problem of clustering to maximize the system sum-rate while considering the bidirectional cooperation can be translated into combine two clusters with the maximum bidirectional average rate gain per MS in each iteration as According to the analysis above, the proposed semidynamic clustering algorithm (the large-scale fading coefficients change slowly compared to the small-scale fading and can be easily tracked, e.g. in a few minutes. Thus the period of updating the cluster depends on the change of large-scale fading.) is summarized in Algorithm 1.
In step 1, each MS is associated with one RAP, which can be determined by the large-scale fading channel factors.
Step 3 describes the process of merging cluster, which uses greedy algorithm to find the suboptimal cooperative clusters under the limitation of cluster size, aimed at reducing the computational complexity. The number of available cluster combinations is . Thus, the complexity of our proposed algorithm is OðM 3 Þ, which is lower than clustering by ES algorithm.

Numerical Results
In this section, we give numerical simulations to compare different clustering algorithms. In the simulations, we use (1) to generate the channels and set path loss exponent η = 4; each RAP is equipped with 2 antennas and each MS is equipped with single antenna. In addition, the height of transmit antennas is 20 m. We define the maximum number of RAPs in a cluster as Q = 4. We assume the noise power is -102 dBm and cell-edge received power is set from -10 dBm to 10dBm. The detail simulation parameters are listed in Table 1. To prevent the contingency of the experiment, the system sum-rate is obtained by averaging 50 drops of MS locations, each of which consists of 10,000 realizations of i.i.d. small-scale channels. In order to compare the proposed algorithm, we simulate the following seven algorithms: (1) Clustering all RAPs: all RAPs are clustered to serve the whole MSs (2) Semidynamic bidirectional clustering algorithm (with legend "SBCA"): the algorithm is proposed in Section 3 (3) Dynamic bidirectional clustering algorithm (with legend "DBCA"): the algorithm uses short-term CSI and considers bidirectional cooperation proposed in Section 3.2 (4) Dynamic unidirectional clustering algorithm (with legend "DUCA") [11]: the algorithm uses shortterm CSI but only considers unidirectional selection (5) No clustering: each MS is only served by the RAP with the strongest massive channel gain and suffers from interference from all the other RAPs (6) Static clustering (2 RAPs): we give a 4 × 4 network topology for example as shown in Figure 3(a), where each MS is served by a cluster consisting of 2 RAPs. It can be applied to other network topologies (7) Static clustering (4 RAPs): we give a 4 × 4 network topology for example as shown in Figure 3(b), where each MS is served by a cluster consisting of 4 RAPs. It can be applied to other network topologies Figure 4 shows the system sum-rates achieved by different clustering algorithms, when K=10, versus cell-edge received power. It can be seen that the system sum-rates of all algorithms increase with the increasing of the cell-edge received power, especially at the low cell-edge received power segment. This is because the interference is small when the cell-edge received power is low, and the signal power increases faster than the interference power. Clustering all RAPs can achieve the highest system sum-rate, but it is an ideal condition which is impossible to implement in reality. If no clustering, MSs will suffer from serious inter-RAP interference, the system sum-rate is the lowest. Even that the static clustering algorithm can improve the sum-rate, however, it cannot adapt to the changes of MS locations; the clusters will not change once formed. The proposed semidynamic bidirectional clustering algorithm exhibits a higher sum-rate than all the static clustering and dynamic unidirectional clustering algorithms. This is because the proposed algorithm uses the long-term CSI and considers the bidirectional cooperation. Though the dynamic bidirectional clustering algorithm performs better than the proposed algorithm, it will cause more signaling overhead. Considering the limitedcapacity backhaul and computational complexity, the proposed algorithm is more practical. Figure 5 depicts the system sum-rates for different number of MSs when the cell-edge received power is 10 dBm. As the number of MSs increases, the proposed algorithm is always superior to the static clustering and dynamic unidirectional clustering algorithms. When the number of MSs is 25, the proposed algorithm can increase the sum-rate about 20% over the static algorithm (4 RAPs). At the same time, the sum-rate of the proposed algorithm is only 1% lower than the dynamic bidirectional clustering algorithm. Figure 6 describes the average rate per MS for different sizes of network when K=10 and the cell-edge receive power is 10 dBm. When the network size increases, the MS average rate will also increase. Since that the distance between MS and its served RAP is more close when the network size increases, the large-scale fading of MSs will become small. From Figure 6, it is easy to see that the proposed algorithm always yields a significant performance. The computational time in MATLAB of different algorithms is presented in Table 2. Since the static cluster algorithms rely on the geographic location of RAPs without any CSI, it is no computational time. The dynamic clustering algorithms forms clusters in each time slot, while the semidynamic clustering algorithm utilizes the long-term CSI, so the complexity of dynamic clustering is much higher than that of the proposed semidynamic bidirectional clustering algorithm. The dynamic bidirectional clustering algorithm considers bidirectional cooperation among clusters. It needs to calculate the average rate gain per MS of two clusters while the dynamic unidirectional clustering algorithm only calculates the average rate gain per MS of one cluster in each iteration. Therefore, the simulation time of dynamic   Wireless Communications and Mobile Computing bidirectional clustering algorithm is longer than dynamic unidirectional clustering algorithm. It can be found from Table 2 that the complexity of our proposed algorithm is low.

Conclusion
In this paper, we have studied the RAP clustering of downlink CF-MDAS and proposed a semidynamic bidirectional clustering algorithm with a lower computational complexity. We derived an approximate expression of the average rate per MS associated with the long-term CSI. Considering the limited-capacity backhaul of CF-MDAS, it is impractical to share the short-term CSI and data information of all RAPs. Based on which, the proposed clustering algorithm only relies on long-term CSI. On the other hand, the proposed algorithm also considers bidirectional cooperation among clusters. The results showed that the proposed algorithm exhibits a higher sum-rate than all the static clustering and dynamic unidirectional clustering algorithms. Meanwhile, the sum-rate provided by the proposed algorithm is closed to the dynamic bidirectional clustering algorithm using short-term CSI. Moreover, the proposed algorithm presents a good performance regardless of the network size.
Based on Lemma 3, then, we have To simplify the expression, defining β iki as the average large-scale fading from MS k to its own cluster i, defining β jki as the average large-scale fading from MS k to its interference cluster j, where ðA:4Þ When in the high-SINR regime, σ 2 can be ignored, then by substituting (A.2) and (A.3) into (A.1), we can obtain ðA:5Þ

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.