Mitigating the noisy solution impact of mixed Gibbs sampling detector in high-order modulation large-scale MIMO systems

A neighborhood-restricted mixed Gibbs sampling (MGS)-based approach is proposed for low-complexity high-order modulation large-scale multiple-input multiple-output (LS-MIMO) detection. The proposed LS-MIMO detector applies a neighborhood limitation (NL) on the noisy solution from the MGS at a distance d — thus, named d-simplified MGS (d-sMGS) — in order to mitigate its impact, which can be harmful when a high-order modulation is considered. Numerical simulation results considering 64-QAM demonstrated that the proposed detection method can substantially improve the MGS algorithm convergence, whereas no extra computational complexity per iteration is required. The proposed d-sMGS-based detector suitable for high-order modulation LS-MIMO further exhibits improved performance × complexity tradeoff when the system loading is high, i.e., when KN≥0.75\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\frac {K}{N}\geq 0.75$\end{document}. Also, with increasing the number of dimensions, i.e., increasing number of antennas and/or modulation order, a smaller restriction of 2-sMGS was shown to be a more interesting choice than 1-sMGS.


Introduction
In order to meet the demands of high transmission capacity, high reliability, and spectral and energy efficiency requirements of modern wireless communication systems, the multiple input and output (MIMO) technique has been proposed and considered an appropriate solution due to their ability to provide multiplexing and diversity gains without the need for additional spectral features. These advantages are further enhanced by large-scale use, called large-scale MIMO (LS-MIMO), which has important application in fifth-generation (5G) wireless communications. Such structures hold the same benefits as conventional MIMO, however on a larger scale. More properly, LS-MIMO is defined as a transmission/reception design using typically several tens or even hundreds of antennas in at least one of the communication terminals, usually in the base station (BS) [1,2]. This turns out to be convenient for the systems in question, since the reduced dimensions demonstrated to reduce the number of operations due to the lower triangular matrix feature. Furthermore, based on the concept of multiple random parallel Markov chains, work in [18] proposes a MR strategy through parallel chains; such strategy reduced the algorithm's running time compared to MGS-MR, despite the increasing of the number of real operations per symbol. The contribution of this work follows: (i) A neighborhood limitation (NL) strategy is proposed aiming at improving the MGS convergence rate operating under higher-order modulation and large-scale MIMO regime. The proposed strategy, called d-sMGS (dsimplified MGS), performs a NL in the random solution coming from the mixture used by the MGS detector. As a result, the impact caused by this noisy solution is mitigated and the convergence is increased. (ii) An analysis of the performance × complexity tradeoff is carried out among the proposed d-sMGS, the conventional MGS [10], and the aMGS (averaged MGS) [14], which the latter is an approach that also aims to alleviate the impact caused by the random solution, although the procedure is based on multiple sampling (MS) strategy, which samples the estimated symbol multiple times and performs a mean operation to obtain the result.
The remainder of this paper is organized as follows. Section 2 presents the adopted large-scale MIMO system model. A review on the MGS technique is presented in Section 3 and the MGS-based approaches with noisy solution-reduced impact are discussed in Section 4, while the aMGS approach is described in Section 4.1 and the proposed simplified MGS with NL detector for LS-MIMO is developed in Section 4.1. Computational complexity is presented in Section 5 and extensive numerical simulation results are analyzed in Section 6. Conclusion remarks are provided in Section 7.

System model and problem formulation
We consider an uplink (UL) single-cell MIMO communication system operating in multiplexing gain mode with K active single-antenna users and N receive antennas at the base station (BS), as disposed in Fig. 1. We mainly investigate the performance × complexity tradeoff of suitable LS-MIMO detection schemes and, for simplicity, the availability of the channel state information at the BS is considered, which also aims to reach the pure efficiency of each detection technique. Thus, the pilot training stage and the respective pilot contamination effect have not taken into account in such context.
Moreover, for simplicity, the communication channel is assumed to be frequency-flat fading, compound by the complex channel matrix H c ∈ C N×K . The elements of H c are all independent complex Gaussian random variables with zero mean and unit variance, i.e., H c i,k ∼ CN [ 0; 1], where H c i,k denotes the element in the ith row and kth column of the matrix H c . Let s c be the K × 1 complex vector corresponding to the K symbols M-QAM transmitted over the single-antenna users, s c ∈ A K c where A c denotes the QAM constellation adopted. The UL received signal, y c i , at the ith BS antenna can be written as: where y c i denotes the ith element of the complex received signal vector y c and s c j is the jth element of s c . In matrix form, the received signal vector at the BS is re-written as where η c denotes the additive white Gaussian noise (AWGN) vector, assumed to be a complex Gaussian random variable with zero mean and variance given by E η c η H c = σ 2 I N , where σ 2 is the noise variance at each receive antenna.
The average received SNR at each receive antenna can be modeled as γ = KP s σ 2 , where P s is the power of the received symbols. For simplicity, it is considered that the large-scale fading effect has been compensated in such a way that all K users' signals are received with equal power at the BS, and assumed equal to KP s , denoting the total sum power available at the transmitters [19].
In this work, a real-valued system model corresponding to (2) is adopted, which is given by: where y ∈ R 2N×1 , H ∈ R 2N×2K , s ∈ R 2K×1 , η ∈ R 2N×1 , and defined as: For the QAM complex alphabet A c , the elements of s assume integer values from the underlying pulse-amplitude modulation (PAM) alphabet A, i.e., s ∈ A 2K .
The maximum likelihood (ML) decision rule is given by: s ML = arg minˆs ∈A 2K ||y−Hŝ|| 2 . However, the ML detector is exponentially complex in K, being prohibitive for large K · N, which is the case of LS-MIMO systems [19].

Conventional method: review of mixed Gibbs sampling detection
The LS-MIMO detector mixed Gibbs sampling (MGS) proposed in [10] is revisited in this subsection, which is based on the motivation to solve the stalling problem presented in the conventional GS detector.
To sample the estimated symbol at each position, a target distribution [20] is evaluated, which is given by: whereŝ i denotes the ith position of the estimated symbols vectorŝ, α denotes a positive parameter, which tunes the mixing time of the Markov chain [20] and is also called as temperature. The conventional Gibbs sampling detector does not include the α parameter in its sample process and thus can be viewed as a special case when α = 1. A larger temperature speeds up the mixing and aims to reduce the higher moments of the number of iterations when finding the correct solution. However, as stated in [10], the stalling problem persists even with large α.
The MGS detector utilizes a mixing of (a) conventional Gibbs sampling (i.e., α = 1) and (b) the infinite temperature version of (5) (i.e., α = ∞), resulting in a random and uniform sample from all the possibilities, called a noisy or random solution in this paper. In this way, the MGS follows a sampling distribution given by: and where q denotes the mixing ratio. The MGS detector of [10] considers the α 1 = 1, α 2 = ∞ combination, which results in a near-ML performance, overcoming the stalling problem of the GS, being also a simple implementation choice. On the other hand, in high-order modulation, such as 64-QAM and 256-QAM, the noisy solution interferes in the algorithm's convergence, since there are a large number of symbols in the constellation and a simple random solution in this signal space has a high possibility of being far from the real solution, which causes the algorithm to require more iterations for convergence. In this sense, the proposed d-sMGS detector acts to mitigate this harmful effect. Regarding the mixing ratio parameter q, in [10], an analysis in low-order QAM constellations is carried out and its suitable value choice is presented as the inverse of the number of dimensions in the system, i.e., q = 1 2K , which is also employed in the proposed detector during our numerical simulations.
In the MGS algorithm, an initial solutionŝ (t=0) is considered for the estimated symbols vector, where t represents the current iteration. Indeed, the initial solution may be chosen either by a random symbols vector or as the output of a linear low-complexity detector, such as zero forcing (ZF) or MMSE. The index i, in addition to the position of the vector s, also denotes the coordinate referring to the MGS algorithm, where i = 1, 2, . . . , 2K. Therefore, each iteration requires 2K coordinate updating. At each iteration, updating the 2K coordinates is performed by sampling the distributions given by: One can notice that by (8) each updated coordinate is fed, in the same iteration, to the next coordinate. The probability of the ith symbol assuming the value a j ∈ A, ∀j = 1, . . . , |A| can be written as: where the cardinality of set A is expressed as |A|, whileŝ i,j denotes the vectorŝ (t) with its ith position changed to the symbol a j . The sampling process based on (9) can lead to a numerical limitation due to the exponential function. In this sense, such implementation was carried out through a logarithmic intermediate step, as: where is ith position of f in descending order, for i = 1, . . . , |A|. A practical and computationally efficient evaluation of MGS target Function is summarized in Algorithm 1.  for j = 1 to |A| do 10: g j = f j − f 11: p ŝ i = a j |ŝ i−1 , y, H = exp g j 12:

Algorithm 1 MGS Target Distribution Function Calculation
end for 13: end for 14: //Terminate The MGS algorithm ends after a certain amount of iterations, and the vector of estimated symbols is chosen as the vector that presented the lowest ML cost, considering all iterations. In the next subsections, the additional strategy of multiple restarts (MR) [10] and the stopping criteria for the iterations and the restarts are addressed.

Multiple restarts
In medium QAM order modulations, such as 16-QAM, the mixing strategy of MGS is unable to achieve near-optimal performance [21] in a reasonable number of iterations, while MR procedure, as proposed in [10], has demonstrated promising results, leading the MGS-MR under 16-QAM to near-optimal performance.
In the aMGS and d-sMGS detectors, the MR strategy is also incorporated, namely aMGS-MR and d-sMGS-MR detectors. Thus, Algorithms 2 and 3 run either a maximal number of restarts R max times or it is limited by a stopping criterion and the lowest cost found considering all restarts is the final solution. As discussed in Section 6, the MR strategy can improve the convergence of the algorithm compared to the same number of iterations in a single execution, resulting in a better performance-complexity tradeoff.

Stopping criterion
Given that the mixing strategy provides the local minimum escaping feature, the evolution of the cost function values across iterations becomes unpredictable and the optimal solution can be found before the maximum number of iterations I has been reached [14]. In this sense, an efficient stopping criterion is paramount in reducing the complexity of the MGS detector.
Similarly, the decision to set a restart in the algorithm requires a criterion definition, since the optimal solution may already have been found, not requiring an extra execution of the algorithm. Hence, MR strategy must be balanced aiming to achieve a better performance-complexity tradeoff.
Stopping criteria have been proposed in the literature. For instance, in [10], the stopping criterion is based on the difference between the best ML cost found so far and the noise variance. Moreover, the QAM constellation size could be taken into account. The main idea in [10] is to stop the detection iterations if a maximum number of iterations I is attained or if the iteration in stalling mode is larger than a maximum of s iterations.
Assume the estimated symbol vector, in the tth iteration, isŝ (t) . The quality metric of Hence, the stalling limit for iterations, s , is given by where c s is a constant depending upon the M-QAM constellation size, which increases with M. Although (12) is suitable as a stopping criterion, a minimum number of iterations c min must be defined to ensure the quality of symbol detection. Therefore, s can be rewritten as where c 1 is a tunning constant which defines the allowed number of iterations in stalling mode.
For the MR strategy, the criterion set the allowable number of restarts r , which also is based on quality metric φ ŝ (t) : and c 2 is the tuning constant adjusting the maximum number of restarts. At the end of each restart, r is computed and checked if the actual number of repetitions is less than r . If yes, go to another run of the algorithm; else, output the solution vector with the minimum cost so far as the final solution.
For the aMGS and d-sMGS detectors presented below, aMGS and d-sMGS, we also assume the stop criteria described in this subsection.

Reducing the impact of noisy solution
Originally, the mixture between the target distribution function solution and the random solution, proposed by MGS detector of [10], attempted to escape local minima that degrade system performance. In fact, this procedure showed to significantly improve the performance, specially in low-order modulation scenarios, as 4-or 16-QAM. On the other hand, in high-order modulation systems, the large number of symbols causes the random solution to degrade the convergence of the algorithm since it is based on a coordinate update process which requires the global solution; thus, one or more positions that consider a random solution (probably erroneous and far from the real solution) interfere in the convergence in the other positions and, consequently, in the global one. This condition is aggravated in high-dimension problems, i.e., combining high-order modulations and number of antennas, which is the case of interest in this work.
In this sense, two approaches that tries to alleviate the harmful impact of the noisy solution are described below. Figure 2 summarizes the coordinate update process on the aMGS and d-sMGS detectors. The strategy of multiple samples in mitigating the noisy solution also runs the risk of nullifying this solution if many samples are employed; this can happen since a mean among many terms from a r.v. with probabilities q and (1 − q)with q << (1 − q) -tends to be an average value in which the term with probability q is nullified. In this sense, the noisy solution would be ineffective and the condition of stalling problem could happen, since the mixing of the MGS is a strategy to specifically tackle it.

Approach #1: Averaged MGS LS-MIMO detector
The aMGS proposed in [14] is addressed herein and is based on the following improvements: 1 Averaged multiple sampling on each coordinate: differently from the single sampling strategy [10], the aMGS employs an average between L e number of samples at each coordinate during the update process. By employing an averaged calculation, an intermediate (averaged) point between the target function symbol and the random symbol is more likely to be chosen, instead of a pure random symbol. As a result, the benefit of local minima escape is maintained, whereas the negative impact on the algorithm's convergence is smoothed. 2 Target function simplification: to reduce the computational complexity related to target function calculation of (9), the aMGS adopts a minimum ML cost approach. This simplification performs less mathematical operations, since the ||y − Hŝ|| computation is already performed in (9). Thus, the aMGS target function, in the tth iteration is evaluated as: i denotes the updated estimated symbol vector until the (i − 1) position at the tth iteration, whereas the other remaining i, (i + 1), . . . , 2K positions assume the values from the previous iteration, i.e., (9), the calculation of (15) performs less operations while achieving the same BER performance [14].

MS in coordinate update process
The coordinate update process of aMGS is defined by: where L e is the number of samples (realizations), and the random variable (r.v.) ρ m,i is a mixture of two r.v. with weight given by the mixing ratio q, defined by: It is important to note that, being (15) a deterministic function, during the L e realizations on each coordinate, (15) is calculated only once, when m = 1. After that, each m realization has the computational cost of generating a random number (relative to the mixing ratio).
At the end of algorithm iterations, the vector with the lowest cost is assumed the best global solution. Due to the mean operation, a slicer for M-QAM constellation is needed at the end of the detection procedure. Thus, whereŝ f−best is the "floating-best" solution which represents the estimated vector related to the best global cost attained after I iterations, andŝ best is the final estimated symbol vector. A pseudocode for the aMGS is described in Algorithm 2.

Approach #2: Simplified MGS with neighborhood limitation LS-MIMO detector
We propose a different approach which is based on a neighborhood limitation of distance d in the random solution and is named d-sMGS LS-MIMO detector. The term simplified refers to the simplified target function of Eq. 15, which is also employed in this scheme. //Coordinate update process 6: for i = 1 to 2K do 7: //Simplified target function calculation 8: for j = 1 to |A| do 9: f j = ||y − Hŝ // L e samples on each coordinate 14: for m = 1 to L e do 15: generate The proposed d-sMGS detector acts in the symbol constellation performing a NL, with distance d in relation to the symbol estimated in the previous iteration, when sorting the random symbol. This procedure showed to significantly improve the convergence when a modulation of high-order is considered, as disposed in Section 6, and presents the lowest per-symbol complexity among MGS and aMGS, since it considers the simplified target function (overcoming the MGS in mathematical operations) and performs a single sample (overcoming the multiple sampling aMGS), as showed in Section 5.

NL in coordinate update process
The d-sMGS coordinate update process is based on a mixture between the simplified target function, Eq. 15, and a limited random solution. Thus, the estimated symbol in the t-iteration at the ith coordinate is given by: where χ i (·) is the mixed r.v. with weight q, defined by: where κ d is the symbol distance function in the real-valued constellation considered, for example, let i , d = n 1 , . . . , n |N | . A pseudocode for the proposed d-sMGS is described in Algorithm 3. The multiple restarts additional strategy is omitted, since it simply restarts the algorithm with another initial solution.

Computational complexity
The computational complexity is described in terms of real number of operations (rops), in which one rop denotes the computational complexity of the real mathematical operations: addition, subtraction, multiplication, or division. For the exponential and logarithmic functions, an approximation through Taylor Series with 18 terms has been considered to calculate the computational complexity. Table 1 describes the per-symbol computational complexity (C T ) involved in each step of d-sMGS algorithm. Additionally, the total per-symbol complexity of the aMGS and the conventional MGS has been evaluated. The per-symbol complexity of the initial solution is denoted by C I , which is adopted in this work as the output of an MMSE detector, which has also its total complexity described in Table 1 [22]. From Table 1, one can notice that the d-sMGS algorithm and aMGS and MGS algorithms have the same asymptotic per-symbol complexity order of O(K 2 ), although the conventional MGS algorithm may require an additional complexity dependent on constellation size due to the exponential function, which is represented by //Coordinate update process 6: for i = 1 to 2K do 7: //Evaluation of χ i (·), Eq. 20 8: generate if (u i > q) then 10: //Simplified target function calculation, Eq. 15 11: for j = 1 to |A| do 12: f j = ||y − Hŝ else 18: //Generation of the d-limited set 19:  Total per-symbol complexity:

MMSE algorithm
Total per-symbol complexity: the cardinality |A|. On the other hand, the additional complexity due to the averaged strategy of the aMGS represents a negligible impact, since it requires only (2L e + 2) rops per iteration, whereas such additional complexity is not dependent on the problem size. The proposed d-sMGS algorithm combines advantages of both by using a single sample such as the MGS and the simplified aMGS target function. The complexity increment given by the neighborhood constraint is considered negligible, since the symbol is already previously estimated and such procedure represents only a random sampling in a restricted vector.
From Table 1, it may be noted that the proposed d-sMGS has its per-symbol complexity independent of the parameter d, so the use of larger neighborhoods in the random symbol generation has no impact on complexity. With respect to the per-symbol complexity of the initial solution, C I , in this work, we adopted the output of an MMSE detector, which has also its total complexity described in Table 1.
It is important to emphasize that the complexity of the d-sMGS, aMGS, and MGS algorithms is defined by the number of iterations, which is controlled by the stopping criterion s , with the upper limit I. Similarly, the amount of restarts is controlled by r , with an upper limit R max . In terms of complexity, the MR procedure can be interpreted as an extra amount of iterations necessary for each new restart. In this sense, an I eff is considered in Table 1, which denotes the total amount of iterations (including all restarts) performed at each symbol period. Since Monte Carlo method is employed in simulations, in Section 6, a mean value of I eff considering all realizations is evaluated and is called effective number of iterations (ENI): where T denotes the total number of realizations (symbol periods) during the Monte Carlo method simulation and I eff,i denotes the I eff in the i-realization.

Quality metric
Due to the large number of parameters involved in the presented LS-MIMO detectors, a simple performance-complexity tradeoff metric is considered [14], which aims to establish a fair comparison analysis between different detection strategies: where BER dB denotes the bit error rate in dB. Higher values of χ(·) imply more efficient and effective LS-MIMO detector.

Numerical results and discussion
In this section, the uncoded BER performance related to the d-sMGS algorithm for LS-MIMO detection is evaluated through Monte Carlo simulations. The simulations are performed for a large-scale MIMO operating in multiplexing mode and assuming that a perfect channel state information is available at the receiver side. Table 2 summarizes the main system and channel parameter values deployed in this section. As proposed in [10], the mixing ratio parameter is adopted as the inverse of the number of dimensions in the system, i.e., q = 1 2K . For the stopping criterion parameters, we have adopted c 1 = 10, c 2 = 1.0, and c min = 10 [14].
This numerical simulation section has been divided into two main parts: in Section 6.1, the mixing ratio q and number of samples L e parameters of the aMGS detector are discussed, as it denotes a technique that also aims at reducing the impact of the noisy solution; in Section 6.2, we present numerical results of performance and computational complexity of the proposed d-sMGS detector against the aMGS and MGS techniques, addressed in this work.

aMGS parameter discussion
The aMGS-MR BER performance for different mixing ratios q = {1/2K, 1/3K, 1/4K}, considering R max = {1, 5, 10}, is presented in Fig. 3 for each fixed L e ∈ {1, 2, 4, 8} sample scenario [14]. The number of users is equal to K = 96 while N = 128 BS antennas (β = 0.75). The system is operating under medium-high SNR, γ dB = 25 dB. First, it is evident that the choice of different mixing ratio values impact both performance and complexity (represented by the ENI quantity at convergence). In addition, one can notice that the large amount of L e = 8 samples becomes harmful to the algorithm, once convergence is achieved with larger ENI. Among the other results, the best performance-complexity tradeoff is presented with L e = 2 samples and q = 1/4K, which results in: χ | L e =2 = 44.89; against χ| L e =4 = 37.85 with 4 samples and q = 1/2K; and χ | L e =1 = 39.66 with 1 sample and q = 1/4K. A detailed analysis of the aMGS performance/complexity gain in relation to the mixing ratio and the number of samples can be found in [14]. It can also be concluded that with increasing number of samples L e , the curve represented by q = 1/2K has its convergence improved, resulting in less complexity. That is, when the impact of the noisy solution is reduced, the choice of q = 1/2K is presented as the best performance-complexity tradeoff. In this sense, the value q = 1/2K is adopted for the proposed detector d-sMGS.
Through the analysis performed in [14], the parameter values summarized in Table 3 have been adopted for the aMGS in the reminder of this work. For the MGS-R, the following parameters have been adopted: q = 1/2K, I = 8K √ M, R max = 50, c 1 = 10, and c 2 = 0.5 [10].  In Fig. 4, the convergence of the aMGS algorithm adopting best q values, from Table 3, is analyzed against the average rops complexity, with 96×128 antennas and 64-QAM [14]. For comparison purpose, a single sampling result using the optimal mixing ratio value as proposed in [10], i.e., q = 1/2K (curve [E]), is also included. One can notice that a less number of samples has shown to be beneficial in this LS-MIMO scenario, since the single sample case presented the best performance combined to the lowest asymptotic complexity, followed by the two-(L e = 2) and four-fold (L e = 4) sampling case. Nevertheless, due to a slightly convergence gain observed with L e = 2 samples, the tradeoff metric for L e = 1 is found to be χ| L e =1 = 39.83 against χ| L e =2 = 44.22 with L e = 2 samples. A detailed analysis of the aMGS performance/complexity gain in relation to the mixing ratio and the number of samples can be found in [14]. An in-depth analysis of the performance-complexity tradeoff of aMGS can be found in [14].

Analysis on the proposed d-sMGS
First of all, we focus on finding the maximum number of iterations I aiming at maximizing tradeoff performance x complexity. In the literature, the quantity I = 8K √ M adopted  Table 3 and optimal value as proposed in [10], i.e., q = 1/2K and L e = 1 (curve [E]) [14] (2021) 2021:83 Page 17 of 22 in [10] is quite reasonable since it takes into account the number of active users and the modulation order. In this sense, Fig. 5 shows the performance convergence of the proposed algorithm with the increase of the maximum number of iterations. We considered K = N = 16 antennas in 64-QAM with NL distance d = {1, 2, 3} and used the parameter a to denote the maximum number of iterations, so that I = aK √ M = 128a. It can be clearly seen that the increase in the NL distance is not beneficial to the algorithm's performance, which is easily explained by the fact that, with increasing d, the neighborhood of the random solution increases, approaching the condition of unrestricted solution in the constellation, retaking its negative impact on algorithm's convergence. Thus, observing the 1-sMGS curve, it can be seen that its convergence is reached with a equal to 8, which coincides with the result adopted in [10]. Therefore, this value I = 8K √ M will be adopted for the proposed d-sMGS detector in the reminder of this work. Figure 6 shows the SNR vs. performance-computational complexity of the addressed detectors. A high system loading, i.e., β ≈ 0.9, in 64-QAM modulation is adopted with √ M and R max = 50 [10]; for the aMGS-MR, I = 3000, R max = 5 and the choice of the mixing ratio value is given according to the best option criterion published by the author [14]. One can notice in Fig. 6a that both proposed detectors presented significant performance gain in the region of high SNR in relation to the other detectors, equivalent to approximately one decade against the second best performance detector aMGS-MR with L e = 8 samples. Differently from that observed previously, the increase in the NL distance did not cause a loss of performance, since the 2-sMGS detector resulted in a marginally similar performance to the 1-sMGS. Thus, it denotes a tendency that the increase of the NL distance can be beneficial in scenarios with greater number of antennas, such as LS-MIMO. Related to the computational complexity, it can be observed that   Fig. 6b, it is reiterated the hypothesis that the increase of the NL distance results in a performance gain. One can notice a significant performance gain in the 4-sample aMGS detector, surpassing the result with L e = 8, which corroborates the hypothesis that a smaller restriction in the noisy solution becomes beneficial with the increase in the number of antennas. In fact, in the region of high SNR, γ dB = 25 dB, it can be seen that the 2-sMGS and aMGS with L e = 4 achieve similar performance, although in the medium SNR region (γ dB = 23 dB), the proposed d-sMGS still appear superior. With respect to the complexity in terms of rops, it is noticed that the 2-sMGS-MR and aMGS-MR detectors with L e = 4 and 8 samples presented a marginally equal complexity in γ dB = 25 dB; however, the least complexity is again reached by the aMGS, specially in medium SNR region (γ dB = [21, 23] dB). Therefore, it can be concluded that the proposed d-sMGS detection technique presented the best performance in both scenarios, and the smaller restriction of neighborhood with d = 2 was a more interesting choice with increasing number of antennas; in addition, there was no significant increase of complexity compared to the multiple sample detector aMGS; in other words, the complexity of the 2-sMGS detector was marginally equal to the lowest complexity techniques: aMGS with L e = 4 and 8 samples. A system loading analysis against BER and rops complexity is depicted in Fig. 7 under γ dB = 25 dB. It may be first noted that at high loading, i.e., β ≈ 0.9, the proposed detection scheme showed a significant gain in performance over the aMGS. In the other regions, there is no clearly outstanding technique; however, a lower restriction in the noisy solution demonstrated better results, which are represented by the 2-sMGS overpassing the 1-sMGS and aMGS with L e = 1 or 2 in front of the L e = 4 and 8 samples. In relation to the computational complexity with N = 64 antennas (Fig. 7b), one can notice that in the medium-high loading region (β ≥ 0.75), the proposed d-sMGS strategy presented less complexity both with respect to multiple sampling aMGS and conventional MGS. In the medium-low system loading results (β ≤ 0.5), multiple sampling schemes presented lower computational complexity. Therefore, one can highlight the superiority of the proposed strategy in both performance and complexity in medium-high loading configurations, demonstrating the potential of this strategy when the LS-MIMO system operates under high loading crowded scenarios. This can be explained as the number of mobile users increases, approaching the full-loading system condition β → 1, the set of possible symbol combinations becomes larger, such that the noisy solution from the mixture has its negative effect aggravated, affecting the algorithm's convergence, whereas the NL strategy is able to mitigate this effect, having a beneficial effect on the convergence which results in improvement in performance and complexity reduction.
With the increasing number of antennas at N = 128, the system loading analysis reflects a clear superiority of the 2-sMGS detector in high loading configurations, both in performance and in complexity. This performance behavior corroborates the hypotheses raised in Fig. 6 regarding performance improvement with increasing NL distance. On the other hand, in medium-low loading, the complexity of 2-sMGS was shown to be greater than aMGS and 1-sMGS, equating only to the conventional MGS-MR.

Conclusions
A neighborhood-limited d-sMGS detector for large-scale MIMO systems has been proposed based on the neighborhood constraint of the noisy solution at a distance of d.
The proposed LS-MIMO d-sMGS detection scheme demonstrated the ability to mitigate the impact caused by the noisy solution from the mixture, which is aggravated and can become harmful when the full system loading condition is present or when a high-order modulation is implemented.
The modifications in the MGS technique proposed here have demonstrated effectiveness in achieving convergence improvements in the detection algorithm, which resulted in significant gains in performance and complexity compared to both the multiple sampling aMGS technique as well as the conventional MGS. These advantages are especially obtained when the system loading is high and there are a large number of antennas, condition favorable to LS-MIMO. Moreover, with increasing the number of dimensions, i.e., increasing number of antennas and/or modulation order, a smaller restriction of 2-sMGS was shown to be a more interesting choice than 1-sMGS.
In addition, the NL strategy represented less complexity per iteration compared to aMGS or MGS, since only one sample is calculated and the simplified objective function is considered. On the other hand, when a low system loading is considered, the NL strategy resulted in a slight increase in complexity.