A MUSIC-Based Algorithm for Blind User Identiﬁcation in Multiuser DS-CDMA

A blind scheme based on multiple-signal classiﬁcation (MUSIC) algorithm for user identiﬁcation in a synchronous multiuser code-division multiple-access (CDMA) system is suggested. The scheme is blind in the sense that it does not require prior knowledge of the spreading codes. Spreading codes and users’ power are acquired by the scheme. Eigenvalue decomposition (EVD) is performed on the received signal, and then all the valid possible signature sequences are projected onto the subspaces. However, as a result of this process, some false solutions are also produced and the ambiguity seems unresolvable. Our approach is to apply a transformation derived from the results of the subspace decomposition on the received signal and then to inspect their statistics. It is shown that the second-order statistics of the transformed signal provides a reliable means for removing the false solutions.


INTRODUCTION
CDMA-based systems are widely used in various wireless applications. In order to exploit the capacity of a CDMA system, multiuser detection techniques are essential. A large number of schemes and algorithms have been devised to enhance the performance and also to reduce the complexity of a CDMA receiver in a multiuser environment. In most cases, some prior knowledge of the user parameters, for example, the spreading code, timing, and power, is assumed. However, in a real system, this may not be the case. Users enter and exit the system irregularly and the base station has to keep track of the status of each user. Various methods could be used to transfer users parameters to the base station, however, one way or the other, they impose some overhead and reduce system capacity. Therefore, another important aspect of the CDMA reception is to assist multiuser detection schemes by user identification. In other words, it is desired to know how many active users are operating at any given time and who they are. This enables the receiver to dynamically adapt itself to the multiuser environment. This capability has a twofold benefit for a CDMA multiuser system. First, the receiver will be able to maximize the cancellation of multiple-access interference (MAI), since it has the updated information on other active users. Second, the degree of complexity, which is almost directly proportional to the performance of the re-ceiver, can be optimized against the number of active users. In other words, when there are a small number of users, the receiver will be able to select a more complex detection algorithm to achieve a lower bit error rate. This is an attractive feature for software defined radio platforms.
Blind user identification enables the receiver to be more self-reliant and may also improve the system efficiency, since side information is not required. Moreover, a blind scheme that is capable of identifying users and their spreading sequences is very valuable for signal intercept and nonintrusive test applications.
Several user identification schemes have recently been introduced [1,2,3,4]. In [1,2,3], the outputs of different branches of a filter bank, each matched to a given signature sequence, are used to identify the active user. This implies the prior knowledge of the signature sequences.
Schemes based on the subspace theory have been proposed for blind channel estimation as well as blind detection for a CDMA multiuser receiver [5,6]. Subspace concept has also been used for user identification in a CDMA system. In [4], a subspace approach based on MUSIC algorithm is introduced that also requires the prior knowledge of all the signature sequences. Also, a blind subspace scheme through recursive estimation of the signature sequences is suggested in [7], however it does not exhibit a consistent convergence behavior.
In this paper, a scheme for blind user identification based on the MUSIC algorithm [4] is proposed. The scheme relies only on the second-order statistics. The main contribution of this work is that the proposed approach does not require the prior knowledge of the signature sequences. Spreading codes and users' powers are discovered and estimated by the proposed scheme.

SIGNAL MODEL
A synchronous direct sequence (DS-) CDMA system is considered with a processing gain of N. The received signal prior to chip rate sampling can be modeled as where A k , b k , and s k (t) denote the received amplitude, the transmitted bit, and the spreading sequence of the kth user, respectively. A k is assumed to be unknown but constant during the period of observation. b k is a random variable taking ±1 with equal probability. Spreading codes are assumed short, that is, supporting only the bit interval T. The white Gaussian noise with a variance of σ 2 is denoted as n(t).
After the chip rate sampling, (1) can be written in a vector form as where s k = (1/ √ N)[sk1 s k2 · · · s kN ] T represents the normalized signature sequence of the kth user. The superscript T denotes the transpose operation; n is a zero mean white Gaussian noise vector with a covariance matrix σ 2 I N , where I N is the N × N identity matrix. For convenience, (2) can be rewritten as

SUBSPACE DECOMPOSITION AND MUSIC ALGORITHM
The autocorrelation matrix of the received signal r can be obtained by The eigenvalue and eigenvector matrices are obtained by performing EVD on the autocorrelation matrix C: where U and Λ are the general eigenvector and eigenvalue matrices. Performing EVD on the autocorrelation matrix of the received signal results in two orthogonal subspaces of signal and noise. The dimension of the signal subspace or, in other words, the number of active users can be determined by examining the eigenvalues, since the smallest eigenvalues have the multiplicity (N − K) [4]. The signal and noise subspaces can be separated as follows: If s i belongs to an active user, it lies in the signal subspace and then f i is equal to zero, however if it is not equal to zero, it indicates that the user corresponding to s i is not active at this moment. By the same principle, if the ith user is active, as the result of s i residing in the signal subspace, g i equals one, and is less than one otherwise.

BLIND USER IDENTIFICATION
If the signature sequences of the users are not known, we have to examine the orthogonality of S and the noise subspace for all combinations of spreading sequences. Since the spreading code is comprised of N chips, this examination calls for a complete search over 2 N−1 different possible combinations of chips in a spreading code. However, there is one major problem with this approach that needs to be resolved. If there are K active users in a system depending on the cross-correlations between the active codes and also the set threshold for (6)-(7), application of the MU-SIC algorithm may not result only in all the active spreading codes in (8), but also in falsely declaring the linear combinations of them. That is simply because the linear combinations of the codes will also satisfy Therefore instead of K, we may obtain K mixed solutions (K < K < 2 N−1 ). Depending on the selected thresholds for detection in (6)-(7), K might even be several times larger than K. As shown in Figure 1, the proposed approach comprises two steps: (1) applying the MUSIC algorithm and (2) resolving the ambiguity.
Since the received signal r comprises only K authentic spreading codes, in order to resolve the ambiguity and distinguish between the authentic and false solutions, we have to somehow inspect the relation of each solution to r. Our approach is as follows. For every result from the MUSIC, we apply a transformation on the received signal and then inspect the statistics of the results. The transformation has to be able to separate different users' signals to avoid their statistics being mixed up. A proper choice for this task is to use decorrelating transformation. This does not seem possible since the spreading codes are not yet known. Assuming prior knowledge of signature sequences, in a synchronous CDMA system, we can devise a decorrelator receiver only based on signal subspace information for each active user [5]. In our case all the K solutions resulting from the MUSIC projection can be regarded as the prior knowledge of signature sequences, and since the signal subspace information is already available from the first step, we can proceed to implement the decorrelator receiver d i for each of the candidate solutions where µ i is a nonzero normalizing factor [5]: Depending on the nature of s i , application of (10) to the received signal produces different results. If s i is an authentic solution, then d i represents a single decorrelating function as stated in (10): However, if s i is not an authentic solution, it results from a linear combination of active codes, and then d i will be a linear combination of decorrelating functions of the active codes as well. If where α j 's are real numbers representing the combining factors, then the decorrelating transform is where By applying (10) to the received signal, we have where w i is white Gaussian noise with a variance σ 2 Application of (12) and (14) results in noise enhancement for the two cases. However, the results of decorrelating transforms operating on the data part of (16) are significantly different. If we only focus on the data part of the received signal, where s i is an original code,  Figures 2 and 3, the distinct difference between the two cases lies in their statistics. For the case where s i is an authentic solution, samples at the decorrelator output are clustered about the ±A i . In Figure 2, the only source of perturbation of the samples is the additive noise; interference from other codes does not exist. However, when the s i is a false solution, resulting samples are dispersed significantly. The amount of dispersion depends on the number of constituting codes, corresponding data bits, combining factors, and receive amplitudes.
Based on this difference, we define a cost function J(d i ) that measures the deviation from the average of the absolute value of the decorrelation results: where E(·) indicates expectation of produced samples over all possible noise and data sequences. Another way to interpret the definition of the cost function is the following. The main difference between the two cases of a false or authentic solution is how the power of the signal is distributed over the amplitude samples. In the case of an authentic solution, the power is mainly concentrated over a small range of amplitudes in the vicinity of the mean absolute amplitude. However, in the case of false solution, the values are irregularly spread over a wide range of samples. Hence, the difference of the total power and the power of the mean absolute amplitude can be used to distinguish the two cases: Assuming A i σ wi , then we have Now, we consider the case when s i is a false solution. In this case, since the interference from the other codes is the dominant contributor to the dispersion, and the additive noise is much less significant, The probability density function of z i is a function of the combining factors, the receive amplitudes, and the information bits of interfering users. Therefore, a closed form general derivation does not seem to be easy to find.
For a special case where there are many active users, the probability density function p(z i ) can be approximated as a zero mean Gaussian distribution by using the central limit theorem: where Then the mean of the absolute amplitude is Now the cost function can be evaluated: As (27) shows, even if the noise is removed, the interference term will still remain. The only way to remove the interference term and to make (27) insignificant is to have all the combining factors α j = 0, but it contradicts the assumption of a false solution.
After finding the active spreading codes, user identification will be completed by estimating the users' power. An estimate of the users' powers can be obtained from (4) as follows: equivalently, where σ 2 is estimated from the initial subspace decomposition. Also, instead of a group estimation of powers, a given user's power can be independently estimated as

SIMULATION RESULTS
Through out the simulations, a processing gain of N = 16 is assumed. The accumulation length for evaluation of autocorrelation matrix, L1, and the observation length for inspecting the statistics of z i , L2, are considered as L1 = 5000 and L2 = 500 samples, unless specified otherwise. The accumulation lengths can be shortened to make it more appropriate for a dynamic communication environment. As will be shown, a trade-off between the accumulation lengths and the detection margin could be made. Since the spreading codes are not available in advance, signature sequences are generated by a 2 N−1 counter and then projected onto the subspaces.   In the next simulation, signals from 10 users arrive at the receiver. As a result of initial subspace decomposition and projection, 64 solutions are found. By inspecting the eigenvalues, it is learned that there are only 10 active users and the remaining 54 solutions are false. In order to resolve the ambiguity, the cost function is measured for each solution and its inverse is plotted in Figure 6. As shown, solutions associated with active users have significantly higher J(d i ) −1 , and false solutions can be easily distinguished and eliminated by their low J(d i ) −1 . The simulation is repeated for two different conditions of signal-to-noise ratio (SNR). In Figure 6a, it is assumed that all users are of equal power and have an equal SNR = 30 dB. However, for the second case presented in Figure 6b, it is assumed that there is one weak user with SNR = 20 dB and for the remaining 9 users, SNR = 30 dB. This is a worst-case scenario for the weak user. Figure 6 demonstrates that for both cases of equal and nonequal power, there is a considerable margin for correct discovery of the active users.
For a dynamic communication environment, it is essential that the processing delay for detection of the active users be reduced. In the following simulations, we investigate the effect of observation lengths on the detection process. In the simulations, 10 equal-power users with SNR = 30 dB are assumed. Figure 7 presents the result for the effect of L1, while L2 = 500. In principle, L1 has to be long enough to assure an accurate capture of the statistics of the received signal. Thus, in a system with K active user, one may expect that L1 should to be several times larger than 2 K . As Figure 7 shows, although L1 = 50 causes significant reduction in detection margin, a value of L1 = 500, while not being too long, can provide a significant margin for detection. Since the length of L1 is proportional to the number of active users, in practice the selection of L1 can be done adaptively as follows. The process starts with a moderate value for L1, and then by obtaining the number of active users from the subspace decomposition, L1 can be adjusted for the next batch accordingly. For example, if the number of active users is found to be small, then L1 can be shortened. On the other hand, if K was large, then L1 should be increased for an accurate tracking of the users.  Figure 8 shows the effect of L2 on the detection process. L2 can be selected significantly smaller than L1, since b k takes only ±1. As Figure 8 demonstrates, the difference between L2=100 and L2=1000 is negligible. Therefore, in order to acquire an accurate estimate of the statistics of z i , L2 can be only a few tens of bit periods long. Also, it is worthwhile to note that the main difference between L2 = 10 and L2 = 100 is in the floor level of the plots. A higher value of L2 results in a lower and a more uniform floor for the J(d i ) −1 plot. To summarize our observations from Figures 7 and 8, it can be concluded that the impact of L1 is more on the peaks, however L2 influences the floor level of the J(d i ) −1 plots. Figure 9 shows the estimation error (σ Ai /A i ) of the receive amplitude at various users' powers scenarios. In this case, we assume there are 8 active users in the system.  After performing the identification, we estimate their powers. Users are grouped into one, two, four, and eight groups of equal powers with the following SNR's (dB) at the receiver side: As demonstrated in Figure 9, in any scenario, the estimation error for users with highest SNRs is very low. Also, it should be noted that the estimation error for a user with a certain SNR is about the same in any users' power scenarios. For example, the estimation error for users with SNR = 20 dB, in any of the above scenarios, is in the same range of 5 × 10 −3 to 8 × 10 −3 . Similarly, the estimation error for users with SNR = 38 dB is always in the vicinity of 1 × 10 −3 . In other words, the estimation error is mainly a function of the signal-to-noise ratio of each user and the interference from other users does not have significant impact on it.

CONCLUSION
To increase the capacity of DS-CDMA system, employment of multiuser detection schemes becomes essential. Multiuser detection schemes require some knowledge about each active user and their relevant parameters. The accurate estimate and knowledge of the active users and their parameters play a significant role in the success of a multiuser detection scheme in canceling multiple access interference. Since MAI is a dynamic parameter in a multiuser environment, it is essential to perform user identification for better MAI cancellation as well as the optimization of the receiver structure. A blind MUSIC-based approach for user identification and power estimation in a multiuser synchronous CDMA environment is suggested. It is shown that the algorithm is perfectly capable of blind user identification. The simulation results indicate the accuracy of the identification and power estimation process.