An Improved Method for Parametric Spatial Spectrum Estimation in Internet of Things

When massive numbers of wireless IoT devices are being deployed, cognitive spectrum management is critical to satisfy the explosive broadband requirements of IoT applications. Heterogeneous the target of the spatial spectrum estimation which is part of array signal processing is to achieve the evaluation of space signal parameters and source location, which result in that the spatial spectrum estimation becomes the most basic content of the array signal processing. It needs a method by which the large dimensional array still has its consistency. Therefore, this paper studies an improved large dimensional array parameterized spatial spectrum estimation method based on Pisarenko method, named G-Pisarenko method. Firstly, an improved estimator about the logarithm of the covariance matrix of a certain bilinear form is analyzed which is based on the theory of large dimension random matrix. We can find out a relatively better method, i.e., MW method. The method will become the primitive method for us to improve. Then, aimed at the relating covariance matrix in MW, we use an improved large dimensional array estimation method which can improve the logarithm of the covariance matrix estimation. Finally, we compare the improved method and the original method by simulation, and it can be seen the clear advantage of G-Pisarenko method when the sample number and observed dimensions are in the same order of magnitude.


Introduction
The development of Industrial Internet of Things (IIoT) has been limited due to the shortage of spectrum resources. Also, the traditional ground industrial Internet of Things (IIoT) cannot supply wireless interconnections anywhere due to its small-scale communication coverage, meanwhile the Internet of Things (IoT) is facing the shortage of spectrum resources due to the rapid growth of IoT terminals and big data services. Fifth generation (5G) network owns sufficient spectrum resources and supplies large data volume business, which can help to expand the communication resources of the IoT by combing IoT with 5G network [1][2][3].
Also known as spatial signal processing, array signal processing utilizes some properties of signals in space to process signals [4]. The specific implementation process is to put multiple sensors in different directions in space to form an array, then utilize the specific array of the sensors to receive and process signals, and extract interested signals and related parameters. At the same time, noise and other interference information are suppressed and eliminated. At present, this processing method has occupied a very important position in the field of signal processing, and it is also of great significance in the fields of radar and sonar [5]. Array signal processing has many advantages, which are strong antiinterference ability, higher gain, strong spatial superresolution, and flexible beam control. Therefore, array signal processing occupies a very important position in the current research [6,7].
The earliest spatial spectrum estimation algorithm is called conventional beamforming (CBF) method, which is based on the Fourier spectrum estimation method in time domain. It mainly replaces the data processed in the traditional time domain with the data received by each array element in the spatial domain. It is also called Bartlett beamforming method, but it has a fatal disadvantage. Namely, the resolution of the array angle would be limited by its physical aperture, and the target located in a beam width space could be often indistinguishable. Subsequently, there are many high-resolution estimation methods that extend nonlinear signal processing technology to spatial domain. There are Burg's maximum entropy method (MEM), Pisarenko's method, and Capon's minimum variance method (MVM) [8][9][10]. All these methods have a common premise. Namely, they are based on the continuous distribution of signal sources in space, but they are not satisfied in practice. Therefore, there are great limitations. Later, there would be algorithms to mathematically decompose the data received by the array, such as MUSIC algorithm, which divides the received data into two subspaces of mutually orthogonal signal and noise. Hereafter, the spatial spectrum peak is constructed to improve the resolution [11].
However, no matter what method is used for spatial spectrum estimation, multivariate statistical knowledge is needed. Moreover, in multivariate statistical analysis, there are two kinds of limit results which are quite different. One is that the assumed dimension is very small and the sample capacity is relatively much larger than the dimension. At this time, the theory of satisfaction is called classical limit theory. The other is the large-dimensional limit theory. When the dimension is very high, it might even fail according to the situation described by the classical limit theory. Therefore, the largedimensional limit theory is proposed. At this moment, it is necessary to combine the large-dimensional random matrix with the spatial spectrum estimation to find an improved estimation method, which could satisfy both the dimension and the observed value geometry. With the maturity of spatial spectrum estimation technology, spatial spectrum estimation based on large array would inevitably become the mainstream of development, and the application of random matrix theory in spectral estimation is also a great progress and breakthrough.
Random matrix theory (RMT) refers to the matrix, whose elements are real, complex, or quaternion of random variables, while the analytical and statistical properties of eigenvalues and eigenvectors are mainly researched in random matrix theory. In the recent decades, with the rapid development of computer science and technology, largescale data analysis is becoming more and more important. There are statistical problems of massive data in the fields of biological microarray data, wireless communication network, financial stock analysis, spatial spectrum estimation, and so on. However, the classical statistical method is based on the case that the dimension is much larger than the sample capacity. This is not satisfied when the dimension sample capacity tends to infinity at the same time, so the large dimension random matrix theory is used.
At the earliest, random matrix theory was researched by Wishart and Hsu. The main work was the joint distributions of matrix elements and eigenvalues in multivariate statistical analysis. Subsequently, physicists introduced the theory into atomic physics and thought that a random Hermitian matrix could replace the Shrodinger operator, and its eigenvalues were used to describe the observed energy level. Nowadays, there are more and more kinds of random matrix, and the common basic research object is the limit spectrum distribution. This is helpful to research the more precise properties of matrix eigenvalues, such as the increase and decrease of eigenvalues and the distribution of maximum and minimum eigenvalues. At present, many scholars have conducted more in-depth research.
Recently, based on the technique of contour integral in random matrix theory, an improved estimator for determining the logarithm of covariance matrix in bilinear form has been proposed [12]. This newly demonstrated estimator is applicable not only to high sample capacity (such as traditional estimator) but also to the case that both observed dimension and sample capacity increase at the same rate. In practical application, sample capacity and observed dimension always have the same order of magnitude. In this case, this feature shows very good performance. Better fitting estimation could be obtained through using this estimator in spatial spectrum estimation.
In conclusion, array signal processing has already become a very important branch of signal processing field. Thereinto, the spatial spectrum estimation technology could be used to estimate some parameters of signal source position or signal spatial domain, so it becomes the most basic content of array signal processing. At present, many spatial spectrum estimation methods have been proposed, but they have a common drawback. It is not considered that the number of samples and the observed dimension (the number of array elements) are in the same order of magnitude, but only the number of samples is much larger than the observed dimension under the assumption. Therefore, it is urgent to research an estimation method that could still be consistent in the case of large arrays. This is also the important research content in this paper.

Theoretical Analysis
Through the analysis of traditional Pisarenko algorithm, it is found that these estimates have a common feature, which is only for the case that the number of samples is far greater than the observed dimension. When the number of samples and the number of sensors is at the same level, there are large errors in all estimates, but the actual situation is like this. In this section, an improved estimator about the logarithm of the covariance matrix of a certain bilinear form is analyzed. This new estimator is applicable not only to high sample capacity (such as traditional estimator) but also to observed dimension and sample capacity increasing at the same rate. This estimate value is based on the theory of large dimension random matrix.

Mathematical Modeling.
A set of observations [13] with a definite zero mean of M × Ndimensions (M,Ndifferent) are considered. That is fy 1 y 2 ⋯ y n g and that is to say y n ∈ C M , n = 1, ⋯, N. It is assumed that these observations are equally distributed with zero mean and form a covariance matrix R m together. λ Wireless Communications and Mobile Computing The logarithm of the covariance matrix plays an important role in many signal processing applications, such as spectrum estimation, parameter identification, and spectrum representation. For example, in the spectrum estimation, Pisarenko proposed the following form of spectrum estimation: In this formula, hð⋅Þis a definite reversible function with inverse h −1 ð⋅Þ. s M ð f Þ is a column vector of M × 1, and its kth value is equal to the value of M −1/2 exp ½j2πkf . In the spectrum estimation, when there is hð⋅Þ = log hð⋅Þ, there would be power estimators: Namely, the power depends on the quadratic form of the covariance matrix logarithm of the observations.
In practice, the real covariance matrix is unknown and must be estimated from the samples. The common method is to construct a sample covariance matrix, such asR M = ð1/NÞ∑ N n=1 y n y H n , and then, log R M is estimated from logR M . If there is a process from N to infinity, the estimated value ofR M is always satisfied. It could be assumed that logR M is infinitely close to log R M , when N is large enough. However, in practice, the number of the sample N may not be very large compared with the actual dimension of the matrix R m , so the log R M accuracy is quite low in this case.
To solve this problem, a generalized progressive form would be focused on. Namely, M and N could be regarded as large but comparable numbers. The consistent estimation of definite bilinear form of log R M would be considered, when M and N increase to infinity at the same rate. More specifically, we would think that the numerical estimates like  It is strictly assumed that the number of samples is higher than the observed dimension. That is 0 < C M < 1. The following assumptions [14] are made: (i) (Hypothesis 1). According to the zero mean Gauss law of covariance matrix, the set fRe ðy n Þ, Im ðy n Þ, n = 1,⋯,Ng of M × 1 dimension vectors is distributed independently and identically (ii) (Hypothesis 2). The minimum eigenvalue of R m (or the maximum eigenvalue) is not equal to zero or infinite in M (iii) (Hypothesis 3). The fC M g sequence is not equal to 0 or 1 The main conclusion under this hypothesis is that the estimator of the universal log R M bilinear form is generally applicable, when M and N tend to infinity at the same rate. The above definition of sample covariance is considered.
The coefficient is Thereinto, the value of b μ ðMÞ 1 < ⋯<b μ ðMÞ M is equal to the value μ in the following equation: It is worth pointing out here that the new estimate value could converge to the classical value, if M is taken as a fixed value and N tends to infinity. Actually, in the observation in equation (5) [15] of real covariance matrix decomposition function. This is defined as a matrix ðR M − zI M Þ −1 , z ∈ C + = fz ∈ C : Im ðzÞ > 0g. Thereinto, the contour of the integral is parameterized by a very convenient function ω M ðzÞ, which would be described in the following content. This parameterization allows us to use a definite algebraic expression of consistent estimation of decomposition function ðR M − zI M Þ −1 of sample covariance matrix to express the integrand function. Finally, the expression could be obtained through using the classical Cauchy residue theorem.
Thereinto, C − includes the clockwise direction of all R m eigenvalues' oriented contour, and log ð⋅Þ is a holomorphic logarithmic branch above C \ R − . The contour is regarded as a curve in C − = fω M ðzÞ, z ∈ ∂R − g. For arbitrary y > 0, ∂R could satisfy: for the rectangle R = ½η inf , η sup × i½ − y, y's periphery, ω M ðzÞ is a result of the following polynomial equation: In particular, seen in Figure 1, it is worth noting that The parameterization of the contour C is used, and formula (8) is rewritten as follows: Thereinto, ω M ′ ðzÞ is the complex derivative of ω M ðzÞ. That 0 does not belong to the rectangle R could be seen (because of c M < 1), and there is periphery distance between of the distance 0 of ω M ðzÞ and the eigenvalue of R M . The both holomorphic functions m M ðzÞand q M ðzÞ could be defined as follows: makes, According to the convergence theorem, the results in the relevant data could be reused, and the random quantitieŝ m M ðzÞ andq M ðzÞ in the above integral sum could replace q M ðzÞ. Moreover, when M and N tend to infinity, it is certain that there could be: Thereinto, Therefore, it is necessary that I ðMÞ m in (14) could also be expressed as I ðMÞ m in (4), if the proof of Theorem 1 could be obtained.

2.3.3.
Processing the Produced Integral. Unfortunately, the nonisolated singularity could be appeared for the above integral in R(due to the appearance of logarithms). Moreover, the integral could not be processed with the conventional complex integration method. To solve this, the following realvalued function is defined:  [17], and has single and plural poles at the place b μ ðMÞ k . In k = 1, ⋯, M, the algebraic property could be utilized to know that I ðMÞ m ð0Þ is equal to I ðMÞ m defined in (4). Next, for any function I ðMÞ m ðθÞ that could satisfy θ ∈ ð0, c M Þ, this function is not only continuously differentiable but also satisfactory for: For any arbitrary θ ∈ ð0, c M Þ, this integral could also be calculated in the closed case, in which the conventional integration method is used. Actually, the above integral a meromorphic function, which has double complex poles Through calculating the remainder of these poles, it is easy to obtain the closest expression with respect to dI ðMÞ m ðθÞ/dθ, which is defined as follows: The first three terms of the above equation are a simple primitive function for θ. Thereinto, the appearance of the function b ν ðMÞ r ðθÞ makes the last item difficult to be processed. To simplify the last term, the variables in the above formula could be changed to make θ → b ν ðMÞ r ðθÞ. Afterwards, the polynomial parameters produced in b ν ðMÞ r ðθÞare expressed as the sum of simple fractions. The derived original function and the value I ðMÞ m ð0Þ derived above are used to obtain: Finally, θ = c M is taken, and some algebraic identities are used to obtain the value as shown in Theorem 1.
It could be seen that this method is indeed a good estimate, and it still has good properties in large-dimensional arrays.

G-Pisarenko Algorithm Design
In this section, the aforementioned logarithmic improved estimator is applied to the traditional Pisarenko algorithm to propose a new algorithm, namely, G-Pisarenko algorithm.
3.1. Signal Model. The array used in this paper is a linear array. With the origin as the reference point, the array element position is assumed to be x k ðk = 1, 2,⋯,MÞ. When the angle of the signal incidence is θ i ði = 1, 2,⋯,NÞ, there is τ ki = ðx k sin θ i /cÞðk = 1,⋯MÞthen. When the array is a uniform linear array, there is only a relative delay difference among the different array elements, which receive signals in the same direction. At this moment, the first array element is regarded as the reference array element, and there is τ ki = ððm − 1Þd 4. Simulation Results and Analysis 4.1. Spectrum Peak Search Results. In order to illustrate the validity of the proposed estimates, an array processing procedure with three signal sources is considered. The array element is composed of a uniform array. The distance between the array elements is half a wavelength. All three signal sources are subject to the standard of Gaussian distribution. The superimposed noise is a random process with a mean value being zero and a variance being one. The arrival direction of the signal source being relative to the periphery of the array is fixed to [-40, 0, 40] degree. The simulations are performed in the following cases: The first case is M = 10 and N = 6000. Namely, the number of samples is much larger than the observation dimension (the number of arrays), and it is run once. At this moment, the spectrum peak search results of traditional Pisarenko and G-Pisarenko methods are shown in Figures 2 and 3.
It could be observed that at this moment, the both have the spectrum peaks at -40 degrees, 0 degree, and 40 degrees. However, there would be more serious pseudo peaks under the traditional method. In comparison, there are obvious peak values in the improved method. If the angular interval is very small, the estimation result would be inaccurate. This is also one problem in the traditional Pisarenko's MW method. There are obvious peak values in the improved method, when the angular interval is small. It could be seen that in the traditional method, there are serious pseudo peaks, and the graphics are more chaotic around 0 degree. At -40 degrees and 40 degrees, its effect is not as clear as that of the improvement method also.
The second case is M = 10 and N = 20. The number of samples and the value of the observation dimension are smaller, but the size is in the same order of magnitude. At the same time, it is a constant, which satisfies M/N and is greater than 0 and less than 1. At this moment, the simulation results are shown in Figures 6 and 7.
It could be observed that in the traditional method, not only the pseudo peaks exist but also the values of the pseudo peaks are relatively larger in this case, which is similar to case 1. Its difference is not as obvious as that of the improved method. Therefore, the improved method still has its advantages in this case.
This experiment is repeated 50 times, and its simulation results are shown in Figures 8 and 9.
The third case is M = 10 and N = 20. The number of samples and the value of the observation dimension are larger, but the size is in the same order of magnitude. At the same time, it is a constant, which satisfies M/N and is greater than 0 and less than 1. Since the amount of calculation is especially large when the values M and N are larger, 100 and 200 are used to roughly fit the case of large dimension. At this moment, the simulation results are shown in Figures 10 and 11.
It could be observed that the pseudo peak values in Figure 9 are very close, and the distortion is serious. The spectrum peaks in Figure 10 are obvious. Due to the large amount of calculation, the simulation results after 50 repetitions would not be discussed here, but it could be inferred that there is not much difference between 50 times and 1 time. The estimated spectrum peaks are still more obvious in the improved algorithm.      Wireless Communications and Mobile Computing the improved method could be compared, when the different values are taken for the signal-to-noise ratio. Here, the arrival direction of the signal source could be fixed to the degree ½−40, 0, 40, which is relative to the periphery of the array. The signal-to-noise ratio is between -15 and 20. When running 100 times, the simulation results are as follows: The first case: when there are M = 20 and N = 60, the simulation results at this moment are shown in Figure 12. It could be found that when the number of array elements and the number of samples are in the same order of magnitude, the root mean square error of the G-Pisarenko method is much lower than that of the traditional Pisarenko method. As the signal-to-noise ratio increases, the mean square error of the improved method gradually approaches zero.
The second case: when there are M = 20 and N = 600, the simulation results at this moment are shown in Figure 13. It could be found that when the number of array elements is relatively small, the root mean square error of the G-Pisarenko method is still much lower than that of the traditional Pisarenko. As the signal-to-noise ratio increases, the root mean square error of the improved method gradually approaches zero. The spectrum estimation effect is very well.         Wireless Communications and Mobile Computing

Summary
In this paper, an improved Pisarenko parameterized spatial spectrum estimation method, namely, the G-Pisarenko method, is proposed based on random matrix theory. From the simulation, it could be found that when M and N values are very small, the spectrum peaks of the improved algorithm are more obvious, and there is no redundant pseudo peak. When the M and N values are larger, the spectral peaks in the traditional algorithm are chaotic, but the performance is still well in the improved algorithm. From the perspective of root mean square error, the error of the improved algorithm is much smaller than that of the traditional algorithm. So, this is a very good improved estimation. According to the principle of the improved estimation, the methods in this paper could be used in other algorithms to achieve an improved feature decomposition algorithm for largedimensional arrays. The scheme can be applied to the Internet of things and play an effective improvement. However, at the same time, it is not difficult to find that during the spectrum peak search, a large amount of calculation could be generated in the method proposed in this paper. The complexity of the calculation is so high that it is impossible to discuss the situation where the values of M and N tend to be infinite. Therefore, this paper also needs to be improved on this basis. A simpler and more effective algorithm could be found. In addition, the situations researched in this paper are based on the ideal situations of many assumptions, which could only be ideally approximated in practical occasions. The effectiveness needs to be improved.

Data Availability
The (data type) data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
This paper has clear and legal intellectual property rights or property rights.