Capacity of very noisy communication channels based on Fisher information

We generalize the asymptotic capacity expression for very noisy communication channels to now include coloured noise. For the practical scenario of a non-optimal receiver, we consider the common case of a correlation receiver. Due to the central limit theorem and the cumulative characteristic of a correlation receiver, we model this channel noise as additive Gaussian noise. Then, the channel capacity proves to be directly related to the Fisher information of the noise distribution and the weak signal energy. The conditions for occurrence of a noise-enhanced capacity effect are discussed, and the capacity difference between this noisy communication channel and other nonlinear channels is clarified.

idealized assumption of white noise is unpractical, and the coloured noise has practical significance [2][3][4] . We here further derive a general asymptotic expression of the channel capacity for coloured noise, which applies to not only the optimum receiver but also an arbitrary correlation receiver.
In the case of coloured noise and for very low SNR, the conditional probability function can be expanded to the first order  less than that of Fisher information matrix of non-Gaussian noise. This result extends the conclusion of equation (36) by Nirenberg 5 , and also confirms that, in terms of the channel capacity, zero-mean Gaussian noise is the worst case given that the noise vector has a fixed covariance matrix 3,4 . However, we note the channel capacity of equation (6) is achieved by the optimum receiver of equation (3). In many practical cases, the optimum receiver may be not implementable for the unknown noise distribution or the non-closed form of distributions (e.g. α-stable noise 19 ). Thus, we further consider the generalized correlation receiver is not restricted to be memoryless. For the zero-mean vector of E Z [g(z)] = 0 (for a shift in mean) 6 under f Z and for very low SNR, g(x) can be expanded to the first-order Then, for a large observation size N, the statistic T m has the mean Using the Cholesky decomposition of the symmetrical matrix V = E Z [g(z)g(z) T ] = LL T , the output SNR of the receiver can be calculated as . Then, we argue that, for sufficiently large observation times and with the constraint of weak signal energy, the receiver output tends to be Gaussian distributed, and the capacity can be approximately calculated as g m m f and for the positive semidefinite matrix and the equality occurring for . This inequality (13) indicates that the eigenvalue Λ of J(f Z ) is not less than the eigenvalue Λ g of the matrix . Therefore, based on equations (6), (11) and (13), we find g which extends the conclusion of ref. 5 to the case of coloured noise. In addition, the equality in equation (13) also demonstrates the receiver of equation (8) is optimal when i.e. the optimum receiver of equation (3).
We argue that the asymptotic capacity expression of equation (11) has a broader applicability for an arbitrary correlation receiver operated in coloured or white noise environments. As a simple check for the consistency of the results from equation (11) to equation (14), we consider the case of white noise. Immediately, due to the statistical independence of g(z), the expectation matrices and V = E z [g 2 (z)]I. Here, the derivative g′ (z) = dg(z)/dz and I is the unit matrix. Therefore, the matrix , and the channel capacity becomes 1 2 and the equality in equation (15) occurs when that specifies the optimum receiver in the presence of white noise 5 .
Conditions for noise-enhanced capacity. Since the emergence of the concept of stochastic resonance 20 , the employment of noise in enhancing the performance of nonlinear systems has become an interesting option 13,14,[21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36] . Initially, the mechanism of stochastic resonance manifests itself as a time-scale matching condition for the noise-induced characteristic time of systems and the signal period 20,27 . Later, the notion of stochastic resonance has been widened to a number of different mechanisms, e.g. aperiodic stochastic resonance 22 and suprathreshold stochastic resonance 31 . For such stochastic resonance effects 22,31 , there is no matching time-scale that corresponds to the input aperiodic or information-carrying random signal, but the system performance still reaches a maximum at an optimal non-zero noise level. Therefore, the noise-enhanced effect, instead of stochastic resonance, becomes a more appropriate term for describing the enhancement effect of system responses via the addition of noise. Here, if the channel capacity reaches a maximum at an optimal non-zero noise level, then the noise-enhanced capacity effect occurs. Otherwise, upon increasing the noise level, the channel capacity monotonically decreases, this is to say, the noise-enhanced capacity effect does not exist.
There are two approaches for varying the noise in stochastic resonance. One is tuning the noise level but not changing the noise type, and the other is adding extra noise to a given noisy signal, while the extra noise type may be different form the original one. Next, we will demonstrate the occurrence or nonoccurrence condition of the noise-enhanced capacity effect by the above mentioned methods.
First, we will prove that no noise-enhanced capacity effect exists for tuning the scaled noise level in an optimum receiver. For the scaled noise vector Z = DZ n , the covariance matrix ∑ Z can be factored as ∑ Z = DD T and the standardized noise vector Z n has a covariance matrix being the unit matrix = z z I E( ) n n T 7 . A well-known scaling property of the Fisher information matrix is 7,18,37-40 is Λ n that is a fixed quantity for Z n . For such a channel with its optimum receiver, equation (11) indicates the channel capacity monotonically decreases as the noise intensity increases. Thus, no noise-enhanced capacity phenomenon will occur by tuning the noise level. For instance, we consider a threshold receiver based on the function g(x) = sign(x) and the Laplacian white noise with its distribution We note that the threshold receiver is optimum for the Laplacian noise, and satisfies the equality condition in equation (15). In this case, the channel capacity in equation (15) can be calculated as 2 , which monotonically decreases as the noise level σ increases. Thus, there is no noise-enhanced capacity effect.
Secondly, we usually have a given signal corrupted by noise, and the initial noise level is unadjustable. We will prove that the addition of extra noise cannot further improve the channel capacity achieved by the optimum receiver. Under this circumstance, we add an extra noise vector W, independent of Z and S m , to the observation X, and the updated data vector is where the composite noise vector U = Z + W with its distribution f U . In this case, we should employ the statistic to specify the optimum receiver, and the corresponding capacity is then given by with the largest eigenvalue Λ ∼ of the Fisher information matrix J(f U ). For any nonsingular matrix  ∈ × A N N , the Fisher information matrix inequality 3,37-40 holds for we then find the largest eigenvalue Λ of J(f Z ) is not less than the largest eigenvalue Λ ∼ of J(f U ) and ≤ .  C C (20) This result of equation (20) clearly shows that stochastic resonance cannot further improve the channel capacity achieved by the optimum receiver, regardless of adding white or coloured noise vector W.
Thirdly, we note that the above two negative conditions of the noise-enhanced capacity effect arise with the optimum receiver matched to the distribution of the background noise. By Contrast, if the generalized correlation receivers of equation (8) are not optimal for the background noise, stochastic resonance may play an important role in the enhancement of capacity. For example, we consider non-scaled Gaussian mixture noise vector W with its distribution 6,21,28,33 Scientific RepoRts | 6:27946 | DOI: 10.1038/srep27946  , the dependence among noise samples Z n will be weak 41 . The signum function g(x) = sign(x) is adopted to construct the generalized correlation receiver of equation (8), which is not optimal for the coloured noise Z. The optimum receiver indicated in equation (3) for the coloured noise Z is rather complicated, since the distribution f Z does not have a tractable analytic expression 41 . Using the approach developed in ref. 41, we have the expectation matrix W Z with the unit matrix I and , and the matrix V becomes tridiagonal with elements . Then, we calculate the largest eigenvalue of the matrix . In Fig. 2, we show the capacity per signal energy C g /ε = Λ g /2 in equation (11) versus the noise parameters μ and ζ in equation (21). Here, the correlation coefficient ρ 1 = 0.2 and ρ 2 = 0 in the coloured noise model of equation (22). We regard the parameters ± μ as the peak locations of the Gaussian mixture distribution in equation (21), while the parameter ζ as the noise level. It is then clearly shown in that Fig. 2, upon increasing ζ for a fixed value of μ (the noise variance σ w 2 also increases), the noise-enhanced capacity effects exist. The corresponding maxima of C g /ε versus optimal values of ζ are also marked by squares in Fig. 2.
We emphasize that the above noise-enhanced capacity effect is an illustrative case of stochastic resonance that exists for a suboptimal receiver not matching the background noise. However, this mismatch condition is not the decision criteria for the occurrence of the noise-enhanced effect, since the example illustration is under  μ and ζ in equation (21). Here, the correlation coefficient ρ 1 = 0.2 and ρ 2 = 0 in the coloured noise model of equation (22). The corresponding maxima of C g /ε versus optimal values of ζ are also marked by squares. the assumptions of a small signal and a correlation receiver with a large observation size. Beyond these restrictive assumptions, the noise-enhanced effect has been frequently observed 21,24,25,[28][29][30][31] . For instance, the noise-enhanced effect has been demonstrated for non-weak signals in threshold neurons 25,29,31 , where an optimal matching condition is inapplicable to the neuronal model immersed in complex noisy environments. It is sufficiently recognized that a well-established criterion for the noise-enhanced effect is to observe an optimal noise level at which the system response can be optimized.

Discussion
In this paper, we analyse the capacity of a very noisy communication channel with correlation receivers. With the weak signal energy constraint and for very low SNR, we generalize an asymptotic expression of capacity achieved by the optimum receivers in a coloured noisy environment. Moreover, for the case when the optimum receiver is unavailable in practice, a capacity formula is presented for the communication channel with a generalized correlation receiver. We further discuss the occurrence condition of the noise-enhanced capacity effect in the considered communication channel.
A similar asymptotic expression of capacity is also obtained in memoryless 10,11 or memory additive-noise channels [12][13][14] . We emphasize the asymptotic capacity expressions of equations (6) and (11) are different from that in previous literature [10][11][12][13][14] . In Fig. 1, for the channel output Y = g(X), these studies assume the conditional probability density as f Y|S (y|s). Then, the Fisher information matrix is defined as [10][11][12][13][14] . Then, for the zero-mean signal vector E S (s) = 0 and the weak signal energy ε, the mutual information between the input space Φ and the output space Ψ is approximated as [10][11][12][13][14] T S Y S s 0 which is different from the mutual information I(Φ, Ω) of equation (4) based on the Fisher information matrix J(f Z ) of the noise distribution f Z . It is shown in Fig. 1 that the receiver multiplies nonlinear transformation g(x) with optimized coefficients, and obtains a cumulative statistic T m that decides whether the mth signal S m is sent or not. Then, the considered communication channel chooses an optimal signal S m from the signal space to maximize the average mutual information. Since the receiver collecting the weighted nonlinear outputs as the statistic , and for any nonlinear function g, the distribution of T m tends to be Gaussian. This leads to the asymptotic expressions of capacity of equations (6) and (11). We recognize the asymptotic capacity expressions in equations (6) and (11) have application in the context of a very noisy communication channel with a correlation receiver. As a new analytical result of the channel capacity, it has theoretical significance and deserves some exposition.
We also note that, for the linear transfer function of Y = Z + S, the conditional probability density f Y|S (y|s) = f Z (y − s), the Fisher information matrix of equation (27) where the differentiation operator ∇ with respect to S is equivalent to differentiation with respect to Z 3 . Therefore, for the linear additive-noise channel, the considered communication channel has the same capacity as that denoted in refs 10-14.
Besides a linear channel capacity defined and calculated by Shannon 1 , only a few analytical results exist for a variety of different nonlinear channel models. We argue that our asymptotic capacity expression for a nonlinear channel may be valuable for practical channels and coding techniques developed for communication applications in order to approach the established linear Shannon limit, and deserves further extensive study. We here only consider a single correlation receiver for detecting the weak signal, however recent studies in general provide evidence that, besides an optimal noise intensity, an optimal network configuration exists, at which the best system response can be obtained 22,31,[42][43][44][45][46] . Thus, an interesting extension for future work is to investigate the capacity of a very noisy communication channel with receivers connected in various network configurations.  . Then, a receiver multiplies the transformation g(X) with optimized coefficients, resulting in a cumulative statistic T m (X) for deciding whether the mth signal S m is sent or not. The capacity C of a communication channel is given by the maximum of the mutual information I(Φ, Ω) between the input signal space Φ and the channel output space Ω

Methods
where the maximization is with respect to the input distribution f S over the signal space Φ 1-5 .