High-SNR capacity of wireless communication channels in the noncoherent setting: A primer

This paper is dedicated to Prof. Johannes Huber on the occasion of his 60th birthday.
https://doi.org/10.1016/j.aeue.2011.02.003Get rights and content

Abstract

This paper, mostly tutorial in nature, deals with the problem of characterizing the capacity of fading channels in the high signal-to-noise ratio (SNR) regime. We focus on the practically relevant noncoherent setting, where neither transmitter nor receiver know the channel realizations, but both are aware of the channel law. We present, in an intuitive and accessible form, two tools, first proposed by Lapidoth and Moser (2003) [12], of fundamental importance to high-SNR capacity analysis: the duality approach and the escape-to-infinity property of capacity-achieving distributions. Furthermore, we apply these tools to refine some of the results that appeared previously in the literature and to simplify the corresponding proofs.

Introduction

Most wireless communication systems operate in the noncoherent setting where neither transmitter nor receiver have a priori information on the realization of the underlying fading channel. As channel state information is typically acquired by allocating transmission time and/or bandwidth to channel estimation (a typical example is the use of pilot symbols [18]), a problem of significant practical relevance is to determine the optimal amount of resources to be used for this task. This problem can be addressed in a fundamental fashion by determining the Shannon capacity (i.e., the ultimate limit on the rate of reliable communication [4]) in the noncoherent setting. Unfortunately, corresponding analytical results are exceedingly difficult to obtain, even for simple channel models [1]; nevertheless, significant progress has been made during the past few years by studying the capacity behavior in the asymptotic regimes of high and low signal-to-noise ratio (SNR). Throughout this paper, we shall deal exclusively with the high-SNR regime. The capacity behavior at high SNR turns out to be very sensitive to the channel model used [12], [9], [6]. In this paper, we shall focus on a channel model—the correlated block-fading model [13], [15]—that is simple and yet rich enough to illustrate some of the possible asymptotic dependencies of capacity on SNR, namely, logarithmic with different pre-log factors [13], [9], [21], [7], or double-logarithmic [12]. The aim of this tutorial paper is two-fold:

  • We present, in an intuitive and accessible manner, two tools that turn out to be exceedingly useful in the characterization of capacity at high SNR: the duality approach and the escape-to-infinity property of capacity-achieving distributions. These tools were first introduced in [12].

  • We use these tools to refine a result that appeared previously in [13] and to provide an alternative and much simpler proof of a result in [21], [7]. Furthermore, we develop insights into the use of duality by exploiting the geometry of the correlated block-fading model.

Uppercase boldface letters denote matrices and lowercase boldface letters designate vectors. Uppercase sans-serif letters (e.g., Q) denote probability distributions,1 while lowercase sans-serif letters (e.g., r) are reserved for probability density functions. The superscripts T and H stand for transposition and Hermitian transposition, respectively. We denote the identity matrix of dimension N×N by IN; diag{a} is the diagonal square matrix whose main diagonal contains the entries of the vector a, and λq(A) stands for the qth largest eigenvalue of the Hermitian positive-semidefinite matrix A. For a random vector x with distribution Q, we write xQ. We denote expectation by E[], and use the notation Ex[] or EQ[] to stress that expectation is taken with respect to xQ. We write D(Q()||R()) for the relative entropy between the distributions Q and R [4, Sec. 8.5]. Furthermore, CN(0,R) stands for the distribution of a circularly symmetric [10, Def. 24.3.2] complex Gaussian random vector with covariance matrix R. For two functions f(x) and g(x), the notation f(x)=O(g(x)), x, means that limsupx|f(x)/g(x)|<, and f(x)=o(g(x)), x, means that limx|f(x)/g(x)|=0. Finally, log() indicates the natural logarithm.

In our quest for simplicity of exposition, we chose to focus on the correlated block-fading channel model [13], [15]. In this model, the channel changes in an independent fashion across blocks of N discrete-time samples and exhibits correlated fading within each block (with the same fading statistics for all blocks). The input–output (IO) relation corresponding to one such block is given by:y=diag{h}x+w.Here, x=[x1xN]TCN is the (random) input vector, which we assume to satisfy the average-power constraint1NE[||x||2]ρ.The vector wCN(0,IN) represents additive white Gaussian noise (AWGN), and hCN(0,R) contains the fading channel coefficients. The vectors x, h, and w are mutually independent. We assume that R has rank Q(1QN) and that the main-diagonal entries of R are all equal to 1. Throughout the paper, we consider the noncoherent setting where transmitter and receiver know the statistics of h, but not its realizations.

The model we just described may seem contrived at first sight. Yet, it is of practical relevance for at least two reasons. First, it captures the essence of channel variations (in time) in an accurate but simple way: the rank Q of R corresponds to the minimum number of entries of h that need to be known to perfectly recover the whole vector (in the absence of noise); therefore, larger Q corresponds to faster channel variation. Second, when R is circulant, the IO relation (1) coincides with the IO relation—in the frequency domain—of a cyclic-prefix orthogonal frequency-division multiplexing system [16] operating over a frequency-selective channel with Q uncorrelated taps. In other words, the model in (1) can be thought of as the dual of the widely used intersymbol-interference channel model. Independence across blocks is a sensible assumption for systems employing time-division multiple access or frequency hopping [14]. Finally, we remark that for the special case Q=1, the channel model in (1) reduces to the piecewise-constant block-fading channel model previously used in numerous papers such as [14], [7], [21].

The capacity of the channel in (1) is given byC(ρ)=1NsupQI(x;y).Here, I(x;y) denotes the mutual information [4, Sec. 8.5] between x and y in (1), and the supremum is taken over all distributions Q on x that satisfy the average-power constraint (2). Because the variance of the entries of h and w is normalized to one, we can interpret ρ as the receive SNR.

The literature is essentially void of analytic expressions for C(ρ), even for the simplest case N=1. Nevertheless, as we shall see in the next section, the high-SNR behavior of C(ρ) can be characterized fairly well.

For the general case 1QN, Liang and Veeravalli showed that [13, Props. 3 and 4]C(ρ)=NQNlogρ+O(loglogρ),ρ.This result is sufficient to characterize the capacity pre-log χ, defined as the asymptotic ratio between capacity and the logarithm of SNR as SNR goes to infinity:

χ=limρC(ρ)logρ.The pre-log can be interpreted as the fraction of signal-space dimensions that can be used for communication. From (4) we find the pre-log to be given by the difference of two terms, i.e., χ=1Q/N. The first term can be thought of as the capacity pre-log when the channel is known perfectly at the receiver (in this case, χ=1 [2]); the second term quantifies the loss in signal-space dimensions due to the lack of channel knowledge. Note that Q/N is the smallest fraction of entries of the N-dimensional vector h that need to be known to reconstruct the whole vector in the absence of noise.2 Hence, we can further interpret the penalty term Q/N as the fraction of signal-space dimensions in which pilot symbols need to be transmitted to allow the receiver to learn the channel.

When Q=N, i.e., the channel correlation matrix has full rank, (4) implies that the pre-log is equal to 0. It turns out that in this case the O(loglogρ) term in (4) is tight and capacity grows double-logarithmically in SNR. This surprising result was proven in [13, Lem. 5]. In Section 3.2, we shall refine the result in [13, Lem. 5] by providing the following, more accurate, high-SNR capacity characterization:C(ρ)=loglogργ11Nq=1Nlogλq(R)+o(1),ρ.This result characterizes capacity (for Q=N) up to a o(1) term (i.e., a term that vanishes as ρ). In contrast, the expression provided in [13, Lem. 5] agrees with capacity only up to a O(1) term (i.e., a term that is bounded as ρ).

The most important tool in the proof of (5) is the duality approach, a technique first introduced in [12] to characterize the capacity of stationary ergodic fading channels with finite differential entropy rate. The essence of the duality approach is that it allows one to obtain tight upper bounds on C(ρ) by choosing appropriate distributions on the output y. Compared to the treatment in [12], our goal in Sections 2.1 and 3.2 is to provide the simplest and most accessible proofs for the main results underlying the duality approach. This comes at the cost of generality (in terms of noise and fading statistics).

While finding a capacity characterization that—like (5)—is tight up to a o(1) term for all Q with 1QN is an interesting open problem, for the special case Q=1 (with Q<N) the following result was reported in [7], [21]:C(ρ)=N1Nlogρ+logNγ1logΓ(N)N+o(1),ρ.Here, γ denotes the Euler–Mascheroni constant, and Γ() is the Gamma function [12, Eq. (197)]. The proof of (6) provided in [7] is based on a rather technical argument and does not seem to explicitly exploit the geometry in the problem, i.e., the fact that x and y are collinear in the absence of noise. The proof in [21] does exploit this geometry through an apposite change of variables, and applies to the multiple-antenna setting as well.

In Section 3.1, we present a simple, alternative proof of (6) that, differently from the proofs in [7], [21], is based on duality and exploits the geometry in the problem to motivate the choice of the output distribution. Our proof needs another tool put forward in [12]: the escape to infinity property of the capacity achieving distribution. This property, which we review in Section 2.2, allows one to restrict the maximization in (3) to a smaller set of distributions.

Section snippets

The duality approach

To prove (5) and (6), we sandwich capacity between a lower and an upper bound that agree up to a o(1) term. Establishing capacity lower bounds is, in principle, relatively simple: it suffices to evaluate the mutual information in (3) for an input distribution Q that satisfies the average-power constraint. Obviously, care must be exercised in choosing Q, so as to ensure that the resulting bound is tight in the limit ρ (see Section 3.1.2 for a concrete example).

Capacity upper bounds are more

The rank-one case

When Q=1, we can rewrite (1) in the following (more convenient) formy=sx+wwhere sCN(0,1). The high-SNR capacity expansion (4) implies that the capacity pre-log of the channel in (11) is given by 11/N. This is in agreement with the intuition we provided in Section 1.3: one pilot symbol per block is enough to learn the channel in the absence of noise. We next provide a different interpretation of this result, which is of geometric nature and sheds light on how to select input and output

Open problems

Duality is the main tool we used to establish the novel capacity expansion (5) for the full-rank case and to provide an alternative, simple proof of (6) for the rank-1 case (i.e., the piecewise-constant block-fading channel model). For the latter case, in particular, we showed how the geometry of the communication problem at hand can be used to guess an output distribution that yields an asymptotically tight capacity upper bound. Finding a o(1)-accurate capacity characterization when 1<Q<N is

Giuseppe Durisi received the Laurea degree summa cum laude and the Doctor degree both from Politecnico di Torino, Italy, in 2001 and 2006, respectively. From 2002 to 2006, he was with Istituto Superiore Mario Boella, Torino, Italy. From 2006 to 2010 he was a postdoctoral researcher at ETH Zurich, Zurich, Switzerland. Since 2010 he has been an assistant professor at Chalmers University of Technology, Gothenburg, Sweden. He was a visiting researcher at IMST, Germany, University of Pisa, Italy,

References (21)

  • I.C. Abou-Faycal et al.

    The capacity of discrete-time memoryless Rayleigh-fading channels

    IEEE Trans Inf Theory

    (2001)
  • E. Biglieri et al.

    Fading channels: information-theoretic and communications aspects

    IEEE Trans Inf Theory

    (1998)
  • W.M. Boothby

    An introduction to differentiable manifolds and Riemannian geometry

    (1986)
  • T.M. Cover et al.

    Elements of information theory

    (2006)
  • I. Csiszár et al.

    Information theory: coding theorems for discrete memoryless systems

    (1982)
  • Durisi G, Morgenshtern VI, Bölcskei H, Schuster UG, Shamai (Shitz) S. Information theory of underspread WSSUS channels....
  • B.M. Hochwald et al.

    Unitary space–time modulation for multiple-antenna communications in Rayleigh flat fading

    IEEE Trans Inf Theory

    (2000)
  • A.P. Kannu et al.

    On the spectral efficiency of noncoherent doubly selective channels

    IEEE Trans Inf Theory

    (2010)
  • A. Lapidoth

    On the asymptotic capacity of stationary Gaussian fading channels

    IEEE Trans Inf Theory

    (2005)
  • A. Lapidoth

    A foundation in digital communication

    (2009)
There are more references available in the full text version of this article.

Cited by (0)

Giuseppe Durisi received the Laurea degree summa cum laude and the Doctor degree both from Politecnico di Torino, Italy, in 2001 and 2006, respectively. From 2002 to 2006, he was with Istituto Superiore Mario Boella, Torino, Italy. From 2006 to 2010 he was a postdoctoral researcher at ETH Zurich, Zurich, Switzerland. Since 2010 he has been an assistant professor at Chalmers University of Technology, Gothenburg, Sweden. He was a visiting researcher at IMST, Germany, University of Pisa, Italy, and ETH Zurich, Switzerland. Dr. Durisi’s research interests are in the areas of communication and information theory, and harmonic analysis.

Helmut Bölcskei was born in Mödling, Austria on May 29, 1970, and received the Dipl.-Ing. and Dr. techn. degrees in electrical engineering from Vienna University of Technology, Vienna, Austria, in 1994 and 1997, respectively. In 1998 he was with Vienna University of Technology. From 1999 to 2001 he was a postdoctoral researcher in the Information Systems Laboratory, Department of Electrical Engineering, and in the Department of Statistics, Stanford University, Stanford, CA. He was in the founding team of Iospan Wireless Inc., a Silicon Valley-based startup company (acquired by Intel Corporation in 2002) specialized in multiple-input multiple-output (MIMO) wireless systems for high-speed Internet access, and was a co-founder of Celestrius AG, Zurich, Switzerland. From 2001 to 2002 he was an assistant professor of Electrical Engineering at the University of Illinois at Urbana-Champaign. He has been with ETH Zurich since 2002, where he is a professor of Electrical Engineering. He was a visiting researcher at Philips Research Laboratories Eindhoven, The Netherlands, ENST Paris, France, and the Heinrich Hertz Institute Berlin, Germany. His research interests are in information theory, mathematical signal processing, and applied and computational harmonic analysis. He received the 2001 IEEE Signal Processing Society Young Author Best Paper Award, the 2006 IEEE Communications Society Leonard G. Abraham Best Paper Award, the 2010 Vodafone Innovations Award, the ETH “Golden Owl” Teaching Award, is a Fellow of the IEEE, and was an Erwin Schrödinger Fellow of the Austrian National Science Foundation (FWF). He was a plenary speaker at several IEEE conferences and served as an associate editor of the IEEE Transactions on Information Theory, the IEEE Transactions on Signal Processing, the IEEE Transactions on Wireless Communications, and the EURASIP Journal on Applied Signal Processing. He is currently editor-in-chief of the IEEE Transactions on Information Theory and serves on the editorial board of “Foundations and Trends in Networking”. He was TPC co-chair of the 2008 IEEE International Symposium on Information Theory and serves on the Board of Governors of the IEEE Information Theory Society.

View full text