On the Gaussian multiway relay channel with intra-cluster links

This paper studies the extension of the multiway relay channel (introduced by Gündüz et al.) by adding intra-cluster links. In this model, multiple clusters of users communicate with the help of one relay and the users within a cluster wish to exchange messages among themselves. Restricted encoders are considered; thus, the encoded messages of each user depend only on its own message, not on previously decoded ones. Cut-set bounds and achievable rates are given for the Gaussian case with and without time-sharing between clusters. Depending on the protocol considered, schemes based on random coding or nested lattice coding are proposed. The schemes are compared in terms of exchange capacity, that is the equal rate point in the capacity region of a symmetric multiway relay channel. It is shown that the gap between the cut-set bound and Compress-and-Forward, as well as Amplify-and-Forward, is independent of the transmit power constraints when time-sharing is used.


Introduction
The relay channel, introduced by van der Meulen [2] and studied in depth by Cover and El Gamal [3] is one of the major building blocks for wireless networks: one source node wishes to transmit data to a destination node with the help of a relay.
Information may thus flow along the direct or the relayed link. The key question to approach capacity is how should those two links cooperate. Many different protocols have been proposed to communicate over the Gaussian relay channel, such as Decode-and-Forward (DF), Amplify-and-Forward (AF), and Compress-and-Forward (CF)( [4] Ch. 16, [5] and the references therein).
The Two-Way Relay Channel (TWRC), where two users wish to exchange messages with the help of a relay [6], is a natural extension of this three-node channel, and DF, AF, and CF have been adapted to this channel. The TWRC without direct links between the two users has been extensively studied under half duplex ( [7,8]) and full-duplex ( [4,Ch. 19], [9,10] ). The TWRC with direct links [11][12][13] *Correspondence: anne.savard@telecom-lille.fr 1 Télécom Lille, Univ. Lille, CNRS, UMR 8520 -IEMN, F-59000 Lille, France Full list of author information is available at the end of the article has also been studied. The Compute-and-Forward (CoF) protocol proposed by Nazer et al. [14], for which the relay decodes the sum of the messages instead of the individual messages, has also been proposed for this channel [15].
The multiway relay channel (mRC) is an extension of the TWRC that has been recently proposed by Gündüz et al. in [16,17]: they consider multiple clusters of users that wish to exchange messages locally within each cluster, with the help of a single relay. The achievable equal rate point for DF, CF, AF, and CoF is given for the so-called restricted model, in which the nodes' channel inputs depend only on their own messages, not on past symbols. The finite field multiway relay channel has been studied in [18].
The main difference between the multiway relay channel in [17], and the model considered in this paper consists in the inclusion of direct links between users of the same cluster. When users are close to each other (for example, in a sensor network), they can overhear signals sent by neighboring nodes; thus, adding direct links gives a more realistic model of that situation.
The multiway relay channel with direct links and one cluster has recently been studied by Su et al. in [19]. The authors characterized the cut-set bound and the rate region using DF and CF based on random coding arguments. A study of the gap between the cut-set bound and CF or DF is proposed when a specific power constraint at the relay applies.
The main differences with the work done by Su et al. and the work presented in this paper are the following: first of all, we concentrate on the Gaussian mRC and we propose achievable schemes using either AWGN coding or lattice coding. We also propose an achievable AF and CoF scheme and a more general study of the gap between the cut-set bound and CF or AF and of the gap between CF and AF. Moreover, we also characterize the achievable equal rate point when there are L clusters with intracluster links without time-sharing at the relay, which is, to our best knowledge, the first results obtained for this model.
The focus of this paper is to provide rate limits for the Gaussian mRC with intra-cluster links. Users in a cluster broadcast their messages to both the relay and the other users within the cluster. The relay receives incoming messages on a Multiple Access Channel (MAC) and sends a function of its received message over a Gaussian broadcast channel to the users in order to help them decode the messages they wish to recover. This paper focuses on the equal-rate point in the capacity region of the symmetric network setup. The per-user rate is termed exchange rate, while the sum rate (the total throughput over the channel) is called total exchange rate [17].
In this setup, all users have the same power constraint and the noise powers at all nodes are the same. Moreover, the gain on the user-relay links is the same for all users and is denoted by g, and all intra-cluster links have unit gain. When g is small, the relay does not play a crucial part in the network, whereas if g is large, which can be obtained if the relay has a better antenna than the users (or more power and less noise), the relay is useful since it has a better observation than the users.
In this paper, we investigate two different setups: one with and one without time-sharing between clusters. We characterize the set of achievable symmetric rate tuples such that all users can multicast their messages to all other users in their cluster. We propose extensions of some wellknown protocols for the Gaussian relay channel, namely DF, CF, AF, and CoF based either on AWGN or lattice coding, as well as a cut-set bound. We also characterize gaps between the cut-set and AF or CF and between CF and AF for the multiway relay channel with time-sharing between clusters.
The rest of this paper is organized as follows: the system model is presented in Section 2. Lattices as a coding tool are briefly presented in Section 3. Our main contributions are given in Sections 4 and 5. Section 4 presents the cut-set bound, the total exchange rate obtained with AF, CF, and DF, as well as with CoF when time-sharing is performed between clusters. The asymptotic limits for large values of g are studied and compared with the results obtained by Gündüz et al., since under proper power scaling for large g, our model becomes equivalent to theirs. A weakened cut-set bound that is used to bound the gaps between proposed schemes and the cut-set bound is also presented. The section will be concluded by results obtained with our proposed protocols for selected examples. Section 5 presents the achievable total exchange rate obtained with AF, CF, and DF and example results without time-sharing among clusters. Section 6 gives some conclusions on the presented model and achievable rates.

System model
This paper considers a Gaussian multiway relay channel (mRC) in which N users, grouped into L clusters of K ≥ 2 users each (N = KL), exchange messages with the help of one relay. The K users in each cluster want to recover the messages of all other K − 1 users within the same cluster. We suppose that clusters are built upon users that are physically close, which can overhear the other users' message, and model this through direct links between users of the same cluster (at the same time, cluster separation is assumed large enough to avoid interference). All nodes are full-duplex: they can receive and send at the same time. This situation is depicted in Fig. 1.
The full-duplex Gaussian mRC is modeled as where X l,k , which is of power P l,k , is the signal sent by user k of cluster l, X R , which is of power P R , is the relay output signal, Y l,k is the received signal at user k of cluster l, Y R is the received signal at the relay, and Z R and Z l,k are zeromean, unit-variance Gaussian noises that are independent of each other and of the channel inputs. The difference with the model in [17] is the presence of the intra-cluster signal in (1). In this paper, we focus on a symmetric network with equal power constraints, respectively noise variances, at the users, i.e. P l,k = P,∀l, k.
We investigate the average throughput for different relaying schemes. We use the notation C(x) = 1 2 log 2 (1 + x), x = 1 − x and x + = max(0, x). With time-sharing between clusters, each user has a codebook of rate R e K (leading to an average throughput of R e KL ), whereas without time-sharing, the user codebooks have rate R e KL equal to the average throughput. Some of the presented schemes are based on lattice coding; thus we present a brief overview on these techniques, referring interested readers to [20] for more details.

Lattices
The main advantage of lattice coding is its ability to exploit the network topology: instead of decoding messages individually, this construction allows the relay to directly compute the modulo sum of the messages, which relaxes for instance the decoding constraint of DF.
A lattice ⊂ R n is a discrete additive subgroup of R n closed under addition. In other words, The lattice quantizer Q that maps any point x ∈ R n to the closest lattice point is defined as Q (x) = arg min λ∈ ||x − λ|| and the modulo operation gives the quantization error as [ x] mod = x − Q (x). The fundamental Voronoi region V 0 of is the set of points that are closer to the origin than to any other lattice point: We briefly present some parameters that describe a lattice. The covering radius r cov is the radius of the smallest sphere that covers V 0 . The effective radius r eff is the radius of a sphere with the same volume as V 0 . The second moment per dimension σ 2 ( ) defines the average power of the lattice : σ 2 ( ) = 1 nV V 0 ||x|| 2 dx, where V is the volume of V 0 . The normalized second moment of a lattice of dimension n is defined as G( ) = σ 2 ( ) V 2/n . It measures the efficiency of as a shaping region: the normalized second moment of a sphere in R n is 1/2πe and the more V 0 resembles a sphere, the closer to 1/2πe is G( ).
Let us now consider a sequence of n-dimensional lattices (n) . The sequence is said Rogers-good [21] if lim n→∞ r (n) cov r (n) eff = 1, that is if the covering radius approaches the effective radius. The sequence is said Poltyrev-good [22] (good for AWGN coding) if, for is the Poltyrev exponent and μ is the volume-to-noise ratio (see [20] for details). The sequence is said to be good for mean-square error quantization if lim n→∞ G( (n) ) = 1 2πe . It can be shown that if a lattice is Rogers-good, then it is also good for mean-squared error quantization [23].
Good lattice codebooks are obtained with the help of a coarse lattice c and a fine lattice f , such that c ⊆ f , with fundamental Voronoi region V c of volume V c and V f of volume V f , respectively. These lattices are chosen such that f is Poltyrev-good and c is both Rogersand Poltyrev-good. The codebook is given by The second moment per dimension of c is chosen to insure a power constraint. The rate of this codebook is

Total exchange rate for the symmetric network with time-sharing between clusters
We first focus on the model with time-sharing between clusters, as studied by Gündüz et al. Each cluster only transmits over a 1/L fraction of time which allows one to increase the power of each user up to P = LP and still satisfy the average user power constraint. For notation simplicity, we drop the cluster index, yielding The remainder of the section presents the cut-set bound as well as achievable average throughput per user using CF, AF, DF, and CoF (only when K = 2). For each lower/upper bound, we also study the limit when g grows large, i.e., when the direct links become negligible compared to the relayed links. This study allows a comparison with the average throughput per user obtained by Gündüz et al. In order to make a fair comparison, we must normalize the transmitted powers (at the users and the relay) by g 2 .

Cut-set bound (outer bound on the exchange capacity of the symmetric Gaussian mRC)
Proposition 1 For a symmetric Gaussian mRC with direct links, L clusters of K users each and time-sharing between clusters, the cut-set bound (CSB) on the exchange capacity is given by and ρ is a correlation parameter.
Proof The proof extensively uses the fact that the meansquared error of the linear MMSE estimate of Y given X is greater than or equal to the expected conditional variance E(Var(Y |X)), where E denotes the expectation operator.
The X k and X R are zero-mean and jointly Gaussian (see Appendix 16A of [4]).
A detailed proof can be found in [19].

Proposition 2
For a symmetric Gaussian mRC with direct links, L clusters of K users each, time-sharing between clusters and asymptotically large g, with powers P and P R scaled by 1/g 2 , the cut-set bound on the exchange capacity is given by Note that this is identical to the CSB for Gündüz et al. 's model ( [17] eq. (19)).
Proof By replacing P by P/g 2 and P R by P R /g 2 in (3) and (4) and by taking the limit, we obtain We see that both limits are decreasing in ρ, thus the optimum value is ρ = 0, which completes the proof.

Amplify-and-Forward
The easiest protocol to implement at the relay is Amplifyand-Forward. In this scheme, the relay amplifies its received signal within its power constraint and broadcasts the resulting message to the receivers.
Within the time slot of each cluster, all users broadcast their message both to the relay and to all other destinations. The relay scales its received signal and broadcasts it to all users in the next time slot. Thus, the AF protocol transforms the channel into a unit-memory inter-symbol interference MAC. This observation has been made in [5].
This part is inspired by [24], where an AF scheme has been proposed for the Gaussian relay channel. Here we extend the approach in [24] to multiple users grouped into clusters.

Proposition 3 For a symmetric Gaussian mRC with direct links, L clusters of K users each and time-sharing between clusters, the total exchange rate (mRC throughput) R AF is achievable with AF relaying
Proof We use a block Markov encoding scheme. During block b, the relay sends The total noise power is N eq = Thus, the AF protocol transforms the channel into a unit-memory inter-symbol interference MAC whose achievable sum rate (K − 1)r AF is given by [25,26]: From the integral 2π 0 log 2 (x + y cos(z))dz = 2π log 2 x + x 2 − y 2 2 found in ( [27] 4.224.9), we obtain thus where α, resp. β, is given in (5), resp. (6). Thus, the total throughput over the mRC, equal to KLr AF , is given in Proposition 3.

Proposition 4
For a symmetric Gaussian mRC with direct links, L clusters of K users each and asymptotically large g, with powers P and P R scaled by 1/g 2 , the following total exchange rate (mRC throughput) is achievable with AF relaying: Note that this is identical to the rate for Gündüz et al. 's model ( [17] eq. (20)).
Proof By replacing P by P/g 2 and P R by P R /g 2 in (5) and (6) and taking the limit, we obtain lim

Compress-and-Forward
In CF relaying, the relay sends a quantized version of its received signal and transmits it to all destinations. Since the destination knows its own message, which is correlated with the relay received message, the relay can use Wyner-Ziv compression [28] that exploits the destination's own message and the messages received on the direct links as side information. The destinations then combine their received signal and the compressed version sent by the relay to recover the source messages.
Here, we extend the lattice-based CF scheme proposed in [29] to multiple nodes in a cluster.
Proposition 5 For a symmetric Gaussian mRC with direct links, L clusters of K users each, relay link gain g > 0, and time-sharing between clusters, the following total exchange rate (mRC throughput) R CF is achievable with CF relaying using lattice coding: with Proof The encoding and decoding procedure is based on block Markov techniques.

Encoding
The codebook for all users is the same and defined as nested lattices of dimension n and T is both Rogers-and Poltyrev-good and c T is Poltyrev-good. To ensure power constraints, we choose σ 2 ( T ) = LP and c T such that 1 n log 2 |C T | = r CF , the user codebook rate.
In block b, user k sends codeword c Tk ∈ C T as The quantization codebook at the relay is given by The compression rate is thus The sending codebook for the relay is given by where R ⊆ c R are nested lattices of dimension n and R is both Rogers-and Poltyrev-good and c R is Poltyrev-good. To ensure the power constraints, we During block b, the relay sends is the compression index of the signal received during the previous block.
Decoding At block b, the relay receives and quantizes it to where U c Q is a quantization dither uniformly distributed over V c Q . This can be rewritten as where E c Q is the quantization error.
It starts by decoding the quantization index, considering the signals received on the direct link as noise, which is possible if ([15] Lemma 6) Then, it can remove the message sent by the relay, form- is performed in Wyner-Ziv fashion using a MAC. First, user i forms an estimate version of the signal received at the relay during block b − 1. This is done in a Wyner-Ziv fashion, usingỸ i (b − 1) as side information. After this first step, user i has two noisy version of the sum g k =i X k (b−1) that can be combined coherently in order to decode all individual messages as over a MAC. During the previous block, user i formedỸ i (b−1) which is used in block b as side information to estimate the received signal at the relay in the previous block: where the last equality is valid under perfect decoding, , the linear MMSE orthogonality principle requires that α is chosen as α = g(K−1)LP Combining this with the quantization rate constraint, the distortion is given by In order to recover all messages, user i combines the two noisy observationsŶ R (b − 1) andỸ i (b − 1) as Thus, it can decode all messages if Thus, the total throughput over the mRC, equal to KLr CF , is given in Proposition 5.

Proposition 6
For a symmetric Gaussian mRC with direct links, L clusters of K users each, time-sharing between clusters and asymptotically large g, with powers P and P R scaled by 1/g 2 , the following total exchange rate (mRC throughput) is achievable with CF relaying using lattice codes: Note that this is identical to the rate for Gündüz et al. 's model ( [17] eq. (22)).
Proof Replacing P by P/g 2 and P R by P R /g 2 in (7) and taking the limit yields the result.

Decode-and-Forward
In DF relaying, users send the superposition of the message from previous block and the current message. The relay decodes all individual new messages sent by the users and broadcasts them back to all users.
Here, we extend the DF scheme using AWGN superposition coding proposed in [30] to multiple users in a cluster.

Proposition 7
For a symmetric Gaussian mRC with direct links, L clusters of K users each and time-sharing between clusters, the total exchange rate (mRC throughput) R DF is achievable with DF relaying: Proof The proof is based on sliding window encoding and decoding to take advantage of both the direct and the relayed links. During block b, the message w k,b ∈ 2 nr DF of user k is encoded into two codewords X k1 (w k,b ), which is of power P R /K and X k2 (w k,b ), which is of power LP. During block b, user k sends where w k,1 is predetermined for all users. At block b, the relay receives During the previous block, the relay has decoded all w k,b for k ∈[ 1, . . . , K], so it can remove the first term from Y R (b) and decode all w k,b+1 for k ∈[ 1, . . . , K] if User i decodes all w k,b using two noisy observations, one of X k2 (w k,b ) and one of X k1 (w k,b ) using Y i (b−1) and Y i (b). The first noisy observation of w k,b (through X k2 (w k,b )) is obtained using Y i (b − 1). Indeed, since all w k,b−1 have already been decoded, one can remove them and form and considering all w k,b+1 as additional noise. Thus, perfect decoding at user i is possible as long as Thus, the total throughput over the mRC, equal to KLr DF , is given in Proposition 7.

Proposition 8 For a symmetric Gaussian mRC with direct links, L clusters of K users each, time-sharing between clusters and asymptotically large g, with powers
P and P R scaled by 1/g 2 , the following total exchange rate (mRC throughput) is achievable with DF relaying:

The achievable total exchange rate with Gündüz et al. 's mRC model and DF relaying is given by
Proof By replacing P by P/g 2 and P R by P R /g 2 in (8) and (9) and taking the limit, we obtain lim g→∞ R 1 (ρ) = C(ρ 2 KLP) and Since R 1 (ρ) is a decreasing function and R 2 (ρ) is constant in ρ, the optimum value of ρ is ρ * = 0, which completes the proof.

Partial Decode-and-Forward
The above loss with respect to Gündüz et al. 's mRC model and DF scheme is due to the fact that they use Tuncel's "virtual binning" technique [31], in which the relay capacity is de facto shared among only K − 1 streams, whereas in the coherent DF scheme, the relay has to send all K streams. Thus, we propose also a non-coherent partial Decode-and-Forward (pDF) protocol, in which the relay can use Tuncel's scheme, as in [17].

Proposition 9 For a symmetric Gaussian mRC with direct links, L clusters of K users each and time-sharing between clusters, the total exchange rate (mRC throughput) R pDF is achievable with pDF relaying:
Proof The proof follows directly by extending the TWRC scheme in [13], using Tuncel's "virtual binning" [31] at the decoder. The source uses a superposition codebook with two parts, X k,d of rate r d and power αLP, and X k,r of rate r r and power αLP. The relay only decodes X k,r , k = 1, . . . , K, which requires the MAC constraint to be satisfied. The protocol is block-based. Once the relay has decoded block b, it uses a code of rate Kr r to jointly encode X k,r (b), k = 1, . . . , K, as in [17,31], and sends the joint codeword in block b + 1 as X r (b + 1). Thanks to "virtual binning, " this can be decoded as long as the channel from relay to destination reliably carries rate (K − 1)r r .
Thus, the following MAC constraints have to be satisfied at the destination: Using the received signals from blocks b and b + 1, destination i can decode (X k,d (b), X k,r (b)), k = 1, . . . , i − 1, i + 1, . . . , K, if the above constraints are satisfied. The proof is based on joint typicality decoding from codeword lists formed using (X i,d (b), X i,r (b)) as side information, see [12,13,31] for details.
Thus, the total throughput over the mRC, equal to KL(r d + r r ), is given in Proposition 9.
Proposition 10 For a symmetric Gaussian mRC with direct links, L clusters of K users each, time-sharing between clusters and asymptotically large g, with powers P and P R scaled by 1/g 2 , the following total exchange rate (mRC throughput) is achievable with pDF relaying: R g→∞ pDF = min C(KLP),

This equals the total exchange rate for Gündüz et al. 's mRC model with DF relaying.
Proof By replacing P by P/g 2 and P R by P R /g 2 in (10) and taking the limit. The result follows by observing that α = 1 maximizes the first term.

Compute-and-Forward
For this subsection, we assume that there are only two users in each cluster. This part is based on [15], where a combination of CoF and DF has been proposed for the TWRC with unitary links between the relay and the two destinations nodes.
The main advantage of CoF is the ability to compute directly the sum of the messages at the relay instead of decoding both messages individually.

Proposition 11 For a symmetric Gaussian mRC with direct links, L clusters of K = 2 users each and timesharing between clusters, the total exchange rate (mRC throughput) R CoF is achievable with CoF relaying:
Proof The proof follows the one in [15].

Proposition 12
For a symmetric Gaussian mRC with direct links, L clusters of K = 2 users and asymptotically large g, with powers P and P R scaled by 1/g 2 , the following total exchange rate (mRC throughput) is achievable with CoF relaying using lattice codes: Note that this is identical to the rate for Gündüz et al. 's model ( [17] eq. (31)).
Proof By replacing P by P/g 2 and P R by P R /g 2 in (11), and taking the limit, the result is straightforward.

Weakening the cut-set bound
We derive an upper bound on the cut-set bound which will be useful to analyze the performance of the proposed protocols. The goal is to obtain a bound based only on the system parameters and not on the optimization parameter ρ.

Proposition 13 For a symmetric Gaussian mRC with direct links, L clusters of K users each and time-sharing between clusters, the cut-set bound on the exchange capacity can be upper-bounded by:
Proof Recall that R CSB = max ρ∈{0,1} where f 1 (ρ), resp. f 2 (ρ), are given in (3), resp. (4).
The value f 1 (ρ) is an upper bound on the inner minimization. Since f 1 is strictly decreasing, setting ρ = 0 yields the desired bound:

Comparison between cut-set and proposed schemes
In this section, we characterize the gaps to the cut-set bound of the proposed schemes. In particular, we prove that the proposed schemes can achieve a finite gap that is independent of the transmit powers and number of clusters of the system. Thanks to the upper bound on the cut-set bound, the results in this section are more general than those by Su et al. [19], which are restricted to certain relay power regimes. We also prove that the AF protocol achieves a finite gap to the CF protocol.

Proposition 14
For a symmetric Gaussian mRC with direct links, L clusters of K users each and time-sharing between clusters, the CF protocol achieves rates within K 2(K−1) log 2 (1 + g 2 ) bits of the exchange capacity.

Proposition 15
For a symmetric Gaussian mRC with direct links, L clusters of K users each and time-sharing between clusters, the AF protocol achieves rates within K 2(K−1) 1 + log 2 (g 2 + 1) bits of the exchange capacity.
Proof Recall that It can be shown that α 2 − β 2 is a convex function of P R and that ∀P R ≥ 0, Further, one can show that α is an increasing function of P R and that ∀P R ≥ 0, 1 + (K − 1)LP ≤ α.
. Using Proposition 13, we obtain The second inequality is obtained by observing that γ is strictly increasing in P and taking the limit P → ∞ of the argument of the logarithm.

Comparison between Compress-and-Forward and Amplify-and-Forward
Proposition 16 For a symmetric Gaussian mRC with direct links, L clusters of K users each and time-sharing between clusters, the AF protocol achieves rates within K 2(K−1) log 2 (2(1 + g 2 )) bits of the CF protocol.
Proof We first study the achievable rate of CF as a function of P R . It can be shown that it is an increasing function upper bounded by K 2(K−1) log 2 1 + (g 2 + 1)(K − 1)LP . Thus, R CF − R AF ≤ K 2(K−1) log 2 2(g 2 + 1) . The proof follows the same steps as for Proposition 15.

Examples
Let us first assume that the relay has a better observation of the transmitted messages than the users, thus we set the gain g on links between the relay and the users to be larger than 1 (this can be justified by more powerful hardware at the relay such as better antennas and/or higher power and less noise).
In Fig. 2, we plot the cut-set bound, the achievable total exchange rate for the mRC with L = 1 cluster and P = 30 dB as a function of K. We consider two cases: either the relay has constant power P R = P, or its power scales with the number of users as P R = KP. In both cases, for a moderate number of users per cluster, CF gives the best performance, within a finite gap to the cut-set bound. When the number of users increases, Fig. 2 Total exchange rate vs. K, P = 30 dB, g = 3, L = 1-comparison of two power allocation setups: either the relay scales its power with the number K of users per clusters or not DF yields the best performance. For both power allocation setups, AF achieves rates within a finite gap to CF. We also observe that DF approaches the cut-set bound when the relay power doesn't scale with the number of users. Below 5 users, pDF outperforms DF, since it better exploits the direct links. For 5-7 users, pDF and DF perform equally. Above 7 users, pDF achieves lower rates than DF for P R = P. This is due to the fact that DF uses coherent signaling, which is beneficial in this "low relay power for many users" regime.
In Fig. 3, we plot the cut-set bound, the achievable symmetric rate for the mRC with L = 8 clusters of K = 2 users as a function of P. We can note that for the chosen g, CoF gives the best performance among the proposed schemes, and that the gap between the cut-set bound and CoF vanishes for large power P. One can again note that DF yields poor performance (K = 2) and that CF achieves rates within a finite gap of the cut-set bound. For larger powers P, pDF does better than DF by only using the direct links. Figure 4 represents the total exchange rate as a function of the gain g for a fixed power P = 30 dB and K = 4 users per cluster, with and without power scaling at the relay. In both cases, the total exchange rate increases with g. Note that for DF, the total exchange rate is unaffected by relay power scaling, since the rate is limited by the decoding at the relay. For small gains g, pDF does better than DF by only using the direct links.
Let us now compare the influence of the presence of the direct links. To have a fair comparison with the schemes proposed by Gündüz et al., let us assume that g = 1. Figures 5 and 6 display the total exchange rate as a function of the number of users K per cluster for the multiway relay channel with and without direct links for relay power P and KP, respectively. First note that if the relay does not scale its power with the number of users, adding the Fig. 3 Total exchange rate vs. P, P R = 2LP, g = 5, K = 2, L = 8 direct links increases the achievable rate for all relaying schemes (even for DF when the number of users K is large enough to be efficient), see Fig. 5; whereas when the relay scales its power with the number of users (Fig. 6), only CF achieves higher rates for the model with direct links. (AF and DF have performance close to the case without direct links.) The pDF protocol needs direct links to operate and achieves rates that match the CSB without direct links for this choice of parameters.

mRC without time-sharing between clusters
In this section, we study the Gaussian mRC with direct links, when we relax the assumption that clusters are operated in time-sharing fashion (as in [17]). In this case, the received signals are At user k of clusterl : where all users have a power constraint P and the relay a power constraint P R .
As in Section 4, we first derive an upper bound on the capacity using a cut-set argument and then propose various lower bounds using AF, DF or CF.

Cut-set bound
Proposition 17 For a symmetric Gaussian multiway relay channel with direct links, L clusters of K users each, with restricted encoders and without time-sharing, the CSB on the exchange capacity is given by: Proof Along the lines of Proposition 1.

Decode-and-Forward
In the following, two versions of DF are proposed. In both versions, the relay decodes all messages from all clusters, before relaying them. In the first version, relayed messages from other clusters are treated as noise when recovering the messages for a given cluster, whereas in the second one, they are first decoded in order to remove them, before decoding the messages for a given cluster.

Proposition 18
For a symmetric Gaussian multiway relay channel with direct links, L clusters of K users each, for restricted encoders and without time-sharing (relayed messages from users in other clusters are treated as noise), the following total exchange rate (mRC throughput) is achievable with DF relaying: 1] min C g 2 LKPρ 2 ,  Proof The proof follows along the lines of the proof of Proposition 7. The main differences are the powers of each part of the codeword; here, X l,k,1 is of power P R /(KL) and X l,k,2 is of power P. Relayed messages from users in other clusters are treated as noise when decoding the K − 1 messages of a given cluster.

Proposition 19
For a symmetric Gaussian multiway relay channel with direct links, L clusters of K users each, for restricted encoders and without time-sharing (relayed messages from users in other clusters are decoded first to reduce the noise at each user), the following total exchange rate (mRC throughput) is achievable with DF relaying: Proof The proof follows along the lines of the proof of Proposition 7. The main differences are the power of each part of the codeword; here, X l,k,1 is of power P R /(KL) and X l,k,2 is of power P. The first rate constraint in (13) corresponds to the MAC constraint at the relay, where all KL codewords X l,k,1 are decoded. The second rate constraint corresponds to the decoding of the K − 1 codewords X l,k,2 after all X l,k,1 have been removed (the first term in the min corresponds to the decoding of all (L − 1)K codewords X l,k,1 and the second one to the decoding of the K −1 codewords X l,k,1 of a given cluster after having the (L − 1)K codewords X l,k,1 removed.

Amplify-and-Forward
In the following, the relay amplifies all received messages from all clusters and relayed messages from other clusters are either treated as noise when recovering the messages for a given cluster or decoded first.

Proposition 20
For a symmetric Gaussian multiway relay channel with direct links, L clusters of K users each, for restricted encoders and without time-sharing (relayed messages from users in other clusters are treated as noise), the following total exchange rate (mRC throughput) is achievable with AF relaying: with α = 1+ (K − 1)P g 2 (KLP + g 2 P R ) + 1 g 2 (KLP + P R + g 2 (L − 1)KPP R ) + 1 β = 2(K − 1)Pg 2 P R (g 2 KLP + 1) g 2 (KLP + P R + g 2 (L − 1)KPP R ) + 1 . Proof The proof follows along the lines of the proof of Proposition 3 using a unit-memory inter-symbol MAC of K − 1 users. The scaling factor at the relay equals P R g 2 KLP+1 . Relayed messages from users in other clusters are treated as noise when decoding the K − 1 messages of a given cluster.

Proposition 21
For a symmetric Gaussian multiway relay channel with direct links, L clusters of K users each, for restricted encoders and without time-sharing (messages from users in another cluster are decoded first to reduce the noise at each user), the following total exchange rate (mRC throughput) is achievable with AF relaying: with α = 1 + (K − 1)P g 2 (KLP + g 2 P R ) + 1 g 2 (KLP + P R ) + 1 and β = 2(K − 1)Pg 2 P R (g 2 KLP + 1) g 2 (KLP + P R ) + 1 .
Proof The proof follows along the line of the proof of Proposition 3 using a unit-memory inter-symbol MAC of K − 1 users. The scaling factor at the relay equals P R g 2 KLP+1 . Relayed messages from other clusters are first decoded, yielding the second rate constraint in (14) and then removed. (This can be seen as successive decoding: even if users are not interested in these messages, they can decode them in order not to treat them as additional noise.) The K − 1 messages of a given cluster are then decoded using the K − 1 user MAC, yielding the first rate constraint in (14).

Proposition 22
For a symmetric Gaussian multiway relay channel with direct links, L clusters of K users each, relay link gain g > 0, for restricted encoders and without time-sharing (relayed messages from users in other clusters are treated as noise), the following total exchange rate (mRC throughput) is achievable with CF relaying: Proof The proof follows along the lines of the proof of Proposition 5. The main difference are the second moment of the shaping lattice used for the quantization, . Messages from users in another cluster are treated as noise when decoding the K − 1 messages of a given cluster.

Comparison of mRC with and without time-sharing
One interesting question concerning the multiway relay channel is whether it is more advantageous to perform time-sharing among the clusters or not. Figures 7 and 8 present numerical results for L = 4, K = 4 and g = 3 when P R = KLP. Figure 7 presents the obtained total exchange rate when no time-sharing is performed at the relay, whereas Fig. 8 compares the obtained total exchange rate with and without time-sharing.
The cut-set bound without time-sharing is much higher than with time-sharing and one can note that for this scenario only CF seems to perform better without timesharing, all other protocols perform either in the same order of magnitude or worse. Note that CF without timesharing is better than any solution involving time-sharing, since it operates above the cut-set bound with timesharing.
We can note that for this scenario DF, when P R = KLP and additional signals are decoded first, gives results close to the one obtained with time-sharing and clearly outperforms the version where the additional signals are treated as noise. For the two versions of AF, the same observation can be made for this scenario: decoding the additional signals first yields higher total exchange rate. However, when P R = P, treating additional signals as noise gives better results for both DF and AF. Note that DF with decoding of additional signals achieves the same total exchange rate as DF with time-sharing if in both cases, the decoding at the relay yields the bottleneck rate (the perfect decoding constraint at the relay is the same in these two cases).
CF seems to be best performing because with timesharing, it performs close to cut-set bound, while without time-sharing, it is the only protocol to maintain a constant-albeit large-gap to the cut-set bound.
Clearly, only CF seems to be able to achieve rates close to the cut-set bound. All protocols have difficulties dealing with the additional signals, directed to other clusters and broadcast by the relay. The major issue is that these signals have a higher power than the signals that the users of a cluster want to recover.

Discussion and conclusions
In this paper, we considered an extension of the multiway relay channel proposed by Gündüz et al. in [17]. In Fig. 8 Comparison of the proposed protocols with (w/ TS) or without time-sharing (interference (IF) is treated as noise or decoded) among the clusters, K = 4, L = 4, g = 3, P R = KLP the considered setup, multiple clusters of users with direct intra-cluster links communicate with the help of a single relay. Each user wishes to recover all messages within its cluster. We extended standard schemes such as CF, DF, and AF for this setup using results proposed for the Gaussian relay channel based on lattices [14,15,29] or standard AWGN coding/decoding [24,30]. We characterized the achievable total exchange rate for all these protocols with or without time-sharing among the clusters. When there are only two users per cluster and time-sharing is performed, we have also proposed an extension of CoF. Under the time-sharing assumption, we also studied the gaps to the cut-set bound these protocols can achieve, and proved that they only depend on the number of users K and on the gain g of user-relay links, and not on the transmit power. We also proved that AF performs at a finite gap from CF.
For very large user-relay gain g, i.e., when the model becomes that of [17] up to scaling, our results reduce to the ones obtained by Gündüz et al.
We also noted that for the general case without time-sharing, only CF performs clearly better than with time-sharing. Other protocols perform either close to the performance with time-sharing or worse. This degradation is due to the interference caused by signals directed to other clusters, that are broadcast to all clusters by the relay.
So far, we only characterized the fully symmetric case. First results based on random coding arguments have been recently proposed for the asymmetric (in terms of link gains and powers) multiway relay channel with one cluster in [19]. The study of practical coding schemes for asymmetric networks (in terms of link gains, powers, and number of users per cluster) remains an open problem, as does the characterization of the entire achievable rate region for the multiway relay channel with L clusters. Another open issue regards full-duplex with a single send/receive antenna: in some cases, results obtained for full-duplex protocols can be translated to half-duplex protocols [5]. Moreover, this work only considered single-antenna nodes; having multiple antennas at the relay (and possibly the users) would achieve higher rates.