Primal Decomposition-Based Method for Weighted Sum-Rate Maximization in Downlink OFDMA Systems

. We consider the weighted sum-rate maximization problem in downlink Orthogonal Frequency Division Multiple Access (OFDMA) systems. Motivated by the increasing popularity of OFDMA in future wireless technologies, a low complexity suboptimal resource allocation algorithm is obtained for joint optimization of multiuser subcarrier assignment and power allocation. The algorithm is based on an approximated primal decomposition-based method, which is inspired from exact primal decomposition techniques. The original nonconvex optimization problem is divided into two subproblems which can be solved independently. Numerical results are provided to compare the performance of the proposed algorithm to Lagrange relaxation based suboptimal methods as well as to optimal exhaustive search-based method. Despite its reduced computational complexity, the proposed algorithm provides close-to-optimal performance.

Two main radio resource allocation (RRA) problems have been addressed in the literature. The first ones, consist of maximizing an increasing function of the user rates [5][6][7][8][9][10][11][12][13] subject to different power constraints, whilst the second ones consist of minimizing the transmit power subject to constraints on the minimum user rates [13][14][15][16][17]. A suboptimal greedy method for maximizing the smallest rate among the users has been proposed in [5]. A branch-andbound based algorithm for sum-rate maximization has been proposed in [6]. However, in practice the branch-and-bound method is still too computationally heavy for finding the global solution [23]. Computationally efficient algorithms for maximizing the sum-rate have been developed in [7,8]. A suboptimal method for characterizing the achievable rate region of the two-users frequency division multiple access (FDMA) channel have been presented in [10]. The general weighted sum-rate maximization problem has been used in [9] to characterize FDMA capacity region for a broadcast channel. Due to the nontractability of the original problem, a modified convex problem formulation, FDMA-time division multiple access (TDMA) was proposed (i.e., time sharing among users). Authors also considered algorithms to obtain optimal and suboptimal solutions to a particular variation of the original problem where the total power is evenly divided among the used set of subcarriers. Lagrangian relaxationbased approaches to obtain suboptimal algorithms for the weighted sum-power minimization problem has been introduced in [13,14]. A greedy algorithm is proposed in [15] to obtain and approximate solutions for the same problem. Recently, a Lagrangian relaxation-based method has been proposed in [13] for the weighted sum-rate maximization problem. A bisection search method was used to update the dual variable until the algorithm converges. Due to the nonconvexity of the optimization problem the optimality of the algorithm is not guaranteed.
In this paper, we propose an alternative method based on the primal decomposition technique [24,25]. Using numerical simulations, its performance is compared to Lagrangian relaxation based algorithm [13] as well as to the optimal exhaustive search algorithm. Numerical results show that the proposed algorithm converges very fast. Although, the optimality of the final value cannot be guaranteed due to the nonconvexity of the problem, the simulations show that rate-region achieved by the proposed algorithm exactly matches with the one obtained using optimal exhaustive search algorithm.
The rest of the paper is organized as follows. In Section 2 we present the system model and problem formulation. The proposed algorithm is presented in Section 3 and convergence properties are discussed in Section 4. Section 5 compares the complexity of the proposed algorithm to Lagrangian relaxation-based algorithm in [13] as well as to the optimal exhaustive search based method. The numerical simulation results are presented in Sections 6, and 7 concludes our paper.
Notations. |x| denotes the absolute value of complex number x. x T represents the transpose of vector x and e i denotes the ith standard unit vector. CN (c, σ 2 ) stands for circularly symmetric complex Gaussian distribution with mean c, variance σ 2 /2 per dimension. For any real number r, [r] + denotes max{0, r}.

System Model and Problem Formulation
Consider a single antenna (Note that, under OFDMA assumption, extension to the multiple antennas case is straightforward [26].) OFDMA downlink transmission with K users and M subcarriers as shown in Figure 1. The signal received by user k in subcarrier m can be expressed as where k is the user index, m is the subcarrier index, S k denotes the set of subcarriers allocated to user k, x km is the transmitted signal, p km is the power allocated, h km is channel frequency response, and w km is the received noise. We assume that h km is time-invariant and its value is available at the base station. The noise samples are assumed to be independent and identically distributed as w km ∼ CN (0, σ 2 km ). We denote by c km = |h km | 2 /σ 2 km the channel signal-to-noise ratio (SNR) of kth user in subcarrier m and by β k the weight associated with the rate of user k. The weighted sum-rate maximization problem subject to a sum-power constraint P T can be formulated as [13] maximize K k=1 m∈Sk β k log 2 1 + p km c km subject to K k=1 m∈Sk where variables are p km and S k . It is also useful to introduce a virtual system where each subcarrier can be used by all users in the same time. This results in a general OFDMA downlink channel where the signal received by user k in subcarrier m is given by and the second term in the right-hand side represents the interference from other users. Assuming independent channel coding across users at the transmitter and independent decoding at receivers, the weighted sum-rate maximization problem for the virtual system can be formulated as where the optimization variables are p km . The constraints associated with orthogonal subcarrier allocations in problem (2) have been dropped out and the interference among users allocated to the same subcarrier is reflected in the objective.
Here we can make several observations. First, any solution of problem (4) is such that the second constraint in problem (2) is automatically satisfied, for reasons that will be explained in the beginning of Section 3.2. In other words, any solution of problem (4) is feasible for problem (2). Moreover, at any of these solutions the objective function of problem (4) will be exactly the same as the objective function of problem (2). Based on these observations it can be concluded that any solution of the auxiliary problem (4) is a solution for the original problem (2) as well.
The original problem (2) is combinatorial and it requires exponential complexity to find a global optimum. Although problem (4) is still nonconvex it is noncombinatorial. Thus, in the following, we focus on solving problem (4) instead of solving the original problem (2). A similar approach has been used in [7] to solve the (nonweighted) sum-rate maximization problem, that is, for the particular case β k = 1, k = 1, . . . , K. However, the methods proposed there do not apply to the general case of arbitrary weights, for reasons that will become clear in Section 3.3. Due to the nonconvexity of problem (4) finding the global optimum is intractable. Thus, an approximative method inspired from the primal decomposition technique is presented in Section 3.

Primal Decomposition.
To reveal the complicating constraints [24], we introduce M new variables p m = K k=1 p km , m = 1, . . . , M, and reformulate the problem (4) as follows: where the optimization variables are p km and p m . Note that p m represents the total power on subcarrier m. Treating p m as complicating variables, problem (5) can be decomposed [24,25] into a master problem and M subproblems, one subproblem for each subcarrier m = 1, . . . , M. For a given subcarrier m, the subproblem is given by where variables are p km , k = 1, . . . , K. The master problem can be expressed as where variables are p m and f m (p m ) represents the optimal value of subproblem (6) for fixed p m .

Algorithm Derivation.
Let us denote by P the feasible set of problem (6). Note that subproblem (6) is not a convex optimization problem (Since we maximize a convex function.). However, its objective function is convex with respect to (w.r.t.) optimization variables p 1m , . . . , p Km , its feasible set is a nonempty convex polyhedral set (i.e., a simplex [27]) and its objective is bounded above on P . Thus, by following the approach of [7, Section III], from [28, Corollary 32.3.4] (If a convex function f is bounded above on a convex set X ⊆ dom f , then the maximum of f relative to X is attained at one of the finitely many extreme points of X.) it follows that the solutions of subproblems (6) must be achieved at one of the vertices of the polyhedral set P . Consequently, the solutions of the M subproblems can be expressed as where j m represents the index of the user allocated to mth subcarrier, that is, Solution (8) confirms that, even though in subproblems (6) all users are allowed to use all subcarriers, the optimal power allocation consists of allocating only one user per subcarrier. This guarantees that solution (8) is feasible for the original problem (2). By substituting (8) and (9) in the objective of (6), f m (p m ) can be expressed as

EURASIP Journal on Wireless Communications and Networking
We note that the index j m depends on p m according to (9) and the function f m (p m ) is the pointwise maximum of a set of concave functions. Therefore f m (p m ) is not a concave function w.r.t. p m [27] in general. Thus, standard convex optimization tools (e.g., subgradient-based methods) cannot be directly applied to solve master problem (7). We propose an iterative method, where at each iteration i we first solve the M subproblems (6) to obtain an user-tocarrier allocation j (i) m for a given subcarrier power allocation Then, the objective of the master problem is approximated by the following lower bound where P denotes the feasible set of the master problem (7). The lower bound is concave w.r.t. p 1 , . . . , p M and the solution of the approximated master problem can be found by multilevel waterfilling algorithm [9]. The resulting solution is used as subcarrier power allocation for the next iteration. The proposed algorithm can be summarized as follows.
(3) solve the following approximation of master problem and return the solution p m ; let p m (i+1) = p m .
(4) check a stopping criteria; if it is satisfied EXIT, otherwise let i = i + 1 and go to step (2).
The solution of problem (13) solved at step (3) is given by the following multilevel waterfilling expression [9]: where L is chosen such that the power constraint is satisfied with equality, that is,

Particularization to the Sum-Rate Maximization.
The problem of the sum-rate maximization (i.e., β k = 1 for all k = 1, . . . , K) in downlink OFDMA systems is solved in [7, Section III]. The solution method is exactly equivalent to only one iteration of the APD algorithm. Unlike the general weighted sum-rate maximization, in which user weights β k 's are different, in the sum-rate maximization (i.e., β k = 1 for all k = 1, . . . , K) the index j m will not depend on p m according to (9). Thus, by using (10) and (11) the function f m (p m ) can be found as f m (p m ) = log 2 (1 + p m c jmm ) = log 2 (1 + p m · max k c km ) which is concave w.r.t. p m (recall that the function f m (p m ) is not concave w.r.t. p m when the user weights β k 's are different). As a result, the inequality given in (12) holds with equality and solving problem (7) gives the optimal subcarrier power allocation [7, Section III].

Convergence Behavior and Exit Criterion
In this section, we start by investigating the monotonicity of the proposed algorithm. Then we provide a specific exit criterion which certifies that algorithm converged to a fixed power and subcarrier allocation followed by a simple graphical illustration.

Monotonic Behavior.
The following theorem states the monotonic behavior of the proposed algorithm.

Theorem 1. For any iteration
that is, the proposed APD method is an ascent algorithm.
Proof. From (10) and (11), it follows that the solution of (6) in iteration i is given by, Now we can write the following chain of relations, where the first inequality follows from (13), the second one follows trivially from the maximization over the users, and the last equality follows from (11) and (10), respectively.

Exit Criterion.
The exit criterion for such ascent algorithm is typically chosen heuristically, for example, the increasing in the objective between two successive iterations is below a certain predefined threshold. However, for the proposed algorithm we are able to find an exit criterion which certifies that algorithm converged to a fixed power and subcarrier allocation and further improvement is not possible. This is described by the following theorem.

Theorem 2.
If at iteration n +1, (n ≥ 1) we have j (n) m = j (n+1) m , m = 1, . . . , M then the following holds: That is, the algorithm converges to a fixed power and subcarrier allocation.
Proof. Since c km 's are continuous random variables, the probability to have multiple solutions for (9) is zero. Thus, in the following we assume that j m given by (9) is unique (Equation (9) has multiple solutions if and only if c km = c lm for some l / = k. When p m = 0 we assign j m any arbitrarily user index.).
Note that the objective function of (13) is strictly concave. Thus it has a unique solution [27]. Therefore, for all m, j (n) . Since (9) has a unique solution as well, p (n+1) , item (1) follows directly by induction. Furthermore, item (2) follows from item (1) by the uniqueness of the solution of problem (13). Finally, item (3) follows trivially from item (2).
Thus the exit criterion checks if the subcarrier allocation between two successive iterations remains unchanged. Such point is a local optimum (possible global) in the sense that the objective cannot be increased by changing the power allocation or subcarrier allocation only.
As a specific example, consider the simple OFDMA system with two subcarriers (i.e., m = 1, 2). By performing the variable transformations p 1 = (1 − t)P T and p 2 = tP T , t ∈ [0, 1], we can express the variation of 2 m=1 f m (p m ) on P as, which is plotted in Figure 2. According to Figure 2(a) global optimal is achieved at the iteration (i + 3). Achieving global optimality is not always possible because, quasiconcavity [27] of h(t) cannot be guaranteed with random channel SNR, c km . Consequently the APD algorithm can converge to a local optimal solution as shown in Figure 2(b).

Complexity Analysis
In this section, we analyze and compare the computational complexity of the proposed APD algorithm to Lagrangian  relaxation-based algorithm [13] as well as to the optimal exhaustive search algorithm. With K users and M subcarriers, altogether we have K M user-subcarrier combinations. Therefore finding optimal subcarrier and power assignment requires K M searches. Combined with multilevel waterfilling at each instance of subcarrier assignment, O(MK M ) operations are required to find the solution.

(a) Convergence to the global optimal solution
The algorithm proposed in [13] for the weighted sumrate maximization problem requires O(MK) operations to obtain a suboptimal solution. The proposed APD algorithm described in Section 3 requires O(MK) operations in step (2) and O(Mlog 2 M) operations (This is the number of operations required in ordering.) in step (3). In practice, it is reasonable to assume that K log 2 M (The assumption is reasonable since the number of users simultaneously serviced by the system can be very large. For example, in a Wi-Max system M can be up to 2048 [29]. However, the value of log 2 M will not become very large (in a WiMax system log 2 M = 11 at most).). Therefore the complexity of the APD algorithm can be approximated by O(MK).

Numerical Results
The performance of the proposed APD algorithm is compared to the dual decomposition-based algorithm proposed in [13], denoted as WSRmax, as well as to the optimal algorithm based on exhaustive search. The WSRMax algorithm uses a bisection search method to update the dual variable λ [13, Section IV]. For initializing the bisection search interval, 6 EURASIP Journal on Wireless Communications and Networking [λ min , λ max ], we exploit the fact that the subgradient of the dual function can be analytically computed. Since the dual function is convex, the sign of its subgradients changes as we pass through the minimum point of the dual function [13, equation (11)]. Therefore, we use a grid search (with step size 1) to identify the interval in which the subgradient of dual function changes its sign, and it is used as initial bisection search interval. Thus, the interval [λ min , λ max ] is guaranteed to contain the optimal value of the the dual function and the width of the initial interval is one, that is, (λ max − λ min ) = 1. The proposed APD algorithm is initialized by allocating equal power to all subcarriers.
In what follows, we compare the convergence behavior of the APD and the WSRMax algorithms. For a fair comparison, we define the following metric: the average normalized weighted sum-rate deviation, where C opt is the optimal weighted sum-rate value obtained using optimal exhaustive search, C subopt is the estimated objective value from either the APD algorithm or the WSRMax algorithm, and expectation E{·} is taken w.r.t. channel realization. An OFDMA system with M = 8 subcarriers and a uniform power delay profile with 4 channel taps is considered. We assume σ 2 km = σ 2 , k = 1, . . . , K, m = 1, . . . , M and define SNR per subcarrier as P T /(M · σ 2 ). Figure 3 shows the convergence behavior of the considered algorithms with SNR= 10 dB for K = 2 and K = 4 users. The weights of the users are [1,2] for K = 2 and [1, 2, 1, 2] for K = 4. The floor of the curves is due to the suboptimality of the algorithms. The results show that the APD algorithm converges faster than the WSRMax algorithm and provides smaller average normalized weighted sum-rate deviations. Specifically, for both cases, K = 2 and K = 4, the APD algorithm requires only 3 iterations on average to achieve an average normalized weighted sum-rate deviation of 10 −4 whilst the WSRMax algorithm requires around 15 iterations to reach the same accuracy level. It is intuitively obvious that the number of iterations required by the APD algorithm is sensitive to the nature of the surface of the objective function M m=1 f m (p m ) of problem (7), for example, see Figure 2. In general, it is a hard to quantify the number of iterations before convergence (or any bounds on the number of iterations) due to the nonconvexity of problem (5). However, the numerical results suggest that the APD algorithm often converges very fast in practice. It should be emphasized that the number of iterations required in the initialization of the WSRmax algorithm (i.e., the number of iterations required to find the initial bisection search interval) is not considered when drawing the curves. In particular, for the initialization process, the WSRMax algorithm requires a several number of steps (each step has complexity of O(MK)) and the proposed APD algorithm requires none. Moreover, it is hard to find good initialization methods for the WSRMax algorithm (i.e., initialization for bisection search method) compromising between the number of steps required in the initialization and the width of the initial searching interval (λ max − λ min ). Consequently, additional precautions are required and therefore, in practical implementations the APD algorithm is more favorable compared to the WSRMax algorithm.
In the sequel, we compare the behavior of the APD and the WSRMax algorithms using the following metric: the probability of missing the global optimal, where ε is a small number which quantifies the maximum admissible deviation between C opt and C subopt . It is considered that the global optimum is missed if C subopt is more than ε away from C opt . Figure 4 uses the same simulation setup as that in Figure 3 and depicts the variation of probability of missing the global optimal, P ε with the number of iterations. The floor of probability P ε is again due to the suboptimality of both algorithms. The influence of ε on P ε is totally indistinguishable in case of the APD algorithm. This behavior shows that the proposed algorithm APD can arrive very close-tooptimal solutions within a very small number of iterations and then it remains there. The results further show that the P ε evaluated using the WSRMax algorithm is highly dependent on ε. That is, the smaller the deviations in the C subopt from the optimal C opt , the larger the number of iterations required by the WSRMax algorithm to reach the expected target value P ε . Therefore, independent from the ε, the APD algorithm allows to find a suboptimal solution within a small number of iterations at the expense of a slight increase in P ε . These observations are very useful in practice since they carry significant information in the system design point of view. For example, consider a design requirement P 10 more the design requirement as P 10 −6 ≤ 0.3, then the number of iterations required by the WSRmax increases to 24. In contrast, the APD algorithm always requires just one iteration. Figure 5 shows the rate region (The standard way to characterize the boundary points in the 2-user rate region is by solving problem (4) for β 1 = α and β 2 = 1 − α, where α ∈ [0, 1] [30].) computed by using all considered algorithms. The same simulation setup as in [13]  T , respectively. Although the computational complexity of the proposed algorithm is much smaller compared to that of optimal exhaustive search-based method, Figure 5 indicates that the rate region obtained by the APD algorithm almost coincides with the optimal rate region. This behavior is expected since the average normalized weighted sum-rate deviation, (20) is in the order of 10 −4 as shown in Figure 3.
In the following we compare the behavior of the APD and the WSRmax algorithm for large number of subcarriers and users. Since, for large number of users and subcarriers the complexity of evaluating C opt is prohibitively high, the metrics defined in (20) and (21) are not used. The behavior of the APD algorithm is compared with that of the WSRmax algorithm.
In Figure 6, the evolution of the expected weighted sum-rate provided by the APD algorithm is compared to the resulting expected weighted sum-rate from the WSRmax algorithm, where the expectation is taken w.r.t. channel realization. An OFDMA system with M = 256 subcarriers, a uniform power delay profile with 128 channel taps, and K = 8, 16, 32, 64 users is considered. The weights of the users are taken from the sequence {1, 2, 1, 2, . . . , 1, 2}, (e.g., when K = 8, weights are [1, 2, 1, 2, 1, 2, 1, 2]). The SNR is assumed to be 10 dB. The results show that even for a large number of car-  riers, the APD algorithm converges very fast as compared to the WSRMax algorithm independent of the number of users.

Conclusions
A joint subcarrier and power allocation algorithm which is inspired from primal decomposition techniques has been proposed for maximizing the weighted sum-rate in multiuser OFDMA downlink systems. Although the original problem is nonconvex, the proposed APD algorithm finds fast a 8 EURASIP Journal on Wireless Communications and Networking suboptimal, but still very close-to-optimal solution with very high probability (i.e., more than 90% of the time). Unlike the recently proposed WSRMax algorithm [13], the APD algorithm requires no additional precautions in the initialization, and convergence to a suboptimal solution is possible within a very small number of iterations. Although the proposed primal decomposition-based solution method does not rely on zero duality gap for proving the optimality in the case of large number of subcarriers, our computational experience with larger number of subcarriers suggests that the proposed APD algorithm is capable of finding the same solution as the WSRmax algorithm (which is asymptotically optimal when the number of carriers grows to ∞) even with very few iterations.