Modulation and Multiple Access for 5G Networks

Fifth generation (5G) wireless networks face various challenges in order to support large-scale heterogeneous traffic and users, therefore new modulation and multiple access (MA) schemes are being developed to meet the changing demands. As this research space is ever increasing, it becomes more important to analyze the various approaches, therefore in this article we present a comprehensive overview of the most promising modulation and MA schemes for 5G networks. We first introduce the different types of modulation that indicate their potential for orthogonal multiple access (OMA) schemes and compare their performance in terms of spectral efficiency, out-of-band leakage, and bit-error rate. We then pay close attention to various types of non-orthogonal multiple access (NOMA) candidates, including power-domain NOMA, code-domain NOMA, and NOMA multiplexing in multiple domains. From this exploration we can identify the opportunities and challenges that will have significant impact on the design of modulation and MA for 5G networks.

considered as an important service that should be supported by 5G networks [1]. These scenarios require massive connectivity with high system throughput and improved spectral efficiency (SE) and impose significant challenges to the design of general 5G networks. In order to meet these new requirements, new modulation and multiple access (MA) schemes are being explored.
Orthogonal frequency division multiplexing (OFDM) [3]- [5] has been adopted in fourth generation (4G) networks. With an appropriate cyclic prefix (CP), OFDM is able to combat the delay spread of wireless channels with simple detection methods, which makes it a popular solution for current broadband transmission. However, traditional OFDM is unable to meet many new demands required for 5G networks. For example, in the mMTC scenario [1], [2], sensor nodes usually transmit different types of data asynchronously in narrow bands while OFDM requires different users to be highly synchronized, otherwise there will be large interference among adjacent subbands.
To address the new challenges that 5G networks are expected to solve, various types of modulation have been proposed, such as filtering, pulse shaping, and precoding to reduce the out-of-band (OOB) leakage of OFDM signals. Filtering [6]- [9] is the most straightforward approach to reduce the OOB leakage and with a properly designed filter, the leakage over the stop-band can be greatly suppressed.
Pulse shaping [10]- [13] can be regarded as a type of subcarrier-based filtering that reduces overlaps between subcarriers even inside the band of a single user, however, it usually has a long tail in time domain according to the Heisenberg-Gabor uncertainty principle [14]. Introducing precoding [15]- [18] to transmit data before OFDM modulation is also an effective approach to reduce leakage. In addition to the aforementioned approaches to reduce the leakage of OFDM signals, some new types of modulations have also been proposed specifically for 5G networks. For example, to deal with high classes of users and applications in 5G networks, various NOMA schemes have been proposed.
As an alternative to OMA, NOMA introduces a new dimension by perform multiplexing within one of the classic time/frequency/code domains. In other words, NOMA can be regarded as an "addon", which has the potential to be harmoniously integrated with existing MA techniques. The core of NOMA is to utilize power and code domains in multiplexing to support more users in the same resource block. There are three major types of NOMA: power-domain NOMA, code-domain NOMA, and NOMA multiplexing in multiple domains. With NOMA, the limited spectrum resources can be fully utilized to support more users, therefore the capacity of 5G networks can be improved significantly even though extra interference and additional complexity will be introduced at the receiver.
To address the various challenges of 5G networks, we can either develop novel modulation techniques to reduce multiple user interference for OMA or directly use NOMA. The rest of this article is organized as follows. In Section II, novel modulation candidates for OMA in 5G networks are compared. In Section III, various NOMA schemes are discussed. Section IV concludes the article.

II. NOVEL MODULATION FOR OMA
In this section, we will discuss new modulation techniques for 5G networks. Since OFDM is widely used in current wireless systems and standards, many potential modulation schemes for 5G networks are delivered from OFDM for backward compatibility reasons. Therefore, we will first introduce traditional OFDM.

A. Traditional OFDM
Denote d k , for k = 0, 1, · · · , N − 1, to be the transmit complex symbols. Then the baseband OFDM signal can be expressed as for 0 ≤ t ≤ T s , where f k = k∆f , ∆f is the subcarrier bandwidth and T s is the symbol duration. To ensure that transmit symbols can be recovered without distortion, ∆f · T s = 1, which is also called the orthogonal condition. It can be easily shown that if the orthogonal condition holds.
To address the delay spread of wireless channels, a CP is usually used in OFDM. If the length of the CP is larger than the delay span (the duration between the first and the last taps/paths of a channel), then the demodulated OFDM signal can be expressed aŝ where H k is the frequency response of the wireless channel at f k = k∆f and n k is the impact of additive channel noise. Therefore, the channel distortion becomes a multiplication of channel frequency response in OFDM systems while it is convolution in single-carrier systems, which makes the detection of OFDM signal much easier.
From the above discussion, OFDM can effectively deal with the delay spread of broadband wireless channels and FFT can be used to significantly simplify its complexity, therefore it has been widely used in the current wireless communication systems and standards. However, as we can see from (1) modulation techniques are expected to accept asynchronous scenarios.
3) Flexibility: The modulation parameters (e.g., subcarrier width and symbol period) for each user should be configured independently and flexibly to support users with different data rate requirements.
The modulation techniques for OMA mainly include pulse shaping, subband filtering, precoding design, guard interval (GI) shortening, and modulation in the delay-Doppler domain. In this section, we introduce those promising modulation techniques subsequently.

B. Modulations based on Pulse Shaping
Pulse shaping, which is also regarded as subcarrier-based filtering, can effectively reduce OOB leakage. According to the Heisenberg-Gabor uncertainty principle [14], the time and frequency widths of the pulses cannot be reduced at the same time. Therefore, the waveforms based on pulse shaping is usually non-orthogonal in both time and frequency domains to maintain high SE. Compared with traditional OFDM, the transceiver structure supporting pulse shaped modulation is more complex. Here, we introduce two typical modulations based on pulse shaping, i.e., filter bank multicarrier (FBMC) [10], [11] and generalized frequency division multiplexing (GFDM) [12], [13]. Fig. 1, FBMC [10], [11] consists of IDFT and DFT, synthesis and analysis polyphase filter banks. The prototype filter in FBMC performs the pulse shaping. There are two types of typical pulses: the pulse based on the isotropic orthogonal transform algorithm (IOTA) [20] and the pulse adopted in the PHYDYAS project [21]. The length of the pulse in the time domain is determined by the required performance and is usually several times the length of the symbol period. The bandwidth of the pulse, which is different from the pulse in the traditional OFDM that has a long tail, is limited within a few subbands. To achieve the best SE, offset quadrature amplitude modulation (OQAM) is usually applied to make FBMC real-domain orthogonal in time and frequency domains [10]. Therefore, the transmit signal over M/2 consecutive block periods 1 can be expressed as

1) FBMC: As shown in
where K and M are the numbers of subcarriers and symbols, respectively, d k,m is the transmit symbol at subcarrier k and symbol m, and g(n) is the prototype filter coefficient at the n-th time-domain sample.
It is worth noting that the transmit symbols here refer to the pulse amplitude modulation (PAM) symbols that are derived from the staggering of quadrature amplitude modulation (QAM) symbols.
Thus the interval between two adjacent blocks is only half of the block period due to the offset in OQAM. The parameter, θ k,m in (4), is defined as which is used to form the OQAM structure. With a properly designed prototype filter such as IOTA and the OQAM structure, the interference from the nearby overlapped symbols caused by a matched filter (MF) receiver becomes pure imaginary, which can be easily cancelled.
2) GFDM: Fig. 2 demonstrates the block diagram of GFDM. OFDM and single-carrier frequency division multiplexing (SC-FDM) can be regarded as two special cases of GFDM [13]. The unique feature of GFDM is to use circular shifted filters, rather than linear filters that are used in FBMC, to perform pulse shaping. By carefully choosing the circular filter, the out-of-block leakage can be reduced even if the orthogonality is completely given up. We can flexibly adjust M frequency samples and K time samples for a GFDM block according to the application environment. The transmit signal for each GFDM block can be expressed as for 0 ≤ n ≤ KM − 1, where d k,m is the transmit symbol on subcarrier k at subsymbol m and g k,m (n) is the circular time and frequency shifted version of the prototype pulse shaping filter. In (6), where (.) KM denotes the KM modulo operation and g(n) is the prototype pulse shaping filter. Similar to the traditional OFDM, the modulation process and demodulation process can be expressed by matrix operations. The IDFT and DFT matrices in the traditional OFDM are substituted by some specific matrices corresponding to the modulation and demodulation for GFDM. But, the transceiver structure of GFDM is significantly different from the traditional OFDM.
Besides FBMC and GFDM, other modulations based on pulse shaping, such as pulse-shaped OFDM [22] and QAM-FBMC [23], have also been proposed for 5G networks. Generally, modulations based on pulse shaping try to restrict transmit signals within a narrow bandwidth and thus mitigate the OOB leakage so that they can work in asynchronous scenarios with a narrow guard band. FBMC also uses OQAM to achieve real-domain orthogonality, which saves the cost of the GI and interference cancellation. In addition, the circular shifted filters in GFDM avoid the long tail of the linear filters in the time domain, which makes GFDM fit for sporadic transmission. Furthermore, GFDM is easily compatible to MIMO technologies [13].

C. Modulations based on Subband Filtering
Subband filtering is another technique to reduce the OOB leakage. Universal filtered multicarrier (UFMC) [6], [7] and filtered OFDM (f-OFDM) [8], [9] are two typical modulations based on subband filtering, which will be introduced next. Fig. 3 shows the transmitter and the receiver structures of UFMC [6], [7]. In UFMC, the subbands are with equal size, and each filter is a shifted version of the same prototype filter. OFDM is applied within a subband for this modulation as shown in the figure. Since the bandwidth of the filter in UFMC is much wider than that of the modulations based on the pulse shaping, the length in time domain is much shorter. Therefore, interference caused by the tail of the filter can be easily eliminated by adopting a zero-padding (ZP) prefix with a reasonable length. Assuming that N subcarriers are divided into K subbands, each with L = N/K consecutive subcarriers, the transmit signal in UFMC can be expressed as

1) UFMC:
where f k (n) is the filter coefficient of subband k, and s k (n) is the OFDM modulated signal over subband k that can be expressed as with N g denoting the length of the ZP, M denoting the number of symbol blocks and s k,m (n) denoting the signal at subcarrier k and symbol m. In (9), s k,m (n) can be expressed as where d l,m denotes the l-th transmit symbol at the m-th symbol block.
At the receiver, the signal at each symbol interval is with the length of N + N g and is zero-padded to have a length of 2N so that a 2N-point FFT can be performed. Please note that only the even subcarriers are considered for signal detection after the 2N-point FFT.
2) f-OFDM: f-OFDM has a similar transmitter structure as UFMC [8], [9]. The main difference is that f-OFDM employs a CP and usually allows residual inter-symbol interference (ISI) [8]. Therefore, at the receiver, the MF is applied instead of the ZP and decimation. Besides, downsampling can be applied before the DFT operation, which can reduce complexity significantly since the CP can mitigate most of interference caused by the tail of the filter; the residual interference is with much lower power and can be treated as noise [9]. Thus, the filter in f-OFDM can be longer than that in UFMC and has better attenuation outside the band. With the aid of effective channel coding, the performance degradation caused by residual interference in f-OFDM can be negligible. Another difference from UFMC is that the subcarrier spacing and the CP length do not have to be the same for different users in f-OFDM.
The most widely used filter in f-OFDM is the soft-truncated sinc filter [8], which can be easily used in various applications with different parameters. Therefore, f-OFDM is very flexible in the frequency multiplexing.
Besides UFMC and f-OFDM, other modulations based on subband filtering have also been proposed.
For example, resource block f-OFDM (RB-f-OFDM) [24] utilizes filters based on resource block instead of the whole band of users in f-OFDM. In general, modulations based on subband filtering can effectively reduce OOB leakage and achieve better performance in comparison with the traditional OFDM.

D. Other Modulation Techniques
Apart from pulse shaping and subband filtering, there are also some other techniques to suppress the OOB leakage and meet the requirements of 5G networks. In the following, we mainly introduce three other modulations, including guard interval discrete Fourier transform spread OFDM (GI DFT-s-OFDM) [25], spectrally-precoded OFDM (SP-OFDM) [15], and orthogonal time frequency and space (OTFS) [19].

1) GI DFT-s-OFDM:
In GI DFT-s-OFDM [25], the known sequence is used as the GI instead of a CP. Several types of the known sequences, such as the zero sequence [26] and a well-designed unique word [27], can be used. By a fixed known sequence with constant amplitude in GI DFT-s-OFDM, the peak-to-average power ratio (PAPR) of the modulated signal can be reduced. Moreover, the known sequence can also be utilized to estimate the parameters, such as the carrier frequency offset (CFO) in the synchronization process. By utilizing a proper sequence as the GI, the discontinuity between For GI DFT-s-OFDM, the overall length of the GI and useful signal for different users is same.
Thus, the DFT windows for different users at the receiver can still be aligned even if the lengths of the GIs are different. Therefore, the mutual interference due to asynchronization of users can be mitigated [25].
2) SP-OFDM: Fig. 4 shows the diagram of SP-OFDM [15]. From the figure, it consists IDFT and DFT, spectral precoder, and iterative detector. Generally, the data symbols mapped on subcarriers are precoded by a rank-deficient matrix in order to project the signal into a properly selected lower dimensional subspace so that the precoded signal can be high-order continuous, and results in much lower leakage compared with the traditional OFDM [16], [17]. Even if precoded by a rank-deficient matrix can reduce the capacity of the channel, the OOB leakage of the OFDM signals can be significantly suppressed at the cost of only few reduced dimensions.
Compared to the modulations based on filtering, SP-OFDM has the following three advantages: • The ISI caused by the tail of the filters can be removed without filtering. Therefore, the CP applied to combat the multipath of the wireless channels can be shorter, and SE is improved.
• When fragmented bands are used, SP-OFDM can easily notch specific well chosen frequencies without requiring multiple narrow subband filters [18].
• Furthermore, precoding and filtering can be combined to further improve the performance.

3) OTFS:
The structure of OTFS is similar to SP-OFDM, as can be seen in Fig. 4 In addition, a number of modulations based on other techniques have been also proposed, such as windowed OFDM (W-OFDM) [28], which utilizes windowing to deal with the discontinuity between adjacent OFDM symbols.

E. Performance Comparison of Different Modulations
We compare the power spectral density (PSD) and bit-error rate (BER) of different modulations. 2) BER: In order to reduce the OOB leakage, many modulations utilize techniques, such as pulse shaping and subband filtering, which may introduce ISI and ICI. Hence, the BER performance of different modulations is compared here.  FBMC is approximately orthogonal in the real domain and achieves good BER performance. The performance of UFMC, GFDM, and SP-OFDM is similar to that of FBMC, which is degraded slightly due to noise enhancement and low-projection precoding. However, f-OFDM introduces extra ISI that cannot be completely canceled, and as a result, it has slightly worse performance, especially in the high SNR region. GI DFT-s-OFDM and OTFS, which are different from the modulation schemes that directly map the symbols on subcarriers, apply spreading before mapping so that their performance does not approach that of OFDM. Since the fast-fading channel is difficult to be estimated and tracked accurately, the performance of the most modulation schemes degrades significantly as we can see from . While OTFS can still achieve good performance due to its specific channel estimation method. Moreover, its performance in the high-mobility scenario is even better than that in the zero Doppler shift scenario because of Doppler diversity.

F. Open Issues
In this section, modulation techniques for 5G networks will be be discussed.  The downlink transmission of NOMA for the two user case is shown in Fig. 7 where the users are served at the same time/frequency/code resource block with a total power constraint. Specifically, the BS sends a superimposed signal containing the two signals for the two users. This differs from conventional power allocation strategies, such as water filling,as NOMA allocates less power for the users with better downlink CSI, to guarantee overall fairness and to utilize diversity in the time/frequency/code domains.
SIC is used for signal detection at the receiver. The user with more transmit power, that is, the one with smaller downlink channel gain, is first to be decoded while treating the other user's signal as noise.
Once the signal corresponding to the user with the larger transmit power is detected and decoded, its signal component will be subtracted from the received signal to facilitate the detection of subsequent users. It should be noted that the first detected user is with the largest inter-user interference and also the detection error in the first user will pass to the other user, which is why we have to allocate sufficient power to the first user to be detected. The extension of NOMA from two to multiple user cases is straightforward. controlled so that the received signal components at the BS corresponding to the users with the better CSI, have more powers. At the receiver (the BS), the user with the best CSI is decoded first. After that, the corresponding component is removed from the received signal. The SIC receiver works in a descending order of the CSI, which is the opposite to the downlink case.  By allocating different users with different beams in the same resource block, the quality of service (QoS) of each user can be guaranteed in multiple antenna based NOMA systems forcing the beams to satisfy a predefined order. This type of multiple antenna based NOMA scheme has been first proposed by Sun et al. in [40] to investigate power optimization to maximize the ergodic capacity.
This proposed multiple antenna based NOMA scheme has proved to be able to achieve significant performance improvement compared with conventional OMA schemes.
A cluster of users can share the same beam. The spatial channels of different users within the same cluster are considered to be highly correlated. Therefore, beams for different clusters should be carefully designed to guarantee that the channels for different clusters are orthogonal to each other in order to suppress the inter-cluster interference. For multiple-input-single-output (MISO) based NOMA, a two-stage multicast beamforming scheme has been proposed by Choi in [34], where ZF beamforming has been employed to mitigate interference from adjacent clusters first and then the optimal beamforming vectors have been designed to minimize the total transmit power within each cluster. For MIMO based NOMA, a scheme to simultaneously apply open-loop random beamforming and intra-beam SIC, has been proposed by Higuchi and Kishiyama in [31]. However, here the system performance is considerably degraded as the random beamforming can bring uncertainties at the user side. More recently, a precoding and detection framework with fixed power allocation has been proposed by Ding et al. [37] to solve these problems caused by random beamforming, and demonstrated that MIMO based NOMA can achieve better outage performance than MIMO based OMA even for users who experience strong co-channel interference.
A comprehensive summary for the state-of-the-art work on multiple antenna based NOMA is given in Table I, where "BF", "OP", "SU" and "MU" are used to represent beamforming, outage probability, two-user and multi-users cases, respectively.

3) Power Allocation in NOMA:
NOMA is capable of supporting unequal transmission rates for users experiencing various channel conditions by assigning them different transmit powers. Therefore, the power allocation mechanism for different users is critical to power-domain NOMA. In the downlink of NOMA, the power allocation is the opposite to the conventional schemes (e.g., water filling policy), and more powers are allocated to the users with poor CSI in order to ensure that every user obtains reasonable receive signal power and therefore guarantee fairness. The optimization problem is normally modelled to maximize the individual/sum rate while considering this fairness issue. As the power  Give absolute priority to the fairness High High [42], [45], [50] Maximize the geometric mean of user rates Moderate Low [51]- [54] Give users weighted priorities Low Low [55] Tradeoff between throughput and fairness -- [56]- [58] allocation is based on the order of CSI, the cases with perfect and imperfect CSI are different and should be investigated separately, as in [45]. When perfect CSI is available, the optimization problem can be formulated to maximize the minimum achievable user rates. With average CSI rather than perfect CSI, the optimization problem can be formulated to minimize the maximum outage probability. A comprehensive summary of the state-of-the-art work on the power allocation in NOMA with different fairness strategies is given in Table II. It is worth noting that the most critical challenge for power allocation in NOMA comes from the non-convex property of the ordered power constraints, which makes the optimization problem untraceable. Therefore, further research work on the optimal ordering design can be expected.  CoMP+DF OP+ergodic rate Each stream has same diversity order [63] DF Sum rate Novel two-stage power allocation [66] mission rates for cell-edge users [59]. The scenario with users transmitting at different rates naturally matches the application scenarios typical of NOMA.

4) Cooperative
The basic idea of relay-assisted NOMA is to use the users with the better CSI as the decode-andforward (DF)/amplify-and-forward (AF) as relays to improve the transmission rates of the users with poor CSI. A cooperative NOMA model supporting M users with M time slots has been proposed in [60]. In the first time slot, the traditional non-cooperative NOMA scheme is conducted. In the second time slot, the user with the best CSI acts as the DF relay for the user with the second best CSI. In the following time slots, the user with the m-th best CSI works as the relay for the user with the subsequent worse CSI to improve the transmission rates.
CoMP transmission, where multiple BSs support cell-edge users together, is capable of improving the performance of cell-edge users. NOMA has been first applied into CoMP by Choi [61], where two coordinated BSs use Alamouti code to support a cell-edge user in a NOMA channel. Subsequently, the effectiveness of NOMA in CoMP systems has been further demonstrated by Tian et al. [62] in comparison with the conventional joint-transmission NOMA. A coordinated direct and relay transmission (CDRT) scheme has also been considered by Kim and Lee in [63], where the BS communicates with a near user and a relay simultaneously, invoking NOMA in the first time slot while communicating with a far user with the aid of the relay in the following time slots. This NOMA based CDRT scheme solves the main challenge by using the inherent property of NOMA that allows a receiver to obtain side information such as other user's signal for interference cancellation.
A comprehensive summary of the existing work on NOMA in cooperative communications is given in Table III, where "OP" represents the outage probability.

5) Spectral and Energy Efficiency in NOMA:
Recall that when designing 5G networks, SE and energy efficiency (EE) are two important performance metrics. NOMA provides a promising solution due to its extra degree of freedom over the power domain, especially suited for IoT networks that require massive connectivity but with low power consumption at sensor nodes. More specifically, NOMA has been considered to be able to boost the SE of 5G networks [46], [63], [67], [68]. NOMA with the consideration of EE has also been investigated in [69], where a sub-optimal resource allocation algorithm has been proposed to maximize the EE of the systems. Moreover, the EE of a NOMA network, subject to a minimum required data rate for each user, can be maximized using the approach in [70]. Similar to most wireless networks [71], SE and EE cannot be achieved simultaneously in NOMA networks. Therefore we expect to see more work on SE and EE tradeoffs in NOMA in the future. A more comprehensive review on the power-domain NOMA can be found in [72].

B. Code-Domain NOMA
Code-domain NOMA can support multiple transmissions within the same time-frequency resource block by assigning different codes to different users. It has certain spreading gain and shaping gain with the cost of extra signal bandwidth in comparison with power-domain NOMA. Existing solutions to code-domain NOMA mainly include low-density spreading CDMA (LDS-CDMA) [73], low-density spreading OFDM (LDS-OFDM) [74], and sparse code multiple access (SCMA) [75], which will be introduced in the following.
1) LDS-CDMA: LDS-CDMA [73] is a novel type of CDMA. Its key feature is that a low-density signature, which has a similar form of the low-density parity-check (LDPC) matrix, is employed for the codebook construction. When the number of users is larger than that of samples per symbol period in the conventional CDMA, MAI is inevitable and optimal multiuser detection is extremely complex.
However, due to the sparse structure of the signature in LDS-CDMA, a low-complexity near-optimal multiuser detection scheme, based on a message passing algorithm (MPA), can be applied in the detection of LDS-CDMA, which significantly improves performance.
2) LDS-OFDM: LDS-OFDM [74] has similar properties to LDS-CDMA, except that the output of the signature is mapped into the subcarriers of OFDM rather than the time samples in CDMA. Therefore, a low-complexity MPA detector can be adopted. Compared to LDS-CDMA, LDS-OFDM utilizes multicarrier transmission, which makes it fit for wideband channels. Further, the strong compatibility with OFDM makes it flexible in resource allocation [74].
Although a part of the users share the same block, another block would be adopted to distinguish different users when collisions occur.
Besides the sparse spreading, SCMA utilizes multi-dimensional constellations to reduce the receiver complexity and further improve the SE. Attributed to the multi-dimension property, the constellation in one resource block can be projected into its subspace [76]. For example, a four-point QAM constellation can be projected to a three-point constellation. Even when two points collide in one resource block or to say one dimension, they can be distinguished in the other used blocks. Due to fewer constellation points, the receiver complexity can be reduced. Moreover, the constellation design can focus on improving the detection performance. For example, a design based on constellation rotation and interleaving has been proposed in [77], which is able to achieve better BER performance compared to the simple LDS-OFDM.
Due to the sparse structure of the spreading matrix and the large minimum distance of the multidimensional constellation, the detection performance of SCMA becomes excellent even when the resource blocks are overloaded. At the receiver, MPA, which is usually adopted in the decoding of LDPC, is applied in the detection [78], [79]. Due to the sparsity, MPA could achieve near-optimal performance with a much lower complexity compared to the optimal maximum likelihood (ML) and the BCJR algorithms. However, the complexity is still relatively high for user devices. Hence, SCMA also considers clustering the users based on the CSI and allocating different powers to different clusters.
When the transmit powers among different clusters vary, the signals of different clusters can be detected by using SIC, which is similar to the power-domain NOMA. Within each cluster, different users can be distinguished by using MPA. As a result, the combination of SIC and MPA can reduce the complexity of the receiver significantly.

C. NOMA Multiplexing in Multiple Domains
Beyond multiplexing signals in the power or code domains, some of solutions for NOMA have been proposed to multiplex in multiple domains, such as the power domain, the code domain, and the spatial domain, in order to support massive connectivity for 5G networks. In Section III.A.2, we discussed multiple antenna based NOMA, where NOMA multiplexed in the power and spatial domains. We now introduce another three types of typical NOMA schemes multiplexing in multiple domains: pattern division multiple access (PDMA) [80], building block sparse-constellation based orthogonal multiple access (BOMA), and lattice partition multiple access (LPMA) [81].

1) PDMA:
In PDMA [80], non-orthogonal patterns are allocated to different users to perform multiplexing. These patterns are carefully designed in the multiple domains of code, power, and space, to gain the SIC-amenable property. In the presence of this property, the low-complexity SIC based MPA multiuser detection method with reliable performance can be designed to run at the receiver side.
At the transmitter, similar to SCMA, the users in PDMA are also spread by a sparse signature matrix [80]. The main difference is that the number of resource blocks occupied by each user in PDMA can vary. For example, seven users can be multiplexed within three resource blocks through the following signature matrix By utilizing the sparse signature matrix, PDMA can increase the system capacity through overloading.
Moreover, users can also be multiplexed in other domains, such as power and space. In the same resource block, users can be distinguished by different powers as the power-domain NOMA or different precoders if MIMO is applied [82].
At the receiver side, similar to SCMA, MPA can be adopted in detection due to the sparsity of the signature matrix. When different clusters of users are multiplexed in power and space domains, MPA-SIC can be applied. The detection of users that are multiplexed in the same signature matrix is based on the MPA, which can provide excellent performance. Among different clusters in the power and space domains, SIC can be utilized to reduce the complexity. Besides, a turbo structure can be adopted to combine the detector with the decoder to further improve the performance [83].
2) BOMA: This technique attaches the information from a user with good CSI to the symbols of a user with poor CSI. Thus the capacity of a multiuser system is increased significantly.
As shown in Fig. 10, in order to achieve the same BER performance as a user with good CSI, the user with poor CSI should apply a coarse constellation with a large minimum distance. Hence, the small building block that contains the data of the user with good CSI can be tiled in the constellation of the user with poor CSI [84], [85]. For the user with poor CSI, the center of the building block can be regarded as the constellation point and the tiled building block can be regarded as interference. When the size of the building block is much smaller than the minimum distance of the coarse constellation, the degradation of detection performance becomes minimal. Since the user with good CSI can detect the points in its own constellation, it can also detect all points in the tiled building block constellation and decode the bits from itself.  Table. IV. It is noted that each NOMA scheme has its advantages and limitations, which fits different application situations. Actually, it is an adaptive configuration to realize the trade-off between performance and implementation complexity.
For example, if there exists a large difference among users' channel conditions due to the near-far effect or in moving networks, power-domain NOMA with a SIC receiver can be used with relatively low complexity. On the other hand, if the application scenarios require high reliability, especially when channel condition is bad or the location distribution of users is concentrated, SCMA is a feasible solution due to its shaping gain and near-optimal MPA detection.

E. Future Work
Several NOMA schemes have been discussed in this section. Even if using different techniques, these schemes share the same spirit to utilize non-orthogonality to increase the system capacity and support more users by the limited resource blocks. Beyond the existing work, more research is necessary to improve the performance of these NOMA schemes from the following aspects.
The MPA-SIC detection method is usually applied in SCMA and PDMA, in which the user clustering mechanism affects the performance of the method significantly. When users are asynchronous, those with similar time delays should be divided into the same cluster for better performance. If the delays vary a lot among the users within the same cluster, interference among different users becomes large and may break the sparse structure. Multi-branch technique [88] can be applied to improve the performance by regarding each cluster as a branch. By calculating each branch in parallel and selecting the best result as the final one, the performance could be improved compared to the single clustering approach.
The joint design of new modulation and NOMA schemes is an important direction to be explored in 5G networks. Some of the NOMA schemes, especially the LDS based code-domain NOMA, are based on OFDM, where the output of the sparse spreading matrix is mapped into orthogonal subcarriers. In general, how to properly combine the modulation and NOMA scheme is under research. For example, for the combination of SCMA and f-OFDM, the short CP of f-OFDM could introduce ISI and ICI when the subband is narrow and degrade the detection performance of SCMA. If the RISIC algorithm is adopted to cancel the interference introduced by f-OFDM, the multiuser detection of SCMA should be included in the iteration of CP reconstruction, which poses a requirement of joint design approaches for the receivers.
The design of modulation and MA schemes for high frequency bands (above 40 GHz) is beginning to receive increased iterest. The millimeter-wave (mmWave) and Terahertz (THz) bands appear to be good candidates to decrease spectrum sacristy due to the availabilities in current circuit design [89], [90]. However, the propagation properties of mmWave and THz bands have shown to be quite poor, which brings new challenges on system designs. For example, noise is the major limitation of mmWave and THz bands, which makes the transmit power levels extremely important and ultimately impacts the classes of applications that can use them (e.g. IoT). Moreover, high level impairments including carrier frequency offset (CFO) and phase noise also need to be considered in mmWave and THz bands as they are noise-limited. Nevertheless, there is already a study on NOMA based mmWave communications [91], we may further see analyses of such systems based on practical scenarios in the future.

IV. CONCLUSIONS
In this article, we provide a comprehensive survey covering the major promising candidates for modulation and multiple access (MA) in fifth generation (5G) networks. From our discussion, we can see that new modulations for orthogonal MA can be adopted to reduce out-of-band leakage while meeting the diverse demands of 5G networks. Non-orthogonal MA is another promising approach that marks a deviation from the previous generations of wireless networks. By utilizing non-orthogonality, we have convincingly shown that 5G networks will be able to provide enhanced throughput and massive connectivity with improved spectral efficiency.