Regarding a Pre-Distorted ADO-OFDM System as a DCO-OFDM System for Visible Light Communications

We provide an interpretation for a recently proposed pre-distorted asymmetrically clipped DC-biased optical orthogonal frequency division multiplexing (ADO-OFDM) system. Through our derivation, we find that the predistorted ADO-OFDM system can be regarded as a direct current biased optical OFDM (DCO-OFDM) system in which the odd-index and the even-index subcarriers are based on different power assignments. For multipath channels that have no channel side information (CSI), the mechanism for power allocation and adaptive modulation is impractical. Using state-of-the-art polar coding that includes an interleaver is appropriate to combat such a situation. The performance comparison between an interleaved polar coded ADO-OFDM system both with and without pre-distortion are simulated and concluded.


I. INTRODUCTION
In orthogonal frequency division multiplexing (OFDM) visible light communications, signals emitted from the transmitter are required to have both real and non-negative amplitudes. To meet the real-amplitude requirements, data from indices k and N − k must follow the Hermitian symmetric constraint, i.e., X * N −k = X k , where * denotes the complex conjugate. The direct current biased optical OFDM (DCO-OFDM) system was first migrated from referring to the wireless OFDM version. The additional processing in the baseband adds a DC-bias current that raises most signals so that they are positive, and then clips the remaining negative signals so they become zero in order to meet the non-negative requirements. Later, the asymmetrically clipped optical OFDM (ACO-OFDM) system was proposed by only assigning data on odd-index subcarriers and then clipping all the resulting negative time-domain samples so that they become zero in order to meet the non-negative requirement. Hereafter, we omit the term ''OFDM'' so as to achieve concise representation. The merit of the ACO system is that no DC-bias current is needed, which can be more The associate editor coordinating the review of this manuscript and approving it for publication was Faissal El Bouanani . power-efficient. In contrast, disadvantage is that its spectrum efficiency is only half of the DCO system. An asymmetrically clipped DC-biased optical (ADO) system [1]- [4] was proposed by hybridizing both the ACO and the DCO concepts to provide both power and spectrum efficiency. In the ADO system, the data from the odd indices follow the process of the ACO version with the real and non-negative requirements in its ACO branch and the data from the even indices follow the process of the DCO version with the real requirements in its DCO branch. The resulting time-domain samples for the two branches are added together with a DC-bias current and a clipping process is followed in order to meet the nonnegative requirements as shown in Fig.1. At the receiver, the data of the odd indices are first processed and determined, which are used to help determine the data for the even indices. Since negative samples are clipped at the transmitter, the resulting clipping noise occurs at the even-index subcarriers, which can be alleviated if the data from the odd indices are correctly determined. Otherwise, if some detections are erroneous when processing the odd-index elements, performance degradation due to error propagation occurs in the even-index components. In [4] and [5], the authors have completed the bit error rate (BER) performance comparison for ACO, DCO, and ADO. The ADO system however performs the worst due to error propagation which encourages the motivation of finding improvement.
In [6], the authors proposed an iterative receiver (IR) structure designed to improve the BER performance of the system. Two paring effects, ''pairwise clipping'' and ''pairwise averaging'', are respectively and iteratively applied to reduce the clipping noise. The pre-distorted ADO system [7]- [9] was recently proposed to preliminarily-eliminate the clipping noise arising from the clipping process in the odd-index elements. This pre-distorts the data from even indices at the transmitter by subtracting the clipping noise earlier such that the process that detects the data for the even indices at the receiver are no longer affected by the odd-index elements. The performance of the pre-distorted ADO system shown in [7]- [9] is competitive to other systems. Moreover, the pre-distorted ADO system can parallelly and seperately process its odd-index and even-index data at the receiver. The process latency and complexity can be reduced at the receiver.
However, the authors did not provide more insights into the pre-distortion method by making more mathematical derivation. In this paper, we demonstrate that the pre-distorted ADO system can be regarded as a DCO version that has two different power groups which are divided into odd and even indices. The mathematical discovery can not only help explain the merits but also provide more insights about the performance of the pre-distorted ADO system.
The remaining parts of this paper are organized as follows. The ADO and predistorted ADO systems are described and their mathematical derivations are also provided in Section II. In Section III, the state-of-the-art polar coding is introduced to the system to combat frequency selective fading channels. The construction of the code is based on calculating the input message reliabilities. An interleaver followes the code is necessary if the CSI is not available. Simulation results are provided in Section IV, and concluding remarks are given in Section IV.

II. ADO-OFDM AND PRE-DISTORTED ADO-OFDM A. ADO-OFDM
Suppose that the ADO system consists of N subcarriers in which N /2 odd-index subcarriers are used for the ACO branch, and the other N /2 even-index subcarriers are used for the DCO branch, as shown in Fig. 1. The data for X = (X 0 , X 1 , . . . , X N −1 ) is divided into two groups, X odd = (0, X 1 , 0, X 3 , . . . , 0, X N −1 ) and X even = (X 0 , 0, X 2 , 0, . . . , X N −2 , 0) in the frequency domain. The Hermitian symmetric requirement is X * N −k = X k , for 0 ≤ k < N /2. In general, X 0 = X N /2 = 0 is required in order to avoid the DC-offset.
Each corresponding sample of x = (x 0 , x 1 , . . . , x N −1 ) in the time domain can be obtained from the frequency samples X and vice versa by using the IFFT n (·) and FFT k (·) functions, respectively. They are defined by for 0 ≤ n < N and 0 ≤ k < N . Hereafter, the notation where a bold uppercase font is used represents a vector in the frequency domain, and notation where a bold lowercase font is used represents a vector in the time domain. The subscript ''odd'' or ''even'' indicates that only its odd or even components are meaningful. As shown in Fig. 1, X odd = X ACO passes through the upper ACO branch and X even = X DCO passes through the lower DCO branch. After passing through the IFFT devices, the time-domain samples are respectively denoted as x n,ACO and x n,DCO , for 0 ≤ n < N , which are real values, but not all positive. Their relationships are Since signals at the odd-index part are required to be nonnegative, the clipping process, x + n,ACO = max(0, x n,ACO ), is neccessary. Note that half of the x + n s,ACO are zeros because of negative symmetry [5], Let E[a] denote the expectation of the random variable a. We set P 0 as the average signal power per subcarrier, i.e., E[|X k | 2 ] = P 0 and hence Two scalars, 2 √ α and √ 2(1 − α), are used for the power assignment to distribute the average powers with the ratio, α/(1 − α), of the ACO branch to the DCO branch per time sample. The ratio is defined by The ADO signal, x n,ADO , consists of three components, 2 √ αx + n,ACO , DCO , and the DC-bias √ µP 0 is written as The DC-bias √ µP 0 is used to raise most signals so that they are positive for transmission. The expectation for their power ratio is α : The derivation of (7) is given in Appendix I. A clipping process, x + n,ADO = max(0, x n,ADO ), is then needed. Assume that the power of the negative portion of x n,ADO is negligible, then Fig. 1(b) illustrates the receiver operations, the received signal y n = h n x + n,ADO + w n , where h n is the channel impulse response and w n is white Gaussian noise with variance σ 2 , for 0 ≤ n < N . Let y = (y 0 , y 1 , . . . , where = P 0 , we can define the signalto-noise ratio (SNR) as P 0 /σ 2 . By taking FFT on (6), we have where 1 = (1, 1, . . . , 1). Note that only the first term can contribute to the components at both the odd-index and the evenindex subcarriers, and the other components only impact the even-index subcarriers. We write FFT k (x + ACO ) = X + ACO = X + odd,ACO + X + even,ACO . According to [4] or [10], By combining (9), (10) and (11), we have Hence, for an odd k, the estimatedX k = . The receiver is first to demoulateX odd , the estimates for X odd , by using (12).
The next step is to obtainx + ACO , the estimates for x + ACO . This can be done by passingX odd through an IFFT device as well as a clipping device that is similar to that used in the procedures for the ACO branch in the transmitter. The process then demodulate the estimatesX k , for an even k, k = 0. After canceling the first term in (10), we have for an even k. This can be achieved ifX odd is correctly demodulated. At this time, for even k, the estimatedX k = √ AY k /( . According to (13), the demodulated estimatesX even are obtained, for an even k.
In [6], the authors proposed an iterative receiver (IR) structure designed to improve the BER performance of the system. The main concept was to respectively and iteratively utilize the demodulatedX odd (=X ACO ) andX even (=X DCO ) to obtainx n,ACO andx n,DCO . By clipping the negative part of x n,ACO ,x + n,ACO is obtained. Two symmetric properties [5] are shown in (3) and in the following, These properties are respectively and iteratively applied so as to reduce the effect of any noise since the pair (x + n,ACO ,x + n+N /2,ACO ) should be either (x + n,ACO , 0) or (0,x + n+N /2,ACO ) and the pair (x n,DCO ,x n+N /2,DCO ) should be ( 1 2 (x n,DCO +x n+N /2,DCO ), 1 2 (x n,DCO +x n+N /2,DCO )). This pairing effects are based on so-called ''pairwise clipping'' and ''pairwise averaging''. A detailed description of the IR structure can be found in [6]. The IR structure can efficiently improve the BER performance when compared to the original receiver structure.  branch is similar to the original system. Because the clipped signal, x + ACO , in the ACO branch will introduce interference X + even,ACO while demodulating the data for the DCO branch in the original system, it can be dealt with using pre-distortion at the transmitter in an approach that is similar to the action in (13). To provide an easier understanding of the predistorted ADO system, the power assignment in the DCO branch is performed before the IFFT. As shown in Fig. 2(a), we can preliminarily-eliminate this interference by adding an FFT device subsequent to the ACO output. The predistorted signals are reproduced after passing through the FFT device, denoted by 2 √ βX + ACO . Only the components at an even subcarrier, 2 √ βX + even,ACO , need to be pre-distorted in the DCO branch. The output from the IFFT device in the DCO branch, denoted as x n,DCO,pre , is therefore x n,DCO,pre = IFFT n ( 2(1 − β)X even − 2 βX + even,ACO ) where By adding the signals from the ACO branch, the predistorted DCO branch, and the DC-bias, the predistorted ADO signal that has a normalization factor √ D is expressed as This indicates that the average power ratio for the terms in the last line of (17) is β/2 : (1 − β) : ζ . It is interesting from (17) that the predistorted ADO system is similar to the DCO system with different powers β 2 P 0 for an odd-index subcarrier and (1−β)P 0 for an even-index subcarrier. The normalization factor √ D can be shown as The derivation of (18) is given in Appendix II. Let x + n,ADO,pre = max(0, x n,ADO,pre ). Similarly, Note that for comparison, we may set α : (1 − α) = β/2 : The demodulation at the receiver is less complicated than the original ADO system. Let y n,pre = h n x + n,ADO,pre + w n . After passing through the FFT illustrated in Fig. 2(b), we obtain from (17) for an odd k, Accordingly, we can obtain the demodulated estimateŝ X =X odd +X even by using (20).
We may also use the IR structure to assist with the demodulation. However, the system performance cannot be improved becausex + n,ACO in the first line of (17) has been replaced by other terms in the last line and hence the pairing effects are no longer available.

III. CODING FOR MULTIPATH CHANNELS
Assume that the optical multipath channel is frequency selective. The frequency-domain channel coefficients H k can be returned to the transmitter as the channel side information (CSI) when a feedback channel exists. For an OFDM system where Y k = H k X k + W k and W k is white Gaussian noise with variance σ 2 , for 0 ≤ k < N , the system may use a water-filling method that assigns more power to the subcarrier of a larger value of |H k | 2 σ 2 to approach the channel capacity. However, the modulation order for each subcarrier may be different corresponding to its assignment power, which is known as adaptive modulation. If the CSI is not obtained or the transmission channel cannot remain unchanged for sufficient length of time, the mechanism becomes unavailable. In this paper, we intend to discuss the situation where no CSI is available, i.e., neither the original ADO system nor the predistorted ADO version requires the power parameters α, β, to be adjusted, or no additional power allocation is required for different subcarriers.
We need to adopt channel coding for transmission over multipath channels. In this paper, we consider state-of-theart polar coding [11]. The code is constructed by cascading the basic components, as shown in Fig.3(a). where v = (v 0 , v 1 , . . . , v 2 t −1 ) is the codeword of length 2 t , u = (u 0 , u 1 , . . . , u 2 t −1 ) is the message, and G 2 t is the generator matrix of size 2 t × 2 t , which can be recursively derived from the following equations.
where ⊗ denotes the Kronecker product. The concept of polar codes is to polarize u into two parts in which the reliable part carries information and the frozen part carries no information. Density evolution [12], Gaussian approximation [13] and Monte Carlo [14] methods can help determine the parts. As indicated in Fig.3(a), if the channel π k : X k −→ Y k is assumed binary-input AWGN channel wtih variance σ 2 , then Let {X k | X k = 1 − 2v k } be the binary mapping. The reliability of v k is defined as the log-likelihood ratio (LLR) by One can show that L(v k ) ∼ N ( 2 σ 2 , 4 σ 2 ) is Gaussian distributed with mean 2 σ 2 and variance 4 σ 2 when X k = 1. The Gaussian approximation method [13] assumes an all-zero codeword to be encoded. As illustrated in Fig.3(a), we can derive the reliabilities of u 0 and u 1 via belief propagation, i.e., where the reliabilities, L(u 0 ) and L(u 1 ), can be tightly approximated by two Gaussian random variables. Now that Y k = H k X k + W k , an OFDM system can be used to apply a one-tap zero-forcing equalizer to obtain Y k = X k + W k /H k and the channel is redefined as π k (≡ π k /H k ):X k −→ Y k . Suppose that X k is an m-bit signal (i.e., an M -ary signal with M = 2 m ), for 0 < k < N /2. The codelength should be punctured from 2 t to m · (N /2 − 1) by following the puncturing procedures [15] such that v p = (v p,0 , v p,1 , . . . , v p,m·(N /2−1)−1 ) is the punctured version of v. Since X k is composed by the m-bit (v p,(k−1)·m , v p,(k−1)·m+1 , . . . , v p,k·m−1 ), the reliability of v p, is Taking (26) as the initial values and applying recursively for (24), (25), we can calculate the input reliabilities for all u i s and determine the message bits for larger reliabilities and the frozen bits for fewer reliabilities for an OFDM system. An example of an (8, 4) polar code with message bits {u 3 , u 5 , u 6 , u 7 } and frozen bits {u 0 , u 1 , u 2 , u 4 } by using QPSK modulation is illustrated in Fig.3(b). However, the code is built by knowing the CSI and not using adaptive modulation in this case.
In [16] and [17], the authors proposed using an interleaver deviced subsequent to the generated coded bits v i s . The interleaver works for averaging the effect of frequency selective fading as if all the channels are roughly the same. To construct the code in this way, it is no longer necessary to consider the CSI in (26). The code still performs sufficiently well and is suitable for adoption for both the ADO and predistorted ADO systems.

IV. SIMULATION RESULTS
The original ADO parameters used in the simulation are based on N = 128, m = 2(QPSK), α = 0.5 and µ = 2 or 3 such that the system assigns equal average powers for the time-domain samples used in both branches. In Fig.4 for the AWGN (H k = 1, for all k), the uncoded BER for the original ACO branch is 3 dB worse than that of the original DCO branch in the high SNR = P 0 /σ 2 region. It is reasonable because the SNR ACO = α A P 0 σ 2 and SNR DCO = 2(1−α) A P 0 σ 2 according to (12) and (13) whileX odd are determined to be almost all correct at high SNR values. SinceX odd are have a higher incorrect rate at low to medium SNR levels, which influences the successive interference cancellation in (13), the performance gain for the original DCO branch is less than 3 dB. For comparison, the pre-distorted ADO parameters where β = 2/3 and ζ = 2 such that the system can be regarded as the DCO system which assigns equal average powers to the time-domain samples for both the oddindex (ACO) and the even-index (DCO) branches in (17).  Compared to the original ADO system, the uncoded BER for the pre-distorted ADO version is better because of the correct pre-cancellation of the interference and its corresponding branch power ratio.
To gain additional observation, the parameters of the predistorted ADO system are set to β = 1/2 and ζ = 2 such that the system assigns the powers for the ACO branch and DCO branch at a ratio of 1 to 2. As we can see in Fig.5, the BER for the ACO branch is 3 dB worse than that of the DCO branch at high SNR values. The BER for the DCO branch where β = 1/2 performs better than that where β = 2/3 because of the different assigned powers. Nevertheless, the average BER for the system where β = 1/2 is worse than that where β = 2/3.  In Fig.6, the performance when using an iterative receiver [6] is provided in the simulation. The IR structure improves the BER performance by about 2 dB. We then consider operating the systems over a frequency selective fading channel of four multipaths with exponentially decayed powers. That is, the i th -path in time domain is rayleigh distributed with variance for 0 ≤ i ≤ 3. A 6-bit CRC-aided (N , K ) = (128, 58) polar code is randomly interleaved and constructed based on its operating SNR value. Assume QPSK modulation is used at each subcarrier. Since X 0 does not carry information, a 6-bit CRC-aided (126, 58) polar code based on two-bit puncturing is used for performance evaluation. The coded performance based on successive cancellation list-8 (SCL-8) decoding is shown in Fig.7 for the block error rate (BLER). The IR structure cannot be applied for the original system since the pairing methods influenced by the multipath channel are not available. The predistorted system is shown to be about 3.5 dB better than the original system.

V. CONCLUDING REMARKS
In this paper, we have shown that the pre-distorted ADO system can be regarded as a DCO system that has different power assignments in the odd and even-index subcarriers. Through derivation and simulation, we can regard the predistorted ADO system as a DCO system, where the BER performance has been shown to be better than the original ADO system. For comparison, we also adopt an IR structure in order to improve the performance of the original ADO system. The advantage of the pre-distorted ADO system is that it can preliminarily-cancel any interference in the even-index subcarrier, which occurs at the receiver in the original ADO system. To combat a multi-path channel, a polar code with an interleaver, which is constructed based on the situation of fixed modulation order and no available CSI, is adopted. However, if the CSI is available, a mechanism using power allocation and adaptive modulation can be exploited for further study in future work.

APPENDIX I DERIVATION OF THE NORMALIZED FACTOR √ A
From central limit theory when N is large enough, either the component x n,ACO or x n,DCO can be approximately regarded as a Gaussian variable with mean zero and variance ρ 2 = P 0 2 , for 0 ≤ n < N . Since x + n,ACO is the clipped version of x n,ACO , its mean and variance can be calculated as Since the factor √ A in (6) normalizes E[|x n,ADO | 2 ] to P 0 , (7) is obtained by using (6) and (A.3).