Reconﬁgurable Signal Processing and Hardware Architecture for Broadband Wireless Communications

This paper proposes a broadband wireless transceiver which can be reconﬁgured to any type of cyclic-preﬁx (CP) -based communication systems, including orthogonal frequency-division multiplexing (OFDM), single-carrier cyclic-preﬁx (SCCP) system, multicarrier (MC) code-division multiple access (MC-CDMA), MC direct-sequence CDMA (MC-DS-CDMA), CP-based CDMA (CP-CDMA), and CP-based direct-sequence CDMA (CP-DS-CDMA). A hardware platform is proposed and the reusable common blocks in such a transceiver are identiﬁed. The emphasis is on the equalizer design for mobile receivers. It is found that after block despreading operation, MC-DS-CDMA and CP-DS-CDMA have the same equalization blocks as OFDM and SCCP systems, respectively, therefore hardware and software sharing is possible for these systems. An attempt has also been made to map the functional reconﬁgurable transceiver onto the proposed hardware platform. The di ﬀ erent functional entities which will be required to perform the reconﬁguration and realize the transceiver are explained.


INTRODUCTION
A number of wireless standards govern personal wireless communications, to name a few, GSM and CDMA for 2G cellular networks; WCDMA, CDMA-2000, and TDS-CDMA for 3G cellular networks; IEEE 802.11a/b/g for wireless local area networks; IEEE 802.16 for wireless wide area networks (WiMAX); and IEEE 802.15 for personal area networks (PAN).In order to satisfy the need of customers' mobility, the designed radio transceivers should not be tied to any specific network.Software-defined radios (SDRs) [1] or This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.cognitive radios (CRs) [2] are the well-suited implementations of such network independent radios.
Different frame format and modulation schemes have been proposed for different networks.For example, orthogonal frequency-division multiplexing (OFDM) [3] has been adopted in IEEE 802.11a and IEEE 802.11g, and IEEE 802.16.Single-carrier cyclic-prefix (SCCP) system is selected for IEEE 802.16 as well.Cyclic-prefix (CP) -based CDMA systems are considered for beyond 3G systems.For example, CP-CDMA and CP-based direct-sequence CDMA (CP-DS-CDMA) are candidate schemes for enhanced versions of 3G DS-CDMA systems [4,5,6,7]; multicarrier CDMA (MC-CDMA) and multicarrier direct-sequence CDMA (MC-DS-CDMA) [8] are being extensively studied for potential adoption in beyond 3G (B3G) or fourth-generation (4G) cellular systems [10,21].The common feature of these systems is that they are block-based transmission systems, and a CP portion is inserted to each data block in order to suppress the interblock interference (IBI) and to simplify the receiver design.SDR structures and implementations have been explored in [11,12,13,14,15,16,17,18,19] in different ways.
In this paper, we propose a universal mobile transceiver which is configurable to any type of CP-based systems.We present a hardware platform for the proposed transceiver and identify the reusable common blocks.In particular, a study of the equalizer design at the mobile terminals is carried out for each system.Single-user environments are considered for OFDM and SCCP systems.For CDMA multiuser systems, we target a receiver architecture for the downlink only.It is found that after block despreading operation, MC-DS-CDMA and CP-DS-CDMA have the same equalization blocks as OFDM and SCCP systems, respectively, showing that hardware and software sharing is possible for these systems.More specifically, denoting N and G as the block size and processing gain, respectively, these two systems perform signal detection of N symbols over G consecutive time blocks.However, MC-CDMA and CP-CDMA require only one time block to perform signal detection.The receiver complexity of MC-CDMA and CP-CDMA can be much higher if near ML detection is required [5] due to the requirement of N-dimensional signal detection within each time block.
In [11], a reconfigurable architecture for multicarrier based CDMA systems is proposed.Specifically, the systems discussed there include MC-CDMA, MC-DS-CDMA, and MT-CDMA of [8] only.The architecture proposed in our paper is more generic in the sense that it can be reconfigured to support both multicarrier systems and single-carrier systems, including SCCP, CP-CDMA, and CP-DS-CDMA.Furthermore, we propose a hardware platform to support the reconfigurable architecture.An attempt has been made to map the functional reconfigurable transceiver onto the proposed hardware platform.The different functional entities which will be required to perform the reconfiguration and realize the transceiver are explained.
This paper is organized as follows.In Section 2, the system models are described for various CP-based systems.In Section 3, the linear MMSE receivers are specified for different systems.Section 4 proposes the reconfigurable transceiver and details the configurations for each system.We also analyze the complexity of the equalizer for each system.The hardware architecture for the proposed transceiver is proposed in Section 5. Finally, conclusions are drawn in Section 6.

SYSTEM DESCRIPTION
In this section, we review the input-output relations for the CP-based systems, which serve as the basis for the design of reconfigurable signal processing and hardware architecture.
The transmission is on a block-by-block basis, with each block consisting of the CP sub-block of P symbols and the data sub-block of N symbols.Thanks to the use of CP, the channel matrix is a circular matrix, which can be represented as H = W H ΛW, where W is the N-point DFT matrix, and Λ = diag{λ 1 , . . ., λ N } is a diagonal matrix, the elements of which are the channel's frequency-domain responses in each subcarrier.
For multicarrier systems, the data symbol sub-block is pretransformed using the N-point IDFT matrix W H ; while for single-carrier systems, the data symbol sub-block is sent out directly.At the receiver side, both single-carrier and multicarrier systems first pretransform the received data block with the DFT matrix W, the output of which is then utilized to recover the transmitted signals via either linear or nonlinear receivers.
For the multiuser CDMA case, spreading and despreading are added in the transmitter and receiver, respectively.Since the transceiver design for the mobile terminal is of interest, for the CDMA multiuser case, we concentrate on the receiver architecture for the downlink only.For OFDM and SCCP systems, single-user environments are considered.

OFDM
For OFDM systems, the input-output relation at the nth time block after FFT can be expressed as [3] where s(n) is the N × 1 input signal vector, y(n) is the N × 1 FFT output of the received signal vector, and u(n) is the FFT output of the AWGN vector, which is of dimension N × 1.

SCCP systems
For SCCP systems, the input-output relation at the nth block after FFT can be expressed as [20] where s(n), y(n), and u(n) are defined as in OFDM systems.Next, we look at four CDMA-based systems: MC-DS-CDMA, MC-CDMA, CP-CDMA, and CP-DS-CDMA.Suppose all users have the same processing gain G.We introduce the following common notations for all systems: T for total number of users; D(n) for long scrambling codes at the nth block, where D(n) = diag{d(n; 0), . . ., d(n; N − 1)}; c i for short codes of user i, and c H i c j = 0 for i = j.
(1) N symbols from each user-specific spreading codes are taken.
(2) The chip signals corresponding to the same symbol index from all users are summed and transmitted over the same subcarrier but through different times blocks.Thus it takes G consecutive blocks to transmit out the whole chip sequences of the N symbols.
Mathematically, the nth received block after FFT operation can be written as where , Note that u(n) is the FFT output of the AWGN vector.On the other hand, gathering the received signals of the kth subcarrier from 0th block to (G − 1)th block yields where Despreading y(k) using D(k)c i , we obtain Thus we have where z i = [z i (0), . . ., z i (N − 1)] T , and u i is defined accordingly.Therefore, after block despreading, an MC-DS-CDMA is equivalent to an OFDM, and it does not experience any MAI and ISI.

MC-CDMA
MC-CDMA performs spreading over frequency domain.Denote T as the total number of users and M = N/G as a positive integer.The nth received block after FFT can be written as where and u(n) is the FFT output of the AWGN vector.Further, Dividing y(n) into M nonoverlapping short-column vectors, each with G elements, ( 9) can then be decoupled as for i = 1, . . ., M, where Λ i and D i (n) are the ith sub-blocks of Λ and D(n).From ( 12), MC-CDMA experiences MAI, but not ISI.

CP-CDMA
CP-CDMA is a single-carrier dual of MC-CDMA.The M = N/G symbols of each user are first spread out with userspecific spreading codes, then the chip sequence for all users are summed up; the total chip signal of size N is then passed to CP inserter, which adds a CP.Using the duality between CP-CDMA and MC-CDMA, from (12), the nth received block of CP-CDMA after FFT can be written as [5] From the above, it is seen that CP-CDMA experiences both MAI and ISI.

CP-DS-CDMA
CP-DS-CDMA is the single-carrier dual of MC-DS-CDMA.It performs block spreading as follows: the N symbols of each user are first spread out with its own spreading codes, then the chip sequence for all users are summed up; the total chip signal corresponding to different chip indices is transmitted over different time block, thus it takes G blocks to send out the whole chip sequence of the N symbols.
Simplification is available if the same set of long scrambling codes are chosen for blocks from 0 to (G − 1), or if long scrambling codes are not used (D(n) = I).We consider the case when long scrambling codes are not used, and collect the kth subcarrier's received signals from the 0th to (G − 1)th blocks, we obtain where W(k, :) denotes the kth row of matrix W, Using c i to perform despreading, we obtain Thus we have which is the single-carrier duality of the MC-DS-CDMA model (8).From (18), it is seen that CP-DS-CDMA does not experience MAI, but it does have ISI.

LINEAR EQUALIZERS
We first look at the generic multiple-input multiple-output (MIMO) channel model.Suppose the MIMO channel is described by the following input-output relation: where s, x, and n denote the transmitted signal vector, received signal vector, and received noise vector, respectively; H is the channel matrix, representing the responses from the transmit antennas to the receive antennas.Without loss of generality, we assume that s, x, and n are all N × 1 vectors, and H is an n I, and E [sn H ] = 0. Denote P as the linear equalizer for the channel (19), which generates the output Then, the minimum mean square error (MMSE) equalizer is given by Now, we are ready to specify the linear receivers for each of the CP-based systems.

MC-DS-CDMA/OFDM
An MC-DS-CDMA after block despreading is equivalent to an OFDM.Thus both can employ the same one-tap equalizer to retrieve the transmitted signals: H .Note that MC-DS-CDMA performs symbol-level equalization.

MC-CDMA
For MC-CDMA, the signal separation is done for each subblock.The channel matrix for the ith sub-block of the nth block is Thus the MMSE equalizer is given by

CP-CDMA
For CP-CDMA, the channel matrix for the nth block is given by Thus the MMSE equalizer is given by which is realized through a one-tap equalizer, (σ 2 s [σ 2 s |Λ| 2 + σ 2 n I] −1 Λ) H , followed by an IDFT operator W H , followed by a despreader [D(n)C] H .Note that CP-CDMA performs chiplevel equalization.

CP-DS-CDMA/SCCP
A CP-DS-CDMA performs block despreading using G consecutive blocks, and after block despreading, it is equivalent to a SCCP system both have the following channel matrix: Thus the MMSE equalizer is given by which is realized through a one-tap equalizer, (σ

PROPOSED TRANSCEIVER
Based on the input-output relations and the detailed equalizers presented in the previous section for each CP-based system, in this section, a universal transceiver is proposed which is configurable to any type of these systems.

Transceiver overview
The proposed transceiver is illustrated in Figure 1, which contains source coding, scrambling, channel coding, interleaving, and pilot insertion blocks, as well as Modem Tx and burst formation block at the transmitter side; and receiver front end (FE) and Modem Rx as well as constellation demapper, descrambling, decoding, and deinterleaving at the receiver side.The data symbols are passed to the reconfigurable Modem Tx block where they are modulated according to the required specification.The burst formation and Tx front-end format the processed symbols to proper framing structure with suitable preambles and they are modulated onto a carrier and transmitted.After undergoing the channel and additive noise distortion, the received signal is downconverted in the RX FE processing and fed to the configurable Modem Rx, which does the synchronization (frame, code, and frequency), CP removal, FFT, channel estimation, channel equalization, and phase compensation.The detected symbols are further processed in the subsequent block to generate the information bits.
In the reconfigurable transceiver, though each block has to be reconfigured to meet the particular standard's requirement, here we restrict our discussion to the Modem Tx and Modem Rx blocks.

System configurations
The Modem Tx and Modem Rx are shown in Figures 2 and  3, respectively, and are realized as parameterized functions.We define a flag parameter i(A) for block A; if i(A) = 1, the block will be functioning; and if i(A) = 0, it will be unity.We also use "CS" to denote code spreading block; "BI" for block interleaving block; "IFFT" for IFFT block; "FFT" for FFT block; "BDI" for block despreading and deinterleaving block; "OTE" for one-tap equalizer; and "DS" for despreading block.
Denote G and N as the processing gain and the size of each block.Note that G = 1 for non-CDMA systems.The Modem Tx takes in a set of M data symbols, where M = N for OFDM, SCCP, MC-DS-CDMA, and CP-DS-CDMA, and M = N/G for MC-CDMA and CP-CDMA.
The flag parameters and selection of M for Modem Tx are shown in Table 1 for different systems.For non-CDMA systems, the CS block is a unity function.The BI block works for MC-DS-CDMA and CP-DS-CDMA, which translates the spread output of size NG into G serial blocks, each with size N.The BI block may also work for MC-CDMA to achieve almost equal diversity for each transmitted symbol [9].The IFFT processing block in Modem Tx is switched off in the case of single-carrier systems (SCCP, CP-CDMA, and CP-DS-CDMA).A CP insertion block is added to remove IBI and to translate the channel matrix into a circular matrix.Finally, parallel-to-serial conversion is done for signal transmission.
At Modem Rx, a common synchronization (SYN) block is applied to each system.For-single-user case, this SYN Table 2: Parameter configurations for Modem Rx.

Schemes i(BDI) i(OTE) i(IFFT) i(DS)
block implements time and frequency synchronization as well as phase noise estimation.Code acquisition is needed for CDMA case.Channel estimation block is also common for each system, which can be implemented using preamble blocks or common pilot channels or dedicated pilot channels, based on the systems to be configured to.The FFT is performed in Modem Rx for all systems.The other parameters are system dependent, which are summarized in Table 2.
(i) For OFDM, only OTE is needed; for MC-DS-CDMA, both BDI and OTE are required.(ii) For SCCP, OTE and IFFE blocks function; for CP-DS-CDMA, BDI, OTE, and IFFT blocks are required.(iii) For MC-CDMA, both OTE and DS blocks are needed; for CP-CDMA, OTE, IFFT, and DS blocks are all required.

Complexity issues
The equalizer complexities of different systems are compared.For non-CDMA systems, such as OFDM and SCCP, the block size N is usually small, say 64 in IEEE 802.11a.For CDMA systems, the block size N can be much larger, say at least 256.Thus we only compare the receiver complexities of CDMA-based systems.Table 3 illustrates the required operations per block for each CDMA-based system.Note that for both BDI and DS blocks, the number of despreaders of size G is considered.We use the number of complex multiplications (NCM) per block as the complexity metric.As the despreading operation may only involve addition operations since the codes are usually BPSK or QPSK modulated, this complexity is ignored.Treat the required NCM for an N-dimensional OTE as 3N, and the NCM for an N-point FFT (IFFT) as N log 2 (N).Then the required NCMs per block for each CDMA system are given as follows: It is seen that MC-CDMA and CP-CDMA have relatively higher complexity than the other two.However, we point out here that the lower complexity of MC-DS-CDMA and CP-DS-CDMA is achieved with the assumption that the wireless channel is static within the G consecutive blocks, which may not be the case for fast-fading environments.In those cases, MC-CDMA and CP-CDMA could be a better choice.Fortunately, the complexity increment for these two systems is around 2 times only.
As we mentioned earlier, we target the reconfigurable receiver architecture for the downlink only.Now, we quantify the gain of doing so with respect to the multimode receiver which gathers the implementation for each system.Adding the NCM for OFDM, which is N log 2 (N) + 3N, and the NCM for SCCP, which is 2N log 2 (N) + 3N, with those for CDMA systems, the required NCM per data block for the simple architecture supporting the six modes is given by (8N + (N/G)) log 2 (N) + 12N + 6(N/G).With the reconfigurable receiver, the required blocks are FFT, BDI, OTE, IFFT, and DS.Again, ignoring the complexity for BDI and DS, then the total NCM required by the reconfigurable receiver per data block is 2N log 2 (N) + 3N.For large processing gain, the required NCM for the reconfigurable receiver is about 25% of that for the multimode receiver.
Finally, we list out the typical values of block size N for different standards.In 802.11a/g,N = 64, and in the proposals for 802.11n,N = 64 or N = 128.N = 256 for 802.16a/eOFDM mode, and N is as high as 1024 [21] for B3G systems.For CDMA-based systems, when the chip rate is fixed, the processing gain is usually variable depending on the user's data rate [21].

HARDWARE ARCHITECTURE
The proposed transceiver architecture is mapped onto the hardware architecture shown in Figure 4.In this figure, generic high-level functional subsystems and interfaces of the software-defined radio (SDR) transceiver system are shown.The main entities of the architecture are the general-purpose processor (GPP) and SDR modem subsystem.
The GPP is an x86-based processor with a PCI-based interface for the SDR modem subsystem.The processing resources available on the SDR modem subsystem are Ti DSP TMS320C6416 running at a clock speed of 600 MHz, a Xilinx FPGA (XCV2V6000), and the RF front end.Along with the analog RF circuitry, the RF front end also has the ADC (analog devices AD6544), DAC (AD9777), and some reconfigurable logic for digital front-end processing.
The logical functions performed in these resources will be dependent on the type of system (OFDM, MC-DS-CDMA, SCCP, CP-DS-CDMA, MC-CDMA, CP-CDMA) that is being realized.
A set of logical entities should be realized on the hardware platform comprised of GPP and SDR modem subsystem.The specific functions performed in these resources will be dependent on the type of system being realized in the platform.The transient functions in the architecture which are configuration dependent are shown in dotted outlines in Figure 4.The functions that are always present irrespective of the configured system are shown in solid outlines.
SDR modem subsystem as shown in Figure 4 covers the RF and digital signal processing functions in the system.This subsystem interfaces to the GPP as a device (specifically as a PCI device).
Burst formation, transmitter front-end (FE) processing and receiver FE processing functions are realized in the RF front end.In the receive path, the RF signals down-converts (5 GHz band signals for 802.11aWLAN) to a 70 MHz IF and quadrature is digitized into I and Q for further processing by baseband run-time reconfigurable (RTR) engine in the receive section.In the transmit section, the RF subsystem upconverts the baseband I and Q signals to the required frequency (5 GHz band for 802.11aWLAN) for transmission.The RF front end is configured to the required configuration during the initial configuration process.The parameters for configuration are provided by the respective waveform driver.
The platform has a run-time reconfigurable (RTR) programmable logic hardware engine.This is used in computation-intensive front-end signal processing of Modem Tx shown in Figure 2 and Modem Rx of Figure 3.This subsystem consists of reconfigurable hardware and a management firmware module that runs on the embedded Ti DSP processor, managing the resources on the hardware.The major functions implemented in the RTR engine are channelization, synchronization and timing recovery, channel estimation and equalization, demodulation, despreading, and de-interleaving.These are typically computation intensive processing suitable for parallel hardware implementation.Corresponding reverse processing is done in the uplink.
The rest of the functions such as source/channel coding, scrambling, pilot insertion, and so forth and their corresponding reverse in the receiver path are performed using the processing resources of DSP.Again the exact functions performed by these engines depend on the system applications that are configured.
The firmware engine, which is implemented in the DSP, represents the run-time reconfigurable firmware subsystem.This typically implements some of the Layer-1 control procedures like scheduling and Layer-1 management and MAC processing (for example in IEEE 802.11a).These functions also depend on the waveform application being reconfigured to as the processing requirements for different CP-based systems are different.
An SDR executive provides the platform services needed to reconfigure and activate waveform applications, such as 802.11a for OFDM system.It is independent of any other application running on the platform.The SDR executive consists of the RTOS, a basic set of resource and configuration management processes and communication and data path management functions.The executive interfaces with the system and configuration manager (SCM) on GPP through the SDR Modem driver and waveform driver executed on the GPP.The SDR executive incorporates a command interpreter to receive and process the commands from the drivers.It also manages the communication channels between the GPP and SDR Modem.
To enable instantiation of different waveform applications on the same hardware, the SDR executive supports dynamic loading of COFF modules and reconfiguration of FP-GAs.The SDR executive will be stored on a nonvolatile storage and will be loaded at boot time.This module receives firmware and bit map information from the system and configuration manager and configures on the platform subsystems.
System and configuration manager (SCM) is the central management function of the architecture.The SCM uses the GPP resources to perform high-level functions like system management, configuration/reconfiguration management of the platform, configuration storage, user interface, and so forth.This module performs its functions in conjunction with other platform functions, specifically the SDR executive, kernel/OS, user interface, and configuration database.
The SCM uses the SDR Modem driver to communicate with the SDR executive to download platform components as needed.
SDR Modem driver, a kernel driver, interfaces the SDR modem subsystem platform to the system and configuration manager process and acts as the root device driver for the SDR Modem hardware.The driver will be loaded when the SDR Modem device is detected and basically interfaces with the SDR executive to pass command and response between the SCM and SDR executive.This also provides facility to transfer the COFF files and FPGA bit maps to configure the DSP and FPGA, respectively.
Kernel/OS block in Figure 4 is the operating-system kernel on the GPP.The operating system used is a generalpurpose OS, LINUX, with real-time extensions, or a dedicated real-time OS like LynxOS could also be used.The realtime extensions are necessary as part of the waveform applications, depending on the application, may be implemented inside the kernel.
Waveform driver n , are function-specific drivers in the host OS.These are shown in dotted outline to indicate that they are not always present.A driver will be loaded into the host memory only when corresponding waveform application is being instantiated; and it is unloaded when the application is unloaded.These are technology specific and some of them may be modified versions of standard drivers.For example in the case of OFDM, this will be a modified, dynamically loadable version of the standard wireless LAN 802.11a drivers.As for CDMA application, this may be a specific driver developed for the purpose, incorporating the Layer 2 and higher protocols interfacing with Layer-1 functions inside the SDR Modem.These driver modules are loaded and unloaded by the SCM, whenever the corresponding waveform application is created and destroyed.Upper-layer protocols and application software will make use of the communication capabilities provided by the waveform drivers and interface with them.
A configuration database holds all the configuration and status information including all the system images for each of the waveform applications that are supported.At least part of it will be nonvolatile (e.g., disk based).The SCM manages the configuration database, it maintains the current status of the system, and will estimate, along with the SDR executive, the resource availability for the selected application.If sufficient resources are available, the waveform application will be installed and feedback will be given to the user.If resources are not available, it will abort the loading and free up all the resources reserved for this waveform application.The loading process includes, in addition to loading of SDR Modem firmware modules and FPGA bit maps, any host driver modules.A loaded waveform application can be activated only upon explicit activation by the SCM.This involves activation of RF subsystem, hardware and firmware components (loaded and configured during the loading process), as well as loading of the driver on the host side.The SCM implements error logging and diagnostic command support for the development and testing support.This includes ability to log module-wise diagnostic messages into nonvolatile storage.
Finally, the user interface exposes the user features provided by the SCM to the user.

CONCLUSIONS
In this paper, we have proposed a broadband mobile transceiver and a hardware architecture which can be configured to any cyclic-prefix (CP) -based system.We have identified the reusable common blocks and studied the receiver complexity of the equalization block for different systems.It is found that after the despreading operation, MC-DS-CDMA and CP-DS-CDMA have the same equalization blocks as OFDM and SCCP systems, respectively, showing that both hardware and software sharing is possible for those systems.We also noticed that although MC-CDMA and CP-CDMA require only one block to perform signal detection, the receiver complexity is higher than the other two CDMA systems.However, MC-DS-CDMA and CP-DS-CDMA rely on the assumption that the wireless channel is static within the G blocks.For fast-fading environments, MC-CDMA and CP-CDMA may still be the better choice to compromise the fast fading and complexity issues.Further, functionality of the different functions in the proposed hardware architecture is elaborated.It is seen that though different functions can be realized on a reconfigurable hardware, the major challenge is to have an efficient system configuration and management function which will initiate and control the reconfiguration as per waveform application requirements.
Block diagram of the proposed transceiver configurable for CP-based systems.Figure 2: Functional block of the reconfigurable Modem Tx.
n I] −1 Λ) H , followed by a IDFT operator W H .Note that chip-level equalization is performed for CP-DS-CDMA.SynchronizationFigure 3: Functional block of the reconfigurable Modem Rx.

Table 1 :
Parameter configurations for Modem Tx.

Table 3 :
Equalizer complexity per block for CDMA-based systems.