Implementation of LTE system on an SDR platform using CUDA and UHD

Bang, Saehee; Ahn, Chiyoung; Jin, Yong; Choi, Seungwon; Glossner, John; Ahn, Sungsoo

doi:10.1007/s10470-013-0229-1

Implementation of LTE system on an SDR platform using CUDA and UHD

Published: 19 November 2013

Volume 78, pages 599–610, (2014)
Cite this article

Analog Integrated Circuits and Signal Processing Aims and scope Submit manuscript

Saehee Bang¹,
Chiyoung Ahn¹,
Yong Jin¹,
Seungwon Choi¹,
John Glossner² &
…
Sungsoo Ahn³

853 Accesses
10 Citations
Explore all metrics

Abstract

In this paper, we present an implementation of a long term evolution (LTE) system on a software defined radio (SDR) platform using a conventional personal computer that adopts a graphic processing unit (GPU) and a universal software radio peripheral2 (USRP2) with a URSP hardware driver (UHD) to implement an SDR software modem and a radio frequency transceiver, respectively. The central processing unit executes C++ control code that can access the USRP2 via the UHD. We have adopted the Ettus Research UHD due to its high degree of flexibility in the design of the transceiver chain. By taking advantage of this benefit, a simple cognitive radio engine has been implemented using libraries provided by the UHD. We have implemented the software modem on a GPU that is suitable for parallel computing due to its powerful arithmetic and logic units. A parallel programming method is proposed that exploits the single instruction multiple data architecture of the GPU. We focus on the implementation of the Turbo decoder due to its high computational requirements and difficulty in parallelizing the algorithm. The implemented system is analyzed primarily in terms of computation time using the compute unified device architecture profiler. From our experimental tests using the implemented system, we have measured the total processing time for a single frame of both transmit and receive LTE data. We find that it takes 5.00 and 8.58 ms for transmit and receive, respectively. This confirms that the implemented system is capable of real-time processing of all the baseband signal processing algorithms required for LTE systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-Time Implementation of 4x4 MIMO-OFDM System for 3GPP-LTE Based on a Programmable Processor

DFC++ Processing Framework Concept

Article 18 August 2016

Dominik Soller, Thomas Jaumann, … Albert Heuberger

The CoreVA-MPSoC: A Multiprocessor Platform for Software-Defined Radio

References

Kim, J., Hyeon, S., & Choi, S. (2010). Implementation of an SDR system using graphics processing unit. IEEE Communication Magazine, 48(3), 156–162.
Article Google Scholar
Jorgensen, P. B., Hansen, T. L., Sorensen, T. B., & Berardinelli, G. (2011). Implementation of LTE SC-FDMA on the USRP2 software defined radio platform, IEEE Swedish Communication Technologies Workshop, (pp. 34–39).
NVIDIA Corporation (2009).NVIDIA CUDA programming guide.
Ettus Research. Change log for releases. http://ettus-apps.sourcerepo.com/redmine/ettus/projects/uhd/wiki/ChangeLog.
3GPP (2012). 3rd generation partnership project (3GPP); Technical specification group radio access network; evolved universal terrestrial radio access (E-UTRA); physical channels and modulation (Release 9), http://3gpp.org/ftp/specs/html-info/36211.htm.
3GPP (2012). 3rd generation partnership project (3GPP); Technical specification group radio access network; evolved universal terrestrial radio access (E-UTRA); Multiplexing and channel coding (Release 9), http://3gpp.org/ftp/specs/html-info/36212.htm.
NVIDIA Corporation (2011). NVIDIA GTX 295 Datasheet, NVIDIA Corporation. http://www.nvidia.com/object/product_geforce_gtx_295_us.html.
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J. W., & Skadron, K. (2008). A performance study of general-purpose application on graphics processors using CUDA. Journal of Parallel and distributed Computing, 68(10), 1370–1380.
Article Google Scholar
Merrill, D., & Grimshaw, A. (2011). High performance and scalable radix sorting: A case of implementing dynamic parallelism for GPU computing. Parallel Processing Letters, 21, 245–272.
Article MathSciNet Google Scholar
Ahn, C., Bang, S., Kim, H., Lee, S., Kim, J., Choi, C., et al. (2012). Implementation of an SDR system using an MPI-based GPU cluster for WiMAX and LTE. Analog Integrated Circuit and Signal Processing, 73(2), 569–582.
Article Google Scholar
Ahn, C., Kim, J., Ju, J., Choi, J., Choi, B., & Choi, S. (2011). Implementation of an SDR platform using GPU and its application to a 2x2 MIMO WiMAX system. Analog Integrated Circuit and Signal Processing, 69(2–3), 107–117.
Article Google Scholar
Ryoo, S., Rodrigues, C. I., Baghsorkhi, S. S., Stone, S. S., Kirk, D. B., & Hwu, W. W. (2008). Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. Proceedings of the 13th ACM SIGPLAN Symposium on principles and practice of parallel programming, (pp. 73–82).
Moazeni, M., Bui, A., & Sarrafzadeh, M. (2009). A memory optimization technique for software-managed scratchpad memory in GPUs. IEEE 7th Symposium on Application Specific Processors, (pp. 43–49).
NVIDIA Corporation (2009). NVIDIA CUDA Compute unified device architecture: Programming guide (Version 2.3).
Berrou, C., Glavieux, A., & Thitimajshima, P. (1993). Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1. IEEE International Conference on Communications, 2, 1064–1070.
Google Scholar
Studer, C., Benkeser, C., Belfanti, S., & Huang, Q. (2011). Design and implementation of a parallel turbo-decoder ASIC for 3GPP LTE. IEEE Journal of Solid-State Circuits, 46(1), 8–17.
Article Google Scholar
Karim, S. M., & Chakrabarti, I. (2010). An improved low-power high-throughput log-MAP turbo decoder. IEEE Transactions on Consumer Electronics, 56(2), 450–457.
Article Google Scholar
Hsu, J. M., & Wang, C. L. (1998). A parallel decoding scheme for turbo codes. IEEE International Symposium on Circuits and Systems, 4, 445–448.
Google Scholar
Wu, M., Sun, Y., Wang, G., & Cavallaro, J. R. (2011). Implementation of a high throughput 3GPP turbo decoder on GPU. Journal of Signal Processing Systems, 65(2), 171–183.
Article Google Scholar
Sun, Y., & Cavallaro, J. R. (2011). Efficient hardware implementation of a highly-parallel 3GPP LTE/LTE-advance turbo decoder. Integration, the VLSI Journal, 44, 305–315.
Article Google Scholar
Yun, S., & Bar-Ness, Y. (2002). A parallel MAP algorithm for low latency turbo decoding. IEEE Communication Letters, 6(7), 288–290.
Article Google Scholar

Download references

Acknowledgments

This research was supported by the MSIP (Ministry of Science, ICT&Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (NIPA-2013-H0301-13-1001) supervised by the NIPA (National IT Industry Promotion Agency).

Author information

Authors and Affiliations

Department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea
Saehee Bang, Chiyoung Ahn, Yong Jin & Seungwon Choi
Optimum Semiconductor Technologies, Inc., Tarrytown, NY, USA
John Glossner
Department of Information and Communication, Myeongji College, Seoul, Korea
Sungsoo Ahn

Authors

Saehee Bang
View author publications
You can also search for this author in PubMed Google Scholar
Chiyoung Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Yong Jin
View author publications
You can also search for this author in PubMed Google Scholar
Seungwon Choi
View author publications
You can also search for this author in PubMed Google Scholar
John Glossner
View author publications
You can also search for this author in PubMed Google Scholar
Sungsoo Ahn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seungwon Choi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bang, S., Ahn, C., Jin, Y. et al. Implementation of LTE system on an SDR platform using CUDA and UHD. Analog Integr Circ Sig Process 78, 599–610 (2014). https://doi.org/10.1007/s10470-013-0229-1

Download citation

Received: 22 March 2013
Revised: 15 August 2013
Accepted: 09 November 2013
Published: 19 November 2013
Issue Date: March 2014
DOI: https://doi.org/10.1007/s10470-013-0229-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Implementation of LTE system on an SDR platform using CUDA and UHD

Abstract

Access this article

Similar content being viewed by others

Real-Time Implementation of 4x4 MIMO-OFDM System for 3GPP-LTE Based on a Programmable Processor

DFC++ Processing Framework Concept

The CoreVA-MPSoC: A Multiprocessor Platform for Software-Defined Radio

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Implementation of LTE system on an SDR platform using CUDA and UHD

Abstract

Access this article

Similar content being viewed by others

Real-Time Implementation of 4x4 MIMO-OFDM System for 3GPP-LTE Based on a Programmable Processor

DFC++ Processing Framework Concept

The CoreVA-MPSoC: A Multiprocessor Platform for Software-Defined Radio

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation