Teraflows over Gigabit WANs with UDT

https://doi.org/10.1016/j.future.2004.10.007Get rights and content

Abstract

The TCP transport protocol is currently inefficient for high speed data transfers over long distance networks with high bandwidth delay products (BDP). The challenge is to develop a protocol which is fast over networks with high bandwidth delay products, fair to other high volume data streams, and friendly to TCP-based flows. We describe here a UDP-based application level transport protocol named UDT (UDP-based Data Transfer) with these properties and which is designed to support distributed data-intensive computing applications. UDT can utilize high bandwidth efficiently over wide area networks with high bandwidth delay products. Unlike TCP, UDT is fair to flows independently of their round trip times (RTT). In addition, UDT is friendly to concurrent TCP flows, which means it can be deployed not only on experimental research networks but also on production networks. To ensure these properties, UDT employs a novel congestion control approach that combines rate-based and window-based control mechanisms. In this paper, we describe the congestion control algorithms used by UDT and provide some experimental results demonstrating that UDT is fast, fair, and friendly.

Introduction

Data-intensive distributed and Grid applications often involve transporting, integrating, analyzing, or mining one or more very high volume data flows or, what we call in this paper, teraflows. Developing teraflow applications was impractical until recently when network infrastructures supporting 1 Gigabit per second (Gbit/s) and 10 Gbit/s links began emerging.

As networks with these bandwidths begin to connect computing and data resources distributed around the world, the limitations of current network protocols are becoming apparent. Currently deployed network transport protocols face several difficulties in effectively utilizing high bandwidth, especially over networks with high propagation times. These difficulties grow commensurately with the bandwidth delay product (BDP), which is the product of the bandwidth and the round trip time (RTT) of the path. In particular, it is an open problem to design network protocols for mixed traffic of teraflows and commodity traffic which are fast, fair, and friendly in the following senses:

  • 1.

    Fast: The protocol should be able to transfer data at very high speed. The throughput obtained by the protocol should only be limited by the physical characteristics of the network and some reasonable overhead of the lower level protocols. For example, a single flow of the protocol with no competing traffic should be able to use all of the available bandwidth B, even for B=1 and 10 Gbit/s links.

  • 2.

    Fair: The protocol has the ability to share bandwidth resources with other flows using the same protocol in the sense that m high speed flows on a high bandwidth delay product link with no competing commodity traffic will each use about B/m of the bandwidth.

  • 3.

    Friendly: The protocol can co-exist with commodity TCP flows. The protocol should not use up all the available bandwidth and prevent concurrent TCP flows from getting a fair allocation of bandwidth. More specifically, TCP flows should have approximately the same throughput in two situations: (1) only m+n TCP flows exist; (2) m TCP flows and n high speed flows exist.

TCP is a protocol that becomes less than optimal as network bandwidth and delay increase, although it is still dominant on today’s Internet. During its congestion avoidance phase, TCP is designed to increase its sending rate by one segment per RTT when there is no congestion event as indicated by three duplicate acknowledgements; each time there is a congestion event, the sending rate is decreased by half. This approach, called additive increase multiplicative decrease (AIMD), is very inefficient for networks with high BDPs. The recovery time for a single lost packet can be very high. It is also unfair to competing flows in the sense that flows with shorter RTTs can obtain more bandwidth. In this paper, we concentrate on the congestion avoidance behavior of TCP. TCP slow start, fast retransmit, and fast recovery are out of the scope of this paper. Detailed description of the shortcomings of TCP in high BDP networks can be found in [6], [29].

In this paper, we show that a new protocol called UDT (for UDP-based Data Transfer) can be used to support teraflows for distributed and grid computing applications. We also show experimentally that UDT is fast, fair, and friendly in the senses given above, even over networks with high BDPs. In addition, since UDT is implemented at the application level, it can be deployed today without any changes to the current network infrastructure. Of course, UDT can also be deployed more efficiently by modifying the operating system kernel.

UDT uses UDP packets to transfer data and retransmit the lost packets to guarantee reliability. Its congestion control algorithm combines rate-based and window-based approaches to tune the inter-packet time and the congestion window, respectively.

It is not a goal of UDT to replace TCP in the Internet. Instead, protocols like UDT can be used to supplement TCP in networks with large BDPs, where some applications require one or several teraflows and these must co-exist with commodity flows.

The rest of this paper is organized as follows. The control mechanism of UDT is described in Section 2. In Section 3, we introduce experimental results. Section 4 looks at related work. Section 5 contains concluding remarks.

Section snippets

Description of UDT protocol

We begin this section with an overview of the UDT protocol. Next, we describe the reliability and congestion control mechanisms used in UDT.

Experimental studies

In this section, we describe the testbeds and the experimental studies we performed.

Related work

Currently, there are four approaches being investigated for high performance data transport: employing parallel TCP connections, modifying standard TCP, creating new protocols based upon UDP, and developing entirely new network transport layer protocols [15].

Using parallel TCP connections to achieve higher performance is intuitive [10] and also widely available because the current version of GridFTP employs this approach [2]. However, it poses several problems. For example, careful tuning is

Conclusion

In this paper, we have described a new application level protocol called UDT. UDT is built over UDP. UDT employs a new congestion control algorithm that is designed to achieve intra-protocol fairness in the presence of multiple high volume flows. UDT is designed to be fast, fair, and friendly as defined in Section 1.

We showed in our experimental studies that UDT can effectively utilize the high bandwidth of networks with high BDPs in situations where currently deployed versions of TCP are not

Acknowledgements

This work was supported by NSF grants ANI-9977868, ANI-0129609, and ANI-0225642.

Robert L. Grossman is the Director of the Laboratory for Advanced Computing and the National Center for Data Mining at the University of Illinois at Chicago, where he has been a faculty member since 1988. He is also the President of Open Data Partners, which provides consulting and outsourced services focused on data. He has published over 75 papers in refereed journals and proceedings on data mining, distributed computing, high-performance networking, business intelligence, and related areas,

References (29)

  • R.L. Grossman et al.

    Proceedings of Supercomputing

    IEEE

    (1999)
  • R.L. Grossman, M. Mazzucco, H. Sivakumar, Y. Pan, Q. Zhang, SABUL—simple available bandwidth utilization library for...
  • T. Hacker et al.

    The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network

  • E. He et al.

    Reliable blast UDP: predictable high performance bulk data transfer

    Proceedings of the IEEE International Conference on Cluster Computing

    (Sept. 2002)
  • Cited by (11)

    • Elastic reservations for efficient bandwidth utilization in LambdaGrids

      2007, Future Generation Computer Systems
      Citation Excerpt :

      As is evident from the previous discussion on quasi-flexibility, our interpretation of elasticity in the context of LambdaGrid applications is different from Shenker’s definition. Moreover, high performance transport protocols, such as UDT [21], have been experimentally shown to be able to utilize most of the bandwidth that is made available to them in an optical channel (multiple flows can also coexist in UDT with very little bandwidth wasted). We thus infer that incremental increase in bandwidth will also lead to a corresponding increase in the performance of these elastic applications.

    • Collaborative data visualization for Earth Sciences with the OptIPuter

      2006, Future Generation Computer Systems
      Citation Excerpt :

      Because the same virtual namespace can be assigned to different resource sets, the application can be independently run on them. To simplify application use of novel protocols [11–16], DVC’s provide an integrated communication framework that combines different protocol implementations and provides a simple, unified set of interfaces to the applications. These protocols are essential to achieve high performance on long-haul networks, though having implicit complexities and presenting diverse interfaces.

    • Addressing big data issues in Scientific Data Infrastructure

      2013, Proceedings of the 2013 International Conference on Collaboration Technologies and Systems, CTS 2013
    • Introduction

      2013, Communications and Control Engineering
    • Overcoming large data transfer bottlenecks in RESTful service orchestrations

      2012, Proceedings - 2012 IEEE 19th International Conference on Web Services, ICWS 2012
    View all citing articles on Scopus

    Robert L. Grossman is the Director of the Laboratory for Advanced Computing and the National Center for Data Mining at the University of Illinois at Chicago, where he has been a faculty member since 1988. He is also the President of Open Data Partners, which provides consulting and outsourced services focused on data. He has published over 75 papers in refereed journals and proceedings on data mining, distributed computing, high-performance networking, business intelligence, and related areas, and lectured extensively at conferences and workshops.

    Yunhong Gu is a fourth-year PhD candidate in the Computer Science Department at the University of Illinois at Chicago. His research interests include transport protocols, Internet congestion control, and distributed systems. Previously, he received his MS degree in computer science from Beijing University of Aeronautics and Astronautics, China.

    Xinwei Hong is a postdoctoral research associate at the National Center for Data Mining at University of Illinois at Chicago. He received a PhD degree in electronics and information engineering from Huazhong University of Science and Technology, China, in 1988. His current research interests include developing high-speed data delivery over high-performance wide area networks.

    Antony Antony is a researcher at NIKHEF, The Netherlands. He received a BE in Electrical Engineering from University of Bombay, India. Over the past years, he has been involved in several advanced networking projects, including DataTAG and NetherLight. His research interests include the dynamics of transport protocols over long-distance high-speed networks and inter-domain routing protocols such as BGP and GMPLS.

    Johan Blom graduated in 1992 from Utrecht University, The Netherlands, with a thesis entitled “Topological and Geometrical Aspects of Image Structure”. Next, he worked at the same university in the field of particle tracking using a Radon transform embedded in a multi-resolution structure. In 1996, he moved to computer-guided education and was involved in a collaboration between the Universities of Amsterdam and Utrecht. In 2001, he joined University of Amsterdam, where he develops software for automated testing and monitoring of Gigabit networks.

    Freek Dijkstra received an MSc degree in applied physics from University of Utrecht, The Netherlands, in 2002. He is a researcher and a PhD student at University of Amsterdam. Freek’s research interests focus on Optical Networking, especially multi-domain aspects.

    Cees de Laat is an associate professor at University of Amsterdam, The Netherlands. He received a PhD degree in physics from University of Delft. His research interests include optical networking, lambda switching and provisioning, policy-based networking, and the Authorization, Authentication and Accounting architecture. He is responsible for the research on the Lambda switching facility (NetherLight). He implements research projects in the GigaPort Networks area in collaboration with SURFnet.

    View full text