An efficient multipath routing for distributed computing systems with data replication
Introduction
This paper presents a routing algorithm for time critical applications which need to retrieve remote high-bandwidth data file(s) in a LAN-based distributed computing system (DCS). A connection-oriented data forwarding technique, virtual circuit [4], [6], [7], which enables bandwidth to be reserved through the lifetime of a connection, is used in the network. Since the inherent link delay (including propagation delay and transmission delay) is insignificant for bulk data transfer in a high-speed LAN, the design objective of the routing algorithm is to maximize data transfer rate (or throughput). For a source-to-destination traffic session, throughput can frequently be improved by splitting the traffic over several paths. The technique of using multiple paths between a source–destination pair is called multipath routing.
Based on the multipath routing scheme, the problem of optimal routing can be described as a multicommodity flow problem if the number of paths in a source–destination session is unlimited and the inherent link delay and control overhead are negligible. Linear programming is hence a feasible technique to solve it. However, since in a DCS data files are often replicated to improve system (or program) reliability, the program may acquire data from any replica [1], [3]. This problem thus becomes a complex combinatorial one. An exhaustive approach could of course find the optimal solution, but it pays for high computation price. Our previous work has developed an alternate, the critical-cut algorithm, to solve the optimal routing problem [2]. For most experiment cases, the algorithm yields short execution times. Occasionally, it still takes exponential time. This paper thus proposes a heuristic routing algorithm to obtain acceptable routing paths. Short and stable execution time makes the proposed algorithm be applicable to the existent systems.
Assumptions: We assume that a distributed computing system can be described as an undirected graph with nodes representing computing sites and edges representing communication links. We assume virtual circuits for communication between a source and a destination, and multipath routing is supported in the virtual circuit schemes. Intermediate nodes do not buffer packets, but simply send all received packets immediately. Inherent link delay is negligible. A given file cannot be split and distributed to multiple nodes. A communication path cannot have loops.
Section snippets
Critical-cut algorithm
This section presents the critical-cut algorithm to find the optimal routing. Proofs of theorems behind the algorithm can be referred in [2].Notations source the node which holds data files target the node which issues request for data files cut set of edges such that the removal of the edges separates a connected graph into two disconnected subgraphs the cut separating the nodes in set X from the other nodes (i.e., the nodes in ). Note that in presenting a cut by the notation , target is
Heuristic routing algorithm
The performance of critical-cut method depends on the cut tree and the distribution of files. For the case such as the example in Section 2.5, the cut tree is a linear array and the algorithm therefore yields very short execution time. However, for some other cases, critical cut method may take exponential time to obtain the optimal routing. In practical environments, a routing algorithm with short and stable execution time is important in real-time applications. We thus propose a heuristic
Simulation results
To evaluate the performance of the proposed heuristic routing algorithm, we compared the routes found by the heuristic algorithm with the optimal routing paths. The optimal routing paths were obtained by the critical-cut method as described in Section 2. Many factors such as the network topology, available link capacities, the number of replicated copies of each data file, and the dispersal of the program and the data files in the sense will influence the performance. In order to fairly
Conclusion
In this paper, we propose a heuristic multipath routing algorithm on virtual circuit based DCS. Such a routing scheme is appropriate for high-bandwidth demanded data transfers issued by real-time applications. To evaluate how well the algorithm performs, we made experiments on various networks to monitor the algorithm execution time and the communication capacities of the found routing paths. These observed experimental data are compared with the results obtained by the critical-cut algorithm,
References (7)
- et al.
Network support for multimedia: A discussion of the telnet approach
Comput. Networks and ISDN System
(1994) - et al.
Concurrency Control and Recovery in Database Systems
(1987) - P.Y. Chang, D.J. Chen, Optimal routing for distributed computing systems with data replication, in: Proceedings of IEEE...