An efficient processing of a chain join with the minimum communication cost in distributed database systems

Lin, Xuemin; Orlowska, Maria E.

doi:10.1007/BF01263657

An efficient processing of a chain join with the minimum communication cost in distributed database systems

Published: January 1995

Volume 3, pages 69–83, (1995)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

Xuemin Lin¹^nAff2 &
Maria E. Orlowska¹

54 Accesses
40 Citations
Explore all metrics

Abstract

This paper investigates the optimization problem when executing a join in a distributed database environment. The minimization of the communication cost for sending data through links has been adopted as an optimization criterion. We explore in this paper the approach of judiciously using join operations as reducers in distributed query processing. In general, this problem is computationally intractable. A restriction of the execution of a join in a pre-defined combinatorial order leads to a possible solution in polynomial time. An algorithm for a chain query computation has been proposed in [21]. The time complexity of the algorithm isO(m ² n ²+m ³ n), wheren is the number of sites in the network, andm is the number of relations (fragments) involved in the join. In this paper, we firstly present a proof of the intuitively well understood fact—that the “eigenorder” of a “chain” join will be the best pre-defined combinatorial order to implement the algorithm in [21]. Secondly, we show a sufficient and necessary condition for a chain query with the eigenordering to be a “simple” query. For the process of the class of simple queries, we show a significant reduction of the time complexity fromO(m ² n ²+m ³ n) toO(mn+m ²). It is encouraging that, in practice, the most frequent queries belong to the category of simple queries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

It’s All a Matter of Degree

Article 15 September 2017

Optimization of Multi-way Join Cost Using System R* and SharesSkew

Minimizing Network Traffic for Distributed Joins Using Lightweight Locality-Aware Scheduling

References

P.M.G. Apers, A. Hevner, and S.B. Yao, “Optimization Algorithms for Distributed Queries,”IEEE Transactions on Software Engineering, SE-9(1), pp. 57–68, 1983.
Google Scholar
Y. Bartal, A. Fiat, and Y. Rabani, “Competitive Algorithms for Distributed Data Management,”24th Annual ACM Symposium on the Theory of Computing, pp. 39–49, 1992.
P.A. Bernstein and D. Chiu, “Using Semi-Joins to Solve Relational Queries,”Journal of ACM, 28(1), pp. 25–40, 1981.
Google Scholar
P.A. Bernstein, N. Goodman, E. Wong, C.L. Reeve, and J.B. Rothe, “Query Processing in a System for Distributed Database (SDD-1),”ACM Transaction on Database Systems, 6(4), pp. 602–625, 1981.
Google Scholar
J.A. Bondy and U.S.R. Murty,Graph Theory with Applications, The Macmillan, 1978.
D. Chiu, P.A. Bernstein, and Y. Ho, “Optimizing Chain Queries in a Distributed Database System,”SIAM Journal on Computing, 13(1), pp. 116–134, 1984.
Google Scholar
M.-S. Chen and P.S. Yu, “Interleaving a Join Sequence with Semijoins in Distributed Query Processing,”IEEE Transactions on Parallel and Distributed Systems, 3(5), pp. 611–621, 1992.
Google Scholar
M.-S. Chen and P.S. Yu, “Using Join Operations as Reducers in Distributed Query Processing,”Databases in Parallel and Distributed Systems, pp. 116–123, 1990.
T.H. Cormen, C.E. Leiserson, and R.L. Rivest,Introduction to Algorithms, The MIT press, 1990.
C.J. Date,An Introduction to Database System, 2 Addision-Wesley, 1982.
S. Ganguly, W. Hasan, and R. Krishnamurthy, “Query Optimization for Parallel Execution,”SIGMOD Record, 21(2), pp. 9–18, 1992.
Google Scholar
M.R. Garey and D.S. Johnson,computers and Intratability: a guide to the theory of NP-Completeness, W. H. Freeman and Company, 1978.
A.R. Hevner and S.B. Yao, “Query Processing in Distributed Database Systems,”IEEE Transactions on Software Engineering, SE-5(3), pp. 177–187, 1979.
Google Scholar
T. Ibaraki and T. Kameda, “On the Optimal Nesting Order for Computing N-Relations Joins,”ACM Transactions Database Systems, 9, pp. 482–502, 1984.
Google Scholar
Y.E. Ioannidis and S. Christodoulakis, “On the Propagation of Errors in the Size of Join Results,”Proceedings of the 1991 SIGMOD International Conference on Management of Data, pp. 268–277, 1991.
R. Krishnamurthy, H. Boral, and C. Zaniolo, “Optimization of Nonrecursive Queries,”Proceedings of VLDB 86, pp. 1282–137, 1986.
H. Lu, M.C. Shan, and K.L. Tan, “Optimization of Multi-Way Join Queries for Parallel Execution,”Proceedings of VLDB 91, pp. 549–560, 1991.
S. Pramanik and D. Vineyard, “Optimizing Join Queries in Distributed Databases,”IEEE Transactions on Software Engineering, 14(9), pp. 1319–1326, 1988.
Google Scholar
D. Maier,Theory of Relational Databases, Computer Science Press, 1993.
Y. Mansour and B. Patt-Shamir, “Greedy Packet Scheduling on Shortest Paths,”Proceedings of the 10th Annual ACM Symposium on Principles of Distributed Computing, pp. 165–176, 1991.
M.W. Orlowski, “On Optimisation of Joins in Distributed Database Systems,”Future Databases 92, World Scientific, pp. 106–114, 1992.
M.W. Orlowski,Private Communication.
D. Shasha and T.L. Wang, “Optimizing Equijoin Queries in Distributed Databases Where Relations are Hash Partitioned,”ACM Transactions on Database Systems, 16(2), pp. 279–308, 1991.
Google Scholar
A. Swami and A. Gupta, “Optimization of Large Join Queries: Combining Heuristics and Combinatorial Techniques,”Proceedings of SIGMOD 89, pp. 367–376, 1989.
Google Scholar
A.E. Taylor,Advanced Calculus, Ginn, 1955.
M. Templeton, et al., “Mermaid-Experiences with network operation,”Proceedings of IEEE Data Engineering Conference, 1986.
J.D. Ullman,Principles of Database Systems, Computer Science Press, Rockville, MD, 1982.
Google Scholar
C.P. Wang, “The Complexity of Processing Tree Queries in Distributed Databases,”2nd IEEE Symposium on Parallel and Distributed Processing, pp. 604–611, 1990.
C.P. Wang, V.O.K. Li, and A.L.P. Chen, “One-shot Semi-Join execution strategies for processing distributed queries,”7th IEEE Data Engineering Conference, pp. 756–763, 1991.
C.P. Wang, A.L.P. Chen, and S.-C. Shyu, “A Parallel Execution Method for Minimizing Distributed Query Response Time,”IEEE Transactions on Parallel and Distributed Systems, 3(3), pp. 325–333, 1992.
Google Scholar
E. Wong, “Dynamic Rematerialization: Processing Distributed Queries Using Redundant Data,”IEEE Transactions on Software Engineering, SE-9(3), pp. 228–232, 1983.
Google Scholar
C.T. Yu and C.C. Chang, “Distributed Query Processing,”ACM Computing Surveys, 16(4), 1984.
C.T. Yu, Z.M. Ozsoyoglu and K. Lam, “Optimization of Distributed Tree Queries,”Journal of Computer and System Science, 29, pp. 399–433, 1984.
Google Scholar

Download references

Author information

Xuemin Lin
Present address: Department of Computer Science, The University of Western Australia, 6009, WA, Australia

Authors and Affiliations

Department of Computer Science, The University of Queensland, 4072, QLD, Australia
Xuemin Lin & Maria E. Orlowska

Authors

Xuemin Lin
View author publications
You can also search for this author in PubMed Google Scholar
Maria E. Orlowska
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Editor: Peter Apers

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, X., Orlowska, M.E. An efficient processing of a chain join with the minimum communication cost in distributed database systems. Distrib Parallel Databases 3, 69–83 (1995). https://doi.org/10.1007/BF01263657

Download citation

Received: 17 June 1993
Revised: 30 June 1994
Issue Date: January 1995
DOI: https://doi.org/10.1007/BF01263657

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient processing of a chain join with the minimum communication cost in distributed database systems

Abstract

Access this article

Similar content being viewed by others

It’s All a Matter of Degree

Optimization of Multi-way Join Cost Using System R* and SharesSkew

Minimizing Network Traffic for Distributed Joins Using Lightweight Locality-Aware Scheduling

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An efficient processing of a chain join with the minimum communication cost in distributed database systems

Abstract

Access this article

Similar content being viewed by others

It’s All a Matter of Degree

Optimization of Multi-way Join Cost Using System R* and SharesSkew

Minimizing Network Traffic for Distributed Joins Using Lightweight Locality-Aware Scheduling

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation