Abstract
The assumptions of uniformity and independence of attribute values in a file, uniformity of queries, constant number of records per block, and random placement of qualifying records among the blocks of a file are frequently used in database performance evaluation studies. In this paper we show that these assumptions often result in predicting only an upper bound of the expected system cost. We then discuss the implications of nonrandom placement, nonuniformity, and dependencies of attribute values on database design and database performance evaluation.
- 1 AHO, A., AND ULLMAN, J. Optimal partial match retrieval when fields are independently specified. ACM Trans. Database Syst. 4, 2 (June 1979), 168-179. Google ScholarDigital Library
- 2 ANDERSON, H.D., AND BERRh, P.B. Minimum cost selection of secondary indexes for formated files. ACM Trans. Database Syst. 2, 1 (Mar. 1977), 68-90. Google ScholarDigital Library
- 3 ASTRAHAN, M.M., KIM, W., AND SCHKOLNICK, M. Evaluation of the System R access path selection mechanism. In Proceedings IFIP, 1980.Google Scholar
- 4 BERNSTEIN, P.A., AND CHIC, D.W. Using semijoins to solve relational queries. J. A CM 28, 1 (Jan. 1981), 25-40. Google ScholarDigital Library
- 5 CARDENAS, A.F. Analysis and performance of inverted database structures. Commun. ACM 18, 5 (May 1975), 253-263. Google ScholarDigital Library
- 6 CHRISTODOULAKIS, S. Estimating selectivities in databases. Ph.D. dissertation; Rep. CSRG No. 136, Dept. of Computer Science, Univ. of Toronto, 1981. Google ScholarDigital Library
- 7 CHRISTODOULAKIS, S. Estimating record selectivities, inf. Syst. 8, 2 (1983), 105-115.Google Scholar
- 8 CHRISTODOULAKIS, S. Estimating block transfers and join sizes. In Proceedings SIGMOD 1983 Conference, ACM, New York, 40-54. Google ScholarDigital Library
- 9 CHRISTODOULAKIS, S. Estimating block selectivities. Inf. Syst. 9, 1 (1984).Google ScholarCross Ref
- 10 CHRISTODOULAKIS, S., AND FALOUTSOS, C. Performance considerations for a message file server. IEEE Trans. Softw. Eng. (Mar. 1984}.Google ScholarDigital Library
- 11 DEMOLOMBE, R. Estimation of the number of tuples satisfying a query expressed in relational algebra. IEEE 1980, 55-63.Google Scholar
- 12 Engineers' Salaries, 1979 Report on. Association of Professional Engineers of the Province of Ontario, 1979.Google Scholar
- 13 GOODMAN, N,, BERNSTEIN, P.A., WONG, E., REEVE, C.L., AND ROTHNIE, J.B. Query processing in SDD-I: A system for distributed databases. Tech. Rep., Computer Corporation of America, 1979.Google Scholar
- 14 HAMMER, M., AND NIAMIR, B. A heuristic approach to attribute partitioning. In Proceedings ACM SIGMOD 1979, ACM, New York. Google ScholarDigital Library
- 15 JAYNES, E.T. Where do we stand on maximum entropy? In The Maximum Entropy Formalism, Levine and Tribus, Eds., MIT Press, Cambridge, Mass., 1979, 15-118.Google Scholar
- 16 KERSCHBERG, L., TING, P.D., AND YAO, S.B. Optimal distributed query processing. Tech. Rep., Bell Laboratories, Holmdel, N.J., 1980.Google Scholar
- 17 KIN~, W.F. On the selection of indices for a file. IBM Res. Rep. RJ-1341{20850), IBM Research Laboratory, San Jose, Calif., 1974.Google Scholar
- 18 KNUTH, D.E. The Art o/Computer Programming, vol. 3: Sorting and Searching. Addison Wesley, Reading, Mass., 1973. Google ScholarDigital Library
- 19 LUM, V.Y., SENCO, M,E., WANG, C.P., AND LING, H. A cost oriented algorithm for data set allocation in storage hierarchies. Commun. ACM 18, 6 (June 1975). Google ScholarDigital Library
- 20 MARSHALL, A., AND OLKIN, i. Inequalities: Theory of Majorization and its Applications. Academic Press, New York, 1979.Google Scholar
- 21 NIEVERGELT, J., AND REINGOLD, E.M. Binary search trees of bounded balance. SIAM J. Comput. 2, 1 (1973), 33-43.Google ScholarCross Ref
- 22 OZKARAHAN, E.A., SCHUSTER, S.A., AND SEVCIK, K.C. Peformance evaluation of a relational associative processor. ACM Trans. Database Syst. 2, 2 (June 1977), 175-196. Google ScholarDigital Library
- 23 ROTHNIE, J., AND LOZANO, J. Attribute based file organization in a paged memory environment. Commun. ACM 17, 2 (Feb. 1974), 63-69. Google ScholarDigital Library
- 24 SALTON, G. Dynamic Information and Library Processing. Prentice-Hall, Englewood Cliffs, N.J., 1975. Google ScholarDigital Library
- 25 SCHKOLNICK, M. The optimal selection of secondary indices for flies. Inf. Syst. t, 141-146.Google Scholar
- 26 SCHKOLNICK, M. A survey of physical database design methodology and techniques. In Proceedings Conference on Very Large Data Bases 1978, 474-487.Google Scholar
- 27 SELINGER, P.G., ASTRAHAN, M.M., CHAMBERLIN, D.D., LORIE, R.A., AND PRICE, T.G. Access path selection in a relational database management system. In Proceedings ACM SIGMOD 1979, ACM, New York, 23-34. Google ScholarDigital Library
- 28 SEVCIK, K.C. Database system performance prediction using an analytic model. In Proceedings Conference on Very Large Data Bases 1981, 182-198.Google Scholar
- 29 TEOREY, T., AND FRY, J. Design of Database Structures. Prentice-Hall, Englewood Cliffs, N.J., 1982. Google ScholarDigital Library
- 30 Tou, J.T., AND GONZALEZ, R.C. Pattern Recognition Principles. Addison-Wesley, Reading, Mass., 1974.Google Scholar
- 31 TSlCHmTZlS, D., AND CHRISTODOULAKIS, S. Message files. ACM Trans. Office Inf. Syst. I, 1 (Jan. 1983), 88-98. Google ScholarDigital Library
- 32 WIEDERHOLD, G. Database Design. McGraw Hill, New York, 1983. Google ScholarDigital Library
- 33 WONG, C.K. Minimizing expected head movement in a one-dimensional mass storage systems. Comput. Surv. 12 (1980), 167-178. Google ScholarDigital Library
- 34 WONG, C.K., AND YUE, P.C. A majorization theorem for the number of distinct outcomes in N independent trials. Discrete Math. 6 (1973), 391-398.Google ScholarDigital Library
- 35 YAO, S.B. Approximating block accesses in database organizations. Commun. ACM 20, 4 (Apr. 1977), 260-261. Google ScholarDigital Library
- 36 YUE, P.C., AND WONG, C.K. Storage cost considerations in secondary index selection. Int. J. Comput. Inf. Syst. 4, 4 (1975), 307-315.Google Scholar
Index Terms
- Implications of certain assumptions in database performance evauation
Recommendations
Assumptions in relational database theory
PODS '82: Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systemsMany results in relational database theory on the structure of dependencies, query languages, and databases in general have now been established. However, neither (a) the reliance of these results on various assumptions, nor (b) the desirability or ...
Identifying New Directions in Database Performance Tuning
Database performance tuning is a complex and varied active research topic. With enterprise relational database management systems still reliant on the set-based relational concepts that defined early data management products, the disparity between the ...
A Group-Select Operation for Relational Algebra and Implications for Database Machine Design
A group-select operation has been defined for relational algebra. This operation is found to be useful for efficiently reducing expressions of nonprocedural relational languages that permit natural quantifiers. Conceptually, the operation first ...
Comments