Abstract
In this paper, we present continuous research on data analysis based on our previous work on similarity search problems. PanKNN[13] is a novel technique which explores the meaning of K nearest neighbors from a new perspective, redefines the distances between data points and a given query point Q, and efficiently and effectively selects data points which are closest to Q. It can be applied in various data mining fields. In this paper, we present our approach to improving the scalability of the PanKNN algorithm. This proposed approach can assist to improve the performance of existing data analysis technologies, such as data mining approaches in Bioinformatics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
White, D.A., Jain, R.: Similarity Indexing with the SS-tree. In: Proceedings of the 12th Intl. Conf. on Data Engineering, New Orleans, Louisiana, pp. 516–523 (February 1996)
Achtert, E., Böhm, C., Kröger, P., Kunath, P., Pryakhin, A., Renz, M.: Efficient reverse k-nearest neighbor search in arbitrary metric spaces. In: SIGMOD 2006, pp. 515–526. ACM, New York (2006)
Aggarwa, C.C.: Towards meaningful high-dimensional nearest neighbor search by human-computer interaction. In: ICDE (2002)
Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 420. Springer, Heidelberg (2000), citeseer.nj.nec.com/aggarwal01surprising.html
Bay, S.D.: The UCI KDD Archive,University of California, Irvine, Department of Information and Computer Science, http://kdd.ics.uci.edu
Berchtold, D.A., Keim, S., Kriegel, H.-P.: The X-tree: An index structure for high-dimensional data. In: VLDB 1996, Bombay, India, pp. 28–39 (1996)
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: International Conference on Database Theory 1999, Jerusalem, Israel, pp. 217–235 (1999)
Cui, B., Shen, H., Shen, J., Tan, K.: Exploring bit-difference for approximate KNN search in high-dimensional databases. In: Australasian Database Conference (2005)
Fagin, R., Kumar, R., Sivakumar, D.: Efficient similarity search and classification via rank aggregation (2003)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. The VLDB Journal, 518–529 (1999)
Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? The VLDB Journal, 506–515 (2000)
Seidl, T., Kriegel, H.-P.: Optimal multi-step k-nearest neighbor search. SIGMOD Rec. 27(2), 154–165 (1998)
Shi, Y., Zhang, L.: A dimension-wise approach to similarity search problems. In: The 4th International Conference on Data Mining, DMIN 2008 (2008)
Weber, R., Schek, H.-J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proc. 24th Int. Conf. Very Large Data Bases, VLDB, pp. 194–205, 24–27(1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shi, Y., Kling, T. (2010). Improving the Ability of Mining for Multi-dimensional Data. In: Zhang, Y., Cuzzocrea, A., Ma, J., Chung, Ki., Arslan, T., Song, X. (eds) Database Theory and Application, Bio-Science and Bio-Technology. BSBT DTA 2010 2010. Communications in Computer and Information Science, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17622-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-17622-7_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17621-0
Online ISBN: 978-3-642-17622-7
eBook Packages: Computer ScienceComputer Science (R0)