Improving the Ability of Mining for Multi-dimensional Data

Shi, Yong; Kling, Tyler

doi:10.1007/978-3-642-17622-7_30

Yong Shi⁷ &
Tyler Kling⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 118))

Included in the following conference series:

823 Accesses

Abstract

In this paper, we present continuous research on data analysis based on our previous work on similarity search problems. PanKNN[13] is a novel technique which explores the meaning of K nearest neighbors from a new perspective, redefines the distances between data points and a given query point Q, and efficiently and effectively selects data points which are closest to Q. It can be applied in various data mining fields. In this paper, we present our approach to improving the scalability of the PanKNN algorithm. This proposed approach can assist to improve the performance of existing data analysis technologies, such as data mining approaches in Bioinformatics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

White, D.A., Jain, R.: Similarity Indexing with the SS-tree. In: Proceedings of the 12th Intl. Conf. on Data Engineering, New Orleans, Louisiana, pp. 516–523 (February 1996)
Google Scholar
Achtert, E., Böhm, C., Kröger, P., Kunath, P., Pryakhin, A., Renz, M.: Efficient reverse k-nearest neighbor search in arbitrary metric spaces. In: SIGMOD 2006, pp. 515–526. ACM, New York (2006)
Google Scholar
Aggarwa, C.C.: Towards meaningful high-dimensional nearest neighbor search by human-computer interaction. In: ICDE (2002)
Google Scholar
Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 420. Springer, Heidelberg (2000), citeseer.nj.nec.com/aggarwal01surprising.html
Chapter Google Scholar
Bay, S.D.: The UCI KDD Archive,University of California, Irvine, Department of Information and Computer Science, http://kdd.ics.uci.edu
Berchtold, D.A., Keim, S., Kriegel, H.-P.: The X-tree: An index structure for high-dimensional data. In: VLDB 1996, Bombay, India, pp. 28–39 (1996)
Google Scholar
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: International Conference on Database Theory 1999, Jerusalem, Israel, pp. 217–235 (1999)
Google Scholar
Cui, B., Shen, H., Shen, J., Tan, K.: Exploring bit-difference for approximate KNN search in high-dimensional databases. In: Australasian Database Conference (2005)
Google Scholar
Fagin, R., Kumar, R., Sivakumar, D.: Efficient similarity search and classification via rank aggregation (2003)
Google Scholar
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. The VLDB Journal, 518–529 (1999)
Google Scholar
Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? The VLDB Journal, 506–515 (2000)
Google Scholar
Seidl, T., Kriegel, H.-P.: Optimal multi-step k-nearest neighbor search. SIGMOD Rec. 27(2), 154–165 (1998)
Article Google Scholar
Shi, Y., Zhang, L.: A dimension-wise approach to similarity search problems. In: The 4th International Conference on Data Mining, DMIN 2008 (2008)
Google Scholar
Weber, R., Schek, H.-J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proc. 24th Int. Conf. Very Large Data Bases, VLDB, pp. 194–205, 24–27(1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Systems, Kennesaw State University, 1000 Chastain Road, Kennesaw, GA, 30144
Yong Shi & Tyler Kling

Authors

Yong Shi
View author publications
You can also search for this author in PubMed Google Scholar
Tyler Kling
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Victoria University, 8001, Melbourne, VIC, Australia
Yanchun Zhang
University of Calabria, Via P. Bucci, 41C, I-87036, Rende, Cosenza, Italy
Alfredo Cuzzocrea
Hosei University, 3-7-2, Kajino-cho, Koganei-shi, 184-8584, Tokyo, Japan
Jianhua Ma
Information Security Infrastructure Research Group, Electronics and Telecommunications Research Institute, 161, Gajeong-Dong, Yuseong-Gu, Daejeon, Korea
Kyo-il Chung
Engineering and Electronics, Edinburgh University, King’s Buildings, Faraday, rm 3.101, Mayfield Road, EH9 3JL, Edinburgh, UK
Tughrul Arslan
Nanjing University of Aeronautics and Astronautics, 210016, Jiangsu, Nanjing, China
Xiaofeng Song

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, Y., Kling, T. (2010). Improving the Ability of Mining for Multi-dimensional Data. In: Zhang, Y., Cuzzocrea, A., Ma, J., Chung, Ki., Arslan, T., Song, X. (eds) Database Theory and Application, Bio-Science and Bio-Technology. BSBT DTA 2010 2010. Communications in Computer and Information Science, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17622-7_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-17622-7_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17621-0
Online ISBN: 978-3-642-17622-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics