Abstract
The Uyghur, Kazak, and Kyrgyz languages no language ID and the some letters in this languages are sharing code points in Unicode area, so it is difficult to distinguish between Uyghur, Kazak, and Kyrgyz letters in information exchange, automatic word segmentation and retrieval applications, existing linguistic ambiguity. In addition, in the region in alphabetical order with the Arabic alphabet, Uyghur, Kazak, and Kyrgyz letter is the order of chaos, this will led to great difficulties for Uyghur, Kazak, and Kyrgyz multilingual data indexing, query processing and sorting process. In this paper, studied and proposed the most effective solutions and ideas for above actual problems: in view of the problem of linguistic ambiguity, proposed a Relocated Unicode Format (short for RuniForm) Encoding Method; For multilingual indexing, proposed a multilingual indexing technology based on MD5 encryption and related query processing approach in Uyghur, Kazak, and Kyrgyz information retrieval system (UKKIRS). The experimental results indicated that, the proposed algorithms solved well the problems mentioned above, and are very dedicated to this UKKIRS.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Tohti, T., Musajan, M., Hamdulla, A.: Character Code Conversion and Misspelled Word Processing in Uyghur, Kazak, Kyrgyz Multilingual Information Retrieval System [J]. In: 7th International Conference on Advanced Language Processing and web information technology (ALPIT 2008), Dalian, China, pp. 139–144 (2008)
Tohti, T., Musajan, M., Hamdulla, A.: Design the Uyghur, Kazak, Kyrgyz Full-text Search Engine Indexer and Its Implementation. Journal of Information (10), 49–51 (2008)
Yi, H.G., She, M.G.: MD5 Arithmetic and Digital Signature. Computer & Digital Engineering, China 34(5), 44–46 (2006)
Scholer, F., Williams, H.E., Yiannis, J., Zobel, J.: Compression of Inverted Indexes for Fast Query Evaluation. In: Proceedings of 25th ACM-SIGIR, Finland, pp. 222–229 (2002)
Tohti, T., Musajan, M., Hamdulla, A.: Research on Query Processing and Implementation in Uyghur, Kazak, and Kyrgyz Full-text Search Engine [J]. In: 4th National Conference on Information Retrieval and Content Security, Beijing, China, pp. 217–223 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Tursun, D., Tohti, T., Hamdulla, A. (2009). Research on Multilingual Indexing and Query Processing in Uyghur, Kazak, and Kyrgyz Multilingual Information Retrieval System. In: Lee, R., Hu, G., Miao, H. (eds) Computer and Information Science 2009. Studies in Computational Intelligence, vol 208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01209-9_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-01209-9_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01208-2
Online ISBN: 978-3-642-01209-9
eBook Packages: EngineeringEngineering (R0)