Abstract
With the rapid growth of online information, a simple search query may return thousands or even millions of results. There is a need to help user to access and identify relevant information in a flexible way. This paper describes a methodology that automatically map web search results into user defined categories. This allows the user to focus on categories of their interest, thus helping them to find for relevant information in less time. Text classification algorithm is used to map search results into categories. This paper focuses on feature selection method and term weighting measure in order to train an optimum and simple category model from a relatively small number of training texts. Experimental evaluations on real world data collected from the web shows that our classification algorithm gives promising results and can potentially be used to classify search results returned by search engines.
Chapter PDF
Similar content being viewed by others
References
C, Liu. (2004). A Survey: Automatic Text Categorization. CS412 Report, University of Illinois at Urbana-Champaign.
T, Joachims. (1997). A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. In Proceedings of the 14 th International Conference on Machine Learning (ICML97, pp143–151.
H, Chen, and S.T. Dumais. (2000). Bringing Order to the Web: Automatically Categorizing Search Results. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI 2000), pp 145–152.
H, Chen., S.T. Dumais and E, Cytrell. (2001). Optimizing Search by Showing Results in Context, In Proceedings of the, ACM SIGCHI Conference on Human Factors in Comptrting Systerns (CHI 2001). pp277–284.
H, Liu., H, Dash., and H, Motoda. (2000). Consistency Based Feature Selection, In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2000)
D, Mladenic. and M, Grobelnik. (1998). Word sequences as features in text-learning. In Proceedings of EM-98, the seventh Electro-technical and Computer Science Conference, pp145–148.
M, Sahami. and D, Koller. (1996). Toward Optimal Feature Selection. In Proceedings of the 13 th International Conference on Machine Learning (ICM96), San Franscisco CA, Morgan Kaufmann, pp 284–292.
G, Salton. and C, Buckley. (1988). Term Weighting Approaches in Automatic Text Retrieval. In Technical Report, COR-87-881, Department of Computer Science, Cornell University.
S, S, Tan. (2002). Topic Hierarchy Annotation using Feature Selection Technique. MSc Thesis, School of Computer Science, Universiti Sains Malaysia.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 International Federation for Information Processing
About this paper
Cite this paper
Tan, S.S., Hoon, G.K., Yong, C.H., Kong, T.E., Lin, C.S. (2005). Mapping Search Results into Self-Customized Category Hierarchy. In: Shi, Z., He, Q. (eds) Intelligent Information Processing II. IIP 2004. IFIP International Federation for Information Processing, vol 163. Springer, Boston, MA. https://doi.org/10.1007/0-387-23152-8_41
Download citation
DOI: https://doi.org/10.1007/0-387-23152-8_41
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-23151-8
Online ISBN: 978-0-387-23152-5
eBook Packages: Computer ScienceComputer Science (R0)