Mapping Search Results into Self-Customized Category Hierarchy

Tan, Saravadee Sae; Hoon, Gan Keng; Yong, Chan Huah; Kong, Tang Enya; Lin, Cheong Sook

doi:10.1007/0-387-23152-8_41

Saravadee Sae Tan²,
Gan Keng Hoon²,
Chan Huah Yong³,
Tang Enya Kong² &
…
Cheong Sook Lin²

Part of the book series: IFIP International Federation for Information Processing ((IFIPAICT,volume 163))

Included in the following conference series:

International Conference on Intelligent Information Processing

1102 Accesses

Abstract

With the rapid growth of online information, a simple search query may return thousands or even millions of results. There is a need to help user to access and identify relevant information in a flexible way. This paper describes a methodology that automatically map web search results into user defined categories. This allows the user to focus on categories of their interest, thus helping them to find for relevant information in less time. Text classification algorithm is used to map search results into categories. This paper focuses on feature selection method and term weighting measure in order to train an optimum and simple category model from a relatively small number of training texts. Experimental evaluations on real world data collected from the web shows that our classification algorithm gives promising results and can potentially be used to classify search results returned by search engines.

Download to read the full chapter text

Chapter PDF

Minimizing Web Diversion Using Query Classification and Text Mining

A Novel Feature Selection Method Based on Category Distribution and Phrase Attributes

A two-stage feature selection method for text categorization by using category correlation degree and latent semantic indexing

Article 29 January 2015

Fei Wang, Cai-hong Li, … Lian Li

Key words

References

C, Liu. (2004). A Survey: Automatic Text Categorization. CS412 Report, University of Illinois at Urbana-Champaign.
Google Scholar
T, Joachims. (1997). A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. In Proceedings of the 14 ^th International Conference on Machine Learning (ICML97, pp143–151.
Google Scholar
H, Chen, and S.T. Dumais. (2000). Bringing Order to the Web: Automatically Categorizing Search Results. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI 2000), pp 145–152.
Google Scholar
H, Chen., S.T. Dumais and E, Cytrell. (2001). Optimizing Search by Showing Results in Context, In Proceedings of the, ACM SIGCHI Conference on Human Factors in Comptrting Systerns (CHI 2001). pp277–284.
Google Scholar
H, Liu., H, Dash., and H, Motoda. (2000). Consistency Based Feature Selection, In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2000)
Google Scholar
D, Mladenic. and M, Grobelnik. (1998). Word sequences as features in text-learning. In Proceedings of EM-98, the seventh Electro-technical and Computer Science Conference, pp145–148.
Google Scholar
M, Sahami. and D, Koller. (1996). Toward Optimal Feature Selection. In Proceedings of the 13 ^th International Conference on Machine Learning (ICM96), San Franscisco CA, Morgan Kaufmann, pp 284–292.
Google Scholar
G, Salton. and C, Buckley. (1988). Term Weighting Approaches in Automatic Text Retrieval. In Technical Report, COR-87-881, Department of Computer Science, Cornell University.
Google Scholar
S, S, Tan. (2002). Topic Hierarchy Annotation using Feature Selection Technique. MSc Thesis, School of Computer Science, Universiti Sains Malaysia.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Aided Translation Unit (UTMK), School of Computer Sciences, Universiti Sains Malaysia, 11800, Penang, Malaysia
Saravadee Sae Tan, Gan Keng Hoon, Tang Enya Kong & Cheong Sook Lin
Grid Computing Lab, School of Computer Sciences, Universiti Sains Malaysia, 11800, Penang, Malaysia
Chan Huah Yong

Authors

Saravadee Sae Tan
View author publications
You can also search for this author in PubMed Google Scholar
Gan Keng Hoon
View author publications
You can also search for this author in PubMed Google Scholar
Chan Huah Yong
View author publications
You can also search for this author in PubMed Google Scholar
Tang Enya Kong
View author publications
You can also search for this author in PubMed Google Scholar
Cheong Sook Lin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Instittue of Computing TechnologyKey, Laboratory of Int. Infor. Process., Chinese Academy of Sciences, Beijing, 100080, China
Zhongzhi Shi & Qing He &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tan, S.S., Hoon, G.K., Yong, C.H., Kong, T.E., Lin, C.S. (2005). Mapping Search Results into Self-Customized Category Hierarchy. In: Shi, Z., He, Q. (eds) Intelligent Information Processing II. IIP 2004. IFIP International Federation for Information Processing, vol 163. Springer, Boston, MA. https://doi.org/10.1007/0-387-23152-8_41

Download citation

DOI: https://doi.org/10.1007/0-387-23152-8_41
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-23151-8
Online ISBN: 978-0-387-23152-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mapping Search Results into Self-Customized Category Hierarchy

Abstract

Chapter PDF

Similar content being viewed by others

Minimizing Web Diversion Using Query Classification and Text Mining

A Novel Feature Selection Method Based on Category Distribution and Phrase Attributes

A two-stage feature selection method for text categorization by using category correlation degree and latent semantic indexing

Key words

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Mapping Search Results into Self-Customized Category Hierarchy

Abstract

Chapter PDF

Similar content being viewed by others

Minimizing Web Diversion Using Query Classification and Text Mining

A Novel Feature Selection Method Based on Category Distribution and Phrase Attributes

A two-stage feature selection method for text categorization by using category correlation degree and latent semantic indexing

Key words

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation