Elsevier

Computer Networks

Volume 31, Issues 11–16, 17 May 1999, Pages 1653-1665
Computer Networks

Adding support for dynamic and focused search with Fetuccino

https://doi.org/10.1016/S1389-1286(99)00045-6Get rights and content

Abstract

This paper proposes two enhancements to existing search services over the Web. One enhancement is the addition of limited dynamic search around results provided by regular Web search services, in order to correct part of the discrepancy between the actual Web and its static image as stored in search repositories. The second enhancement is an experimental two-phase paradigm that allows the user to distinguish between a domain query and a focused query within the dynamically identified domain. We present Fetuccino, an extension of the Mapuccino system that implements these two enhancements. Fetuccino provides an enhanced user-interface for visualization of search results, including advanced graph layout, display of structural information and support for standards (such as XML). While Fetuccino has been implemented on top of existing search services, its features could easily be integrated into any search engine for better performance. A light version of Fetuccino is available on the Internet at http://www.ibm.com/java/fetuccino.

Section snippets

Introduction and motivation

The Web keeps growing at a phenomenal rate. From the perspective of search technology, this growth has two important characteristics. First, the `update' policy is totally uncontrolled, with millions of users creating, modifying and deleting content at will, and linking it to other content in an unstructured manner. Second, Web growth is increasingly fueled by the addition of dynamically and automatically generated content [18].

These characteristics impact the major criteria of evaluation for

The approach

Dynamic search is primarily distinguished from conventional static search in that the former involves fetching the actual documents at the time the query is issued and analyzing their relevance to the search query on the fly, while the latter is based on evaluating the query against pre-computed repositories. Obviously, dynamic search cannot be employed `from scratch' because of the high cost of text analysis and the huge search space, and therefore it requires a starting point for recursive

A two phase approach

A major problem when conducting searches over a large heterogeneous and uncontrolled document set such as the Web is with the quality of the results for ambiguous queries. Indeed, it happens quite often that the user discovers only after issuing a search that the query expression s/he picked has a different interpretation in a totally irrelevant domain. This might happen even if the user is an expert at expressing clear and precise queries, simply because of the large scope of the Web. This

Conclusion and future work

The accelerated growth of the Web might cause traditional purely static search engines to become less accurate and less effective over the time, even if their indexing, retrieval, and storage techniques are likely to improve. Fetuccino addresses this shortcoming by augmenting static search services with text-based dynamic exploration around the vicinity of the search results, thereby discovering new relevant information and validating old information. The key to making an effective use of the

Acknowledgements

We thank Jon Kleinberg12 and Ron Fagin for useful discussions on Hubs and Authorities (and Ron Pinter for putting us in contact). We are also grateful to Prabhakar Raghavan for letting us use the Clever system. Finally, we are in debt to Dirk Nicol for hosting Mapuccino and Fetuccino on the IBM Corporate Java Site. This research was done while Dan Pelleg was an extern student at the IBM Haifa Research Laboratory.

Issy Ben-Shaul is a faculty member in the department of Electrical Engineering at the Technion — Israel Institute of Technology, and a consultant in the Information Retrieval and Organization Group at the IBM Haifa Research Lab. He received his BSc in Mathematics and Computer Science from Tel Aviv University in 1988, and his MS and PhD in Computer Science from Columbia University in 1991 and 1995, respectively. During 1995, before joining the Technion, Ben-Shaul was a research staff member at

References (22)

  • K. Bharat, A. Broder, M. Henzinger, P. Kumar and S. Venkatasubramanian, The connectivity server: fast access to linkage...
  • S. Brin and L. Page, The anatomy of a large-scale hypertextual Web search engine, in: Proc. 7th International World...
  • J. Carriere and R. Katzman, WebQuery: searching and visualizing the Web through connectivity, in: Proc. of the 6th...
  • S. Chakrabarti, B. Dom, D. Gibson, S.R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins, Experiments in topic...
  • S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan and S. Rajagopalan, Automatic resource list compilation by...
  • D.R. Cutting, D.R. Karger, J.O. Pedersen and J.W. Tukey, Scatter/gather: a cluster-based approach to browsing large...
  • P. De Bra, G.-J. Houben, Y. Kornatzky and R. Post, Information retrieval in distributed hypertexts, in: Proc. of...
  • D. Gibson, J. Kleinberg and P. Raghavan, Inferring Web communities from link topology, in: Proc. of the 9th ACM...
  • M. Herscovici, M. Jacovi, Y.S. Maarek, D. Pelleg, M. Shtalhaim and S. Ur, The shark-search algorithm: an application:...
  • J. Kleinberg, Authoritative sources in a hyperlinked environment, in: Proc. of the 9th ACM-SIAM Symposium on Discrete...
  • O. Liechti, M.J. Sifer and T. Ichikawa, Structured graph format: XML metadata for describing Web site structure, in:...
  • Cited by (11)

    • Bee hive at work: Story tracking case study

      2009, Proceedings - 2009 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT Workshops 2009
    • Query graph visualizer: A visual collaborative querying system

      2008, 1st International Conference on the Applications of Digital Information and Web Technologies, ICADIWT 2008
    • Visualization of Web Spaces: State of the Art and Future Directions

      2007, Data Base for Advances in Information Systems
    • Learning to crawl: Comparing classification schemes

      2005, ACM Transactions on Information Systems
    View all citing articles on Scopus

    1. Download : Download full-size image
    Issy Ben-Shaul is a faculty member in the department of Electrical Engineering at the Technion — Israel Institute of Technology, and a consultant in the Information Retrieval and Organization Group at the IBM Haifa Research Lab. He received his BSc in Mathematics and Computer Science from Tel Aviv University in 1988, and his MS and PhD in Computer Science from Columbia University in 1991 and 1995, respectively. During 1995, before joining the Technion, Ben-Shaul was a research staff member at the IBM research laboratory in Haifa, and worked on applications and extensions of clustering technology to the Internet. He is leading the Distributed Systems Group and the associated software systems laboratory at the Technion. His research interests include distributed and mobile systems, software engineering, information retrieval, Web, advanced transactions, workflow management systems and electronic commerce. He has published over 30 papers in refereed journals and conference

    1. Download : Download full-size image
    Michael Herscovici is a Research Staff Member at the IBM Haifa Research Lab in Haifa, Israel and belongs to the `Information Retrieval and Organization' Group. His research interests include Internet applications and parsing techniques. Mr. Herscovici received his B.Sc. in Computer Science from the Technion, Israel Institute of Technology in Haifa, in 1998. He joined IBM in 1997 and has since worked on the dedicated robot component of Mapuccino, a Web site mapping tool.

    1. Download : Download full-size image
    Michal Jacovi is a Research Staff Member at the IBM Haifa Research Lab in Haifa, Israel, and belongs to the `Information Retrieval and Organization' Group. Her research interests include Internet applications, user interfaces, and visualization. She received her M.Sc. in Computer Science from the Technion, Haifa, Israel, in 1993. Ms. Jacovi has joined IBM in 1993, and worked on several projects involving user interfaces and Object Oriented, some of which have been published in journals and conferences. Since the emergence of Java, she has been involved in the conception and implementation of Mapuccino, a Web site mapping tool, written in Java, that is being integrated into several IBM products.

    1. Download : Download full-size image
    Yoelle S. Maarek is a Research Staff Member at the IBM Haifa Research Lab in Haifa, Israel and manages the `Information Retrieval and Organization' Group that counts about 15 members. Her research interests include information retrieval, Internet applications, and software reuse. She graduated from the `Ecole Nationale des Ponts et Chaussees', Paris, France, as well as received her D.E.A (graduate degree) in Computer Science from Paris VI University in 1985. She received a Doctor of Science degree from the Technion, Haifa, Israel, in January 1989. Before joining IBM Israel, Dr Maarek was a research staff member at the IBM T.J. Watson Research Center for about 5 years. She serves on the program committees of several international conference and is a member of the Review Board of the WebNet Journal. She has published over 25 papers in refereed journals and conferences.

    1. Download : Download full-size image
    Dan Pelleg received his B.A. in 1995 and his MSc in 1998 from the Department of Computer Science, Technion, Haifa, Israel. His Master's thesis topic was `Phylogeny Approximation via Semidefinite Programming'. He is currently a PhD candidate in the CS Dept. at Carnegie-Mellon University. His research interests include computational biology, combinatorial optimization and Web-based software agents. During the summers of 1997 and 1998, Dan worked as an extern student in IBM Haifa Research Laboratory.

    1. Download : Download full-size image
    Menachem Shtalhaim is a Research Staff Member at the IBM Haifa Research Lab in Haifa, Israel and belongs to the `Information Retrieval and Organization' Group. His research interests include Internet applications, communication protocols and heuristic algorithms. Mr. Shtalhaim joined IBM in 1993, and worked on several projects involving morphological analysis tools, Network Design and analysis tool (IBM product NetDA/2) and the AS400 logical file system layer. In the past, Mr. Shtalhaim has worked on medical diagnostic systems. He is the author of the dedicated robot component of Mapuccino, a Web site mapping tool

    1. Download : Download full-size image
    Vladimir Soroka is a Research Staff Member at the IBM Haifa Research Lab in Haifa, Israel and belongs to the `Information Retrieval and Organization' Group. His research interests include Internet applications and Information organization. Mr. Soroka received his B.Sc. in Computer Science from the Technion, Israel Institute of Technology in Haifa, Israel in 1996. Before joining IBM, Mr. Soroka worked on Internet-based fax servers. He joined IBM in 1998 and has since worked on various applications such as Mapuccino, a Web site mapping tool.

    1. Download : Download full-size image
    Sigalit Ur is a Research Staff Member at the IBM Haifa Research Lab in Haifa, Israel, working on Mapuccino, a Web site mapping tool, written in Java, that is being integrated into several IBM products. She received a Master of Science degree in Intelligent Systems from the University of Pittsburgh in 1993. Before joining IBM, Ms. Ur was involved in projects in a wide variety of fields, including data processing, databases, cognitive science, multi-agent planning and image processing, some of which have been published in journals and conferences.

    View full text