Information Representation and Retrieval in the Digital Age (2nd ed.)

Maja Žumer (Department of Library and Information Science and Book Studies, University of Ljubljana, Slovenia)

Journal of Documentation

ISSN: 0022-0418

Article publication date: 18 January 2011

414

Keywords

Citation

Žumer, M. (2011), "Information Representation and Retrieval in the Digital Age (2nd ed.)", Journal of Documentation, Vol. 67 No. 1, pp. 201-202. https://doi.org/10.1108/00220411111105524

Publisher

:

Emerald Group Publishing Limited

Copyright © 2011, Emerald Group Publishing Limited


Information retrieval is a fast changing area and it not a surprise that Professor Chu decided to revise her book after nine years. To reflect the developments of the last decade, several topics have also been added in this edition, such as social tagging, taxonomies, folksonomies, ontologies, web 2.0, semantic web, and natural language processing.

In the preface the author describes the book as a general, systematic, nontechnical overview of information retrieval, particularly aimed at the beginners in the field. Information retrieval is discussed from the perspective of library and information science rather that computer science and information representation is included in the context of information retrieval only, not as a separate parallel field.

The first three chapters set the stage: history of information representation and retrieval, key concepts and major components, followed by basic approaches to representation of content (indexing, categorisation, summarisation) and information representation in general (metadata, full text and representation of multimedia).

Chapter 4 deals with the use of natural language and controlled vocabularies in information representation and query formulation. Thesauri, subject headings lists and classification schemes are discusses and compared to the use of natural language. A description of taxonomies, folksonomies and ontologies concludes the chapter.

In chapters 5‐7 the author discusses the retrieval process. Chapter 5 is about common techniques from basic (Boolean queries, truncation, proximity searching) to advanced (fuzzy searching, weighted searching, query expansion). Precision and recall are introduced and techniques for improving them are mentioned. Retrieval approaches, searching and browsing, are presented in Chapter 6, while Chapter 7 discussed theoretical retrieval models, particularly the Boolean model, vector space model, probability model and their extensions.

In Chapter 8 types of information retrieval systems are described: online systems, CD‐ROM systems, library OPACs and internet retrieval systems, including but not limited to web search engines.

Chapter 9 gives a short overview of multilingual information retrieval, but mostly focuses on still image, moving image and sound information retrieval. Chapter 10 follows with the user focus, discussing users and their information needs and theoretical models of user‐system interaction.

Chapter 11 is about evaluation and covers general evaluation measures and specific evaluation criteria for different types of information retrieval systems. It concludes with a detailed description of major evaluation projects such as the Cranfield tests and the TREC series.

The final Chapter, 12, looks into the future and introduces natural language processing, the semantic web and use of artificial intelligence in information retrieval.

This is a comprehensive work indeed providing a general overview for the newcomers to the field. They will appreciate the broad coverage of topics, but may, on the other hand, be confused by the fact that the same topic, for example evaluation or Boolean searching, is discussed several times in different chapters.

Anybody looking for more in‐depth information will have to look further and the list of references at the end of each chapter may be a starting point. While several new references have been added in this edition, the majority are from the 1990s and are in many cases rather dated.

Any book on such broad topic will necessarily deal with some topics in passing, or even oversimplify them. Some errors in the text may therefore be attributed to such superficial treatment. For example, the definition of “ofness” is incorrect several times. Instead of the real meaning, depiction (exemplified in the statement “this is a picture of a lion”), which is mentioned on p. 168, it is defined as metadata such as creator's name or publication year assigned to the resource (pp. 49, 51, 74). Another example is the sentence “But it is impossible to get search results high in both precision and recall because of the inverse relationship between the two measurements” (p. 81), which is simply wrong. While it is true that the mechanisms improving one usually decrease the other, there is no inverse relationship and it is therefore possible to get search results high (or low) on both criteria. Fortunately the error is somewhat corrected on p. 211.

Despite some critical remarks I have to emphasise that Heting Chu has successfully managed to chart the map of information representation and retrieval, dealing with many intertwined facets of the field. This book is a good introduction to the foundations and key concepts, which will serve as the starting point for further exploration of the area. While probably not used as the only textbook for an advanced information retrieval course, this book can be particularly recommended to educators as a comprehensive list of topics and sources, especially the more traditional ones.

Related articles