Keywords

1 Motivation

In the legal domain, the amount of information available digitally is rapidly increasing. Legal scholars and professionals have to navigate this information to find the case law and articles relevant for them. Dutch legal information retrieval systems currently focus primarily on matching queries and documents for their ranking. In order to help users find the most relevant documents, these ranking algorithms should be expanded.

A possible solution is bibliometric-enhanced information retrieval, where citations are measured and used as a proxy for impact in the ranking algorithm. But the legal domain differs from other research domains due to the often strong interconnection between research and practice. In the Dutch legal domain this is demonstrated by the lack of distinction between legal scientific and professional publications. This is one of the reasons why bibliometrics, which in other fields measures impact of scientific publications, has not yet been established within the Dutch legal domain.

This research will cover both the theory (such as relevance factors in legal publications and the meaning of citations in legal publications) and experimentation of applying bibliometrics to legal documents. The result is a new bibliometrics-enhanced ranking algorithm for legal information retrieval systems.

2 Research Questions

The main question in this research is: can we use citation and usage (click) metrics to improve ranking in legal IR? This question comprises five sub-questions:

  1. 1.

    What factors influence the perception of relevance of users of legal IR systems?

  2. 2.

    What does a citation signify in legal publications?

  3. 3.

    What is the relation between user interactions with documents (usage) and citations in legal publications?

  4. 4.

    What is the right balance between text-based relevance and user-based relevance in ranking algorithms for legal IR?

  5. 5.

    What is the appropriate user-focused rank evaluation metric for legal IR?

3 Related Work

The notion of using bibliometrics to enhance information retrieval is inspired by the groundwork of Garfield [5] on the theory of citations as a measure of impact, and more recent work of Beel and Gip [4]. The normalization of citations is based on the work of Waltman et al. [18]. Based on their work we have decided on time, field, and document type normalization. The context of Dutch legal publications and their citation culture is provided by Stolker [16] and Snel [15].

This research is inspired by work of Van Opijnen and Santos [10], who translate the theory of Saracevic [13] on the spheres of relevance to the legal domain. The work of Barry and Schamber [2, 3] describes the different indicators of relevance as identified by users.

The choice for a user-focused evaluation metric is influenced by the work of Järvelin and Kekäläinen [6]. The decision to use click data as a form of implicit feedback is based on work of Joachims et al. [7, 8].

4 Preliminary Work

4.1 What Factors Influence the Perception of Relevance of Usersof Legal IR Systems?

Footnote 1

[17] The goal of our study was to make explicit which document characteristics users consider as factors of relevance when assessing a search result (document representation; such as title and snippet) in legal IR. To achieve this, we conducted a user questionnaire. In this questionnaire we showed users of a legal IR system a query and two search results. The user has to choose which of the two results he/she would like to see ranked higher for the query and is asked to provide a reasoning for his/her choice. The questionnaire had eleven pairs of search results spread over two queries. A total of 43 legal professionals participated in our study.

The identified relevance factors were title relevance, document type, recency, level of depth, legal hierarchy, law area (topic), authority (credibility), bibliographical relevance, source authority, usability, whether the document is annotated, and the length of the document. The identified factors confirm previous research, such as the work of Barry and Schamber [2, 3]. The identified factors also suggest that there are document characteristics (e.g. authority, legal hierarchy and whether the document is annotated) that are usually grouped under cognitive or situational relevance, and thereby considered to be personal, but that users in the legal domain agree on. The agreement within the field makes that these factors can be grouped under domain relevance as described by Van Opijnen and Santos [10].

4.2 What Does a Citation Signify in Legal Publications?

Footnote 2

[19] In our next paper we examined citations in legal information retrieval. Citation metrics can be a factor of relevance in the ranking algorithms of information retrieval systems. But the challenge in legal bibliometrics, and therefore legal bibliometric-enhanced IR, is that the legal domain differs from other research domains in two manners: (1) its strong national ties and (2) the often strong interconnection between research and practice [16].

First, we contrasted citations in the legal domain to citations in the hard sciences based on the literature on scholarly citations. Second, we applied quantitative analysis of legal documents and citations to test whether the theory described in literature (particularly the distinction between scholarly and practitioners oriented publications) is confirmed by the data.

An analysis of 52 cited (seed) documents and 3196 citing documents showed no strict separation in citations between documents aimed at scholars and documents aimed at practitioners. Our results suggest that citations in legal documents do not measure the impact on scholarly publications and scholars, but measure a broader scope of impact, or relevance, for the legal field.

5 Proposed Methods

5.1 What Is the Relation Between User Interactions with Documents (Usage) and Citations in Legal Publications?

Based on the outcome of the above research question, we wish to create a boost function to create the bibliometric-enhanced ranking algorithm. Because citations in legal documents measure a broader form of impact, it is likely that citations alone do not provide a complete overview of this broader impact, but that other factors, such as usage (through click data), create a more complete overview.

However, the work of Perneger [12] suggests that there might be a correlation between usage and citations. Therefore we took all documents published and added to the Legal Intelligence system (the largest legal IR system in the Netherlands) in February 2017.Footnote 4 This led to a set of 43,218 documents.

For each document, based on their unique document ID, we collected all click data (usage) until 2019. We then computed the Spearman correlation between the usage and the citations. We chose Spearman correlation because the data, like all citation data, does not have a normal distribution. The results show a Spearman correlation of 0.57 (\(p< 0.0001\)). This means that there is a moderate positive correlation between citations and usage, which is highly significant.

For the ranking algorithm, this correlation means that two separate boost functions will boost some results too much and others too little. Therefore a single harmonized boost function appears to be a better choice.

5.2 What Is the Right Balance Between Text-Based Relevance and User-Based Relevance in Ranking Algorithms for Legal IR?

Next to balancing citations and usage, the boost function as a whole will also have to be balanced against the current text-based relevance score (TF-IDF, BM25 or similar). Furthermore, the usage (click) and citation data will not be reliable from the moment of publication of the document. These new documents will have to be given the benefit of the doubt, for example through a freshness score. The optimal balance of these variables will be tuned using the evaluation metric.

5.3 What Is the Appropriate User-Focused Rank Evaluation Metric for Legal IR?

One of the challenges for legal information retrieval is finding the correct evaluation metric. This has several causes:

  1. 1.

    The small user group when compared to web search. This makes A/B testing difficult.

  2. 2.

    Differences is results lists. Because of differences in journal subscriptions two users may not see the same results list when they do the same query.

  3. 3.

    The high tariffs of legal experts, which makes a golden answer set prohibitively expensive to create and maintain.

A metric based on an implicit feedback model is of particular interest, as this is affordable and user-focused. We have made a first attempt to create an implicit feedback DCG model, but there is data sparsity. This means it is not possible to find queries for which the entire results list has a relevance judgment. Possible solutions will be sought in the fields of patent search and e-discovery.