Skip to main content
Log in

Abstract

Key n-grams are useful in the analysis of legal discourse as they help bring recurrent key expressions to the fore and understand the patterning of legal language. This paper aims to generate, analyse and compare the key n-grams of two legal corpora: a corpus of European directives on distance consumer contracts and a UK national legislation corpus on the same subject-matter. The corpora are considered, alternatively, as both focus and reference corpora. In this way, keyness, i.e., the terminology that makes each corpus unique, is revealed from both corpora. The paper findings mostly bring to the fore five different patterns: differences in the key n-grams due to institutional or country-related factors; legalese influences; typical n-grams of Eurolect; dichotomy in the terminology used (albeit applying the same legal principles), and polysemy (i.e., similar words with different applications in various genres). This analysis confirms the usefulness and insightfulness of key n-grams in understanding the impact of disciplinary conventions in legal language.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. In particular, the Consumer Contracts (Information Cancellation and Additional Payments) Regulations 2013.

  2. For example, in the BLaRC (British Law Report Corpus, [33]) “infringement” collocates with the following words (in order of frequency): “article”, “privacy”, “copyright”, and “right”, whereas “breach” collocates with “contract”, “duty”, “section”, and “article”.

References

  1. Belvisi, Nicole Mariah Sharon, Naveed Muhammad, and Fernando Alonso-Fernandez. 2020. Forensic authorship analysis of microblogging texts using n-grams and stylometric features. Proceedings of the 8th international workshop on biometrics and forensics, IWBF, Porto, Portugal. https://doi.org/10.48550/arXiv.2003.11545.

  2. Bhatia, Vijay K. 2010. Textbook on legal language and legal writing. New Delhi: Universal Law Publishing Co., Pvt. Ltd.

    Google Scholar 

  3. Biber, Douglas, and Susan Conrad. 1999. Lexical bundles in conversation and academic prose. In Out of Corpora: Studies in Honour of Stig Johansso, ed. Hilde Hasselgård and Signe Oksefjell, 181–190. Amsterdam: Rodopi.

    Chapter  Google Scholar 

  4. Biber, Douglas, Susan Conrad, and Viviana Cortes. 2004. If you look at … : Lexical bundles in university teaching and textbooks. Applied Linguistics 25: 371–405.

    Article  Google Scholar 

  5. Biel, Łucja. 2015. Phraseological profiles of legislative genres: Complex prepositions as a special case of legal phrasemes in EU law and national law. Fachsprache — International Journal of Specialized Communication 37 (3–4): 139–160.

    Google Scholar 

  6. Biel, Łucja. 2018. Lexical bundles in EU law: The impact of translation process on the patterning of legal language”. In Phraseology in legal and institutional settings: A corpus-based interdisciplinary perspective, ed. Stanisław Goźdź-Roszkowski and Gianluca Pontrandolfo, 11–26. Abingdon: Routledge.

    Google Scholar 

  7. Breeze, Ruth. 2011. Disciplinary values in legal discourse: A corpus study. Ibérica 21: 93–115.

    Google Scholar 

  8. Breeze, Ruth. 2019. Part-of-speech patterns in legal genres. In Corpus-based research on variation in English legal discourse, ed. Teresa Fanego and Paula Rodríguez-Puente, 79–104. Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  9. Clear English. Tips for Translators. 2014. European Commission. https://commission.europa.eu/system/files/2020-06/clear-english-tips-translators_en.pdf. Accessed November 2023.

  10. Crossley, Scott A., and Max M. Louwerse. 2007. Multi-dimensional register classification using bigrams. International Journal of Corpus Linguistics 12 (4): 453–478.

    Article  Google Scholar 

  11. Décary, Robert. 1989. Une loi ‘à la moderne’ interprétée ‘à l’ancienne’. Le Journal du Barreau, Montréal.

  12. Duckworth, Mark, and Arthur Spyrou. 1995. Legal words: 30 essays on legal words and phrases. Sydney: Centre for Plain Legal Language, University of Sydney.

    Google Scholar 

  13. Dyevre, Arthur. 2021. Text-mining for Lawyers: How machine learning techniques can advance our understanding of legal discourse. Erasmus Law Review, Aflevering 1: 7–23.

    Google Scholar 

  14. Gabrielatos, Kostas. 2018. Keyness analysis: Nature, metrics and techniques. In Corpus approaches to discourse: A critical review, ed. Charlotte Taylor and Anna Marchi, 225–258. Oxford: Routledge.

    Chapter  Google Scholar 

  15. Giampieri, Patrizia. 2021. An analysis of the “right of termination”, “right of cancellation” and “right of withdrawal” in off-premises and distance contracts according to EU directives. Comparative Legilinguistics 47: 105–133.

    Article  Google Scholar 

  16. Giampieri, Patrizia. 2022. How (un)readable are the European and UNESCO cultural conventions in the digital era? IJLLD 10 (2): 22–42.

    Article  Google Scholar 

  17. Giampieri, Patrizia. in press. The use of comparable corpora on (general) terms and conditions as a pedagogical tool in translation training between English and Italian. PhD thesis. Malta: University of Malta.

  18. Goffin, Roger. 1994. L’Eurolecte: Oui, Jargon Communautaire: Non. Meta 39 (4): 636–642.

    Article  Google Scholar 

  19. Goźdź-Roszkowski, Stanisław. 2012. Discovering patterns and meanings: Corpus perspectives on phraseology in legal discourse. Roczniki Humanistyczne 8: 47–68.

    Google Scholar 

  20. Goźdź-Roszkowski, Stanisław. 2021. Corpus linguistics in legal discourse. International Journal for the Semiotics of Law 34: 1515–1540. https://doi.org/10.1007/s11196-021-09860-8.

    Article  Google Scholar 

  21. Hollingsworth, Charles. 2012. Syntactic stylometry: Using sentence structure for authorship attribution. Master Thesis, Athens, Georgia: University of Georgia.

  22. Ishihara, Shunichi. 2017. Strength of linguistic text evidence: A fused forensic text comparison system. Forensic Science International 278: 184–197.

    Article  Google Scholar 

  23. Jacometti, Valentina, and Pozzo Barbara. 2018. Traduttologia e linguaggio giuridico. Milan: Wolters Kluwer.

    Google Scholar 

  24. Jarvis, Scott, and Magali Paquot. 2012. Exploring the role of n-grams in L1 identification. In Approaching transfer through text classification: Explorations in the detection-based approach, ed. Scott Jarvis and Scott A. Crossley, 71–105. Bristol: Multilingual Matters.

    Chapter  Google Scholar 

  25. Katz, Daniel Martin, Michael J. Bommarito, Julie, Seaman, and Eugene Agichtein. 2011. Legal n-grams? A simple approach to track the evolution of legal language. Proceedings of JURIX 2011: The 24th international conference on legal knowledge and information systems, Vienna, https://ssrn.com/abstract=1971953 or https://doi.org/10.2139/ssrn.1971953.

  26. Kilgarriff, Adam. 1997. Using word frequency lists to measure corpus homogeneity and similarity between corpora. Proceedings 5th ACL workshop on very large corpora. Beijing and Hong Kong, 231–245.

  27. Kilgarriff, Adam, Pavel Rychlý, Pavel Smrž, and David Tugwell. 2004. Itri-04–08 the sketch engine. Information Technology. http://www.sketchengine.eu.

  28. Lauchman, Richard. 2002. Plain language. A handbook for writers in the U.S. Federal Government. Rockville: Lauchman Group. https://www.lauchmangroup.com/PDFfiles/PLHandbook.PDF. Accessed November 2023.

  29. Mac Aodha, Máirtín. 2017. Review of Biel, Łucja. 2014. Lost in the Eurofog: The textual fit of translated law. Frankfurt: Peter Lang. Meta 62 (3): 648–651. https://doi.org/10.7202/1043956ar.

  30. Martı́nez, Eric, Francis Mollica, Yufei Liu, Anita Podrug, and Edward Gibson. 2021. What did I sign? A study of the impenetrability of legalese in contracts. Proceedings of the Annual Meeting of the Cognitive Science Society 43: 140–146. https://escholarship.org/uc/item/5k09w2td.

  31. Mori, Laura. 2018. Observing Eurolects: Corpus analysis of linguistic variation in EU law. Amsterdam: John Benjamins.

    Book  Google Scholar 

  32. Noreika, Mantas, and Inesa Seškauskienė. 2017. EU Regulations: Tendencies in translating lexical bundles from English into Lithuanian. Vertimo studijos 100: 156–174. https://doi.org/10.15388/VertStud.2017.10.11302.

    Article  Google Scholar 

  33. Rizzo, Camino Rea, and María José Marín. Pérez. 2012. Structure and design of the British Law Report Corpus (BLRC): A legal corpus of judicial decisions from the UK. Journal of English Studies 10: 131–145.

    Article  Google Scholar 

  34. Römer, Üte. 2009. English in Academia: Does nativeness matter? Anglistik: International Journal of English Studies 20: 89–100.

    Google Scholar 

  35. Sánchez, Abril, Francisco OlivaPatricia, and Joan Blázquez Martínez Evora. 2018. The right of withdrawal in consumer contracts: A comparative analysis of American and European law. InDret 3: 1–56.

    Google Scholar 

  36. Schonlau, Matthias, Nick Guenther, and Ilia Sucholutsky. 2017. Text mining with n-gram variables. The Stata Journal 17 (4): 866–881.

    Article  Google Scholar 

  37. Scott, Mike. 1997. PC analysis of key words - and key key words. System 25 (2): 233–245.

    Article  Google Scholar 

  38. Stubbs, Michael, and Isabel Barth. 2003. Using recurrent phrases as text-type discriminators: A quantitative method and some findings. Functions of Language 10 (1): 61–104.

    Article  Google Scholar 

  39. Tiersma, Peter. 1999. Legal language. Chicago: The University of Chicago Press.

    Google Scholar 

  40. Torikai, Shinichiro. 2017. Multi-word sequences in legal discourse. Language, Culture, and Communication 9: 113–147.

    Google Scholar 

  41. Wang Chen, Keping Bi, Yunhua Hu, Hang Li, and Guihong Cao. 2012. Extracting search-focused key n-grams for relevance ranking in web search. Proceedings of the fifth ACM international conference on Web search and data mining (WSDM ’12). Association for Computing Machinery, New York, NY, USA, 343–352. https://doi.org/10.1145/2124295.2124338.

  42. Williams, Christopher. 2005. Vagueness in legal texts: Is there a future for shall? In Vagueness in normative texts, ed. Maurizio Gotti, Vijay Bhatia, Jan Engberg, and Dorothee Heller, 201–224. Bern: Peter Lang.

    Google Scholar 

  43. Williams, Christopher. 2023. The impact of plain language on legal English in the United Kingdom. London / New York: Routledge.

    Google Scholar 

Download references

Funding

The author has no financial or proprietary interests in any material discussed in this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrizia Giampieri.

Ethics declarations

Declarations

The author did not receive support from any organization for the submitted work and no funding was received to assist with the preparation of this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1

Appendix 1

First 100 key n-grams generated from the EU Directive corpus and the UK national legislation corpus on off-premises consumer contracts.

No.

Item (EuD-FC > UkL-RC)

Relative frequency

Item (UkL-FC > EuD-RC)

Relative frequency

1

Right of withdrawal

2,382.70166

Consumer rights Act

1,104.68152

2

The right of withdrawal

1,509.04431

Consumer rights Act 2015

1,092.71753

3

Member States shall

1,211.20667

Rights Act 2015

1,092.71753

4

The withdrawal period

655.24292

Of the Act

534.39465

5

Member States may

655.24292

Weights and measures

534.39465

6

The trader shall

635.38708

The consumer rights Act

490.52646

7

By the following

615.53125

Be treated as

466.59833

8

Replaced by the following

615.53125

To be treated

462.61032

9

This directive should

536.10785

To be treated as

438.68219

10

In accordance with article

536.10785

Likely to be

406.77805

11

Accordance with article

536.10785

Right to reject

406.77805

12

The consumer shall

516.25201

Enterprise Act 2002

386.83795

13

Is replaced by

496.39615

Part 1 of

386.83795

14

To in paragraph 1

496.39615

The consumer protection

378.86191

15

In this directive

496.39615

The grey list

354.93378

16

Is replaced by the

496.39615

2 of the

334.99368

17

By this directive

476.54031

Local weights and

327.01764

18

Provisions of this directive

456.68448

A term which

327.01764

19

Member States should

456.68448

Local weights and measures

327.01764

20

The Commission shall

436.82861

A contract to

319.0416

21

Shall ensure that

436.82861

To a contract

315.05359

22

From the day

416.97278

A breach of

311.06555

23

Of the right of

397.11694

Part 2 of

307.07755

24

His right of withdrawal

397.11694

Of the guidance

303.08951

25

Prior express consent

397.11694

Weights and measures authority

291.12546

26

His right of

397.11694

And measures authority

291.12546

27

Level of consumer

357.40524

1 of the

287.13745

28

Is not supplied on

357.40524

Of this Act

287.13745

29

Not supplied on

357.40524

An offence under

287.13745

30

Not supplied on a

357.40524

Secretary of state

279.16141

31

Level of consumer protection

357.40524

For breach of

279.16141

32

Content which is not

357.40524

A term that

271.18536

33

Which is not supplied

357.40524

The secretary of state

271.18536

34

The supply of water

337.54938

The enterprise Act

271.18536

35

Referred to in article

317.69354

Unfair contract terms

271.18536

36

Member States shall ensure

317.69354

The secretary of

271.18536

37

To this directive

317.69354

Part 1 of the

267.19733

38

This directive shall

317.69354

Is likely to

263.20932

39

To in article

317.69354

England and Wales

263.20932

40

States shall ensure that

317.69354

Part 2 of the

259.22131

41

States shall ensure

317.69354

Right to a

255.23328

42

Of returning the

297.83771

Enhanced consumer measures

255.23328

43

For in article

297.83771

Unfair trading regulations

251.24525

44

Laid down in

297.83771

The trader must

247.25723

45

Of returning the goods

297.83771

Unfair trading regulations 2008

247.25723

46

To withdraw from

297.83771

Trading regulations 2008

247.25723

47

Provided for in article

297.83771

Protection from unfair

247.25723

48

Providers of online

277.98184

From unfair trading regulations

247.25723

49

Cost of returning the

277.98184

Of schedule 2

247.25723

50

Be without prejudice to

277.98184

From unfair trading

247.25723

51

Without prejudice to the

277.98184

Protection from unfair trading

247.25723

52

This directive and

277.98184

Consumer protection from unfair

247.25723

53

A high level

277.98184

Consumer protection from

247.25723

54

Model withdrawal form

277.98184

At an end

235.29318

55

Consumer under an

277.98184

If it is

235.29318

56

A high level of

277.98184

Object or effect

231.30516

57

The European Union

277.98184

Repair or replacement

231.30516

58

Of the first

277.98184

As at an end

231.30516

59

Having regard to the

277.98184

As at an

231.30516

60

Of article 6

277.98184

Is to be treated

227.31714

61

Prejudice to the

277.98184

A local weights and

227.31714

62

Or undertakes to

277.98184

A local weights

227.31714

63

The distance contract

277.98184

An officer of

227.31714

64

The first paragraph

277.98184

In Northern Ireland

223.32912

65

Be without prejudice

277.98184

Object or effect of

223.32912

66

Of the European Union

277.98184

A consumer contract

223.32912

67

High level of

277.98184

The object or

219.34109

68

Means of communication

258.12601

The object or effect

219.34109

69

The rules on

258.12601

Of that Act

219.34109

70

And digital services

258.12601

Of a term

215.35307

71

Gas or electricity

258.12601

The contract as

215.35307

72

Of this article

258.12601

The Enterprise Act 2002

215.35307

73

A right of withdrawal

258.12601

Breach of the

803.27.00

74

Defined in point

258.12601

contract to supply

211.36507

75

Of the online

258.12601

Apply to a

211.36507

76

Of article 16

258.12601

Contract as at an

207.37704

77

As defined in point

258.12601

Contract as at

207.37704

78

With this directive

258.12601

Treated as included

203.38902

79

High level of consumer

258.12601

A term of

203.38902

80

The consumer under

258.12601

Right to cancel

203.38902

81

Are not put

238.27016

The consumer protection from

199.401

82

Are not put up

238.27016

Is subject to

199.401

83

The provider of

238.27016

Of part 1

199.401

84

Content and digital services

238.27016

Terms and notices

199.401

85

Content and digital

238.27016

Which has the

195.41298

86

Limited volume or set

238.27016

Contracts for goods

191.42496

87

Digital content and digital

238.27016

Be treated as included

191.42496

88

Digital content or digital

238.27016

At the end

191.42496

89

Content or digital

238.27016

Of this part

191.42496

90

Be considered as

238.27016

Has the object

191.42496

91

Of the supplier

238.27016

Term which has the

187.43694

92

Or set quantity

238.27016

In breach of

187.43694

93

Volume or set

238.27016

Digital content and services

187.43694

94

Volume or set quantity

238.27016

Which has the object

187.43694

95

The consumer under an

238.27016

Content and services

187.43694

96

Where they are not

238.27016

Term which has

187.43694

97

Of district heating

238.27016

Has the object or

187.43694

98

Other Member States

238.27016

Term or notice

187.43694

99

Not put up

238.27016

Of this schedule

187.43694

100

Directive should be

238.27016

This part of

183.44891

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Giampieri, P. Key n-Grams in EU Directives and in the UK National Legislation on Consumer Contracts. Int J Semiot Law 37, 59–75 (2024). https://doi.org/10.1007/s11196-023-10087-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11196-023-10087-y

Keywords

Navigation