Abstract
Key n-grams are useful in the analysis of legal discourse as they help bring recurrent key expressions to the fore and understand the patterning of legal language. This paper aims to generate, analyse and compare the key n-grams of two legal corpora: a corpus of European directives on distance consumer contracts and a UK national legislation corpus on the same subject-matter. The corpora are considered, alternatively, as both focus and reference corpora. In this way, keyness, i.e., the terminology that makes each corpus unique, is revealed from both corpora. The paper findings mostly bring to the fore five different patterns: differences in the key n-grams due to institutional or country-related factors; legalese influences; typical n-grams of Eurolect; dichotomy in the terminology used (albeit applying the same legal principles), and polysemy (i.e., similar words with different applications in various genres). This analysis confirms the usefulness and insightfulness of key n-grams in understanding the impact of disciplinary conventions in legal language.
Similar content being viewed by others
Notes
In particular, the Consumer Contracts (Information Cancellation and Additional Payments) Regulations 2013.
For example, in the BLaRC (British Law Report Corpus, [33]) “infringement” collocates with the following words (in order of frequency): “article”, “privacy”, “copyright”, and “right”, whereas “breach” collocates with “contract”, “duty”, “section”, and “article”.
References
Belvisi, Nicole Mariah Sharon, Naveed Muhammad, and Fernando Alonso-Fernandez. 2020. Forensic authorship analysis of microblogging texts using n-grams and stylometric features. Proceedings of the 8th international workshop on biometrics and forensics, IWBF, Porto, Portugal. https://doi.org/10.48550/arXiv.2003.11545.
Bhatia, Vijay K. 2010. Textbook on legal language and legal writing. New Delhi: Universal Law Publishing Co., Pvt. Ltd.
Biber, Douglas, and Susan Conrad. 1999. Lexical bundles in conversation and academic prose. In Out of Corpora: Studies in Honour of Stig Johansso, ed. Hilde Hasselgård and Signe Oksefjell, 181–190. Amsterdam: Rodopi.
Biber, Douglas, Susan Conrad, and Viviana Cortes. 2004. If you look at … : Lexical bundles in university teaching and textbooks. Applied Linguistics 25: 371–405.
Biel, Łucja. 2015. Phraseological profiles of legislative genres: Complex prepositions as a special case of legal phrasemes in EU law and national law. Fachsprache — International Journal of Specialized Communication 37 (3–4): 139–160.
Biel, Łucja. 2018. Lexical bundles in EU law: The impact of translation process on the patterning of legal language”. In Phraseology in legal and institutional settings: A corpus-based interdisciplinary perspective, ed. Stanisław Goźdź-Roszkowski and Gianluca Pontrandolfo, 11–26. Abingdon: Routledge.
Breeze, Ruth. 2011. Disciplinary values in legal discourse: A corpus study. Ibérica 21: 93–115.
Breeze, Ruth. 2019. Part-of-speech patterns in legal genres. In Corpus-based research on variation in English legal discourse, ed. Teresa Fanego and Paula Rodríguez-Puente, 79–104. Amsterdam: John Benjamins.
Clear English. Tips for Translators. 2014. European Commission. https://commission.europa.eu/system/files/2020-06/clear-english-tips-translators_en.pdf. Accessed November 2023.
Crossley, Scott A., and Max M. Louwerse. 2007. Multi-dimensional register classification using bigrams. International Journal of Corpus Linguistics 12 (4): 453–478.
Décary, Robert. 1989. Une loi ‘à la moderne’ interprétée ‘à l’ancienne’. Le Journal du Barreau, Montréal.
Duckworth, Mark, and Arthur Spyrou. 1995. Legal words: 30 essays on legal words and phrases. Sydney: Centre for Plain Legal Language, University of Sydney.
Dyevre, Arthur. 2021. Text-mining for Lawyers: How machine learning techniques can advance our understanding of legal discourse. Erasmus Law Review, Aflevering 1: 7–23.
Gabrielatos, Kostas. 2018. Keyness analysis: Nature, metrics and techniques. In Corpus approaches to discourse: A critical review, ed. Charlotte Taylor and Anna Marchi, 225–258. Oxford: Routledge.
Giampieri, Patrizia. 2021. An analysis of the “right of termination”, “right of cancellation” and “right of withdrawal” in off-premises and distance contracts according to EU directives. Comparative Legilinguistics 47: 105–133.
Giampieri, Patrizia. 2022. How (un)readable are the European and UNESCO cultural conventions in the digital era? IJLLD 10 (2): 22–42.
Giampieri, Patrizia. in press. The use of comparable corpora on (general) terms and conditions as a pedagogical tool in translation training between English and Italian. PhD thesis. Malta: University of Malta.
Goffin, Roger. 1994. L’Eurolecte: Oui, Jargon Communautaire: Non. Meta 39 (4): 636–642.
Goźdź-Roszkowski, Stanisław. 2012. Discovering patterns and meanings: Corpus perspectives on phraseology in legal discourse. Roczniki Humanistyczne 8: 47–68.
Goźdź-Roszkowski, Stanisław. 2021. Corpus linguistics in legal discourse. International Journal for the Semiotics of Law 34: 1515–1540. https://doi.org/10.1007/s11196-021-09860-8.
Hollingsworth, Charles. 2012. Syntactic stylometry: Using sentence structure for authorship attribution. Master Thesis, Athens, Georgia: University of Georgia.
Ishihara, Shunichi. 2017. Strength of linguistic text evidence: A fused forensic text comparison system. Forensic Science International 278: 184–197.
Jacometti, Valentina, and Pozzo Barbara. 2018. Traduttologia e linguaggio giuridico. Milan: Wolters Kluwer.
Jarvis, Scott, and Magali Paquot. 2012. Exploring the role of n-grams in L1 identification. In Approaching transfer through text classification: Explorations in the detection-based approach, ed. Scott Jarvis and Scott A. Crossley, 71–105. Bristol: Multilingual Matters.
Katz, Daniel Martin, Michael J. Bommarito, Julie, Seaman, and Eugene Agichtein. 2011. Legal n-grams? A simple approach to track the evolution of legal language. Proceedings of JURIX 2011: The 24th international conference on legal knowledge and information systems, Vienna, https://ssrn.com/abstract=1971953 or https://doi.org/10.2139/ssrn.1971953.
Kilgarriff, Adam. 1997. Using word frequency lists to measure corpus homogeneity and similarity between corpora. Proceedings 5th ACL workshop on very large corpora. Beijing and Hong Kong, 231–245.
Kilgarriff, Adam, Pavel Rychlý, Pavel Smrž, and David Tugwell. 2004. Itri-04–08 the sketch engine. Information Technology. http://www.sketchengine.eu.
Lauchman, Richard. 2002. Plain language. A handbook for writers in the U.S. Federal Government. Rockville: Lauchman Group. https://www.lauchmangroup.com/PDFfiles/PLHandbook.PDF. Accessed November 2023.
Mac Aodha, Máirtín. 2017. Review of Biel, Łucja. 2014. Lost in the Eurofog: The textual fit of translated law. Frankfurt: Peter Lang. Meta 62 (3): 648–651. https://doi.org/10.7202/1043956ar.
Martı́nez, Eric, Francis Mollica, Yufei Liu, Anita Podrug, and Edward Gibson. 2021. What did I sign? A study of the impenetrability of legalese in contracts. Proceedings of the Annual Meeting of the Cognitive Science Society 43: 140–146. https://escholarship.org/uc/item/5k09w2td.
Mori, Laura. 2018. Observing Eurolects: Corpus analysis of linguistic variation in EU law. Amsterdam: John Benjamins.
Noreika, Mantas, and Inesa Seškauskienė. 2017. EU Regulations: Tendencies in translating lexical bundles from English into Lithuanian. Vertimo studijos 100: 156–174. https://doi.org/10.15388/VertStud.2017.10.11302.
Rizzo, Camino Rea, and María José Marín. Pérez. 2012. Structure and design of the British Law Report Corpus (BLRC): A legal corpus of judicial decisions from the UK. Journal of English Studies 10: 131–145.
Römer, Üte. 2009. English in Academia: Does nativeness matter? Anglistik: International Journal of English Studies 20: 89–100.
Sánchez, Abril, Francisco OlivaPatricia, and Joan Blázquez Martínez Evora. 2018. The right of withdrawal in consumer contracts: A comparative analysis of American and European law. InDret 3: 1–56.
Schonlau, Matthias, Nick Guenther, and Ilia Sucholutsky. 2017. Text mining with n-gram variables. The Stata Journal 17 (4): 866–881.
Scott, Mike. 1997. PC analysis of key words - and key key words. System 25 (2): 233–245.
Stubbs, Michael, and Isabel Barth. 2003. Using recurrent phrases as text-type discriminators: A quantitative method and some findings. Functions of Language 10 (1): 61–104.
Tiersma, Peter. 1999. Legal language. Chicago: The University of Chicago Press.
Torikai, Shinichiro. 2017. Multi-word sequences in legal discourse. Language, Culture, and Communication 9: 113–147.
Wang Chen, Keping Bi, Yunhua Hu, Hang Li, and Guihong Cao. 2012. Extracting search-focused key n-grams for relevance ranking in web search. Proceedings of the fifth ACM international conference on Web search and data mining (WSDM ’12). Association for Computing Machinery, New York, NY, USA, 343–352. https://doi.org/10.1145/2124295.2124338.
Williams, Christopher. 2005. Vagueness in legal texts: Is there a future for shall? In Vagueness in normative texts, ed. Maurizio Gotti, Vijay Bhatia, Jan Engberg, and Dorothee Heller, 201–224. Bern: Peter Lang.
Williams, Christopher. 2023. The impact of plain language on legal English in the United Kingdom. London / New York: Routledge.
Funding
The author has no financial or proprietary interests in any material discussed in this article.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Declarations
The author did not receive support from any organization for the submitted work and no funding was received to assist with the preparation of this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix 1
Appendix 1
First 100 key n-grams generated from the EU Directive corpus and the UK national legislation corpus on off-premises consumer contracts.
No. | Item (EuD-FC > UkL-RC) | Relative frequency | Item (UkL-FC > EuD-RC) | Relative frequency |
---|---|---|---|---|
1 | Right of withdrawal | 2,382.70166 | Consumer rights Act | 1,104.68152 |
2 | The right of withdrawal | 1,509.04431 | Consumer rights Act 2015 | 1,092.71753 |
3 | Member States shall | 1,211.20667 | Rights Act 2015 | 1,092.71753 |
4 | The withdrawal period | 655.24292 | Of the Act | 534.39465 |
5 | Member States may | 655.24292 | Weights and measures | 534.39465 |
6 | The trader shall | 635.38708 | The consumer rights Act | 490.52646 |
7 | By the following | 615.53125 | Be treated as | 466.59833 |
8 | Replaced by the following | 615.53125 | To be treated | 462.61032 |
9 | This directive should | 536.10785 | To be treated as | 438.68219 |
10 | In accordance with article | 536.10785 | Likely to be | 406.77805 |
11 | Accordance with article | 536.10785 | Right to reject | 406.77805 |
12 | The consumer shall | 516.25201 | Enterprise Act 2002 | 386.83795 |
13 | Is replaced by | 496.39615 | Part 1 of | 386.83795 |
14 | To in paragraph 1 | 496.39615 | The consumer protection | 378.86191 |
15 | In this directive | 496.39615 | The grey list | 354.93378 |
16 | Is replaced by the | 496.39615 | 2 of the | 334.99368 |
17 | By this directive | 476.54031 | Local weights and | 327.01764 |
18 | Provisions of this directive | 456.68448 | A term which | 327.01764 |
19 | Member States should | 456.68448 | Local weights and measures | 327.01764 |
20 | The Commission shall | 436.82861 | A contract to | 319.0416 |
21 | Shall ensure that | 436.82861 | To a contract | 315.05359 |
22 | From the day | 416.97278 | A breach of | 311.06555 |
23 | Of the right of | 397.11694 | Part 2 of | 307.07755 |
24 | His right of withdrawal | 397.11694 | Of the guidance | 303.08951 |
25 | Prior express consent | 397.11694 | Weights and measures authority | 291.12546 |
26 | His right of | 397.11694 | And measures authority | 291.12546 |
27 | Level of consumer | 357.40524 | 1 of the | 287.13745 |
28 | Is not supplied on | 357.40524 | Of this Act | 287.13745 |
29 | Not supplied on | 357.40524 | An offence under | 287.13745 |
30 | Not supplied on a | 357.40524 | Secretary of state | 279.16141 |
31 | Level of consumer protection | 357.40524 | For breach of | 279.16141 |
32 | Content which is not | 357.40524 | A term that | 271.18536 |
33 | Which is not supplied | 357.40524 | The secretary of state | 271.18536 |
34 | The supply of water | 337.54938 | The enterprise Act | 271.18536 |
35 | Referred to in article | 317.69354 | Unfair contract terms | 271.18536 |
36 | Member States shall ensure | 317.69354 | The secretary of | 271.18536 |
37 | To this directive | 317.69354 | Part 1 of the | 267.19733 |
38 | This directive shall | 317.69354 | Is likely to | 263.20932 |
39 | To in article | 317.69354 | England and Wales | 263.20932 |
40 | States shall ensure that | 317.69354 | Part 2 of the | 259.22131 |
41 | States shall ensure | 317.69354 | Right to a | 255.23328 |
42 | Of returning the | 297.83771 | Enhanced consumer measures | 255.23328 |
43 | For in article | 297.83771 | Unfair trading regulations | 251.24525 |
44 | Laid down in | 297.83771 | The trader must | 247.25723 |
45 | Of returning the goods | 297.83771 | Unfair trading regulations 2008 | 247.25723 |
46 | To withdraw from | 297.83771 | Trading regulations 2008 | 247.25723 |
47 | Provided for in article | 297.83771 | Protection from unfair | 247.25723 |
48 | Providers of online | 277.98184 | From unfair trading regulations | 247.25723 |
49 | Cost of returning the | 277.98184 | Of schedule 2 | 247.25723 |
50 | Be without prejudice to | 277.98184 | From unfair trading | 247.25723 |
51 | Without prejudice to the | 277.98184 | Protection from unfair trading | 247.25723 |
52 | This directive and | 277.98184 | Consumer protection from unfair | 247.25723 |
53 | A high level | 277.98184 | Consumer protection from | 247.25723 |
54 | Model withdrawal form | 277.98184 | At an end | 235.29318 |
55 | Consumer under an | 277.98184 | If it is | 235.29318 |
56 | A high level of | 277.98184 | Object or effect | 231.30516 |
57 | The European Union | 277.98184 | Repair or replacement | 231.30516 |
58 | Of the first | 277.98184 | As at an end | 231.30516 |
59 | Having regard to the | 277.98184 | As at an | 231.30516 |
60 | Of article 6 | 277.98184 | Is to be treated | 227.31714 |
61 | Prejudice to the | 277.98184 | A local weights and | 227.31714 |
62 | Or undertakes to | 277.98184 | A local weights | 227.31714 |
63 | The distance contract | 277.98184 | An officer of | 227.31714 |
64 | The first paragraph | 277.98184 | In Northern Ireland | 223.32912 |
65 | Be without prejudice | 277.98184 | Object or effect of | 223.32912 |
66 | Of the European Union | 277.98184 | A consumer contract | 223.32912 |
67 | High level of | 277.98184 | The object or | 219.34109 |
68 | Means of communication | 258.12601 | The object or effect | 219.34109 |
69 | The rules on | 258.12601 | Of that Act | 219.34109 |
70 | And digital services | 258.12601 | Of a term | 215.35307 |
71 | Gas or electricity | 258.12601 | The contract as | 215.35307 |
72 | Of this article | 258.12601 | The Enterprise Act 2002 | 215.35307 |
73 | A right of withdrawal | 258.12601 | Breach of the | 803.27.00 |
74 | Defined in point | 258.12601 | contract to supply | 211.36507 |
75 | Of the online | 258.12601 | Apply to a | 211.36507 |
76 | Of article 16 | 258.12601 | Contract as at an | 207.37704 |
77 | As defined in point | 258.12601 | Contract as at | 207.37704 |
78 | With this directive | 258.12601 | Treated as included | 203.38902 |
79 | High level of consumer | 258.12601 | A term of | 203.38902 |
80 | The consumer under | 258.12601 | Right to cancel | 203.38902 |
81 | Are not put | 238.27016 | The consumer protection from | 199.401 |
82 | Are not put up | 238.27016 | Is subject to | 199.401 |
83 | The provider of | 238.27016 | Of part 1 | 199.401 |
84 | Content and digital services | 238.27016 | Terms and notices | 199.401 |
85 | Content and digital | 238.27016 | Which has the | 195.41298 |
86 | Limited volume or set | 238.27016 | Contracts for goods | 191.42496 |
87 | Digital content and digital | 238.27016 | Be treated as included | 191.42496 |
88 | Digital content or digital | 238.27016 | At the end | 191.42496 |
89 | Content or digital | 238.27016 | Of this part | 191.42496 |
90 | Be considered as | 238.27016 | Has the object | 191.42496 |
91 | Of the supplier | 238.27016 | Term which has the | 187.43694 |
92 | Or set quantity | 238.27016 | In breach of | 187.43694 |
93 | Volume or set | 238.27016 | Digital content and services | 187.43694 |
94 | Volume or set quantity | 238.27016 | Which has the object | 187.43694 |
95 | The consumer under an | 238.27016 | Content and services | 187.43694 |
96 | Where they are not | 238.27016 | Term which has | 187.43694 |
97 | Of district heating | 238.27016 | Has the object or | 187.43694 |
98 | Other Member States | 238.27016 | Term or notice | 187.43694 |
99 | Not put up | 238.27016 | Of this schedule | 187.43694 |
100 | Directive should be | 238.27016 | This part of | 183.44891 |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Giampieri, P. Key n-Grams in EU Directives and in the UK National Legislation on Consumer Contracts. Int J Semiot Law 37, 59–75 (2024). https://doi.org/10.1007/s11196-023-10087-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11196-023-10087-y