Abstract
Topic Modeling of legal texts is a challenging task because of its complicated language structures, and technical features. Recently, there has been a big boost in the number of legislative documents, which makes it very difficult for law experts to keep up with legislation like implementing acts and analyzing cases. The importance of topics is affected by the processing and the presentation of law texts in some contexts. The aim of this work is to figure out the legal opinions from cases seen by the supreme court of the United States and the legal judgments from cases seen by the supreme court of India. In this study we used different Language Models to create sentence embeddings from those legal texts datasets. This paper employs BERTopic technique and a baseline approach in order to discover significant topics from legal opinions and legal judgment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nogales, A., Täks, E., Taveter, K.: Ontology modeling of the estonian traffic act for self-driving buses. In: Lossio-Ventura, J.A., Muñante, D., Alatrista-Salas, H. (eds.) SIMBig 2018. CCIS, vol. 898, pp. 249–256. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11680-4_24
Ruhl, J.B., Nay, J., Gilligan, J.: Topic modeling the president: conventional and computational methods. Geo. Wash. L. Rev. 86, 1243 (2018)
Dieng, A.B., Ruiz, F.J.R., Blei, D.M.: Topic modeling in embedding spaces. Trans. Assoc. Comput. Linguist. 8, 439–453 (2020)
Ray, S.K., Ahmad, A., Kumar, C.A.: Review and implementation of topic modeling in Hindi. Appl. Artif. Intell. 33(11), 979–1007 (2019)
Pilato, G., Vassallo, G.: TSVD as a statistical estimator in the latent semantic analysis paradigm. IEEE Trans. Emerg. Top. Comput. 3(2), 185–192 (2014)
Rajandeep, K., Manpreet, K.: Latent semantic analysis: searching technique for text documents. Int. J. Eng. Dev. Res. 3(2), 803–806 (2015)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Mu, W., Lim, K.H., Liu, J., Karunasekera, S., Falzon, L., Harwood, A.: A clustering-based topic model using word networks and word embeddings. J. Big Data 9(1), 1–38 (2022)
Kadir, N.H.M., Aliman, S.: Text analysis on health product reviews using R approach. Indones. J. Electr. Eng. Comput. Sci. (IJEECS) 18(3), 1303–1310 (2020)
Mangsor, N.S.M.N., Nasir, S.A.M., Yaacob, W.F.W., Ismail, Z., Rahman, S.A.: Analysing corporate social responsibility reports using document clustering and topic modeling techniques. Indones. J. Electr. Eng. Comput. Sci. 26(3), 1546–1555 (2022)
Remmits, Y.: Finding the topics of case law: latent dirichlet allocation on supreme court decisions (2017)
Luz De Araujo, P.H., De Campos, T.: Topic modelling brazilian supreme court lawsuits. In: Legal Knowledge and Information Systems, pp. 113–122. IOS Press (2020)
Mohammed, S.H., Al-augby, S.: LSA & LDA topic modeling classification: comparison study on e-books. Indones. J. Electr. Eng. Comput. Sci. 19(1), 353–362 (2020)
O’Neill, J., Robin, C., O’Brien, L., Buitelaar, P.: An analysis of topic modelling for legislative texts. In: CEUR Workshop Proceedings (2016)
Angelov, D.: Top2vec: distributed representations of topics. arXiv preprint arXiv:2008.09470 (2020)
Rawat, A.J., Ghildiyal, S., Dixit, A.K.: Topic modelling of legal documents using NLP and bidirectional encoder representations from transformers. Indones. J. Electr. Eng. Comput. Sci. 28(3), 1749–1755 (2022)
Silveira, R., Fernandes, C., Neto, J.A.M., Furtado, V., Pimentel Filho, J.E.: Topic modelling of legal documents via LEGAL-BERT. In: Proceedings http://ceur-ws org ISSN 1613 0073 (2021)
Grootendorst, M.: BERTopic: neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794 (2022)
Gunjan, V.K., Zurada, J.M.: Modern Approaches in Machine Learning & Cognitive Science: A Walkthrough. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-96634-8
Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)
Abuzayed, A., Al-Khalifa, H.: BERT for Arabic topic modeling: an experimental study on BERTopic technique. Procedia Comput. Sci. 189, 191–194 (2021)
Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 399–408 (2015)
Thinsungnoena, T., Kaoungkub, N., Durongdumronchaib, P., Kerdprasopb, K., Kerdprasopb, N.: The clustering validity with silhouette and sum of squared errors. Learning 3(7) (2015)
Ghosh, S., Wyner, A.: Identification of rhetorical roles of sentences in Indian legal judgments. In: Legal Knowledge and Information Systems: JURIX 2019: The Thirty-second Annual Conference, vol. 322. IOS Press (2019)
Acknowledgments
This work was supported by Google PhD Fellowships program and by Google Cloud Platform (GCP).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hammami, E., Faiz, R. (2024). Topic Modelling of Legal Texts Using Bidirectional Encoder Representations from Sentence Transformers. In: Saad, I., Rosenthal-Sabroux, C., Gargouri, F., Chakhar, S., Williams, N., Haig, E. (eds) Advances in Information Systems, Artificial Intelligence and Knowledge Management. ICIKS 2023. Lecture Notes in Business Information Processing, vol 486. Springer, Cham. https://doi.org/10.1007/978-3-031-51664-1_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-51664-1_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51663-4
Online ISBN: 978-3-031-51664-1
eBook Packages: Computer ScienceComputer Science (R0)