loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Khadim Dramé 1 ; 2 ; Gorgoumack Sambe 1 ; 2 and Gayo Diallo 3

Affiliations: 1 Laboratoire d’Informatique et d’Ingénierie pour l’Innovation, Ziguinchor, Senegal ; 2 Université Assane Seck de Ziguinchor, Ziguinchor, Senegal ; 3 SISTM - INRIA, BPH INSERM 1219, Univ. Bordeaux, Bordeaux, France

Keyword(s): Text Duplicatoin, Semantic Sentence Similarity, Multilayer Perceptron, French Clinical Notes.

Abstract: Detecting similar sentences or paragraphs is a key issue when dealing with texts duplication. This is particularly the case for instance in the clinical domain for identifying the same multi-occurring events. Due to lack of resources, this task is a key challenge for French clinical documents. In this paper, we introduce CONCORDIA, a semantic similarity computing approach between sentences within French clinical texts based on supervised machine learning algorithms. After briefly reviewing various semantic textual similarity measures reported in the literature, we describe the approach, which relies on Random Forest, Multilayer Perceptron and Linear Regression algorithms to build supervised models. These models are thereafter used to determine the degree of semantic similarity between clinical sentences. CONCORDIA is evaluated using the Spearman correlation and EDRM classical evaluation metrics on standard benchmarks provided in the context of the Text Mining DEFT 2020 challenge base d. According to the official DEFT 2020 challenge results, the CONCORDIA Multilayer Perceptron based algorithm achieves the best performances compared to all the other participating systems, reaching an EDRM of 0.8217. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.131.13.37

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Dramé, K.; Sambe, G. and Diallo, G. (2021). CONCORDIA: COmputing semaNtic sentenCes for fRench Clinical Documents sImilArity. In Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST; ISBN 978-989-758-536-4; ISSN 2184-3252, SciTePress, pages 77-83. DOI: 10.5220/0010687500003058

@conference{webist21,
author={Khadim Dramé. and Gorgoumack Sambe. and Gayo Diallo.},
title={CONCORDIA: COmputing semaNtic sentenCes for fRench Clinical Documents sImilArity},
booktitle={Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST},
year={2021},
pages={77-83},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010687500003058},
isbn={978-989-758-536-4},
issn={2184-3252},
}

TY - CONF

JO - Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST
TI - CONCORDIA: COmputing semaNtic sentenCes for fRench Clinical Documents sImilArity
SN - 978-989-758-536-4
IS - 2184-3252
AU - Dramé, K.
AU - Sambe, G.
AU - Diallo, G.
PY - 2021
SP - 77
EP - 83
DO - 10.5220/0010687500003058
PB - SciTePress