loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Almuth Müller and Achim Kuwertz

Affiliation: Fraunhofer IOSB, Fraunhoferstraße 1, 76131 Karlsruhe, Germany

Keyword(s): Entity Resolution, Record Linkage, Deduplication, Natural Language Processing, Fuzzy Matching.

Abstract: This paper presents a concept for a two-tire semi-automated approach for business data entity resolution. Resolving entity names is generally relevant e.g. in business intelligence. When applied, several difficulties have to be considered, such as name deviations for an organization. Here, two types of deviations can be distinguished. First, names can differ due to typos, native special characters or transformation errors. Second, an organization name can change due to outdated designations or being given in another language. A further aspect is data sovereignty. Analyzed data sources can be under direct control, e.g. in own data storage systems, and thus be kept clean. Yet, other sources of relevant data may only be publicly available. It is in general not recommended to copy such data, due to e.g. its amount and data duplication issues. The proposed two-tire approach for entity resolution thus not only considers different kinds of name derivations, but also data sovereignty issues. Being still work in progress, it yet has the potential to reduce the effort required when compared to manual approaches and can possibly be applied in different areas where there is a significant need for harmonized data and externally curated systems are not feasible. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.222.107.253

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Müller, A. and Kuwertz, A. (2022). A Two-tire Approach for Organization Name Entity Resolution. In Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA; ISBN 978-989-758-583-8; ISSN 2184-285X, SciTePress, pages 484-491. DOI: 10.5220/0011307000003269

@conference{data22,
author={Almuth Müller. and Achim Kuwertz.},
title={A Two-tire Approach for Organization Name Entity Resolution},
booktitle={Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA},
year={2022},
pages={484-491},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011307000003269},
isbn={978-989-758-583-8},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA
TI - A Two-tire Approach for Organization Name Entity Resolution
SN - 978-989-758-583-8
IS - 2184-285X
AU - Müller, A.
AU - Kuwertz, A.
PY - 2022
SP - 484
EP - 491
DO - 10.5220/0011307000003269
PB - SciTePress