Published February 9, 2022 | Version v1
Journal article Open

Interpretable Ontology Extension in Chemistry - Supplementary data

  • 1. Otto von Guericke University Magdeburg

Description

Supplementary data for our submission "Interpretable Ontology Extension in Chemistry". We present an approach towards ontology extension that uses structural information to train a transformer-based model that predicts new subsumption relations. The ELECTRA model has been pre-trained using a combination of molecules from the ChEBI ontology and a selection of molecules from the PubChem database (chebai/data/SWJpre/raw/smiles.txt). The resulting model has then been fine-truned on a selection of ChEBI classes. The trained model has then been applied to a set of previously unseen chemicals from PubChem (hazardous.txt). The resulting predictions have been used to extend the ChEBI ontology. The extended ontology can be found as an owl file in 'chebi-slim-extended.owl.gz' and as a plot in 'classif-hazardous.png'. The resulting ontology was inconsistent because some of the predicted subsumption relations violated disjointness axioms. Those subsumption relations have been removed ('chebi-slim-extended-fixed.owl.gz'). The README.md file describes how to reproduce our results.

Files

interpretable_ontology_extension.zip

Files (44.0 MB)

Name Size Download all
md5:350a9eab7d3171c6508a64de2e46658c
44.0 MB Preview Download