Issue 21, 2021

Melting point prediction of organic molecules by deciphering the chemical structure into a natural language

Abstract

Establishing quantitative structure–property relationships for the rational design of small molecule drugs at the early discovery stage is highly desirable. Using natural language processing (NLP), we proposed a machine learning model to process the line notation of small organic molecules, allowing the prediction of their melting points. The model prediction accuracy benefits from training upon different canonicalized SMILES forms of the same molecules and does not decrease with increasing size, complexity, and structural flexibility. When a combination of two different canonicalized SMILES forms is used to train the model, the prediction accuracy improves. Largely distinguished from the previous fragment-based or descriptor-based models, the prediction accuracy of this NLP-based model does not decrease with increasing size, complexity, and structural flexibility of molecules. By representing the chemical structure as a natural language, this NLP-based model offers a potential tool for quantitative structure–property prediction for drug discovery and development.

Graphical abstract: Melting point prediction of organic molecules by deciphering the chemical structure into a natural language

Supplementary files

Article information

Article type
Communication
Submitted
10 Nov 2020
Accepted
11 Jan 2021
First published
11 Jan 2021

Chem. Commun., 2021,57, 2633-2636

Melting point prediction of organic molecules by deciphering the chemical structure into a natural language

W. Mi, H. Chen, D. (. Zhu, T. Zhang and F. Qian, Chem. Commun., 2021, 57, 2633 DOI: 10.1039/D0CC07384A

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements