Applying Large Graph Neural Networks to Predict Transition Metal Complex Energies Using the tmQM_wB97MV Data Set

Machine learning (ML) methods have shown promise for discovering novel catalysts but are often restricted to specific chemical domains. Generalizable ML models require large and diverse training data sets, which exist for heterogeneous catalysis but not for homogeneous catalysis. The tmQM data set, which contains properties of 86,665 transition metal complexes calculated at the TPSSh/def2-SVP level of density functional theory (DFT), provided a promising training data set for homogeneous catalyst systems. However, we find that ML models trained on tmQM consistently underpredict the energies of a chemically distinct subset of the data. To address this, we present the tmQM_wB97MV data set, which filters out several structures in tmQM found to be missing hydrogens and recomputes the energies of all other structures at the ωB97M-V/def2-SVPD level of DFT. ML models trained on tmQM_wB97MV show no pattern of consistently incorrect predictions and much lower errors than those trained on tmQM. The ML models tested on tmQM_wB97MV were, from best to worst, GemNet-T > PaiNN ≈ SpinConv > SchNet. Performance consistently improves when using only neutral structures instead of the entire data set. However, while models saturate with only neutral structures, more data continue to improve the models when including charged species, indicating the importance of accurately capturing a range of oxidation states in future data generation and model development. Furthermore, a fine-tuning approach in which weights were initialized from models trained on OC20 led to drastic improvements in model performance, indicating transferability between ML strategies of heterogeneous and homogeneous systems.


Effects of Removed Structures on tmQM Statistics
Figure 15: Parity plot for the validation set of a GemNet-T model trained on 80% of all of tmQM, with the datapoints corresponding to structures removed from tmQM to tmQM wB97MV removed.Of the 8,666 structures in the validation set, 166 structures had an absolute error of at least 0.1 Hartree, and 31 had errors of at least 0.5 Hartree.13 structures in this set were structures that were removed when generating tmQM wB97MV.
Figure 16: Parity plot for the testing set of a GemNet-T model trained on 80% of all of tmQM, with the datapoints corresponding to structures removed from tmQM to tmQM wB97MV removed.Of the 8,666 structures in the testing set, 157 structures had an absolute error of at least 0.1 Hartree, and 32 had errors of at least 0.5 Hartree.23 structures in this set were structures that were removed when generating tmQM wB97MV.
Figure 17: Parity plot for a GemNet-T model trained on 80% of all of tmQM, with only the datapoints corresponding to structures removed from tmQM to tmQM wB97MV shown (note that this includes structures from across the training, validation, and testing sets).Of the 155 removed structures, 29 had errors of at least 0.1 Hartree, and 1 had errors of at least 0.5 Hartree.We note that of the 155 removed structures, 27 were charged and 128 were neutral, which is a very similar ratio to the ratio of charged to neutral structures in the entire dataset (71,173 neutral structures out of 86,665).tmQM wB97MV MAE and EwT Tables

Figure 1 :
Figure 1: Energy distributions of tmQM before preprocessing (top) and after preprocessing (bottom).Both histograms plot the count versus the electronic energy in hartrees.

Figure 2 :
Figure 2: Energy distributions of the neutral subset of tmQM before preprocessing (top) and after preprocessing (bottom).Both histograms plot the count versus the electronic energy in hartrees.

Figure 3 :
Figure 3: Energy distributions of tmQM wB97MV before preprocessing (top) and after preprocessing (bottom).Both histograms plot the count versus the electronic energy in hartrees.

Figure 4 :
Figure 4: Energy distributions of the neutral subset of tmQM wB97MV before preprocessing (top) and after preprocessing (bottom).Both histograms plot the count versus the electronic energy in hartrees.

Figure 6 :
Figure 5: Learning curves for models using all of tmQM (left) and the neutral structures only (right), plotting test set MAE (in meV/atom) versus the percentage of the data used for training.

Figure 7 :
Figure 7: Parity plots for the test set of models trained on 40% of tmQM.

Figure 8 :
Figure 8: Parity plots for the test set of models trained on 60% of tmQM.

Figure 9 :
Figure 9: Parity plots for the test set of models trained on 80% of tmQM.

Figure 10 :
Figure 10: Parity plots for the test set of models trained on 20% of the neutral subset of tmQM.

Figure 11 :
Figure 11: Parity plots for the test set of models trained on 40% of the neutral subset of tmQM.

Figure 12 :
Figure 12: Parity plots for the test set of models trained on 60% of the neutral subset of tmQM.

Figure 13 :
Figure 13: Parity plots for the test set of models trained on 80% of the neutral subset of tmQM.

Figure 14 :
Figure 14: Parity plot for the training set of a GemNet-T model trained on 80% of all of tmQM, with the datapoints corresponding to structures removed from tmQM to tmQM wB97MV removed.Of the 69,333 structures in the training set, 362 structures had an absolute error of at least 0.1 Hartree, and 172 had errors of at least 0.5 Hartree.119 structures in this set were structures that were removed when generating tmQM wB97MV.

Figure 18 :
Figure 18: Parity plot for the training set of a GemNet-T model trained on 80% of the neutral subset of tmQM, with the datapoints corresponding to structures removed from tmQM to tmQM wB97MV removed.Of the 56,939 structures in the training set, 287 structures had an absolute error of at least 0.1 Hartree, and 139 had errors of at least 0.5 Hartree.104 structures in this set were structures that were removed when generating tmQM wB97MV.

Figure 19 :
Figure19: Parity plot for the validation set of a GemNet-T model trained on 80% of the neutral subset of tmQM, with the datapoints corresponding to structures removed from tmQM to tmQM wB97MV removed.Of the 7,117 structures in the validation set, 78 structures had an absolute error of at least 0.1 Hartree, and 32 had errors of at least 0.5 Hartree. 10 structures in this set were structures that were removed when generating tmQM wB97MV.

Figure 20 :
Figure20: Parity plot for the testing set of a GemNet-T model trained on 80% of the neutral subset of tmQM, with the datapoints corresponding to structures removed from tmQM to tmQM wB97MV removed.Of the 7,117 structures in the testing set, 72 structures had an absolute error of at least 0.1 Hartree, and 36 had errors of at least 0.5 Hartree.14 structures in this set were structures that were removed when generating tmQM wB97MV.

Figure 21 :
Figure 21: Parity plots for the test set of models trained on 20% of tmQM wB97MV.

Figure 22 :
Figure 22: Parity plots for the test set of models trained on 40% of tmQM wB97MV.

Figure 23 :
Figure 23: Parity plots for the test set of models trained on 60% of tmQM wB97MV.

Figure 24 :
Figure 24: Parity plots for the test set of models trained on 80% of tmQM wB97MV.

Figure 25 :
Figure 25: Parity plots for the test set of models trained on 20% of the neutral subset of tmQM wB97MV.

Figure 26 :
Figure 26: Parity plots for the test set of models trained on 40% of the neutral subset of tmQM wB97MV.

Figure 27 :
Figure 27: Parity plots for the test set of models trained on 60% of the neutral subset of tmQM wB97MV.

Figure 28 :
Figure 28: Parity plots for the test set of models trained on 80% of the neutral subset of tmQM wB97MV.

Table 1 :
Atomic energies used for reference correction on the entire tmQM dataset.

Table 2 :
Atomic energies used for reference correction on the neutral subset of the tmQM dataset.

Table 3 :
Atomic energies used for reference correction on the entire tmQM wB97MV dataset.

Table 4 :
Atomic energies used for reference correction on the neutral subset of the tmQM wB97MV dataset.

Table 5 :
Test set Mean Absolute Error (in meV/atom) for all models trained on all of tmQM.

Table 6 :
Test set Mean Absolute Error (in meV/atom) for models trained on the neutral subset of tmQM.

Table 7 :
Test set Energy within Threshold (EwT, %) for models trained on the entirety of tmQM.

Table 8 :
Test set Energy within Threshold (EwT, %) for models trained on the neutral subset of tmQM.

Table 9 :
Test set Mean Absolute Error (in meV/atom) for all models trained on all of tmQM wB97MV.

Table 10 :
Test set Mean Absolute Error (in meV/atom) for models trained on the neutral subset of tmQM wB97MV.

Table 11 :
Test set Energy within Threshold (EwT, %) for models trained on the neutral subset of tmQM wB97MV.