Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials

Efficiently creating a concise but comprehensive data set for training machine-learned interatomic potentials (MLIPs) is an under-explored problem. Active learning (AL), which uses either biased or unbiased molecular dynamics (MD) simulations to generate candidate pools, aims to address this objective. Existing biased and unbiased MD simulations, however, are prone to miss either rare events or extrapolative regions -- areas of the configurational space where unreliable predictions are made. Simultaneously exploring both regions is necessary for developing uniformly accurate MLIPs. In this work, we demonstrate that MD simulations, when biased by the MLIP's energy uncertainty, effectively capture extrapolative regions and rare events without the need to know \textit{a priori} the system's transition temperatures and pressures. Exploiting automatic differentiation, we enhance bias-forces-driven MD simulations by introducing the concept of bias stress. We also employ calibrated ensemble-free uncertainties derived from sketched gradient features to yield MLIPs with similar or better accuracy than ensemble-based uncertainty methods at a lower computational cost. We use the proposed uncertainty-driven AL approach to develop MLIPs for two benchmark systems: alanine dipeptide and MIL-53(Al). Compared to MLIPs trained with conventional MD simulations, MLIPs trained with the proposed data-generation method more accurately represent the relevant configurational space for both atomic systems.


Introduction
Computational techniques are invaluable for exploring complex configurational and compositional spaces of molecular and material systems.The accuracy and efficiency, however, depend on the chosen computational methods.Ab initio molecular dynamics (AIMD) simulations using densityfunctional theory (DFT) provide accurate results but are computationally demanding.Atomistic simulations with classical force fields (FFs) offer a faster alternative but often lack accuracy.2][3] An essential component of any MLIP is the accurate encoding of the atomic system by a local representation, which depends on configurational (atomic positions) and compositional (atomic types) degrees of freedom. 4Recently, a wide range of machine learning approaches have been introduced, including linear and kernel-based models, [5][6][7][8] Gaussian approximation, 9,10 and neural network (NN) interatomic potentials, [11][12][13][14][15][16][17][18] all demonstrating remarkable success in atomistic simulations.
The effectiveness of MLIPs, however, crucially relies on training data sufficiently covering the relevant configurational and compositional spaces. 19,20 ithout such training data, MLIPs cannot faithfully reproduce the underlying physics.An open challenge, therefore, is the generation of comprehensive training data sets for MLIPs, covering relevant configurational and compositional spaces and ensuring that resulting MLIPs are uniformly accurate across these spaces.This objective must be realized while reducing the number of expensive DFT evaluations, which provide reference energies, atomic forces, and stresses.This challenge is further complicated by the limited knowledge of physical conditions, such as temperature and pressure, at which configurational changes occur.
To address this challenge, iterative active learning (AL) algorithms can be used to improve the accuracy of MLIPs by providing an augmented data set; [21][22][23][24][25][26] see Fig. 1 (a).They select the data most informative to the model, that is, atomic configurations with higher energy and force uncertainties, as estimated by the model.This data is drawn from configurational and compositional spaces explored during, e.g., molecular dynamics (MD) simulations.Reference DFT energies, atomic forces, and stresses are evaluated for the selected atomic configurations.Furthermore, energy and force uncertainties indicate the onset of extrapolative regions-regions where unreliable predictions are made-prompting the termination of the MD simulation and the evaluation of reference DFT values.In such a naive AL setting, covering the configurational space and exploring extrapolative configurations might require running longer MD simulations and defining physical conditions for observing slow configurational changes (rare events).An AL experiment begins with training an MLIP in the first iteration using a small set of randomly perturbed initial configurations.The current MLIP is employed in each iteration to run parallel MD simulations.Each simulation continues until it reaches a predefined uncertainty threshold.Then, a batch of configurations is selected from all trajectories.Reference energies and forces of these samples are evaluated using a DFT solver, updating the training data set.The updated data set is employed for training the MLIP in the next iteration.(b) Adaptive biasing strategies like metadynamics enhance the exploration of the configurational space.In metadynamics, exploration along manually defined CVs is facilitated by adding Gaussian functions to a history-dependent bias (areas filled by blue, orange, and red colors).However, even for well-defined CVs, exploring the configurational space of interest may require long simulation times due to the diffusive motion along these CVs.(c) Uncertainty-biased MD aims to minimize uncertainty u (grey shaded area) related to the actual error, thereby facilitating the exploration of the configurational space.In uncertainty-biased MD, we subtract the MLIP's energy uncertainty from the predicted energy (continuous black line) and run MD simulations using the altered energy surface (dashed black line).Curved lines denote distinct MD trajectories.Unlike metadynamics, uncertainty-biased MD operates without defining CVs and drives MD simulations toward high uncertainty regions in each iteration.
ing adaptive biasing strategies such as metadynamics; [27][28][29][30][31][32] see Fig. 1 (b).However, metadynamics requires manually selecting a few collective variables (CVs) which are assumed to describe the system.The limited number of CVs restricts exploration, as they might miss relevant transitions and parts of the configurational space.In contrast, MD simulations biased toward regions of high uncertainty can enhance the discovery of extrapolative configurations. 33,34  related work utilizes uncertainty gradients with respect to atomic positions for adversarial training of MLIPs. 35To obtain MLIPs that are uniformly accurate across the relevant configurational space, however, efficient exploration of both rare events and extrapolative configurations is necessary.The extent to which uncertainty-biased MD can achieve this objective remains an unexplored research area.
In this work, we demonstrate the capability of uncertaintybiased MD to explore the relevant configurational space, including fast exploration of rare events and extrapolative regions; see Fig. 1 (c).We achieve this by exploring the CVs of alanine dipeptide-a widely used model for protein backbone structure.To assess the coverage of the CV space, we introduce a measure using a tree-based weighted recursive space partitioning.Furthermore, we extend existing uncertaintybiased MD simulations by automatic differentiation (AD) and propose a novel biasing technique that utilizes bias stresses obtained by differentiating the model's uncertainty with respect to infinitesimal strain deformations.We assess the efficiency of the proposed biasing technique by running MD simulations at constant pressure (N pT statistical ensemble) and exploring cell parameters of MIL-53(Al)-a flexible metal-organic framework (MOF) featuring closed-and large-pore stable states.
A key ingredient of any uncertainty-driven AL algorithm is a sensitive measure of local structural changes for detecting the onset of extrapolative regions.However, MLIP uncertainties often underestimate actual errors, resulting in the exploration of unphysical regions, negatively affecting MLIP training.Thus, calibrated uncertainties are crucial for learning high-quality MLIPs in a data-driven manner.In our setting, we demonstrate that conformal prediction (CP) helps align the highest force error with its corresponding uncertainty value.This approach effectively makes MLIPs not underestimate force errors, which is important for preventing MD simulations from exploring unphysical configurations.Thus, CPbased uncertainty calibration helps set reasonable uncertainty thresholds without limiting the exploration of the configurational space.In contrast, conventional approaches drive MD away from high-uncertainty regions, which can hinder exploration. 368][39] These features can be interpreted as the sensitivity of a model's output to changes in its parameters.We demonstrate that gradient features can be used to define uncertainties of total and atom-based properties, such as energy and atomic forces.To make gradient-based uncertainties computationally efficient, we employ the sketching technique 40 and reduce the dimensionality of gradient features.
We further enhance configurational space exploration and improve the computational efficiency of uncertainty-driven AL by employing batch selection algorithms. 38,39 hese algorithms simultaneously select multiple atomic configurations from trajectories generated during parallel MD simulations.Batch selection algorithms enforce the informativeness and diversity of the selected atomic structures.Thus, they ensure the construction of maximally diverse training data sets.

Results
In the following, we first demonstrate the necessity of uncertainty calibration on an example of MIL-53(Al) to constrain MD simulations to physically reasonable regions of the configurational space.Then, we present two complementary analyses demonstrating the improved data efficiency of MLIPs obtained by our uncertainty-driven AL approach, developing MLIPs for alanine dipeptide and MIL-53(Al).Furthermore, we investigate how uncertainty-biased MD enhances the exploration of the configurational space, utilizing bias forces and stress.To benchmark our results, we draw a comparison with MD run at elevated temperatures and pressures as well as metadynamics simulations.The details on the ensemble-free uncertainties (distance-and posterior-based ones derived from sketched gradient features) and uncertainty-biased MD can be found in the Methods section.

Calibrating uncertainties with conformal prediction
Total and atom-based uncertainties are typically poorly calibrated, 41 meaning that they often underestimate actual errors.The underestimation of errors is particularly dangerous when dynamically generating candidate pools, as it may result in exploring unphysical configurations.Specifically, poor calibration complicates defining an appropriate uncertainty threshold for prompting the termination of MD simulations and the evaluation of reference DFT energies, atomic forces, and stresses.To address this issue, we utilize inductive CP, which computes a re-scaling factor based on predicted uncertainties and prediction errors on a calibration set.The confidence level 1 − α in CP is defined such that the probability of underestimating the error is at most α on data drawn from the same distribution as the calibration set.The detailed procedure can be found in the Methods section.
Figure 2 demonstrates the correlation of maximal atombased uncertainties with maximal atomic force RMSEs for the MIL-53(Al) test data set from Ref. 32.In the figure, transparent hexbins represent uncertainties calibrated with a lower confidence (α = 0.5; see Methods), while opaque ones depict those calibrated with a higher confidence (α = 0.05).The presented uncertainties are derived from gradient features or an ensemble of three MLIPs (see Methods) and calibrated using CP with atomic force RMSEs.For posterior-and distancebased uncertainties, which are unitless, the re-scaling with CP ensures that the resulting uncertainties are provided in correct units, i.e., eV/Å.Ensemble-based uncertainty quantification already provides correct units, which are preserved by CP.Equivalent results for alanine dipeptide can be found in the Supplementary Information.Transparent hexbin points represent uncertainties calibrated with α = 0.5 (low confidence; see Methods), while opaque ones denote uncertainties calibrated with α = 0.05 (high confidence).Calibrating uncertainties with a high confidence level helps align the largest actual error with the corresponding uncertainty, shifting the hexbin points to or below the red diagonal line.This alignment is crucial for identifying unreliable predictions and prompting the termination of MD simulations.samples correspond to randomly perturbed MIL-53(Al) structures, while the remaining 450 are generated using metadynamics combined with the incremental learning approach. 32he latter is an iterative learning algorithm that improves MLIPs by training on configurations generated sequentially over time, explicitly using the last simulation frame of atomistic simulations.
We observe that uncertainties calibrated with a lower confidence level often underestimate actual errors.In this case, MD simulations can explore unphysical regions before reaching the uncertainty threshold, especially in cases with a small correlation between uncertainties and actual errors.By employing CP with higher confidence, we help align the largest prediction error with the corresponding uncertainty, thereby improving its ability to identify the onset of extrapolative regions.This alignment becomes apparent in Fig. 2, where CP shifts the hexbin points to be on or below the diagonal line.
In Fig. 2 (top), we find that even training and calibrating models with a small set of randomly perturbed atomic configurations is sufficient for robust identification of unreliable predictions.This result is crucial as we rely on such data sets to initialize our AL experiments, eliminating the need for predefined data sets. 33,34 urthermore, we observe that calibrated uncertainties from model ensembles tend to overestimate the actual error to a greater extent compared to gradient-based approaches.While this may not be critical when exploring unphysical configurations, it can be wasteful, leading to a premature termination of MD simulations.This trend is consistent across all training and calibration data sizes.Lastly, the results provided here and in the Supplementary Information demonstrate that all uncertainty quantification methods perform comparably regarding Pearson and Spearman correlation coefficients.

Performance of bias-forces-driven active learning
Exploring the configurational space of complex molecular systems, particularly those with multiple stable states, is es- sential for developing accurate and robust MLIPs.We apply bias-forces-driven MD simulations combined with AL to develop MLIPs for alanine dipeptide in vacuum.This dipeptide exhibits two stable conformers characterized by the backbone dihedral angles φ and ψ (as shown in the inset of Fig. 3): the C 7eq state with φ ≈ −1.5 rad and ψ ≈ 1.19 rad and the C ax state with φ ≈ 0.9 rad and ψ ≈ −0.9 rad. 42We use unbiased MD simulations as the baseline to create the candidate pool for AL.We employ the AMBER ff19SB force field for reference energy and force calculations, 43 as implemented in the TorchMD package using PyTorch. 44,45 h AL experiment starts with training an MLIP with eight alanine dipeptide configurations randomly perturbed from its initial configuration in the C 7eq state.Trained MLIPs are then used to run eight parallel MD simulations, initialized from the initial configuration or configurations selected in later iterations.Each MD simulation runs until reaching an empirically defined uncertainty threshold of 1.5 eV/Å.A lower threshold value may result in slower CV space exploration, while a larger one would lead to the exploration of unphysical configurations.The maximum data set size, comprising training and validation data, is limited to 512 configurations.Biased (biasforces-driven) and unbiased MD simulations are performed using the canonical (NV T ) statistical ensemble within the ASE simulation package. 46Unbiased MD simulations are run with the Langevin thermostat at temperatures of 300 K, 600 K, and 1200 K, whereas biased simulations are performed at a constant temperature of 300 K. We have chosen an integration Table 1.CV space coverage, atomic energy (E-) and atomic force (F-) RMSEs, as well as position (Pos.)and uncertainty (Unc.)ACTs for alanine dipeptide experiments conducted with posterior-based uncertainties.CV space coverage, E-and F-RMSEs are reported for MLIPs obtained at the end of each experiment, while ACTs are computed using the entire trajectory obtained throughout the experiment.E-RMSE is given in meV/atom, while F-RMSE is in eV/Å.ACTs are provided in ps.For biased MD, we compare two cases: one with (w.) biasing hydrogen atoms and one without (w/o.).time step of 0.5 fs and set a maximum of 20,000 steps for an MD simulation.A biasing strength of τ = 0.25 was also chosen for biased AL experiments.In reference calculations, we employ a force threshold of 20 eV/Å to exclude unphysical structures, mainly encountered at high biasing strengths (equivalently, a smaller integration time step could be used).

Experiment
All AL experiments have been repeated five times.Figure 3 demonstrates the performance of MLIPs obtained for alanine dipeptide depending on the number of acquired configurations.Table 1 presents error metrics evaluated for MLIPs at the end of each experiment.Here, we provide results for the posterior-based uncertainty.Equivalent results for other uncertainty methods are presented in the Supplementary Information.Figure 3 (a) presents the coverage of the CV space defined by the two dihedral angles (φ and ψ).We measure the coverage of the respective space by a tree-based weighted recursive space partitioning; see Methods.AL experiments combined with unbiased MD simulations at 1200 K serve as the upper-performance limit for MLIPs in the case of alanine dipeptide, achieving the highest coverage of 0.97 after acquiring 512 configurations.Increasing temperature even further while using interatomic potentials, which allow for bond breaking and formation, might lead to the degradation of the molecule.Uncertainty-biased MD simulations at 300 K result in slightly lower coverage values, surpassing the coverages achieved by unbiased counterparts at 300 K and 600 K.
Furthermore, biased MD simulations outperform unbiased dynamics at 1200 K, efficiently covering the CV space before acquiring about 200 configurations.This observation is attributed to the gradual increase in driving forces induced by the uncertainty bias, resulting in a more gradual distortion of the atomic structure.In contrast, high-temperature unbiased simulations perturb the system more strongly and rapidly enter extrapolative regions without exploring relevant configurational changes.Thus, high-temperature simulations may cause the degradation of the investigated atomic systems, unlike uncertainty-biased dynamics.
Figures 3 (b) and (c) present energy and force RMSEs evaluated on the alanine dipeptide test data set; see Methods.Consistent with the findings in Fig. 3 (a), AL approaches combined with biased MD simulations at 300 K outperform their unbiased counterparts at 300 K and 600 K once they acquire about 100 configurations.Biased AL experiments achieve energy RMSE of 1.97 meV/atom, close to those observed in high-temperature MD simulations, surpassing others by a factor of more than 13.A similar trend is observed for force RMSE.Biased AL experiments achieve an RMSE of 0.071 eV/Å, outperforming their counterparts at 300 K and 600 K by factors of 2.1 and 1.6, respectively.These results demonstrate the efficiency of uncertainty-biased dynamics in exploring the configurational space and developing accurate and robust MLIPs.
Biased AL experiments achieve exceptional performance without a priori knowledge of transition temperatures between stable states; see Fig. 3 (d).Defining transition temperatures requires running MD simulations at different temperatures to explore the relevant configurational space without degrading the atomic system.In contrast, there is no strong dependence of the performance of our approach on biasing strength (see the Supplementary Information), which has to be chosen within a moderate range.Our results offer evidence of rare event exploration (for alanine dipeptide, the exploration of both stable states) through uncertainty-biased dynamics.The following section will present a detailed analysis of the exploration rates.
Additionally, we have identified how to further improve our biased MD simulations by making biasing strengths species dependent; see the Supplementary Information.The results presented in this section, achieved with a biasing strength of zero for hydrogen atoms, outperform settings where all atoms are biased equally, with improvements by a factor of 1.08 in coverage and 1.15 in force RMSE; see Table 1.Thus, a more sophisticated data-driven redistribution of biasing strengths can further enhance the performance of bias-forces-driven MD simulations.

Exploration rates for collective variables of alanine dipeptide
We have observed that uncertainty-biased MD simulations effectively explore the configurational space of alanine dipeptide, defined by its CVs. Figure 4 evaluates the extent to which the introduced bias forces in MD simulations accelerate their exploration.In Fig. 4 (a), we present the coverage of the CV space as a function of simulation time, i.e., as a function of the effective number of MD steps.The figure demonstrates that uncertainty-driven AL experiments at 300 K outperform unbiased experiments at 300 K and 600 K.They achieve the same coverage in considerably shorter simulation times, thereby enhancing exploration rates by a factor of larger than two.At the same time, biased MD simulations yield results comparable to those obtained from unbiased MD simulations at 1200 K. Thus, we can argue that uncertainty-biased MD explores the relevant configurational space at a similar rate to unbiased MD conducted at 1200 K.
The exploration rates estimated from Fig. 4 (a) provide an approximate measure of how uncertainty-biased dynamics accelerate the exploration of the configurational space.To offer a more thorough assessment, we examine auto-correlation functions (ACFs) computed for both position and uncertainty spaces in Figures 4 (b) and (c), where a faster decay corresponds to a faster exploration of the respective space.We compute ACFs using concatenated MD trajectories from all AL iterations as they cover the explored configurational space during the entire experiment.Additionally, we calculate the auto-correlation time (ACT) for each experiment.For the definition of ACF and ACT, we refer to Methods.Table 1 presents ACTs for all AL experiments.Smaller ACTs correspond to a faster decay of ACFs, indicating a faster exploration of the respective spaces.
Using the computed ACTs, we conclude that biased AL experiments at 300 K explore position and uncertainty spaces two to six times faster than the unbiased MD simulations at 300 K and 600 K. Compared to the unbiased dynamics at 1200 K, they achieve comparable exploration rates in the position space and rates lower by a factor of two for the uncertainty space.Furthermore, we observed that biasing hydrogen atoms results in reduced uncertainty ACTs compared to the experiments where hydrogen atoms remained unbiased.However, explicitly biasing hydrogen atoms is less efficient in exploring the position space by a factor of three.Thus, shorter uncertainty ACTs of unbiased MD simulations at 1200 K can be attributed to a stronger distortion of bonds, including hydrogen atoms, resulting in fast exploration of extrapolative regions.While this effect is unfavorable for the enhanced exploration of slow modes, such as the CVs of alanine dipeptide, in a biased MD simulation, it may be necessary to consider incorporating small, non-zero biasing strengths for hydrogen atoms to ensure the robustness of MD simulations at elevated temperatures.Interestingly, we observe that uncertainty-biased MD simulations manage to sample two slow modes in alanine dipeptide, even though 27 degrees of freedom (corresponding to the heavy C, N, and O atoms) were effectively biased, demonstrating their remarkable efficiency.
To gain insight into the exploration of the CV space during the AL experiments, we refer to Figs. 4 (d) and (e), which illustrate the time evolution of the maximal atom-based uncertainty and the coverage of the sampled CV space for selected AL iterations.Biased MD simulations consistently explore configurations with higher uncertainty values than their unbiased counterparts at 300 K and 600 K. Furthermore, bias forces not only drive the exploration toward both stable states of alanine dipeptide but also facilitate transitions between them.These results are on par with unbiased MD simulations at 1200 K, indicating that MD simulations driven by bias forces reduce the uncertainty level uniformly across the relevant configurational space.Due to the direct correlation between uncertainties and actual errors, we can argue that uncertainty-driven AL generates uniformly accurate MLIPs across the relevant configurational space.

Performance of bias-stress-driven active learning
Generating training data for bulk material systems with large unit cells and multiple stable states poses a significant challenge in developing MLIPs.Therefore, we assess the performance of the bias-stress-driven AL applied to MIL-53(Al), a flexible MOF that undergoes reversible, large-amplitude volume changes under external stimuli, such as temperature and pressure (see inset of Fig. 5).MIL-53(Al) features two stable phases: the closed-pore state with a unit cell volume of V ∼ 830 Å 3 and the large-pore state with V ∼ 1419 Å 3 .For reference energy, force, and stress calculations, we use the CP2K simulation package (version 2023.1) 47and DFT at the PBE-D3(BJ) level. 48,49 ur baseline for generating the candidate pool for AL involves unbiased MD and metadynamics, 32 which uses an adaptive biasing strategy for cell parameters of MIL-53(Al).
In each AL experiment, we start with 32 MIL-53(Al) configurations randomly perturbed around its closed-pore state, with 90 % reserved for training.Trained MLIPs are then used to perform 32 parallel MD simulations, each running until it reaches an uncertainty threshold of 1.0 eV/Å.The maximum data set size is limited to 512 configurations, comprising training and validation data.Both biased (bias-stress-driven) and unbiased MD simulations use the isobaric-isothermal form of the Nosé-Hoover dynamics. 50,51 nbiased MD simulations are carried out at 600 K and 0 MPa, as well as ± 250 MPa (half of the simulations each), while biased simulations are performed at 600 K and 0 MPa.The characteristic time scales of the thermostat and barostat are set to 0.1 ps and 1 ps, respectively.We have chosen an integration time step of 0.5 fs and set a maximum of 20,000 MD steps for an MD simulation.A stress-biasing strength of τ = 0.5 is used in biased AL experiments.In reference calculations, we employ a force threshold of 20 eV/Å to exclude strongly distorted structures.We use the data set from Ref. 32 as a metadynamics-generated baseline and select the first 500 sequentially generated config- Ramachandran plots are presented for unbiased MD simulations at 300 K and 1200 K and biased MD simulations at 300 K. Simulation time refers to the effective number of MD steps (× 0.5 fs) required to reach the final coverage, while lag time denotes the time interval between two successive MD frames.Biased MD simulations at 300 K exhibit at least two times higher exploration rates than their unbiased counterparts at 300 K and 600 K. Their exploration rates are comparable to those of unbiased MD simulations at 1200 K, with the advantage of gradually distorting the molecule, reducing the risk of its degradation.
Acquired data set size  32 Shaded areas denote the standard deviation across three independent runs, except for metadynamics.For it, shaded areas denote standard deviation across three randomly initialized MLIPs.(d) Volume distribution for atomic configurations acquired during MD at 600 K, along with volume-dependent energy, force, and stress RMSEs.(e) Volume distribution for configurations acquired during MD at 300 K, along with volume-dependent energy, force, and stress RMSEs.We employ a temperature of 300 K to reduce the probability of exploring the large-pore state of MIL-53(Al).Bias-stress-driven MD simulations outperform metadynamics-based simulations with adaptive biasing of the cell parameters.Metadynamics aims to cover the volume space uniformly.In contrast, uncertainty-biased MD generates training data sets that reduce force and stress RMSEs uniformly.Additionally, biased MD simulations enhance the exploration of closed-and large-pore states of MIL-53(Al) shown in the inset of (d).
Table 2. Atomic energy (E-), atomic force (F-), and stress (S-) RMSEs, as well as position (Pos.)and uncertainty (Unc.)ACTs for MIL-53(Al) experiments conducted with posterior-based uncertainties.E-, F-, and S-RMSEs are reported for MLIPs obtained at the end of each experiment, while ACTs are computed using the entire trajectory sampled throughout the experiment.E-RMSE is given in meV/atom, F-RMSE in eV/Å, and S-RMSE in MPa.For metadynamics, we train three MLIPs initialized using different random seeds.

Experiment
Figures 5 (a)-(c) demonstrate the performance of MLIPs developed for MIL-53(Al) depending on the number of acquired configurations.Table 2 presents error metrics evaluated for MLIPs at the end of each experiment.Here, we present results for the posterior-based uncertainty.Equivalent results for other uncertainty quantification methods are presented in the Supplementary Information.We observe that MLIPs trained with configurations generated using metadynamics outperform the others for data set sizes below 200 samples.This difference in performance can be attributed to how perturbed configurations are generated and the differing experimental settings between incremental learning and AL applied here.Bias-stress-driven AL outperforms metadynamics-based incremental learning after acquiring about 200 atomic configurations regarding force and stress RMSEs.Metadynamics-based experiments achieve performance on par with unbiased AL experiments conducted at 0 MPa after they reach a data set size of 200 configurations.For uncertainty-biased MD, the force RMSE improves by a factor of 1.14, and the stress RMSE improves by a factor of two.Furthermore, AL experiments employing biased MD simulations outperform unbiased MD simulations at 250 MPa regarding stress RMSE.Therefore, we can argue that bias-stress-driven MD generates a data set that better represents the relevant configurational space of flexible MOFs without a priori knowing the transition pressure, compared to MLIPs trained with conventional MD and metadynamics simulations.
Figures 5 (d) and (e) show the main advantage of biased MD simulations over unbiased and metadynamics-based approaches.In Fig. 5 (e), we reduce the temperature to 300 K and initiate the AL experiments with 256 configurations, each having a unit cell volume below 1200 Å 3 (drawn from Ref. 32).
Using a lower temperature and learning the configurational space around the closed-pore state is required to decrease the probability of MD simulations exploring the large-pore stable state of MIL-53(Al).In contrast, we found that using randomly perturbed atomic configurations can lead to underestimated energy barriers by MLIPs, thus facilitating the transition between both stable states in initial AL iterations.
While exploring the large-pore state less frequently than metadynamics-based counterparts, bias-stress-driven MD simulations span a broader range of volumes and effectively reduce energy, force, and stress RMSEs uniformly across the entire volume space.Compared to zero-pressure unbiased MD simulations, it effectively biases dynamics toward exploring the large-pore state.However, since this state can be modeled using atomic environments from the closed-pore state, bias stress does not favor exploration of the former.Instead, it drives the dynamics toward smaller volumes where all other approaches tend to predict energy, force, and stress values with higher errors.
These results show that uncertainty-biased MD simulations aim to reduce errors across the relevant configurational space and accelerate the simultaneous exploration of extrapolative regions and transitions between stable states.In contrast, metadynamics may require longer simulation times to generate equivalent candidate pools as it focuses on generating configurations uniformly distributed in the CV space, which is unnecessary for developing MLIPs.Note that metadynamics was not initially designed for generating training data sets, whereas uncertainty-biased MD offers an excellent tool for this task.

Exploration rates for cell parameters of MIL-53(Al)
Figure 6 assesses the extent to which uncertainty-biased (bias stress) MD simulations enhance the exploration of the extensive volume space of MIL-53(Al).In Fig. 6     .ACF for positions obtained by running biased and unbiased MD simulations at 300 K for MIL-53(Al).Shaded areas denote the standard deviation across three independent runs.We employ a temperature of 300 K to reduce the probability of exploring the large-pore state of MIL-53(Al).The ACF exhibits strongly correlated motions attributed to volume fluctuations induced by the bias stress.These fluctuations can be modeled by a sine wave with a period twice the length of the simulation.The red line denotes a sine wave with a larger noise amplitude than the one denoted by the blue line.
a higher frequency of transitions between stable states for biased MD simulations than for zero-pressure MD counterparts.Additionally, uncertainty-biased simulations favor the exploration of smaller MIL-53(Al) volumes, in line with the results shown in Fig. 5. Figures 6 (b) and (c) present ACFs for both positions and uncertainty spaces, with estimated ACTs provided in Table 2.These results indicate that bias-stress-driven MD is at least as efficient as high-pressure MD simulations in exploring both spaces.Similar to alanine dipeptide, a faster decay of ACFs corresponds to smaller ACTs, indicating a faster exploration of the respective space.Figure 6 (d) demonstrates the time evolution of energy, force, and stress RMSEs and reveals that local atomic environments in the large-pore state are effectively represented by those in the closed-pore state, explaining the stronger preference for smaller volumes as observed in Figure 6 (a) and Figures 5 (d) and (e).This effect is evident from the low force and stress RMSEs in the early iterations for the large-pore state, even though this state has not been explored yet.Furthermore, uncertainty-biased simulations consistently outperform their counterparts, starting from the early stages, by uniformly reducing errors across the test volume space.
From these results and the findings in Fig. 5 (d), we conclude that bias-stress-driven MD simulations significantly enhance the exploration of the relevant configurational space, including rare events (i.e., transitions between stable phases).However, in Table 2, we obtained longer ACTs for biased dynamics at 300 K compared to its unbiased counterparts, which seems to contradict our previous arguments.When examining the ACF shown in Fig. 7, it becomes evident that a stronger correlation in the position space results from the volume fluctuations induced in MIL-53(Al) by the bias stress.These fluctuations can be represented by a sine wave with additive random noise and a period twice the simulation's length; see Methods.This observation implies that bias stress induces correlated motions in the MIL-53(Al) system, causing it to expand and contract alternately for half of the simulation time.This phenomenon results in periodic exploration of both the boundaries of small and large volumes within the configurational space.
In contrast to the conventional approaches, including the bias-forces-driven MD simulations, which aim for uncorrelated random-walk-like behavior of predetermined CVs to capture configurational changes, our method introduces correlated motion that explores the entire configurational space without prior knowledge.Increasing the amplitude of random noise in the sine wave reduces the amplitude of these fluctuations in the ACF, similar to raising the temperature in an atomic system.This decrease in the amplitude explains why this effect is not observed in Fig. 6 (b).

Discussion
Our present study investigates a new paradigm for data set generation, facilitating the development of high-quality MLIPs for chemically complex atomic systems.We employ uncertaintybiased MD simulations to generate candidate pools for AL algorithms.Our results show, for the first time, that applying uncertainty bias facilitates simultaneous exploration of extrapolative regions and rare events.Efficient exploration of both is crucial in constructing comprehensive training datasets, consequently enabling the development of uniformly accurate MLIPs.In contrast, classical enhanced sampling techniques (e.g., metadynamics) or unbiased MD simulations at elevated temperatures and pressures cannot simultaneously explore extrapolative regions and rare events.Enhanced sampling techniques were designed to ensure the reconstruction of the underlying Boltzmann distribution.However, this property is unnecessary for data set generation and limits their effectiveness in this context.
Furthermore, the performance of enhanced sampling techniques depends on the manual definition of hyper-parameters, e.g., CVs for metadynamics.Setting them requires expert knowledge because the wrong choice can limit the range of explored configurations.In contrast, uncertainty-biased MD simulations need to define an uncertainty threshold and biasing strength.Both parameters influence the exploration rate of configurational space without constraining the space that can be explored.The biasing strength is advantageous over defining transition temperatures and pressures as it reduces the risk of degrading the atomic system.Furthermore, employing species-dependent biasing strength can restrict biasing in sen-sitive configurational regions, e.g., biasing hydrogen atoms.Simultaneously, it enables targeted biasing of pre-defined atom groups similar to metadynamics.
We compare uncertainty quantification methods, including the variance of an ensemble of MLIPs, and ensemblefree methods derived from sketched gradient features, focusing on configurational space exploration rates and generating uniformly accurate potentials; see Supplementary Information.Overall, gradient-based approaches yield MLIPs with similar performance to those created using ensemble-based uncertainty while significantly reducing the computational cost of uncertainty quantification.For MIL-53(AL), we find that ensemble-based uncertainties overestimate the force error stronger than gradient-based approaches, resulting in earlier termination of MD simulations and potentially worse configurational space exploration.For alanine dipeptide, using an ensemble of MLIPs improves their robustness during MD simulations, facilitating CV space exploration.Therefore, improving the robustness of a single MLIP during an MD simulation is a promising research direction, 52 combined with the proposed ensemble-free techniques.
While this study thoroughly investigates AL with uncertainty-biased MD for generating candidate pools, further research is still necessary.For example, one should analyze how well uncertainty-biased MD explores a configurational space with multiple stable states and how it identifies the respective slow modes using solely uncertainty bias.Also, assessing the uniform accuracy of resulting MLIPs and the enhanced exploration in higher-dimensional CV spaces remains challenging.Although uncertainty-biased MD proves efficient, it comes with an additional computational cost, increasing the inference times by 1.3 to 1.7 compared to unbiased MD, mainly due to calculating uncertainty gradients.In cases with known CVs or transition conditions, this increase in the computational cost might exceed the benefits.Lastly, unlike MD, Monte Carlo simulations generally allow significant configurational changes, eliminating the need to explore intermediate transition paths.Combined with uncertainty bias, they might potentially avoid exploring intermediate, lowuncertainty transition regions, improving the efficiency of uncertainty-driven data generation.

Machine-learned interatomic potentials
We define an atomic configuration, S = {r i , Z i } N at i=1 , where r i ∈ R 3 are Cartesian coordinates and Z i ∈ N is the atomic number of atom i, with a total of N at atoms.Our focus lies on interatomic NN potentials, which map an atomic configuration to a scalar energy E. The mapping is denoted as f θ θ θ : S → E ∈ R, where θ θ θ denotes the trainable parameters.By assuming the locality of interatomic interactions, we decompose the total energy of the system into individual atomic contributions 11 where S i is the local environment of atom i, defined by the cutoff radius r c .The trainable parameters θ θ θ are learned from atomic data sets containing atomic configurations and their energies, atomic forces, and stress tensors.

Gradient-based uncertainties
We quantify the uncertainty of a trained MLIP by expanding its energy per atom E at = E/N at around the locally optimal parameters θ θ θ * 37-39 where S denotes an atomic configuration as defined in the previous section.Gradient features φ (S) ∈ R N feat can be interpreted as the sensitivity of the energy to small perturbations of the parameters.Here, N feat is the number of trainable parameters of the MLIP, and N at is the number of atoms.We employ the energy per atom E at in Eq. ( 2), as it accounts for the extensive nature of the energy that scales proportionally with the system size.This choice ensures that uncertainties defined using gradient features do not favor the selection of larger structures.Gradient features can also be expressed as the mean of their atomic contributions: φ = ∑ N at i=1 φ i /N at .For atomic gradient features φ i , using the energy per atom in Eq. ( 2) is unnecessary.Here, we use φ = φ (S) and φ i = φ i (S i ), with S i denoting the local environment of an atom i, to simplify the notation.Thus, gradient features can be used to quantify uncertainties in total and atom-based properties of an atomic system, such as energy and atomic forces, respectively.
Particularly, we define the atom-based model's uncertainty (atomic forces) by employing squared distances between atomic gradient features Alternatively, we consider Bayesian linear regression in Eq. ( 2) and compute the posterior uncertainty as where λ is the regularization strength.Here, we define Φ train = φ i (X train ) ∈ R (N at •N train )×N feat with X train denoting the local atomic environments of configurations in the training set of size N train .In this work, we refer to our uncertainties as distance-and posterior-based uncertainties.Equivalent results can be obtained for total uncertainties (energy), employing gradient features φ = ∑ Calculating uncertainties using gradient features is computationally challenging, especially for the posterior-based approach, for which a single uncertainty evaluation scales as O N 2 feat .Therefore, we employ the sketching technique 40 to reduce the dimensionality of gradient features, 13/21 i.e., φ rp i = Uφ i ∈ R N rp with N rp and U ∈ R N rp ×N feat denoting the number of random projections and a random matrix, respectively. 38,39 n previous work, 38 we have observed that uncertainties derived from sketched gradient features demonstrate a better correlation with RMSEs of related properties than those based on last-layer features. 37,53,54 Mre details on sketched gradient features can be found in the following sections.Atom-based uncertainties, defined by the distances between gradient features, scale linearly with both the system size and the number of training structures, i.e., as O (N at N train ).Consequently, they require an additional approximation to ensure computational efficiency.To address this, we employed the batch selection algorithm that maximizes distances within the training set, allowing us to identify the most representative subset of atomic gradient features; see the following sections.

Uncertainty-biased molecular dynamics
Following previous works, 33,34 we define a biased energy as where τ denotes the biasing strength.The negative sign ensures that negative uncertainty gradients with respect to atomic positions (bias forces) drive the system toward high uncertainty regions; see Fig. 1 (c).In this work, we use AD to compute bias forces acting on atom i, denoted as −∇ r i u (S, θ θ θ ) with atomic positions r i .The total biased force on atom i reads These biased forces can be used for MD simulations in, e.g., canonical (NV T ) statistical ensemble to bias the exploration of the configurational space.
In the case of bulk atomic systems, the configurational space often includes variations in cell parameters, which define the shape and size of the unit cell, necessitating enhanced exploration of them.For this purpose, we propose the concept of bias stress, defined by with V denoting the volume of the periodic cell.This expression is motivated by the definition of the stress tensor. 55Here, u (S, θ θ θ ) denotes the uncertainty after a strain deformation of the bulk atomic system with the symmetric tensor ε ε ε ∈ R 3×3 , i.e., r = (1 + ε ε ε)•r.The calculation of the bias stress is straightforward with AD.The total biased stress reads The bias stress tensor in Eq. ( 7) effectively reduces the internal pressure in the bulk atomic system.We propose combining the bias stress tensor with MD simulations conducted under constant pressure conditions (N pT statistical ensemble) to enhance the data-driven exploration of cell parameters and pressure-induced transitions in bulk materials.
Uncertainty gradients exhibit different magnitudes compared to energy gradients.Thus, re-scaling uncertainty gradients is necessary to ensure consistent driving toward uncertain regions.Building upon the approach introduced in Ref. 34, we implement a re-scaling technique that monitors the magnitudes of both actual and bias forces (alternatively, actual and bias stresses) over N steps and then computes the ratio between them.To re-scale bias forces, we use the following expression An equivalent expression is applied for bias stresses.
The re-scaling of uncertainty gradients is reminiscent of the AdaGrad algorithm, 56 which dynamically adjusts the learning rate (analogous to the biasing strength) based on historical gradients from previous iterations.While incorporating momentum through exponential moving averages can improve the AdaGrad approach, treating all past gradients with equal weight is essential within the context of this study.Our attempts to damp learning along directions with high curvature (high-frequency oscillations), similar to the Adam optimizer, 57 did not yield improved performance.We further find that employing species-dependent biasing strengths for bias forces, τ → τ Z i , with a particular emphasis on damping biasing of hydrogen atoms, improves the efficiency of biased MD simulations.
We employ biased MD simulation to generate a candidate pool for AL, as depicted in Fig. 1 (a).To further enhance the exploration of the relevant configurational space and improve the computational efficiency of AL, we employ multiple parallel MD simulations.We expect biased MD simulations to have relatively short auto-correlation times obtained from position and uncertainty auto-correlation functions.Short auto-correlation times imply that the generated candidates will be less correlated than those generated with unbiased MD simulations.However, we cannot guarantee the generation of uncorrelated samples with biased MD simulations throughout AL, particularly in later AL iterations when the uncertainty level is reduced.Therefore, we propose to use batch selection algorithms (see later sections) that select N batch > 1 samples at once.These algorithms enforce the informativeness and diversity of the selected atomic configurations and the resulting training data set.

Gaussian moment neural network
This work uses the Gaussian moment neural network (GM-NN) approach for modeling interatomic interactions. 15,17 -NN employs an artificial NN to map a local atomic environment S i to the atomic energy E i (S i , θ θ θ ); see Eq. (1).It uses a fully-connected feed-forward NN with two hidden lay-ers 15,17 with W (l+1) ∈ R d l+1 ×d l and b (l+1) ∈ R d l+1 representing the weights and biases of layer l + 1.In this work, we employ a NN with d 0 = 910 input neurons (corresponding to the dimension of the input feature vector 512 hidden neurons, and a single output neuron, d 3 = 1.The network's weights W (l+1) are initialized by selecting entries from a normal distribution with zero mean and unit variance.
The trainable bias vectors b (l+1) are initialized to zero.To improve the accuracy and convergence of the GM-NN model, we implement a neural tangent parameterization (factors of 0.1 and 1/ √ d l ). 58For the activation function φ , we use the Swish/SiLU function. 59,60  aid the training process, we scale and shift the output of the NN where the trainable shift parameters µ Z i are initialized by solving a linear regression problem, and the trainable scale parameters ρ Z i are initialized to one.The per-atom RMSE of the regression solution determines the constant c. 17 GM-NN models employ a Gaussian moment (GM) representation to encode the invariance of total energy with respect to translations, rotations, and permutations of the same species. 15By computing pairwise distance vectors r i j = r i − r j and then splitting them into radial and angular components, denoted as r i j = ∥r i j ∥ 2 and ri j = r i j /r i j , respectively, we obtain GMs as follows where r⊗L i j = ri j ⊗ • • • ⊗ ri j is the L-fold outer product.The nonlinear radial functions R Z i ,Z j ,s (r i j , β β β ) are defined as a sum of Gaussian functions Φ s ′ (r i j ) (N Gauss = 9 for this work) 17 The factor 1/ √ N Gauss impacts the effective learning rate inspired by neural tangent parameterization. 58The radial functions are centered at equidistantly spaced grid points ranging from r min = 0.5 Å to r c , set to 5.0 Å and 6.0 Å for alanine dipeptide and MIL-53(Al), respectively.The radial functions are re-scaled by a cosine cutoff function, 11 to ensure a smooth dependence on the number of atoms within the cutoff sphere.Chemical information is embedded in the GM representation through trainable parameters β Z i ,Z j ,s,s ′ , with the index s iterating over the number of independent radial basis functions (N basis = 7 for this work).
Features invariant to rotations, G i , are obtained by computing full tensor contractions of tensors defined in Eq. ( 11), e.g., 15,17 where we use Einstein notation, i.e., the right-hand sides are summed over a, b ∈ {1, 2, 3}.Specific full tensor contractions are defined by using generating graphs. 61In a practical implementation, we compute all GMs at once and reduce the number of invariant features based on the permutational symmetries of the respective graphs.All parameters θ θ θ = {W, b, β β β , ρ ρ ρ, µ µ µ} of the NN are optimized by minimizing the combined squared loss on training data We have chosen C e = 1.0,C f = 4.0 Å 2 , and C s = 0.01 to balance the relative contributions of energies, forces, and stresses, respectively.Using AD, we compute atomic forces as negative gradients of total energy with respect to atomic coordinates F i S (k) , θ θ θ = −∇ r i E S (k) , θ θ θ .
Furthermore, we use AD to compute stress tensor, defined by 55 where E S (k) , θ θ θ is total energy after a strain deformation with symmetric tensor ε ε ε ∈ R 3×3 , i.e., r = (1 + ε ε ε) • r.As the stress tensor is symmetric, we use only its upper triangular part in the loss function.Here, V k is the volume of the periodic cell.
We employ the Adam optimizer 57 to minimize the loss function.The respective parameters of the optimizer are β 1 = 0.9, β 2 = 0.999, and ε = 10 −7 .Usually, we work with a mini-batch of 32 molecules.However, smaller mini-batches were used in the initial AL iterations because the training data sizes were less than 32.The layer-wise learning rates are decayed linearly.The initial values are set to 0.03 for the parameters of the fully connected layers, 0.02 for the trainable representation, as well as 0.05 and 0.001 for the shift and scale parameters of atomic energies, respectively.The training is performed for 1000 training epochs.To prevent overfitting during training, we employ the early stopping technique. 62ll models are trained using PyTorch. 45

Sketched gradient features
We obtain atomic gradient features by computing gradients of Eq. ( 1) with respect to the parameters of the fully connected layers in Eq. (9).Particularly, we make use of the product structure of atomic gradient features.To obtain the latter, we re-write the network in Eq. ( 9) as follows where z (l) and x (l) denote the pre-and post-activation vectors of layer l.Thus, atomic gradient features read To make the calculation of gradient features computationally tractable, we employ the random projections (sketching) technique, 40 as proposed in Refs.38, 39.For atomic gradient features φ i (S i ) ∈ R N feat and a random matrix U ∈ R N rp ×N featwith N feat and N rp representing the number of atomic features and random projections, respectively-we can define randomly projected atomic gradient features as While a Gaussian sketch could be employed, where the elements of U are drawn from standard normal distributions, we use a tensor sketching approach that is more runtime and memory efficient. 39Specifically, denoting the element-wise or Hadamard product as ⊙, we compute For atom-based uncertainties, we can directly use the sketched atomic gradient features.For (total) uncertainties per atom, we need to work with a mean φ (S) = ∑ N at i=1 φ i (S i )/N at .Thus, we use that the individual projections (rows of Eq. ( 20)) are linear in the features and obtain for the (total) gradient features 38 φ rp (S) = given that all of the individual random projections use the same random matrices.

Ensemble-based uncertainty quantification
The variance of the predictions of individual models in an ensemble of MLIPs can be used to quantify their uncertainty.Thus, we define the variance of predicted energy as where M is the number of models in the ensemble.The variance of atomic forces reads Here, Ē and Fi denote the arithmetic mean of the predictions from individual models.Our experiments demonstrated that M = 3 is sufficient to obtain good performance.Using larger ensembles would make the ensemble-based uncertainty quantification even more computationally inefficient than gradientbased alternatives.

Batch selection methods
The simplest batch selection method is based on querying points only by their uncertainty values.Specifically, given the already selected structures X batch from an unlabeled pool X pool we select the next point by S = arg max until N batch > 1 structures are selected.In this work, we use this selection method combined with ensemble-based uncertainties.
For the posterior-based uncertainty, we can constrain the diversity of the selected batch by using the posterior covariance between structures Cov S, S ′ = λ 2 φ (S) ⊤ Φ ⊤ train Φ train + λ 2 I −1 φ S ′ , (25)   with Φ train = φ (X train ).The corresponding method greedily selects structures, i.e., one structure per iteration, such that the determinant of the covariance matrix is maximized 38,39,63 S = arg max For the distance-based uncertainty, we ensure the diversity of the acquired batch by greedily selecting structures with a maximum distance to all previously selected and training data points.The respective selection method reads 38,39,64  ( We also applied this batch selection method to define the most representative subset of atomic gradient features when calculating atom-based uncertainty using feature space distances.

Figure 1 .
Figure 1.A schematic overview of an AL algorithm for MLIP training.Training structures are selected from data gathered during biased or unbiased MD simulations.(a) An AL experiment begins with training an MLIP in the first iteration using a small set of randomly perturbed initial configurations.The current MLIP is employed in each iteration to run parallel MD simulations.Each simulation continues until it reaches a predefined uncertainty threshold.Then, a batch of configurations is selected from all trajectories.Reference energies and forces of these samples are evaluated using a DFT solver, updating the training data set.The updated data set is employed for training the MLIP in the next iteration.(b) Adaptive biasing strategies like metadynamics enhance the exploration of the configurational space.In metadynamics, exploration along manually defined CVs is facilitated by adding Gaussian functions to a history-dependent bias (areas filled by blue, orange, and red colors).However, even for well-defined CVs, exploring the configurational space of interest may require long simulation times due to the diffusive motion along these CVs.(c) Uncertainty-biased MD aims to minimize uncertainty u (grey shaded area) related to the actual error, thereby facilitating the exploration of the configurational space.In uncertainty-biased MD, we subtract the MLIP's energy uncertainty from the predicted energy (continuous black line) and run MD simulations using the altered energy surface (dashed black line).Curved lines denote distinct MD trajectories.Unlike metadynamics, uncertainty-biased MD operates without defining CVs and drives MD simulations toward high uncertainty regions in each iteration.

Figure 2 (
top) demonstrates results for MLIPs trained on 45 MIL-53(Al) configurations, while five samples were additionally used for early stopping and uncertainty calibration.

Figure 2 (
bottom) shows the results for MLIPs trained and validated on 450 and 50 MIL-53(Al) configurations, respectively.In both experiments, the training and validation samples were selected from the data sets provided by Ref. 32.The first 50 in eV/Å Maximal atomic uncertainty in eV/Å

Figure 2 .
Figure 2. Correlation of maximal atom-based uncertainties with maximal atomic force RMSEs for MIL-53(Al).The results are presented for the test data set from Ref. 32.All uncertainty quantification methods are calibrated using CP and atomic force RMSEs.The top row shows the results of MLIPs trained using 45 atomic configurations, while five are additionally used for early stopping and uncertainty calibration.The bottom row shows the results obtained with 450 and 50 MIL-53(Al) configurations, respectively.The training and validation data are taken from Ref. 32. Transparent hexbin points represent uncertainties calibrated with α = 0.5 (low confidence; see Methods), while opaque ones denote uncertainties calibrated with α = 0.05 (high confidence).Calibrating uncertainties with a high confidence level helps align the largest actual error with the corresponding uncertainty, shifting the hexbin points to or below the red diagonal line.This alignment is crucial for identifying unreliable predictions and prompting the termination of MD simulations.

Figure 3 .
Figure 3.Comparison of AL approaches employing biased and unbiased MD simulations to generate the candidate pool of atomic configurations for alanine dipeptide.Results are provided for the posterior-based uncertainty quantification derived from sketched gradient features.Unlike unbiased MD simulations, which rely on atom-based uncertainties to terminate MD simulations, biased MD simulations use both total and atom-based uncertainties for biasing MD simulations and prompting their termination, respectively.We use three metrics to assess the performance of our AL approaches: (a) Coverage of the CV space; (b) Energy RMSE; and (c) Force RMSE.All RMSEs are evaluated on the alanine dipeptide test data set; see Methods.Shaded areas denote the standard deviation across five independent runs.The alanine dipeptide molecule, including its CVs, is shown as an inset in (a).The color code of the inset molecule is C grey, O red, N blue, and H white.(d) Ramachandran plots demonstrating the CV spaces explored by the four AL experiments.Biased MD simulations achieve exceptional performance, close to those of MD simulations conducted at 1200 K, without a priori knowledge of transition temperatures between stable states.The CV space covered by uncertainty-biased MD simulations at 300 K matches that of unbiased simulations at 1200 K, significantly outperforming the coverage achieved by unbiased MD simulations at 300 K and 600 K.

Figure 4 .
Figure 4. Evaluation of CV space exploration rates using biased and unbiased MD simulations for alanine dipeptide.Here, MD simulations generate candidate pools of atomic configurations for AL algorithms.Results are provided for the posterior-based uncertainty quantification derived from sketched gradient features.Unlike unbiased MD simulations, which rely on atom-based uncertainties to terminate MD simulations, biased MD simulations use both total and atom-based uncertainties for biasing MD simulations and prompting their termination, respectively.We use three metrics to asses the exploration rates: (a) Coverage of the CV space; (b) Auto-correlation functions of atomic positions; and (c) Auto-correlation functions of atom-based uncertainties.Shaded areas denote the standard deviation across five independent runs.(d) Time evolution of the maximal atom-based uncertainty within an AL iteration and the entire experiment.Time evolution is shown for one of the eight MD simulations.The dashed gray line represents the uncertainty threshold of 1.5 eV/Å.The insets show configurations that reached the uncertainty threshold for uncertainty-biased MD.(e) Ramachandran plots illustrate the exploration of the CV space over AL iterations and the entire experiment.Ramachandran plots are presented for unbiased MD simulations at 300 K and 1200 K and biased MD simulations at 300 K. Simulation time refers to the effective number of MD steps (× 0.5 fs) required to reach the final coverage, while lag time denotes the time interval between two successive MD frames.Biased MD simulations at 300 K exhibit at least two times higher exploration rates than their unbiased counterparts at 300 K and 600 K. Their exploration rates are comparable to those of unbiased MD simulations at 1200 K, with the advantage of gradually distorting the molecule, reducing the risk of its degradation.

Figure 5 .
Figure 5.Comparison of AL approaches employing biased and unbiased MD simulations to generate the candidate pool of atomic configurations for MIL-53(Al).Results are provided for the posterior-based uncertainty quantification derived from sketched gradient features.Unlike unbiased MD simulations, which rely on atom-based uncertainties to terminate MD simulations, biased MD simulations use both total and atom-based uncertainties for biasing MD simulations and prompting their termination, respectively.We use three metrics to assess the performance of our AL approaches: (a) Energy RMSE; (b) Force RMSE; and (c) Stress RMSE.All RMSEs are evaluated on the MIL-53(Al) test data set.32Shaded areas denote the standard deviation across three independent runs, except for metadynamics.For it, shaded areas denote standard deviation across three randomly initialized MLIPs.(d) Volume distribution for atomic configurations acquired during MD at 600 K, along with volume-dependent energy, force, and stress RMSEs.(e) Volume distribution for configurations acquired during MD at 300 K, along with volume-dependent energy, force, and stress RMSEs.We employ a temperature of 300 K to reduce the probability of exploring the large-pore state of MIL-53(Al).Bias-stress-driven MD simulations outperform metadynamics-based simulations with adaptive biasing of the cell parameters.Metadynamics aims to cover the volume space uniformly.In contrast, uncertainty-biased MD generates training data sets that reduce force and stress RMSEs uniformly.Additionally, biased MD simulations enhance the exploration of closed-and large-pore states of MIL-53(Al) shown in the inset of (d).

Figure 6 .
Figure 6.Evaluation of configurational space exploration rates using biased and unbiased MD simulations for MIL-53(Al).Here, MD simulations generate candidate pools of atomic configurations for AL algorithms.Results are provided for the posterior-based uncertainty quantification derived from sketched gradient features.Unlike unbiased MD simulations, which rely on atom-based uncertainties to terminate MD simulations, biased MD simulations use both total and atom-based uncertainties for biasing MD simulations and prompting their termination, respectively.We use three metrics to asses the exploration rates: (a) Volume distribution of configurations sampled throughout the experiment; (b) Auto-correlation functions for positions; and (c) Auto-correlation functions for atom-based uncertainties.Shaded areas denote the standard deviation across three independent runs.(d) Time evolution of the volume distribution of configurations acquired during training and of energy, force, and stress RMSEs evaluated on the test data set depending on the unit cell volume.Bias-stress-driven MD simulations achieve exploration rates comparable to those of high-pressure unbiased MD simulations.They efficiently reduce RMSEs uniformly across the entire volume space, even in the early stages of AL, surpassing the performance of unbiased simulations.

Figure 7
Figure 7. ACF for positions obtained by running biased and unbiased MD simulations at 300 K for MIL-53(Al).Shaded areas denote the standard deviation across three independent runs.We employ a temperature of 300 K to reduce the probability of exploring the large-pore state of MIL-53(Al).The ACF exhibits strongly correlated motions attributed to volume fluctuations induced by the bias stress.These fluctuations can be modeled by a sine wave with a period twice the length of the simulation.The red line denotes a sine wave with a larger noise amplitude than the one denoted by the blue line.