Accelerating off-lattice kinetic Monte Carlo simulations to predict hydrogen vacancy-cluster interactions in $\alpha$-Fe

We present an enhanced off-lattice kinetic Monte Carlo (OLKMC) model, based on a new method for tolerant classification of atomistic local-environments that is invariant under Euclidean-transformations and permutations of atoms. Our method ensures that environments within a norm-based tolerance are classified as equivalent. During OLKMC simulations, our method guarantees to elide the maximum number of redundant saddle-point searches in symmetrically equivalent local-environments. Hence, we are able to study the trapping/detrapping of hydrogen from up to five-vacancy clusters and simultaneously the effect hydrogen has on the diffusivity of these clusters. These processes occur at vastly different timescales at room temperature in body-centred cubic iron. We are able to predict the diffusion pathways of clusters/complexes without a priori assumptions of their mechanisms, not only reproducing previously reported mechanisms but also discovering new ones for larger complexes. We detail the hydrogen-induced changes in the clusters' diffusion mechanisms and find evidence that, in contrast to mono-vacancies, the introduction of hydrogen to larger clusters can increase their diffusivity. We compare the effective hydrogen diffusivity to Oriani's classical theory of trapping, finding general agreement and some evidence that hydrogen may not always be in equilibrium with traps, when the traps are mobile. Finally, we are able to compute the trapping atmosphere of meta-stable states surrounding non-point traps, opening new avenues to better understand and predict hydrogen embrittlement in complex alloys.


Introduction
It has been known for over 100 years [1,2] that the presence of hydrogen (H) in metalsparticularly steels -can severely reduce ductility, leading to catastrophic failure below the yieldstress.The processes that cause these effects are collectively termed hydrogen embrittlement (HE).Despite a century of research, the core mechanisms of HE have yet to be fully understood and are still a topic of active research/debate [3].The difficulty in understanding HE stems from its multi-scale nature; a full description of HE requires understanding of H-adsorption, H-diffusion/transport, and (most crucially) H interaction/influence with/on crystal defects.These processes span many orders of length/time scales, which presents challenges when isolating/connecting the impact of H at the atomistic scale to the macroscopic results.
A breadth of mechanisms for HE have been proposed, most of them revolve around the interactions between H and crystal defects.For a more complete description see Ref. 4 and Ref. 3. A few of the most prominent mechanisms are: Hydrogen-induced decohesion (HID) [5][6][7], Adsorption-induced decohesion (AIDE) [8], Hydrogen-enhanced localised plasticity (HELP) [9][10][11][12][13] and Hydrogen-enhanced strain-induced vacancy (HESIV) formation [14][15][16].Many of these mechanisms are supported by bodies of experimental work.As few are orthogonal to each other, it is likely that a full description of HE contains a combination of two/three of these mechanisms (alongside some yet undiscovered).
Theoretical and computational modelling play a crucial role in the study of the Fe-H system due to the inherent difficulty in experimental observations of atomic H [17].The low solubility and high diffusivity [18] of H in body-centred cubic (BCC) iron (Fe), combined with the small 'nucleus' and low electron density, make direct experimental observations extremely challenging.Instead, techniques such as thermal desorption analysis (TDA) [19], electro-permeation (EP) experiments [20] and atom probe tomography (APT) [21] are employed.Many of these methods (with the notable exception of APT) are unable to directly investigate H diffusion and trapping behaviour within metals on the atomic scale thus, we must fall back to computation/theory to unravel the atomic mechanisms that cause HE.Different computational modelling techniques have been used to investigate HE over varied assumptions and time/length scales.On the smallest length-scales, density functional theory (DFT) is used to study H binding sites [22][23][24] and occasionally combined with molecular dynamic (MD) in ab initio MD to study H diffusion at the highest accuracy [25].Additionally, work has been done using path-integral MD [26,27] to explore H diffusion in iron while incorporating quantum effects, which are known to be important at low temperatures [26].Nevertheless, progress has been made modelling much-larger systems using classical approaches, the most popular of these is MD and its accelerated-variants using semi-empirical potentials.This has enabled the study of H-defect kinetics, such as grain-boundaries [28,29] and dislocations [30,31].Molecular dynamics simulations must resolve atomic vibrations in order to accurately track the dynamics of atom-scale systems.This imposes a significant computational effort as the integration time-step must be ofthe-order of these vibrations.Hence, even using today's computers, MD simulation timescales rarely exceed O (100µs).Continuing to the longest/largest scales, Monte-Carlo (MC) [32] and kinetic Monte-Carlo (KMC) [33] methods overcome this barrier by ignoring the explicit phasespace trajectory, instead focusing on state→state transitions.This can significantly accelerate simulations.However, these methods are confined to discrete representation and require knowledge of mechanisms a priori.
Off-lattice 1 kinetic Monte Carlo (OLKMC) [34], an extended KMC method, is a general and unbiased tool (discovering mechanisms without any a priori input) being successfully applied to study the kinetics of various systems, e.g.Fe/Cu/Al, BBC/FCC, disordered systems, extended defects, point defects, etc. [35][36][37][38][39]. Off-lattice KMC allows for the exploration of continuous systems, at previously inaccessible timescales, at atomic fidelity.As such, it is the perfect tool to explore the uncertain and complex mechanisms controlling HE.
In this paper we develop enhancements to the OLKMC method.The motivation for our research is building a general simulation framework capable of modelling the complex interactions between crystal defects and H in iron, into the timescales required to study the mechanisms of HE.Our main contribution is an error-tolerant atomic local-environment (LE) identification/matching process to elide saddle-point searches.
We apply our OLKMC implementation to study the diffusion of vacancy clusters in the presence of H.We demonstrate OLKMC is capable of reaching embrittlement timescales, of-the-order-of seconds, while simultaneously resolving the atomic motion of H-atoms.With OLKMC we can study the atomic mechanisms through which H affects the diffusion of vacancy clusters.These are important first steps towards modelling the more complex H-defect interactions required to gain a full understanding of HE.

Background: off-lattice kinetic Monte Carlo
The OLKMC method [34] operates on a system of n atoms in continuous space and encodes atomic interactions via a potential-energy function: Local minima (basins) of U are states of the system and are linked together, into a Markov chain, by the mechanisms between them and their corresponding rate constants: The rejection-free n-fold way KMC algorithm [33] is used to advance the system forward; with the system in state i, the next state k is selected as the solution to: where ρ 1 ∈ (0, 1] is a uniform random number and j, k ∈ {1, 2, . . ., n}.The rate-constants model a Poisson process and therefore the time elapsed during a single MC step is [40]: where ρ 2 ∈ (0, 1] is a second uniform random number.In traditional KMC, these rate constants are typically pre-computed and the catalogue of mechanisms known a priori.By applying the harmonic transition state theory (HTST) approximation, the rate constant connecting basins i → j via the single, first-order, saddle-point (SP) ‡ is described by the Arrhenius equation [41]: where β = 1 k B T , νij is the attempt frequency or Arrhenius prefactor and E ‡ , E i are the energies of the system at the SP and state i, respectively.Hence, it is possible to build/adapt the atomistic mechanism catalogue on-the-fly by searching U for these SPs.

Saddle-point searches
The process of finding SPs, called the saddle-point search (SPS) procedure, is critical to the efficiency of OLKMC simulations.Minimum-mode following methods [42] for single ended searches (starting from a single basin) find SPs of the potential energy surface (PES) by climbing from a local minima to an adjacent SP.This is achieved by inverting the component of the force parallel to the minimum eigenmode of the PES.Translating along this force maximises the energy along the minimum-mode and minimises the energy along all other modes hence, converging to a local SP [43].
Several minimum-mode following algorithms were unified under one mathematical framework in Ref. 44 and compared.All investigated methods are bounded in efficiency by the Lanczos method [45].We choose to use the superlinear dimer-method [43], owing to it requiring fewer force evaluations but converging almost as fast as the Lanczos method.The superlinear dimer-method contains several optimisations over the original dimer-method [46] -notably the improvements of Ref. 47.We discuss a minor modification in Section 3.3.

Saddle-point recycling
Each SPS requires many hundreds of calls to the force-field and many SPS must be carried out to ensure the completeness of the KMC catalogue.Due to the local nature of mechanisms, most of these SPS are unnecessary.For example, in a section of perfect lattice the LE around each atom is identical hence, the mechanisms that can occur at each atom are identical.Secondly, consider two atoms sufficiently far apart; a local mechanism centred on one will likely not change the LE around the second therefore, its accessible mechanisms remain the same.Finally, many atoms are in LEs differing only by an Euclidean transformation (of the form r → Rr + c, with R an orthogonal matrix) hence, their mechanisms are related by the same transformation.
Multiple methods have been developed to reduce the cost of building the KMC catalogue by exploiting this locality.The simplest of these are system-wide methods, which attempt to reuse SPs discovered at the previous step [48] however, these do not exploit any relevant symmetries.Due to this inefficiency, they will not be discussed further.Alternatively, local methods seek to classify the LE around each atom in the system.It is then possible to associate mechanisms entirely within a LE.Mechanisms can then be cached and, when an equivalent LE is discovered, instead of launching new SPS, the mechanisms can be reconstructed from the cached information.If the LE classification is suitably invariant, these methods can account for all relevant symmetries hence, the focus shifts to LE classification, of which a number of methods have been proposed.Space discretisation.Discrete pattern recognition methods for LE classification have been explored [49,50] however, these often fail to account for (continuous) symmetries and, because of the discretisation of space, are sensitive to small changes in atomic positions (due to inexact energy minimisations).

Norm-based.
Moving toward a tolerant classification, Ref. 51 presents a system that stores the LE of an atom at r i as {r ij |r ij < r env }.Local environments are considered equivalent when each atom in two superimposed LEs have a corresponding atom in the second environment within tolerance ∆r tol .This method gracefully allows for error on the positions of atoms in a LE.However, no method is presented for determining this equivalence between arbitrarily permuted/transformed LEs.
Topological.Graph-based topological methods, introduced in Ref. 37, fully exploit the symmetry of LEs.Atoms in the LE are used to draw a graph; atoms become nodes and atoms considered bonded (closer than some distance) are connected with an edge.LEs are equivalent if their graph representations are isomorphic.This is, in general, a problem in its own complexity class GI ∈ NP which is not known to be in either P or NP-complete [52].Fortunately, there exists implementations such as the nauty2 software [53] which can solve this problem in polynomial time for many graphs.Although powerful, topological methods rely on a one-to-one correspondence between topology and geometry that may breakdown.Furthermore, they lose the tolerance of norm-based methods.
In Section 3.2, we introduce our own norm-based LE classification that combine the desirable properties of many of the previous methods.

Superbasins and the low-barrier problem
A common issue encountered during OLKMC simulations is the low-barrier problem (LBP) [37,54].This occurs when a collection of basins -often called a superbasin -are connected by a series of fast mechanisms.It requires many MC steps to escape from a superbasin.As the rate-sum, n j=1 Γ ij in Eq. ( 4), is very large during this period, the simulated time advances very slowly.The simplest methods to overcome the LBP effectively combine states connected by fast mechanisms into a single state and ignore all internal superbasin kinetics [55] -this is clearly not exact.Alternatively, TABU-like [56] methods that ban recent-transitions have been employed [37,57].These have been shown to be thermodynamically sound providing the total number of KMC steps is much greater than the oldest banned transition.Two exact solutions to the LBP are presented in Ref. 58; the key insight is the partitioning of states into transient and absorbing sets, followed by analytically solving the motion inside the transient states.A similar exact solution, the mean-rate method (MRM) [59], has been extended to OLKMC to form the basin auto-constructing MRM (bac-MRM) [60], which constructs superbasins on-the-fly.We discuss minor extensions in Section 3.5.

Interatomic potentials
In order to reach HE timescales -of-the-order-of seconds -we employ embedded atom method (EAM) potentials [61].These are short-range, fast, well tested, semi-empirical models of the potential energy of a collection of atoms.Although they are not without criticism [62], EAM potentials have become well established in the literature, particularly for metallic systems.We use the variation presented and fitted in Ref. 63, which generalise the EAM embedding function and are fit to first-principles (DFT) measurements and experimental data.Also, fit to a wide variety of targets, the potentials provide good reproduction of several crystal defect structures.We also include the modifications of Ref. 30, the introduction of additional H-H repulsion, to reduce the H clustering observed in the original potentials.

New invariant and tolerant local-environment classification
In previous work [64], we adopted a topological classification methodology however, this relied on the aforementioned one-to-one correspondence between topology and geometry.We found this       1b (grey shapes) via a permutation of the labels and a rigid-body rotation (transformation).Shape (square/circle) denotes the colour of each point and arrows act as a guide to the eyes for the permutation.In Fig. 1e we see all the points/atoms are close enough that the LEs can be considered equivalent.correspondence to break-down in the Fe-H system due to the small size of the H-atom and small displacements during mechanisms.We tried to overcome this problem by allowing the bonding distance to vary with the species and colouring each atom as the pair formed from the atoms' atomic number and the local sum: with c a problem-dependant scaling constant and G ij elements of the adjacency matrix.This encodes much more of the information contained within {r ij } into the coloured graph.Unfortunately, with the above modifications, infinitesimal perturbations in position are more likely to result in many keys representing the same geometry.Finally, there is no quantitative/qualitative link between topological keys and the similarity of LEs.

Norm-based definition of equivalence
We require a notion of equivalence between LEs that is invariant under: 1. Infinitesimal-perturbations of atomic positions.2. Permutation of identical atoms.
3. Euclidean transformations of the group of atoms.
In order to overcome the aforementioned difficulties when dealing with the small errors in the atomic positions we move away from the graph-based representation of the atoms.Instead, we represent atoms as coloured points (2-tuples): and a LE centred on the point (p 1 , p 1 ) as the point cloud: where (without loss of generality) we set the centroid of the points in P to the origin and p 1 − p i < r env for all i with r env the radius of the LE.The question of determining if two LEs, P and Q (of the same size), are equivalent is the same as asking if there exists a transformation matrix O and permutation π such that: subject to the constraints: where δ is the maximum 2 norm or distance between the point-cloud as well as the maximum inter-point separation: The choice of δ controls how similar two LEs must be before they are considered equivalent.This equivalence is represented graphically in Fig. 1.By design, relabelling a pair of identical points will always result in an equivalent environment.A desirable property, is to ensuring this relabelling results in a corresponding change in π (instead of being absorbed into δ 2 ) hence, for a useful definition of equivalence, we require: where r min is the minimum intra-point separation: r ij = r i − r j .This ensures a consistent correspondence between points in equivalent LEs.Construction of a point-cloud centred on a point (by selecting points within r env from some larger set), is not fully tolerant of point perturbations near the edge of the LE, which could move atoms into/out of the LE.This is acceptable as the LEs will be used for mechanism reconstruction which requires a 1-to-1 correspondence between points.Otherwise, unbalanced equivalence is a natural extension, by simply adding the square of the distance between unmatched points and the boundary of the LE to Eq. ( 9).

Connection to the potential energy
An inequality on δ can be established by Taylor-expanding the potential energy, U , about a converged extrema: where we have applied the eigendecomposition [65, p. 80] to the real symmetric Hessian, H, forming Λ the diagonal matrix of eigenvalues and Q, the orthogonal matrix of eigenvectors.Noting an orthogonal transformation does not change the magnitude of a vector.A (weak) upper-bound on Diagram showing the orientation of two pairs of points i and j in point-clouds P (dark grey) and Q (white) that maximises ∆ 2 i + ∆ 2 j , the sum of the square inter-point separations, under the constraint ∆ 2 i + ∆ 2 j ≤ δ 2 from Eq. ( 9).
∆U near a minima can be constructed from Eq. ( 13): where λ max is the maximum eigenvalue of H and the third line follows from Eq. ( 9).If two LEs are equivalent and δ is small enough (such that the mechanisms are transferable), then the energy barrier(s) of the reconstructed mechanism(s) should be of-the-order-of ∆U off the true energy barrier(s).Ultimately, choosing a smaller value of δ increases the accuracy of the simulation, at the expense of increasing the number of SPS required.Hence, the largest value of δ that ensures Eq. ( 12) and that ∆U is much less than the minimum relevant energy-barrier should be chosen.

Point-cloud registration
Simultaneously determining the orthogonal transformation and permutation that minimises the 2 norm between LEs is a variation of the well-studied rigid point-cloud registration problem [66,67].
In general, this is not possible at the speeds required by OLKMC.However, we are only interested in finding a specific permutation/transformation that satisfies Eq. ( 9) and Eq. ( 10).Here we present a greedy method, that leverages the problem-specific distribution of points.
To find the minimising permutation, we first realise a relationship between the inter-point separations and intra-point separations p ij = p i − p j and q ij = q i − q j of the respective point-clouds P and Q. Studying Fig. 2 we see: Then maximising ∆ i + ∆ j , subject to the constraint from Eq. ( 9), we find our intra-point tolerance criterion: which can be used to match pairs of points in Q to P by recursively ordering Q.At each recursion we search for a point that ensures Eq. ( 16) holds for all previously ordered points in Q.As this method only requires the intra-point separations in P and Q (which are invariant under Euclidean transformations of P and Q), this method can match the order of the points in LEs that are related by arbitrary rotations/reflections before solving for the rotation/reflection.Once the order of the points in P and Q match we must solve: this is equivalent to the orthogonal Procrustes problem [68] which can be efficiently solved using the singular value decomposition (SVD) see Appendix A.
Algorithm 1 Function _ and its subroutines attempt to permute elements of Q such that it is equivalant to P (Eq.( 9) holds); returns True if P and Q are equivalent otherwise False.
Require: P and Q contain the same number of points, n > 1. function for j ← i, . . ., n do if p i = q j then Swap points i and j in The full permutation/ordering and equivalence-testing method is detailed in Algorithm 1; once the algorithm has matched the orders and colours it checks if the permutation satisfies Eq. ( 9).This is required as there may be degenerate permutations satisfying Eq. ( 16) ∀i, j but not satisfying Eq. (9).A key consideration for the usefulness of Algorithm 1 is its complexity; for two randomly permuted point-clouds, we show in Appendix B (under moderate assumptions) provided: r min (18) the average-case time complexity is O (n 2 ).

Choosing δ values
Typically in the Fe-H system (using the perfect lattice as an order of magnitude estimation) . Therefore, according to Eq. ( 14), we choose δ = 0.01Å resulting in an energy tolerance of approximately ∆U ≤ 5 × 10 −4 eV.In practice we expect ∆U 5 × 10 −4 eV as Eq. ( 14) assumes ∆x is parallel to the largest eigenvector of H which is unlikely.We see ∆U is much less than the energy barrier for H diffusion, around 5 × 10 −2 eV, typically the fastest mechanism in the Fe-H system.
The choice of δ is continuously validated during a simulation.If δ is too large then, following a mechanism reconstruction, a relaxation of the lattice will result in a large energy change.If/when this is detected δ can be adjusted.Conversely, if no such energy changes are detected δ can be increased to try and increase the performance of the simulation.Furthermore, if we encounter a local environment that breaks Eq. ( 12) or Eq. ( 18) then δ can be reduced.

Heuristics
With the algorithms detailed thus far, a catalogue could be built that satisfies our requirements for LE classification however, although we ensure the condition of Eq. ( 18), a call to Algorithm 1 still takes ≈ 10µs with a LE containing 65 atoms.Hence, as we may call Algorithm 1 for every LE in the catalogue when encountering a new LE, this becomes prohibitively expensive.To reduce the search-space we partition the catalogue into sub-catalogues each indexed by a key, k: with n α the number of points in P of the α th colour.Due to its discrete nature, k can be used as the key to a hash-table (or other suitable key-value store) enabling O (1) look-up of the sub-catalogues.The sub-catalogues may still become very large, especially in systems where all points are the same colour.As a simulation progresses, we can sort the order of the LEs in each sub-catalogue by its occurrence count.This significantly decreases the look-up time for a typical LE as many systems have most points in the same LE and only a small number of "active" points (e.g near defects) in rare LEs.
To further accelerate searches of the sub-catalogues we introduces a second discriminator, the fingerprint f , a collection of sorted sets/lists: where p ij denotes the intra-point distances between points i and j and the superscripts α, β indicate the colour of the points in the point pair.For example, in the H-Fe system there are two possible point colours hence, f contains five ordered lists.Two each containing the intra-point distances between points of a particular colour and the central point; a further three lists containing the intrapoint distances between pairs of atoms coloured H-H, Fe-H/H-Fe, and Fe-Fe.By construction, f is invariant under Euclidean transformations and permutations of the points in P .Two fingerprints can be compared for equivalence as follows: Algorithm 2 Compare two fingerprints for equivalence under Eq.( 9) subject to the constraints of Eq. (10).
Require: P and Q have matching keys.function By construction, it is necessary, but not sufficient, for two LEs fingerprints to be equivalent for Algorithm 1 to return True.In practice, equivalence of fingerprints is a very strong pre-conditioner for Algorithm 1.As comparison of fingerprints is orders of magnitude faster (typically taking tens of nanoseconds), this substantially accelerates searching the sub-catalogues.

Searching a catalogue of LEs
The full method for classifying a LE, represented by the point-cloud Q, and reconstructing the mechanisms discovered by previous SPS at an equivalent LE proceeds as follows:

Saddle-point searches
In our implementation, we diverge slightly from the original formulation of the superlinear dimer method [43] during the dimer translation step.We still use the L-BFGS algorithm [69,70] for determining the translation direction and step size but, introduce a trust-radius based approach to limit the step-size.The maximum step size, s trust , is scaled according to the success of the previous steps; the projection of the effective gradient on the search direction is calculated after a step: where p is the approximate Newton step, computed using the L-BFGS method, and F eff is the effective force acting on the dimer.An ideal step length would have P = 0. Hence, we increase s trust when P < −P tol and decrease s trust when P > P tol .Additionally, we bound s trust such that s min < s trust < s max .

Kinetic prefactors
Most OLKMC implementations apply the constant pre-factor approximation, νij = v, to Eq. ( 5) [37,54] however, Ref. 71 identifies large variations in νij during Al adatom diffusion events.This is to be expected when dealing with heterogeneous systems and evidence to start calculating νij , rather than relying on constant approximation.Applying HTST and assuming the PES is quadratic near the SP, we can calculate νij for each mechanism [41,72]: where ν ‡ k , ν i k are the real normal-mode frequencies at the SP and state i respectively.This requires computing the full mass-weighted Hessian for each LE.This can be done most efficiently analytically; the procedure is described in the supplementary material.

Superbasin caching
We elect to further extend the bac-MRM to incorporate superbasin caching.We follow Ref. 60 however, when a superbasin is exited, instead of discarding the superbasin, it is stored into a buffer.The implementation then checks if this new "exit" state is in any of the cached superbasins, if so the corresponding superbasin is loaded from the cache.This is particularly critical as it avoids reexploring a superbasin that is re-entered immediately after it has been exited.We also dynamically set the tolerance (which the forward and reverse barriers of a mechanism must be less than) for exit state classification.This is achieved by lowering the tolerance when a superbasin gets too large and increasing it when the number of superbasins in the cache exceeds some user-defined threshold.

Measuring diffusivity
The diffusion coefficient of a collection of atoms over a time t, can be extracted from the mean-squared displacement (MSD): of the atoms and the Einstein equation [73]: Figure 3 Vacancy clusters diffusing in a perfect 6 3 unit-cell supercell at 300K, dashed lines are fit to a constant.
As we use a single H atom throughout our simulations, we compute the diffusivity of H by dividing the trajectory into intervals and averaging the diffusivity computed in each interval [74,75].Due to the complexity of explicitly tracking the position of vacancies during a simulation, we use the MSD of the Fe atoms to calculate the simulated diffusivity, D sim , and effective diffusivity, D eff of individual defects.These are related to the MSD via [76]: with x d the defect concentration.This has the additional benefit of averaging over many Fe atoms.

Vacancy cluster diffusion
We construct a vacancy-cluster, V n , in a (otherwise perfect) 6 3 unit-cell BCC supercell.After a series of mechanisms, the cluster is moved to an equivalent state (just rotated/translated).In combination with the periodic boundary conditions, our norm-based caching recognises this symmetry, reconstructs saddle-points and elides new searches.Hence, SP searches were only required during the initial learning phase of the simulation.
The V n diffusion results are presented in Fig. 3 and summarised in Table 1.The energy profiles for the identified mechanisms are presented in Fig. 4 alongside the mechanisms themselves in Fig. 5.In Fig. 3, we see convergence to diffusive behaviour for all clusters, verifying that we are reaching diffusive timescales.These timescales are a property of each system and range between 10 −5 s and 1s.The expected behaviour for cluster diffusivity is for larger clusters to become less mobile.This trend is visible in Table 1 but, the diffusivity for V 3 seems to reverse this trend.We shall now discuss each cluster in detail to fully understand this behaviour.
V 1 .The single vacancy diffuses by 1  2 111 vacancy hops (a feature common to all the clusters and complexes) with an activation energy of 0.65eV and kinetic pre-factor 7.44 × 10 13 Hz.The energy barrier and mechanism are in good agreement with the literature [75,76].
V 2 .The minimum-energy configuration (MEC) for V 2 is the second nearest neighbour (NN) pair followed by the 1 st NN then 4 th NN orientations, this matches the literature [77].The predominant V 2 diffusion mechanism was oscillations between the 2 nd NN and 4 th NN states, with an energy barrier of 0.65eV.The 1 st NN pathway may be expected to be the dominant mechanisms, as one may predict the transition to the lower-energy 1 st NN state to have a lower energy-barrier.However, the 2 nd NN to 1 st NN transition has an energy barrier of 0.72eV, making it kinetically less-favourable.The V 2 diffusion barrier is very close to the diffusion barrier for V 1 , this may be an artefact of the EAM potential used, as ab initio studies typically predict an energy barrier 0.05-0.11eVlower [78,79].Nevertheless, this agrees with other works that use similar semi-empirical potentials [75] hence, this discrepancy is an artefact of the potential and would be resolved with improved potentials.V 2 has a diffusivity approximately half V 1 , this is due to the combination of a near-identical energy barrier but requiring two vacancy-hops to diffuse.V 3 .As previously hinted, V 3 defies the expectation and is the most mobile cluster with a diffusivity more than an order of magnitude higher than V 1 at 300K.This is due to the MEC permitting a vacancy hop with an energy barrier of 0.48eV, that almost immediately reforms the MEC just displaced/rotated.This means, similarly to V 1 , V 3 can diffuse without changing its shape.The simulated MEC matches theoretical predictions, as does the mechanism [79].
V 4 .The mobility of V 4 resumes the decreasing trend.This is predominantly due to the high energy barrier, 0.73eV, required to break apart the MEC.Reference 79 used DFT to compute the V 4 diffusion mechanism, they obtained a lower energy barrier of 0.48eV however, their mechanism matches ours.This could be due to image interactions introduced by the small 4 3 unit-cell supercell used or indicate work is required on the EAM potential.Nevertheless, the match in mechanism pathway is reassuring and indicates the key physics is being captured by the potential.

Discussion
Off-lattice KMC has successfully been applied to study the diffusion of vacancy clusters.The mechanisms predicted and diffusivity-trends match those seen in the literature [75,77,79].However, they have all been predicted, without a priori assumptions, by the highly general OLKMC framework.This is exemplified by the counter-intuitive diffusion mechanisms of V 2 that could easily be misidentified if using simpler models e.g.final-to-initial-state-energy (FISE) [75].Although, this could have been captured with traditional KMC and careful DFT analysis, here it arises naturally without any special consideration or bias.The range in kinetic pre-factors, spanning almost two orders of magnitude, emphasises the need to compute pre-factors even in single element systems.
During our experiments at 300K, no dissociation of the vacancy clusters occurred.This confirms the dissociation barrier is higher than the diffusion barrier, which is in line with the literature [75].Exploring diffusion across a range of (higher) temperatures with OLKMC would offer an opportunity to confirm the energy barriers with a fit to the Arrhenius equation and study the dissociation behaviour.If larger clusters become increasingly immobile, dissociation may become Figure 5 Diffusion mechanisms for vacancy-cluster diffusion in the α-Fe lattice.White circles represent an occupied lattice site; small symbols indicate an unoccupied BCC lattice site; arrows mark the path of an atom during a mechanism and transparent grey planes act as a guide to the eye containing the atomic path.Small perturbations away from lattice sites have been omitted for clarity.See Fig. 4 for the corresponding energy profiles.Note for V 2 only the lower energy mechanism (Fig. 4b) has been sketched.the predominant form of diffusion.

V n H complex diffusion
To build upon the cluster diffusivity results, we add a single H atom into each of the clusters described in Section 4.1, forming V n H complexes.The cluster acts as a trap for the H atom, which can then detrap, diffuse rapidly through the lattice and re-trap at another (possibly the same) cluster.Alternatively, the complex itself can diffuse.We refer to the H trapping sites within the vacancy clusters as deep trapping-sites.
The parallel of Fig. 3 is presented in Fig. 6, for the representative V 2 H complex.We see timescales range from 10 −13 s to 10s, this is required to resolve the distinct regimes visible in Fig. 6.Firstly, below 10 −9 s, the H atom explores the deep-trapping states within vacancy clusters.Analytical superbasin acceleration kicks in once all the states are explored, solves the flickering problem between the deep trapping-states and triggers the discontinuity to 10 −5 s as the H atom escapes then rebinds to the cluster.Next, near 10 −4 s, the H atom detraps and diffuses to another cluster.Finally, near 1 × 10 −2 s, the complex displaces and beyond converges to diffuse behaviour.Like the experiments in Section 4.1, SP searches are only required during the learning phase of the simulations.
The complex's diffusivities and energy barriers are summarised in Table 2; the energy profiles for the identified mechanisms are presented in Fig. 8 and the mechanisms themselves in Fig. 9.We expect higher energy barriers for small complexes, especially V 1 H, due to the larger relative steric hindrance provided by the H atom in the smaller clusters.As the clusters get larger, one would predict that the 'regular' H-free mechanisms could occur whilst the H atom occupies the opposite side of the cluster, interacting minimally.Hence, larger complexes are expected to diffuse similarly to their corresponding clusters.Interestingly, we see for the larger clusters investigated that the introduction of H lowers the energy barrier, indicating a different mechanism may be occurring.We shall now discuss each complex in detail to fully understand this behaviour.

Figure 7
Histogram of energy barriers for V 2 H complex diffusion in a perfect 6 3 unit-cell supercell at 300K.
V 1 H.As predicted by the steric argument, V 1 H experiences the largest increase in diffusion barrier.The H atom can occupy 6 deep trapping-sites in the vacancy, close to the adjacent octahedral sites.The H atom must be 'pushed' out of the vacancy by an Fe atom during a 1  2 111 hop which increases the energy barrier, demonstrating co-dependence.The partially-escaped H atom then rebinds with the newly formed vacancy.This mechanism is asymmetric; from the perspective of the reverse direction, the H atom first escapes and then pushes a Fe atom into the vacancy.The energy barrier, energy profile and mechanism closely match the EAM results of Ref. 76, validating that our norm-based classification generalises to small interstitials and heterogeneous systems.In the literature it is suggested that the reverse direction is the preferred mechanism [23,76] however, in our experiments this only occurs ≈ 30% of the time.This could be due to a kinetic bias towards the forward direction at 300K.
V 2 H.The H atom can occupy 14 deep trapping-sites in the V 2 2 nd NN cluster, again close to the surrounding octahedral sites.The larger V 2 H complex has a diffusion mechanism more similar to the H free case than V 1 H; the vacancies first move into a 4 th NN coordination leaving the H atom in the original vacancy, the H atom then jumps out and into the moved vacancy.Finally, the original vacancy hops and reforms a displaced 2 nd NN complex.Less interaction between the H atom and moving Fe atom(s) results in a lower total energy barrier of 0.71eV, compared to V 1 H.This explains the corresponding ten-fold increase in diffusivity compared to V 1 H.While for V 1 H only one diffusive mechanism occurred, for V 2 H a group of diffusive mechanisms, with energy barriers 0.71-0.95eVwere found.Figure 7 depicts the distribution of diffusion-mechanism energy-barriers identified.The most prolific mechanism at 300K had an energy barrier of 0.78eV.The minimum barrier mechanism was probably suppressed by the high likelihood of the vacancy hopping back to the initial configuration before the H atom could hop to the new vacancy.In contrast the 0.78eV mechanism progressed through a 1 st NN intermediate, which is equally likely to move to a new state or backtrack as the H atom can freely move between the two vacancies.This highlights the significance of kinetics vs energetics at room temperature.V 3 H.The introduction of H to V 3 has little effect on the diffusion mechanisms, only fractionally increasing the exceptionally low-barrier V 3 mechanism to 0.49eV.This results in V 3 H not disassociating for the entirety of the simulation.The V 3 cluster provides 18 deep trapping-sites for H. Similarly to V 2 H, a range of diffusion mechanisms were identified with energy barriers 0.49-0.57eV.This time the minimum energy mechanism accounted for a substantial fraction of the observed mechanisms but, was still not the most frequent.
V 4 H.The V 4 cluster provides 20-32 deep trapping-sites for H (depending on the cluster configuration).For the first time, introduction of an H atom decreases the energy barrier and increases the diffusivity of the complex, compared to the cluster alone.Studying Fig. 5 and Fig. 9, we see the motion of Fe atoms remains the same as V 4 .Although the H atom has extra space in the large cluster to 'avoid' the hopping Fe atom, as predicted by the steric argument, instead it remains close and lowers the energy of the saddle-point(s).This effect can be likened to the pathway provided by the partially-escaped H atom pushing a Fe atom in V 1 H.However, due to the increased size of the cluster and its shape, the H atom does not need to escape before providing this push.Furthermore, the intermediate Fe configuration (steps 3-7 in Fig. 8d) is connected, meaning the H atom can easily reach all the deep traps in the cluster (unlike V 2 H) hence, the forwards and reverse directions are equally likely.
V 5 H.The V 5 cluster provides 28-36 deep trapping-sites for H and continues the V 4 H trend of decreasing the energy barrier compared to its corresponding cluster.In Fig. 9 we see (again contrary to the steric hypothesis) the H atom remains close to the hopping Fe atom, with the first and last hops very similar to the corresponding steps of the V 4 H mechanism.Despite the lower Diffusion mechanisms for V n H 1 complexes in the α-Fe lattice.Small red circles mark a H atom; white circles represent an occupied lattice site; small symbols indicate an unoccupied BCC lattice site; arrows mark the path of an atom during a mechanism and transparent grey planes act as a guide to the eye containing the atomic path.Small perturbations away from lattice/octahedral/tetrahedral sites have been omitted for clarity.See Fig. 8 for the corresponding energy profiles.Note for V 4 H only the frames corresponding to steps 0-3 and 7-10 in Fig. 8d

Figure 10
Comparison between V n and V n H diffusion barriers vs their corresponding effective diffusivities -expectation is Eq. ( 26) where we assume equal Arrhenius pre-factors.energy barrier, V 5 H diffusivity is reduced compared to V 5 .This can be attributed to the additional complexity of the mechanism (visible in Fig. 8e) and additional backtracking, all of which reinforces the importance of kinetic effects at room temperature.To the authors knowledge, this is the first-time mechanisms for V 5 H have been reported.

Discussion
As we move from V n clusters to V n H complexes, we may expect the diffusivities to scale with the energy barriers according to the classical Arrhenius behaviour: with β = 1 k B T and superscripts denoting the complexes.This assumes both the cluster and the complex have the same Arrhenius pre-factor, this assumption can be motivated by the equivalent pathways of Fe atoms during diffusion in Fig. 5 and Fig. 9.The comparison is drawn in Fig. 10, where we see a general conformance to expectation.We see the largest deviations for V 2 H and V 5 H, these complexes require three high-barrier steps during their diffusion mechanism hence, these deviations are probably due to the increased likelihood of backtracking.
For all the complexes beyond V 1 H, a number of alternative mechanisms were accessible at 300K.Off-lattice KMC was able to discover these on-the-fly; traditional KMC simulation that use (typically small) pre-determined lists of mechanisms could easily omit mechanisms that contribute to interesting behaviour.Capturing only the lowest energy barrier mechanisms may not be sufficient.Figure 7 exemplifies this for V 2 H, which is the simplest example of this increasing complexity.This effect will become worse at higher temperatures where alternate, higher-energy mechanisms become ever more common.
Similarly to Section 4.1, the vacancies in the complexes did not dissociate during the simulation.This suggests the dissociation barrier remains higher than the complex's diffusion barriers but, sheds little light on the effect of H on the dissociation barrier.Extending the simulations across a range of temperatures would offer an opportunity to confirm the energy barriers with a fit to the Arrhenius equation and (through comparisons with the H free case) study the effect of H on the dissociation barrier.
It has also been found that the interaction between H and vacancy-clusters has not followed the pattern if extrapolating from V 1 H; perhaps the most interesting behaviour is the reduction in diffusion barrier, upon introduction of H into the larger clusters V 4 H and V 5 H.A similar phenomenon, dubbed the hydrogen lubrication effect, has been explored computationally in FCC metals [80].If this trend continues and the gap continues to increase, it could help to explain the experimentally observed H-induced nano-void migration [81] and could have implications for HE by contributing towards vacancy agglomeration during the HESIV mechanism [14,82,83].A similar atomic mechanism that lowers the diffusion barrier for Vacancy-H complexes could be responsible for the predicted increase in dislocation velocity/mobility [84,85], which could directly support the HELP mechanism [9] of HE.As ever, we should be wary of extrapolating; nano-sized cluster and dislocations could be simulated with OLKMC and would need to be studied before drawing conclusions about macroscopic phenomena.

Effective hydrogen diffusivity vs classical approaches
Alongside the diffusion of the complexes in Section 4.2, we also investigate the motion of the H atom between the network of defects.To explore these results we make comparisons with Oriani's theory of equilibrium-trapping [86], which predicts the effective diffusivity of H, D Or , is related to the diffusivity in the perfect lattice, D, via [40]: where θ x , θ L are the fractional occupancy of available point-trap and regular lattice sites respectively, and n x , n L the corresponding number of sites.Oriani's theory assumes the traps are stationary hence, D Or only takes account for H displacement while the H atom moves between traps.
To obtain the average of θ x and θ L over the course of the entire simulation we identify the locations of the deep trapping-sites in each frame and determine their centroid.Then, defining r as the distance from the H atom to the centroid, we partition the frames into two sets: trapped frames with the r < r x , a cut off trapping radius, and lattice frames with r ≥ r x .We now discuss how r x can be chosen and give us information about the size of the defect.Associating trapping sites to the location of the H atom at trapped frames and similarly for lattice sites/frames, the total time spent in each partition, t x and t L , are related directly to θ and n via: and we assume n x and n L are approximately constant.As we expect the H atom to spend most of its time bound to the vacancy cluster, we apply t L t x to obtain: In order to determine the appropriate value of r x for each complex, we plot the fraction of the time, F , the H atom spent outside a sphere of radius r centred on the vacancy cluster, as a

Figure 11
Fraction of the total-time the H atom spent outside a sphere of radius r centred on the vacancy cluster at 300K.function of r.The results are presented in Fig. 11: for small r, as we expect, F → 1 as no trap/lattice sites are inside the sphere.Conversely, at large r, F levels-out as all the states outside the sphere are linked by approximately-equal low-barrier mechanisms and the H atom must diffuse an approximately constant (but slowly decreasing hence the trail off in Fig. 11) distance before rebinding.Between the two regions is a sharp discontinuity which marks the bounding sphere that contains all the trapping sites hence, this discontinuity defines r x , an effective size for each defect or trapping atmosphere.This is recorded in Table 3 alongside the diffusivity of the H atom and the relevant energy barriers.
Oriani's theory was postulated assuming traps as single points in the lattice interacting with H.In the present work, we are able not only to predict D eff but also calculate the effective trapping-distance or atmosphere of traps with arbitrary structure i.e. non-point traps.
Table 3 Summary of V n H binding energies and detrapping barriers in the α-Fe lattice extracted from OLKMC simulations at 300K.Additionally, the diffusivity of the H atom is included alongside the trapping radii and Oriani diffusivities.Note: V 3 H did not detrap during the simulation hence, the corresponding values are omitted.

Discussion
As expected, in Table 3 we see increasing the number of vacancies leads to a higher r x ; V 2 H has a larger jump in r x than the rest of the clusters, this is due to extended size of the 4 th NN intermediate state.All of the clusters have a larger effective size than the cluster radius would suggest.This is well demonstrated by V 1 H, where we see r x = 3.9Å is between the 2 nd and 3 rd NN distances, indicating the tetrahedral sites close to the vacancy also act as a weak trap for H.If we examine the energy profile of the dissociation mechanism in Fig. 12, we see a collection of metastable states just below r = 3.9Å.These metastable states join the vacancy's superbasin; as the state-to-state dynamics are not preserved inside superbasins the time fractions at r < r x in Fig. 11 do not correspond to the Boltzmann distribution.Understanding the depth and density of hydrogen's weak-trapping/metastable states introduced by defects is of key importance to designing HE resistant steels that utilise H traps.Our results show that even the simplest defects have a complex secondary-structure of surrounding metastable states.Furthermore, the large effective size of the defects means they will alter the diffusive cross section to a greater extent than would be expected for point defects and such variations are not monotonic with the defect complexity (i.e. the number of vacancies).
The detrapping barriers and binding energies in Table 3 increase with the number of vacancies and the two are separated by ≈ 0.047eV for each complex.Therefore, larger clusters act as deeper traps, this trend should level-out as the inside of the cluster approaches a surface.
The diffusivities of the H atom follow a more complex trend.In general, D H and D Or fall in accordance with ∆E detrap and ∆E B rising but, D Or under-predicts the diffusivity.This is because D Or assumes the traps are immobile while in reality the energy barrier for the complex diffusing can be very close to the detrapping barrier.This would present a minor contribution to the diffusivity of the H atom if the mean free path (MFP) in the lattice was large, as the diffusion in the lattice is very fast.However due to the high defect-concentration limiting the MFP, the two effects are competing.Furthermore, D Or becomes a worse estimator of D H as the defect size increases, this is because the approximations made by Oriani (point trapping, single trapping barrier and no change in diffusive cross-section) become less true as the defect's size increases.The trend is broken by V 4 H, despite having a higher detrapping barrier, the H atom diffuses faster than in V 1 H and V 2 H.This could be partially explained by fact that V 4 H has the smallest gap between ∆E of the complex and ∆E detrap thus, complex diffusion is contributing more to D H .The corresponding rise in D Or (which should not take into account complex diffusion) could then be attributed to the H atom not having enough time to reach equilibrium between the lattice/defect before the defect diffuses again.Nevertheless, the detrapping barriers for V 1 H and V 2 H are still lower than the complex-diffusion barrier of V 4 H; perhaps the difference is made up by the kinetic pre-factors of the rate limiting step(s).
Interpolating Table 3 we would expect the detrapping barrier for V 3 H to be 0.67-0.70eV,this is much higher than the diffusion barrier, which explains the lack of detrapping.If we compare the H diffusivities between the complexes, we see that V 3 H complex-diffusion offers a pathway for H transport that is about an order of magnitude faster at 300K compared to trapping and detrapping from the other complexes.This is partly due to the aforementioned high defect-concentrationdue to the periodic nature of the system -limiting the MFP of lattice-diffusing H. Nevertheless, this unexpected result could explain elevated H transport in some scenarios.Furthermore, with the energy barrier for H transport via V 3 H complex-diffusion being so much lower, this could be the predominant H transport mechanism at lower temperatures.

Conclusions
We have developed and implemented a tolerant norm-based LE classification method for OLKMC, that is invariant under Euclidean-transformations and permutations of atoms.This has enabled the simulation of the Fe-H system into HE timescales at room temperature, predicting features not previously reported using a single modelling framework.This system, with its small interstitials, multi-stage mechanisms, frequent flickering-problems, varied harmonic pre-factors and sensitive energy-barriers, presents many challenges to model with OLKMC.Nevertheless, many of these have been overcome and OLKMC has proved an invaluable and capable tool for the study of these systems with no a priori assumptions of the underlying mechanisms.This is extremely promising for the future of modelling H-defect interactions and improving our understanding of HE at the atomic scale.Specifically, we have: • Investigated the diffusion of small (less than six) vacancy-clusters, with and without the addition of H and found evidence that H can increase the diffusivity of larger clusters.
• Fully classified the diffusion pathways of these cluster/complexes (energetically and mechanistically) and understood how H changes their diffusion mechanisms.
• Simultaneously, obtained the trapping/detrapping barrier(s) of H from -and its effective diffusivity in -the presence of these clusters.We have made comparisons to Oriani's theory, testing the equilibrium hypothesis in the presence of mobile traps and expanded the conclusions to also include predicting trapping atmospheres in arbitrarily defined non-point traps.
• Quantified the trapping atmospheres surrounding vacancy clusters and begun to demonstrate the kinetic effects of shallow traps surrounding point-defects.
• Found harmonic pre-factors can vary by up-to two orders-of-magnitude suggesting, the constant pre-factor approximation should always be carefully verified (particularly in multielement systems).
Finally, OLKMC is a materials-agnostic method.Our norm-based classification should find applications in many materials systems requiring long-term atomistic predictions.
hence, the expected number of points inside the volume of tolerant space, m, is approximately: In order to avoid exponential complexity, we require m ≤ 1. Rearranging Eq. (B.4) and using the approximate nature to simplify the constants, this requires: For the case of α-Fe with r min ≈ 1Å, this requires δ 0.4Å.This matches our empirical experiments with typical LEs in α-Fe where, we observe the runtime of Algorithm 1 blowing-up beyond δ 0.4Å.

Figure 1
Figure 1Demonstration of the norm-based equivalence between reference point-cloud Fig.1a(opaque shapes) and unclassified point-cloud Fig.1b(grey shapes) via a permutation of the labels and a rigid-body rotation (transformation).Shape (square/circle) denotes the colour of each point and arrows act as a guide to the eyes for the permutation.In Fig.1ewe see all the points/atoms are close enough that the LEs can be considered equivalent.

Algorithm 3
Search a catalogue of reference LEs for an equivalent LE. function _ (Q) k ← the key of Q s ← the sub-catalogue corresponding to k in the catalogue for each P in s do if (P , Q) and _ (P , Q) then return P an equivalent LE, whose mechanisms can be reconstructed onto Q by multiplying their atomic displacement-vectors by O T (computed during _ ) return NULL If no match is found then Q represents a new LE; append Q to the sub-catalogue and launch SPS centred on the LE in order to discover any mechanisms associated with it.

Figure 6
Figure6V 2 H complex diffusing in a perfect 6 3 unit-cell supercell at 300K, dashed line is a fit to a constant.

Figure 8
Figure 8 Energy profiles for the cluster-H diffusion mechanisms sketched in Fig. 9 -extracted from OLKMC simulations at 300K.The individual barriers and kinetic pre-factors are omitted for brevity.
Figure 9Diffusion mechanisms for V n H 1 complexes in the α-Fe lattice.Small red circles mark a H atom; white circles represent an occupied lattice site; small symbols indicate an unoccupied BCC lattice site; arrows mark the path of an atom during a mechanism and transparent grey planes act as a guide to the eye containing the atomic path.Small perturbations away from lattice/octahedral/tetrahedral sites have been omitted for clarity.See Fig.8for the corresponding energy profiles.Note for V 4 H only the frames corresponding to steps 0-3 and 7-10 in Fig.8dare shown and similarly for V 5 H with steps 0-2, 7-9 and 18-20 from Fig.8e.
Fraction of time, , spent outside sphere

Figure 12
Figure12 Energy profile of the V 1 H dissociation pathway as a function of the distance of the H atom from the centre of the vacancy.

3 δ r min 3 (B. 4 )
Each iteration will on average: call _ m times for points that will match, each with O (n) runtime; trigger m recursions; call _ O (n) times for points that will not match, each with constant runtime.Hence, we can rewrite Eq. (B.1):O (R) = nm + m (nm + m (. ..))

Table 1
Summary of vacancy-cluster diffusion results in the α-Fe lattice at 300K.All diffusivities have a fractional error less than one part in one hundred.Quoted kinetic pre-factors for multi-step mechanisms is that of the highest barrier step.

Table 2
Summary of cluster-H diffusion results in the α-Fe lattice at 300K.All diffusivities have a fractional error less than one part in one hundred.Quoted kinetic pre-factors for multi-step mechanisms is that of the highest barrier step.