Imeall: A Computational Framework for the Calculation of the Atomistic Properties of Grain Boundaries

We describe the \texttt{Imeall} package for the calculation and indexing of atomistic properties of grain boundaries in materials. The package provides a structured database for the storage of atomistic structures and their associated properties, equipped with a programmable application interface to interatomic potential calculators. The database adopts a general indexing system that allows storing arbitrary grain boundary structures for any crystalline material. The usefulness of the \texttt{Imeall} package is demonstrated by computing, storing, and analysing relaxed grain boundary structures for a dense range of low index orientation axis symmetric tilt and twist boundaries in $\alpha$-iron for various interatomic potentials. The package's capabilities are further demonstrated by carrying out automated structure generation, dislocation analysis, interstitial site detection, and impurity segregation energies across the grain boundary range. All computed atomistic properties are exposed via a web framework, providing open access to the grain boundary repository and the analytic tools suite.

Licensing provisions: Apache-2.0 Programming languages: python, fortran, javascript Nature of problem: Determining the minimum energy structure for a specific grain boundary and interatomic force field involves extensive searches in configuration space. Any duplication of this effort should be avoided, and providing a unique database to gather the resulting structures is needed for this. Accurately cataloguing the chemical and electronic environments associated with the interface atoms in the grain boundary furthermore requires a usable interface between a database of grain boundary structures and an expandible set of interatomic potential calculators. Solution method: We introduce a standard indexing convention that allows the integration of grain boundary structures for arbitrary materials, generated by different users/research groups, into a normalised database that can be easily queried and used as a starting point for further research projects. The Imeall package accomplishes this by specifying a standard naming convention for the grain boundary database and by providing the software routines necessary to populate and query such a database, as well as an interface to interatomic calculators and analysis tools.

Introduction
A grain boundary is formed wherever an extended planar region can be identified that separates two crystalline regions of a homophase material differing in the relative orientation of the adjoining crystal lattices. Grain boundaries play a significant role in determining the mechanical properties of materials. Having access to a standardised database of grain boundary structures is an important prerequisite for any atomic-scale ("chemo-mechanical") analysis, e.g., to investigate how impurities trapped at grain boundaries influence the mechanical strength of the material. Evaluating several properties of significant engineering interest requires access to the atomistic structures of grain boundaries. These include, but are by no means limited to, the diffusivity of impurities at grain boundary interfaces Ref.
[1], the segregation energies of impurities to interfaces Ref. [2], and the interaction and slip transmission of dislocations across boundaries Refs. [3,4]. While a number of systematic investigations of grain boundary structures have been performed, see Refs. [5,6,7] and references therein, a single repository equipped with the appropriate tools to generate, archive and analyse the boundaries is still missing. The Imeall package and the suite of routines described in this work make a step in this direction by providing a framework for constructing and cataloguing grain boundary structures, determining their minimum energy, and calculating an array of quantities of interest.
The paper is organised as follows. Sec. 2 describes the procedure and routines for systematically generating tilt and twist grain boundaries. Sec. 3 introduces a unique naming convention which is reflected in the structure of the Imeall repository. This scheme allows the database to incorporate grain boundary structures for any reference crystalline material, computed using any interatomic potential, in a consistent and physically intuitive fashion. The information contained in the database is mirrored by a normalised SQL database to allow rapid queries via the command line or a web interface. Sec. 4 describes the overall layout of the computational package, highlighting the key routines needed to perform the atomistic calculations and analysis, and those used for storing this information in the database and exposing it via a web framework. The following sections describe some applications of the package, to illustrate the comparative information that can be exposed. In particular, Sec. 5 uses the package to investigate how grain boundary energetics and structural topologies vary upon using different interatomic potential parameterisations. Finally, Sec. 6 describes how the package handles calculations addressing point-defect energetics, specifically interstitial hydrogen located at bcc Fe grain boundaries.

Generating Grain Boundary Structures
The full macroscopic specification of a grain boundary can be accomplished using five degrees of freedom [8]. To generate a boundary, two identical crystals, which we may here differentiate by referring to them as the 'red crystal' and the 'blue crystal' (see Fig. 1) are initialised in the same orientation. The red crystal is then rotated by an angle θ (a single degree of freedom) around a rotation axis specified by the vector N (associated with two more degrees of freedom). After this rotation, a boundary plane is chosen, specified by its normal vectorbp expressed in the unrotated coordinate system. This final vector exhausts the remaining two degrees of freedom and concludes the macroscopic specification of the grain boundary. All red crystal atoms located below the boundary plane and all blue crystal atoms located above the boundary plane are at this point removed. We will refer to the choice of orientation axis, misorientation angle, and boundary plane as a complete canonical macroscopic specification of the grain boundary, i.e., before atomistic relaxations are considered. It is useful for what follows to define two grain boundary main geometries. If the rotation axis is parallel to the boundary plane normal (N parallel tobp ), the grain boundary is referred to as a twist boundary. If the rotation axis is orthogonal to the boundary plane normal (N orthogonal tobp ), the grain boundary is referred to as a tilt boundary. In the case where the Miller indices of the boundary plane are the same in the coordinate systems of both grains, the grain boundary is referred to as symmetric. In the Imeall package generating an exhaustive array of tilt symmetric and twist grain boundary structures is accomplished by the methods defined in imeall.slabmaker.slabmaker and imeall.slabmaker.gengb from quat. These routines make use of quaternion algebra to systematically generate a full array of symmetric tilt and twist boundaries (see Refs. [9,10] for other applications of quaternion algebra to the study of grain boundaries; an overview of quaternion algebra is given in Ref. [11]). Quaternions are frequently used in engineering and graphic design contexts to handle vector rotations. For the present purposes, quaternions can be thought of as four dimensional objects with one scalar component and a threedimensional vector component. The rotation of a vectorv by an angle θ around a unit vectorN is accomplished by conjugation ofv by the quaternion q, i.e., by multiplying the vector on the left and the right as qvq −1 , where q −1 denotes the inverse of q. In the context of grain boundaries the four components of the quaternion, q = (θ,N) are converted to q = (cos(θ/2), sin(θ/2)N) so that the necessary rotations may be obtained. These components are directly obtained from the physical parameters defining the grain boundary i.e. the misorientation angle and the orientation axis. Interestingly, if the quaternion q can be reduced to a primitive form, i.e. all its entries are integers apart from a scaling factor, the rotation is guaranteed to produce a coincident site sublattice (cf. Refs. [12,13]) that is, a periodic sublattice of space points where red and blue atoms coincide. In this case, given q , closed expressions exist that readily provide a set of basis vectors for the coincident site sublattice Ref. [10].
Besides making it easy to perform rotations with no need for using rotation matrices, there is an additional favourable aspect of the quaternion scheme for generating symmetric tilt grain boundaries. This is illustrated in Fig. 1, which presents a schematic representation for the determination of the boundary planes for a tilt symmetric grain boundary using quaternion algebra. The two crystals, red and blue, are initially superimposed. In Fig. 1 the unit vector denoting the axis where the misorientation angle is measured from, according to a right hand rule, is denotedN ×v. Here we have assumed thatN is normal to a lattice plane (the coordinates ofN will be thus three integers apart from a scaling factor) andv is a lattice vector of the plane, orthogonal toN. The right quaternion ("half rotation") product (N × v)q can be used in this geometry to define the boundary plane normal which in the original reference frame corresponds to the symmetric tilt grain boundary plane, i.e., the grain boundary associated with the misorientation angle θ and axisN encoded as components of q. All red lattice vectorsv can be obtained at this point by a left and right quaternion product ("full rotation"), as qvq −1 . Once all atomic positions of the grain boundary interface have been generated, the unit cell is doubled by reflecting it through the boundary plane, to produce a periodic structure (cf. Fig. 1). In the imeall package the gen sym tilt method generates an array of angle boundary plane pairs, and the build tilt sym gb generates the bicrystal unit cells according to these specifications. Equivalent routines exist for twist boundary structures.

Microscopic Search Parameters
While the macroscopic degrees of freedom of the grain boundary are completely specified by the geometric considerations, determining its structurally relaxed atomistic structure Figure 1: The general coordinate system for determining tilt and twist boundary planes and orientations using quaternion algebra. Hats indicate regular three dimensional vectors, considered to be quaternions with a 0 component for the scalar part. The quantity q, without a hat, represents a quaternion with a vector part determined by the orientation axisN and a scalar part determined by the misorientation angle θ. The effect of quaternion multiplication on the vectors normal to the orientation axis plane are represented schematically. Upper right inset: the total relative rotation θ of the two misoriented grains is depicted by the red and blue crystals with their coordinate systems rotated. Lower right inset: the bicrystal generated by reflecting the grain boundary to double the number of grain boundaries in a unit cell. This guarantees periodicity perpendicular to the grain boundary plane allowing for the use of periodic boundary conditions in the atomistic relaxation. requires a systematic search through a large space of initial configurations. This search requires initialising the misoriented grains in a reference frame defined by a two dimensional grid of unique positions,x,ŷ, the rigid body translations, representing the in-plane translation vectors, and a lattice expansion vector e, normal to the boundary plane, which allows for atomistic relaxation of the initial grain boundary structure. For further details regarding the requirements of performing this search see Ref. [14] and Refs. [7,15,16,17]. The mechanics of carrying out the grain boundary structural relaxation search and the routines required for performing the atomistic relaxations are detailed in Sec. 3 and Sec. 5. The methods needed to generate the atomistic grain boundary bicrystal and the graphical representation of coincident site lattices, are handled in the imeall.slabmaker.slabmaker module. All computations above defining the atomistic cells are performed using the quippy library [1]. The resulting structures are stored as extended .xyz files in the grain boundary database, which is described in the following section.

Grain Boundary Hierarchy
In the Imeall package each grain boundary is assigned an identifying label, an "id" tag built up in analogy with the physical specification of the grain boundary. Due to the large number of additional conventions that become necessary when discussing non-cubic systems we have elected not to use the "mean boundary plane" formalism for specifying the grain orientation [8]. Rather, we choose an alternative specification that arises naturally from the procedure described in the previous sections, which involves initializing two superimposed crystals, applying rotations via a right hand rule, and following an intuitive geometric procedure for initializing the configuration. For this, the orientation axisN , the angle θ (in degrees), and the boundary plane normal vectorbp are serialised as absolute value integers and concatenated into a string. For instance, a symmetric tilt boundary specified by the orientation axis [110], with a misorientation angle of 13.44 degrees and separated by the (1 1 12) boundary plane is identified by the grain boundary id 11013441112. The grain boundary at this description level is referred to as the 'canonical grain'. The database is constructed by adopting a nested hierarchy starting from the common root position, which is followed by the material directory, the orientation axis directory, and finally the canonical grain id. As an example, assuming that the material is α-Fe and the database root is stored at '/', the canonical grain mentioned above would be accessed at '/alphaFe/110/11013441112'. 1 To complete the canonical grain descriptor level (at which there is still no assumptions about the interatomic potential used for detailed atomistic characterisation), any data determined purely by geometric considerations, e.g. the orientation axis or the grain boundary Σ number (i.e. the inverse ratio of the number of coincident sites to lattice sites), is included in a file called gb.json. The unrelaxed canonical grain is at the same time stored as an extended .xyz file, along with an image of the grain, and a schematic vector graphics image of two planes of the coincident site lattice at the interface. Below the canonical grain descriptor level, the hierarchy proceeds in a general way intended to capture the full range of accessible microscopic structures. For this, the output of a series of subsequent microscopic initialisations-relaxations procedures, set up according to the requirements described in Sec. 2.1 are stored in a unique Potential directory. For example, a researcher may have studied a particular grain boundary using an embedded atom potential, and a tight binding model. An appropriate layout for the output of this research, all naturally located below the canonical grain branching level in our database tree, would then be .../11013441112/EAM1/ for the first embedded atom potential, and .../11013441112/TB for the structures computed using the tight binding model. Continuing down the hierarchy, the different microscopic structural initialisations-relaxations can be constructed and labeled systematically within a subdirectory structure branching underneath each potential directory: see Table 1 for a descrip-tion of the labeling convention and an example grain specification. The appropriate number of microscopic initialisationsrelaxations to be stored will depend on the application. The typical pattern will involve a large screening set obtained using a computationally inexpensive potential, with selected microscopic geometries ready to be pushed forward to a higher accuracy calculation scheme (e.g., density functional theory-based) in an appropriately labeled sister potential directory. The usefulness of this variable precision procedue is demonstrated by its ability to handle issues previously identified in Refs. [18,19] where classical potentials identify non-physical minimum energy grain boundary structures. For instance, the minimum energy structure identified by embedded atom potentials has a tendency to favour structures preserving the coordination number at the interface. This propensity for over-bonding introduces the risk of creating structures with an unrealistic angular distortion. Quantum mechanical models can, however, restore the physically correct angular contributions to the total energy, and thus identify the 'true' structure, to be confirmed by experiment. Constructing the database in the described manner facilitates the direct tabulation and comparison of quantities for a definite structure with an exact geometric specification, i.e. a grain boundary, as predicted by different interatomic potential models of arbitrary accuracy. The full microscopic initialisa-  Table 1: The directory naming convention in Imeall with an example grain. The example column has been chosen to uniquely specify the minimum energy structure for the [1 1 0] 13.44 (1112) canonical grain boundary. The chosen interatomic potential is of the embedded atom type and is specified in Ref. [20]. The full path for this grain boundary would be 11013441112 v6bxv2 tv0.4bxv0.2 d1.4z denoting a 6 × 2 supercell with in plane rigid body translations of 0.4, 0.2 (referring to the fraction translation in the basis of the lattice vectors of the primitive unit cell for an orthorhombic grain boundary), and an atom deletion criterion for nearest neighbours below 1.4 Å.
tion is determined by the supercell size of the subgrain, the rigid body translations, and the atom deletion criterion distance. The combination of a given potential and fully specified subgrain structure is referred to as a SubGrainBoundary. The predicted atomistic structures of a grain boundary will inevitably vary, depending on the interatomic potential model used in the calculation. Organising the database of microscopic structures according to potential models implies that the database can seamlessly accomodate newly predicted structures for any grain boundary, contributed from researchers using different interatomic potentials. The database also makes provisions for some extra flexibility not needed for standard grain boundaries. In particular, the 'orientation axis' [000] is reserved for isolated dislocations and fracture geometries. (s), and mixed dislocation (d), or the fracture geometry (f). In the case of an isolated dislocation, the Burgers vector and the dislocation line in serialised form follow the alphabetic specification. For example, an isolated edge dislocation line oriented along the [100] line with a Burgers vector pointing along [010] would be specified as e100010. As an example of fracture geometry, f111112, specifies a fracture with the cleavage plane (111) and crack front oriented along the [1 1 -2] direction. Again, a gb.json file specifies the essential parameters of the canonical dislocation with subgrain database entries being resolved according to the interatomic potential used.

Closure Tree
While a hierarchical database layout is a natural choice for organising our microscopic structure database, executing repeated recursive searches of the directory tree to extract information can become computatonally expensive. For this reason, the grain boundary directory hierarchy is also 'flattened' within Imeall, using a closure tree, that stores all the intermediate paths to grain boundary structures, which can be rapidly queried. This mirrored database tracks the physical database layout and ensures the grain boundary database remains properly normalised: i.e. there is no duplication of canonical grain structures, or SubGrainBoundary objects for a given potential and there is minimal redundancy in the storage of grain boundary properties. Integrity keys in the database ensure a unique entry for a given potential and a given grain boundary id. This serialised database can also be queried rapidly without having to resort to a recursive search strategy. The imeall.gb models and imeall.models modules define the database schema for the SQLite database and provides a number of objects and methods for rapidly querying and retrieving data. It is through this serialised database that the web framework for the Imeall package retrieves data for the connected user. Using this bimodal approach to storage allows the database to coexist in the form of an intuitive physical hierarchical layout where a researcher can manually extend and work within a directory tree, and a complimentary serialised database which can be queried in a structured and time-efficient manner.

Package Layout
The Imeall package follows the standard template for a python package. Fig. 3 demonstrates the directory structure of the package and the key routines contained as .py files. The Imeall package can operate in a distributed or a local mode. In the distributed case the host server can be queried and precomputed structures can be checked out or inspected via the web interface. Alternatively the entire Imeall package can be downloaded and installed by any user on a local machine, and, by running the runserver.py script a local instance of the server can be used that connects to whatever portion of the grain boundary database the user chooses to store locally.  Figure 3: The core directory structure and routines of the Imeall package. The directories static and templates, contain the javascript and html templates for the web interface, the potentials directory contains the parameterisation files for the interatomic potentials present in the database. The models.py and gb models.py routines define the methods and schema for traversing the grain boundary directory tree and synchronising the directory database and the SQL database. These routines provide the skeleton of the database. Routines prefixed calc ... contain logic for various types of analysis, e.g. calculating elastic dipole tensors, performing atomistic relaxations, probing interstitial energetics, etc. The slabmaker module contains routines for initializing grain boundary structures.

Energetics of Relaxed Grain Boundary Structures
As a first application of the package, we describe the automated procedure for generating and relaxing a comprehensive array of grain boundary microscopic configurations. The generation of these job arrays is handled by run gb net.py. This script generates a set of inital microscopic structures for a given canonical grain boundary geometry, as required to approximately span all microscopic initializations at the boundary, and then calls the desired atomistic calculator in quippy to handle the structural minimisations. If the desired interatomic potential is not supported in the quippy package, the routines in Imeall will still generate the structures, which can then be manually forwarded to the calculator of choice. The relaxation method is customisable with the default set to be the FIRE algorithm of Ref. [21]. The structural relaxation will proceed until a desired force tolerance is achieved. A strain filter is attached to the atoms object during the structural minimisation so that the lattice vector orthogonal to the grain boundary plane can vary during the relaxation. This allows the structural minimisation to be carried out at zero pressure, or under any desired uniaxial load. Fig. 4 reports the grain boundary energies for the minimum energy structures obtained for the [001] orientation axis and a broad range of misorientation angles, using four different interatomic potentials. The interfacial energies are determined  [23], Mishin Ref. [24], and using the Gaussian Approximation Potential Refs. [25,26]. The database contains the energetics for each potential along with the configuration space searched to obtain the energetic minimum structure. The GAP model is only applied to the low angle boundaries where there are distinct dislocations networks. by a combination of the elastic strain energy and the chemical reconstructions that take place Ref. [27]. In the low angle regime there is a clear distinction between the two contributions, corresponding to a picture of sections of coherent crystal populated by an array of sparse, parallel dislocation cores. As the dislocation core spacing approaches the lattice plane spacing, the decomposition into separated elastic and chemical contributions to the energy becomes more problematic. Fig. 5 illustrates the relaxed dislocation core structures associated with the same grain boundary using four different potentials. The local atomic environment of the atoms at the dislocation core are differentiated using the method of Ref. [28]. The significant structural variation obtained from the different potentials significantly affects the predicted properties of the grain boundary. This motivates the construction of the present single repository. as a prerequisite to rationalize the differences.  [28] to determine the local atomic environment. Namely, blue corresponds to body centered cubic, red to hexagonal close packed, green to face centered cubic, and gold to icosahedral coordination. Each potential predicts the same spacing between dislocation cores, but there is a significant variety in the local atomic environment predicted by these potentials at each dislocation core. This is accompanied by a corresponding variety of the geometry of interstitial trap sites, which is relevant for point defect diffusivity. Structures a, b, c have been generated using potentials from Refs. [23,20,24] respectively, structure d was determined using a Gaussian approximation potential Ref. [25,26].
The equilibrium spacing between dislocations also has a significant effect on the elastic properties of the interface. In low angle grain boundaries the dislocation spacings are governed by Frank's formula: where b is the Burgers vector and the angle θ is measured from the nearest coincident site lattice. Frank's formula readily allows to check whether or not the atomic potential is correctly describing the long range elastic strain field and the Burgers vector of isolated dislocations. The computed dislocation spacings and Burgers vectors are compared with Frank's formula in Fig. 6. The dislocation character and spacing are determined using the DXA technique developed in Refs. [29,30]. The Imeall package also allows for the calculation of the Nye tensor using the technique described in Ref. [31] to identify isolated dislocations. Knowledge of the structure of the dislocations and their spacing allows the construction of analytical models describing the elastic properties of interfaces. The accessibility of such data for a range of grain boundaries and force models provided further motivation for developing the Imeall database and tools.

Interstitial Sites and Segregation Energies
The possibility of cataloguing trapping sites and trap depths for interstitials is a prerequisite for calculating the diffusivity and equilibrium concentrations of point defects in a material: of particular interest for iron-based materials is hydrogen diffusivity [32], notably in relation to the steel-embrittlement problem. A number of diffusivity models require parameterizations reliant on knowing trapping and segregation energies for boundaries [33,34,35,36]. Frequently the distribution of trap site energies is taken to be Gaussian with the variance and center value of the Gaussian fit to experimental data [34] but this so far had to be assumed rather than calculated with appropriate configuration space statistics.
The Imeall database is equipped with the capability of cataloguing trapping sites and calculating the point defect interactions for all individual boundaries across the entire misorientation range and for different axes. Indexing the possible segregation sites at the boundary and calculating realistic distributions for their associated trapping potentials is important for determining equilibrium interstitial occupancies at the interfaces [37]. The interstitial sites in a bulk BCC lattice are represented schematically in Fig. 7. These sites can be de-termined automatically for any given atomistic structure using a Delaunay triangulation. Due to the intrinsic lattice dis- Figure 7: Automatic determination of interstitial sites in a bulk BCC lattice using the Delaunay triangulation method. In a bulk BCC lattice the octahedral sites can be determined as the circumcenter of the sphere represented with the solid and dashed red lines. The solid red circle is centred in an octahedral interstitial site (solid red dot), and the blue triangular faces comprise a tetrahedral site, located at the centre of an irregular tetrahedron having vertices on four lattice sites. The right panel provides the magnitude of the forces induced be a hydrogen point defect in eV/Åunits. These forces are required for calculation of the elastic dipole tensor.
torsions induced by the geometric boundary between materials grains and its associated peculiar pattern of possible atomic relaxations, each grain boundary typically hosts a variety of non-equivalent interstital sites, differing in coordination and volume from the reference bulk lattice values. The routine hydrogenate.py contains the Hydrogenate class for indexing the interstitial sites in a lattice, decorating a lattice with hydrogen, and computing the volume of interstitial sites. The routine hydrogenate.py contains the Hydrogenate class for indexing the interstitial sites in a lattice, decorating a lattice with interstitial hydrogens, and computing the volume of the interstitial sites. An alternative, and more complete, framework described in Ref. [38] which possesses extended functionality can also be used.
For each interstitial site of a given grain boundary it is possible to define an elastic dipole tensor (cf. Refs. [39,40,41]), a local quantity of particular significance because it allows modelling the coupling of a point defect to the strain fields of isolated dislocations, e.g., in the manner prescribed by Ref. [42]. The elastic dipole tensor elements G i j are defined (cf. Ref. [40]) as: where σ i j is the volume averaged stress tensor, and n d is the point defect (e.g., hydrogen) concentration.
To compute these components we implemented in the imeall.calc elast dipole module the "defect force" scheme described in Ref. [41]. In this procedure, a point defect is introduced inside a unit cell of the host material, and the structure is relaxed to its equilibrium zero-forces geometry. The defect is at this point removed from the model system keeping all other atoms fixed, and the resulting forces on the ions previously surrounding the interstitial are calculated without allowing any relaxation. These forces and the ion position vectors can then be related to the dipole tensor via the formula: where d, m are the position vectors of the defect and host atoms, respectively, f [m,d] is the force vector induced on the m-indexed atom by the removal of the d defect, and the subscripts i, j refer to spatial dimension components. We note that the defect force method does not require a single-Hamiltonian description (or an explicit total energy expression) for the system, so that forces can be computed using mixed ("embedding") schemes i.e., combinations of descriptive potentials in the region of near crystallinity and more accurate quantum mechanical models at or very close to the point defect, for instance using the 'Learn On the Fly' scheme described in Ref. [43]. The absolute magnitude of the defect forces induced by a hydrogen interstitial in the BCC lattice is illustrated in Fig. 7, right panel, where as expected the majority of the calculated effect is limited to the metal ions neighbouring the interstitial site analysed. The availability of the grain boundary database enables screening calculations on many defect geometries (e.g., distorted tetrahedral trapping sites located in the neighbourhood of the grain boundary) using fast classical potentials. It is therefore interesting to determine what accuracy level could be expected by the fast force models used in this screening step. We thus computed the dipole tensor for a tetrahedral site in bulk BCC Fe using an EAM force field [20] and a reference DFT calculation. The computed dipole tensor for the EAM is: and for the density functional calculation, The DFT calculation was performed using the Vasp package, using the PBE functional approximation to treat exchange and correlation [44], a PAW pseudopotential [45], a 3×3×3 k-point mesh and imposing a 45 Ry energy cutoff on the plane wave wavefunction expansion. The periodic unit cell contained 250 Fe atoms with the H defect atom placed in a tetrahedral interstitial site in the middle of the unit cell. These results suggest a (∼ 12%) underestimation of the tensor elements size by the EAM potential. Such first order error could have a significant impact on a theoretical analysis of the interaction of the point defect with an elastic strain field present in the metal matrix. However, such concern would be lifted by computing more precise values with higher accuracy DFT-based calculations only for a subset of most relevant/interesting cases revealed by an initial EAM-based high-throughput screening analysis, since systematic absolute ∼ 10% errors could be tolerated, counting on error cancellations, in the initial screening used to identify the subset. The ability to catalogue quantities such as the elastic dipole tensor according to the potential used, and to identify artefacts resulting from the model (e.g., by comparing different models and pointing out outliers) is in fact a useful additional function of the Imeall package. Similar calculations on the energetics of interstitials can be performed on any desired grain boundary structures and will be reported elsewhere.

Conclusion
We have described the structure and function of the Imeall package. The introduction of a naming convention and the overall structure of the database and specification of the data models provides a very convenient framework for a computational resource relating to interfacial structures in materials. The capabilities of the resource have been demonstrated with reference to various properties of symmetric tilt boundaries and pure crystalline α-Fe. The resource is offered online as fully open-access and is extensible by any user. The code repository can be found at https://github.com/kcl-tscm/imeall, links to the full structure database, which is hosted on the NOMAD servers and the web framework can be found in the documentation of the package at http://kcl-tscm.github.io/imeall/index. html.

Acknowledgements
We would like to thank Prof. Adrian Sutton for illuminating conversations on grain boundaries and the physics of strain fields in their vicinity. We would also like to thank Dr. Thomas Daff and Prof. G. Csányi for making their Fe GAP potential available for comparison. The GAP software is available for non-commercial use at www.libatoms.org. Financial support was provided by the Engineering and Physical Sciences Research Council under the HEmS program grant EP/L014742/1 and grant EP/P002188/1. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-