Neural Cellular Automata for Solidification Microstructure Modelling

We propose Neural Cellular Automata (NCA) to simulate the microstructure development during the solidification process in metals. Based on convolutional neural networks, NCA can learn essential solidification features, such as preferred growth direction and competitive grain growth, and are up to six orders of magnitude faster than the conventional Cellular Automata (CA). Notably, NCA delivers reliable predictions also outside their training range, which indicates that they learn the physics of the solidification process. While in this study we employ data produced by CA for training, NCA can be trained based on any microstructural simulation data, e.g. from phase-field models.


Introduction
The microstructure of metals collectively denotes their grains and metallurgical phases, their size, relative amount, shape/morphology and crystallographic orientation (texture). The microstructure strongly influences the physical and mechanical response of metallic materials, and a wide range of properties can be achieved by manipulating the microstructural features through thermomechanical processing. Therefore, correlating processing conditions and the resulting microstructure has always been an important research field. Understanding the influence of the solidification process conditions on the features of the resulting microstructure has attracted even more attention since the advent of metal additive manufacturing (MAM) processes that allow site-specific control of the solidification conditions [1,2]. However, achieving the goal of microstructure control, e.g. fabricating functionally graded materials, via MAM requires reliable solidification microstructure models that enable quantitative prediction of the process-microstructure relationship.
Microstructure formation during the solidification process comprises two phenomena: nucleation and grain growth [3]. Nucleation is the formation of stable nuclei in the molten metal and typically requires that the melt cools down below its melting temperature (undercooled). Growth includes enlargement of the nuclei (or existing grains) depending on the temperature field and microstructure state in the vicinity of the solidification interface [3][4][5][6]. The formation of solid material during growth is accompanied by the release of latent heat of fusion which should be extracted from the solidification interface to allow further growth. From a thermal point of view, this indicates that the growth direction aligns with the maximum temperature gradient direction, where the fastest heat dissipation through conduction takes place [3]. While crystallographic orientation considerations are not relevant for nucleation and formed nuclei have random orientations [7,8], the growth phenomenon is anisotropic and nuclei/grains grow faster along certain crystallographic orientations (i.e. preferred growth directions, <100> for BCC/FCC crystals) [3]. Thermal and crystallographic factors collectively result in the selective growth of nuclei/grains or the so-called competitive grain growth phenomenon, which means that the nuclei/grains whose preferred growth directions are aligned with the maximum temperature gradient grow faster and become the dominant feature of the microstructure [3]. Consideration of the above phenomena is the essential part of the solidification microstructure modelling strategies [3,[9][10][11][12][13][14][15][16][17].
Besides simple analytical/empirical models [3,9,10], the two 'physics-based' microstructure modelling techniques are Phase Field Modeling (PFM) and Cellular Automata (CA). The main focus of these two modelling strategies is on representing the grain growth phenomena, while they often adopt empirical probabilistic or ad-hoc considerations for the nucleation phenomena [7,18]. PFM for microstructure modelling [19,20] represents the microstructure by a set of conserved, e.g. for solute atom concentration, and non-conserved phase field parameters, e.g. for the state (solid/liquid) and grain ID (grains with various crystallographic orientation). PFM defines the free energy of the system as a function of the conserved and non-conserved phase field parameters, their gradients, temperature, and possibly additional variables. An anisotropic grain boundary energy is considered to account for the preferred growth direction. This energy term is maximum for grain boundaries perpendicular to the crystal preferred growth directions so that grains tend to grow anisotropically along these directions [20,21]. The Cahn-Hilliard [22,23] and Allen-Cahn [19,24,25] equations are respectively used to govern the evolution of conserved and non-conserved phase-field parameters such that the free energy of the system tends to its minimum. The kinetics of the evolution depends on the reduction rate of the system's free energy and on temperature-dependent mobility factors. PFM is a very detailed microstructural modelling strategy that not only well represents important phenomena such as preferred growth directions and competitive grain growth, but also can provide predictions about microscale features such as secondary dendrite growth morphology and solute atom distribution [11,12,26]. However, the need for space and time resolutions in the orders of 0.01-1 µm and 0.1-10 µs leads to a very high computational cost for PFM, which limits the simulation domain size to hundreds of micrometres [11][12][13][14][26][27][28][29]. As an example, Yan et al. [13] report a computational time of 13 days for the solidification microstructure simulation of a 350×350×150 µm 3 domain using an NVIDIA Tesla M2090 GPU.
The CA method, developed by Gandin and Rappaz [15,16], evaluates the microstructure and temperature state locally in the vicinity of the solidification interface and combines crystallographic considerations and experimentally or theoretically driven dendrite tip growth rate equations to predict the grain growth phenomenon [4][5][6]. Numerous examples of the application of CA for investigating the microstructure development during casting [16,17,30] and MAM [31][32][33] demonstrate the experimental relevance of the CA method and its ability for representing the preferred grain growth and competitive grain growth phenomena. CA is often used to predict grain size, morphology, and crystallographic texture [31,32,34], although there are a few studies that extend CA for predicting solute atom distribution [35,36]. The typical space and time resolutions for CA simulations are 0.1-10 µm and 0.5-100 µs, respectively. In comparison with PFM, the coarser discretization and the rather local and simple calculations lead to lower computational cost for CA, such that the simulation domain size can be up to several cubic millimetres for a few days of computations [31][32][33][34]. Still, it is obvious that even CA cannot be used for large-scale simulations or, importantly, for optimization problems where numerous simulation runs are required to find the optimum set of process conditions for a desired type of microstructure. Therefore, there is a substantial need to develop novel microstructure simulation techniques which retain physical relevance and reliability but reduce the computational cost by orders of magnitude.
Deep learning has recently emerged as a central tool for scientific computing, and there are many successful examples where a deep neural network (DNN) is trained to serve as a highly efficient simulation tool [37][38][39]. A common DNN-based strategy for microstructure modelling is to project the microstructure onto a low-dimensional space by utilizing statistical descriptors and/or dimensionality reduction techniques, and to use DNNs to describe their evolution in this latent space [40][41][42]. However, the employment of simple statistical descriptors, such as spatial correlation functions of phases or concentration distributions, may compromise the accuracy of predictions and impede the back transformation to microstructure images [40][41][42][43][44]. Alternatively, advanced auto-encoders can be used to reduce the dimensionality and reconstruct from the latent space back to the original microstructure with minimal loss of information [45,46]. However, modelling the development of the microstructure in a latent space precludes the direct integration of physical constraints and laws into the DNN, resulting in black-box solutions with limited applicability beyond their trained domain.
In this paper, we propose a new DNN-based approach, denoted as neural cellular automata (NCA), to predict microstructure development during the solidification process. It is similar to the classical microstructure modelling frameworks (e.g. CA and PF) that 'graphically' describe the evolution of microstructure, and allows for the direct incorporation of physical concepts. Our NCA idea is inspired by the work of Mordvintsev et al. [47,48], who integrated a fully connected neural network into the CA framework to reconstruct or repair dedicated image patterns seen during the training. Similar to their work [47,48], our NCA is an extension of the conventional CA, where a convolutional neural network (CNN) acts as a highly efficient and flexible 'brain' for governing the evolution of cells.
2. From Cellular Automata (CA) to Neural Cellular Automata (NCA) The CA microstructural modelling framework discretizes the simulation domain into a series of cubic/square cells, each containing a set of microstructure-related parameters such as phase state, crystallographic orientation, temperature, and possibly more [31,32]. The phase state can be either solid (S), liquid (L), or growing (G). The (G) cells are located between the (S) and (L) cells, representing the solidification interface. Three Euler angles describe the crystallographic orientation of (S) and (G) cells; these are angles between the global reference coordinate system and the principal directions of the crystal at the cells (i.e. <100> crystallographic orientations, which for BCC/FCC coincide with their preferred growth directions). A number of simple rules then govern the evolution of the parameters of each cell, based on information from the cell itself and from those in its local neighbourhood. The rules are devised such that the physics of grain growth during solidification is reproduced. For example, a (G) or (S) cell with a temperature higher than the melting point transforms to the (L) state, or a (G) cell with no (L) cell in the neighbourhood changes to the (S) state. As essential constituent, CA employ the so-called 'decentred octahedron method' [15,16] to consider the preferred growth directions. For BCC/FCC systems, the 'decentred octahedron' method assumes an octahedron for each (G) cell whose diagonals are parallel to the <100> directions of the cell's crystal. A theoretically or empirically driven equation for dendrite tip growth rate takes the cell's temperature and defines the growth of the octahedron along its diagonals. The octahedron envelope finally grows into neighbouring cells, and when it captures the centre of a neighbour liquid cell, its state (L) transforms to (G). The new growing cell inherits the same crystallographic orientation as the parent cell and starts to grow its own octahedron. The growth of octahedra along <100> crystallographic orientations and the dependency of their growth rate on the local temperature ensure that CA reproduce the phenomena of preferred growth directions and competitive grain growth in solidification microstructure modelling. The 'decentred octahedron method' schematic and the detailed algorithm of CA are given in Appendix A. There are factors which constrain the computational efficiency of CA. The incremental progress of CA simulations should be small to ensure that the 'capture' of the neighbouring cells -i.e. the transformation of the (L) neighbouring cells into the (G) state due to growth -is accurately modelled. The time increment is often limited to ∆t = αd/v max , where d: cell size, α: a constant between (0, 1], and v max : maximum growth rate among all growing cells in the simulation domain. Additionally, the consideration of information from neighbouring cells is typically limited to their immediate surroundings and is done through a 'for loop' method. NCA enhances the computational efficiency and, potentially, the flexibility and physical relevance of CA through the power of CNNs. Figure 1 illustrates a schematic representation of the NCA modelling strategy. Similar to CA, NCA discretise the simulation domain into cubic/square pixels and incrementally (i.e. frame after frame) update the microstructure based on the temperature and microstructure states in the local neighbourhoods of individual pixels. The required input data to initialise NCA are the locations of the nuclei, their size and their crystallographic orientation. Also required is the temperature evolution within the simulation domain during the solidification process; the temperature data are fed to the network as undercooling magnitude (i.e. ∆T = T melting − T ). A continuous variable P ∈ [0,1] is considered for each pixel to represent its phase state, where P = 0 and P = 1 indicate fully liquid and solid states, respectively. Similar to CA, the crystallographic orientations of the pixels are defined with three Euler angles θ = (α 1 , α 2 , α 3 ), i.e. the angles between the principal directions of the solidified pixel and the reference coordinate system.
NCA predicts the sequential microstructure evolution by recurrently employing a trained CNN. The CNN takes the temperature, phase state, three Euler angles and six hidden channels as inputs and delivers pixel-wise increments of the phase state, Euler angles and hidden channels (i.e. ∆P, ∆θ and ∆H, respectively) as output. Apart from the phase state and Euler angles, the hidden channels allow the CNN to exchange information between different frames and better track the evolution of microstructural features. The physical meaning of the hidden channels is not explicitly defined and they have zero value at the start of the analysis. The number of hidden channels and the CNN architecture are decided based on a hyperparameter tuning exercise (see Section 3.4). Accordingly, six hidden channels and 1+5+1 convolutional layers with ReLU activation function and 96 neurons per hidden layer, including 3×3 input/hidden-layer kernels and 1×1 output-layer kernels, are utilized in the NCA. The details of the adopted CNN and NCA training are given in Section 3.2 and 3.3, respectively.
To improve the physical relevance of the algorithm, the NCA integrates a 'physics-based activation function' at the output layer to prevent the CNN from violating explicit known physical laws of solidification. This 'activation function' directly encodes the knowledge that the grain growth mainly takes place at the solidification interface; it enforces no change of microstructure states for a pixel when: i) P < 0.1 and T > T melting , i.e. the liquid pixel is at a temperature above the melting point, ii) no neighbour pixel with P ≥ 0.1, i.e. not in the vicinity of the solidification interface, iii) P > 0.99 and T < T melting , i.e. the solid pixel temperature is below the melting point. Values of 0.1 and 0.99 are the thresholds for the phase state P indicating liquid and solid. They can be seen as network hyperparameters and are taken from [47,48].
The adoption of multi-layer convolution enables NCA to implicitly consider information from large neighbourhoods and integrate complex and highly nonlinear rules for governing the evolution of the microstructural parameters of the pixels. This leads to an advantage of NCA over CA in encoding the complexity of the solidification process, e.g. when an NCA model is trained based on data from PFM. Section 5 also discusses the significantly higher computational speed of NCA over CA.

Materials
In this work, we choose Hastelloy X as a model alloy to investigate the applicability of NCA in predicting its microstructure development under various solidification conditions. The generation of NCA training and validation data through CA necessitates information about the preferred growth direction and dendrite growth kinetics. Hastelloy X is a nickel-based alloy with an FCC crystal structure (solid-solution strengthened) and hence has the preferred growth direction along the <001> orientation [3,49,50]. Meanwhile, the Kurz-Giovanola-Trivedi (KGT) model [5] is used to define the dendrite tip growth rate of Hastelloy X as a function of undercooling as: Further details about the derivation of the above KGT model for Hastelloy X are described in Appendix B.

Convolutional neural networks for NCA
The architecture of the CNN in NCA is composed of fully connected convolutional layers, as depicted in Figure 2. Each convolutional layer consists of a kernel that performs pixel-wise convolution. The mathematical transformation in a convolutional layer is: x where w and b denote the trainable convolutional kernel weights and bias, f represents a non-linear activation function, x l is the input of the current convolutional layer and x l+1 is its output, which serves as the input for the next layer.
In brief, CNN can be considered a neural network 'function' that decides the evolution in the state of a pixel by analysing the information from its neighbouring pixels. The weights and biases of the network are determined through network training, where a high-dimensional optimization problem is solved to minimize the difference between the CNN predictions and the ground truth data generated by the CA (as explained in Section 3.3). To achieve the best performance, the hyperparameters of CNN, such as the number of hidden layers and the activation function type, need to be optimized (as discussed in Section 3.4).  Table 1 shows that different variants of NCA are trained to represent Hastelloy X solidification microstructure under different conditions. Ground truth data are created using CA simulations with a domain size of 55×55 µm 2 for each solidification condition. The data are divided into a 9:1 ratio for training and validation [51,52]. Additionally, ten further CA simulations are performed to evaluate the generalization error of NCA variants under unseen cases, e.g. larger domain sizes, longer solidification durations, different temperature fields and nucleation settings. The orientation and location of nuclei are randomly assigned in the CA simulations for generating training, validation, and testing data except explicitly mentioned otherwise (i.e. single grain growth). The CA simulations adopt a time increment of ∆t = 8 µs, except for quasi-3D cases, where ∆t = 40 µs.

NCA training & testing
During training, the CA-predicted phase state and Euler angles, along with undercooling values, are used as inputs for the NCA (the Euler angles and undercooling are normalized by 90°and 100 K, respectively). The CNN is initialized with zero biases and random weights and kernel parameters (except for the output layer). The training uses the Adam optimizer (learning rate of 10 −3 decaying to 9×10 −5 ) for 7000 epochs and the L2 loss defined by equation 3. To achieve faster convergence and following [47,48], the weight and bias of the output layer are initialized with zero and the gradients of the loss are normalized by their Frobenius norms at every epoch (to prevent gradient explosion).
Here N Frame and N Solid pixel are the numbers of the time frame and solid pixel in each frame (pixel with P > 0.99), respectively. Finally, NCA's accuracy is evaluated by comparing its results with CA predictions: (4) where N total is the total number of pixels and N ∆ϕ(CA,NCA)>15°i s the number of pixels with orientation error ∆ϕ exceeding 15°(i.e. the threshold of establishing grain boundary).

Hyperparameter tuning
A hyperparameter tuning exercise is conducted to find the optimal CNN architecture. This involves training and evaluating multiple neural networks with different architectures, varying the number of hidden convolutional layers and neurons per hidden layer. We keep other details, such as kernel size, learning rate, and activation function, consistent with references [47,48].
The hyperparameter tuning exercise is conducted solely for non-isothermal solidification data, and the CNN architecture that yields the best results is used for all other conditions. To reduce the computational costs of hyperparameter tuning, we limit the number of epochs for training each model to 2500 and use a fast-decaying learning rate schedule (70% decay every 1000 epochs). This approach ensures that each model is trained in less than 0.5 hours. Table 2 shows that the best training and validation accuracy is achieved with five hidden layers and 96 neurons per hidden layer and hence is adopted for this study. The impact of varying the number of input hidden channels on CNN accuracy is also examined which indicates that the adoption of six hidden channels produces the best results (Table  3).

Isothermal solidification: 2D single-grain growth
Initially, the NCA model is trained with 2D CA isothermal simulation data for a domain size of 55×55 µm 2 (55×55 pixels). A single nucleus with a random α 1 and α 2 = α 3 = 0 is placed in the centre of the domain with a uniform and steady temperature field representing an undercooling of 20 K. The model achieves training and validation accuracies of 98.0% and 97.7%, respectively, which indicates that it learns well the kinetics of grain growth and the preferred growth direction mechanism. Furthermore, the trained model is used to predict the solidification microstructure for a 3× larger domain and a 5× longer solidification duration with the nucleus located off the centre. As presented in Figure 3, the NCA can predict such cases with an accuracy of 99.6±0.2%.

Isothermal solidification: 2D multi-grain growth
In the second stage, the data from additional multi-grain CA simulations are added to the previous ones for training and validation of the NCA. In these data nuclei are placed randomly in the 55×55 µm 2 domain with a nuclei density in the range of 8,260-11,570 mm −2 . Similar to the previous simulations, the 2D nature of the simulations allows considering α 2 = α 3 = 0 while α 1 takes a random value. The assessment of the NCA model for multi-grain simulations indicates training and validation accuracies of 99.9% and 96.5%. The trained model further shows a test accuracy of 98.0±0.9% in predicting multi-grain growth in a larger domain. Figure 4 illustrates NCA and CA results for a test case, showing that NCA well reproduces the microstructure evolution and the effect of nuclei density on the final grain size. For both single and multi-grain growth cases, the main inconsistency between NCA and CA is observed at the grain boundaries and solid-liquid interfaces. While CA represents the grain boundaries as a sharp interface, the involvement of the convolution operator in the NCA leads to a diffusive interface at the grain boundaries.
To obtain grain size distributions from the microstructures generated by NCA (shown in Figure 4g), it was essential to establish a clear boundary between grain domains and hence, a post-processing algorithm was devised to address this requirement. The algorithm corrects the Euler angle of each pixel by replacing it with the Euler angle of a nearby nucleus that displays the most similar orientation. This correction serves to convert the diffuse interface into a well-defined sharp boundary. For more comprehensive information about this post-processing technique, refer to Appendix E.

Isothermal solidification: towards quasi-3D multi-grain growth
Without changing the NCA architecture, we now examine its applicability for a scenario similar to quasi-3D grain growth simulation. A quasi-3D simulation, where the 3D domain is replaced by three perpendicular 2D cross-sections [54,55], serves as a computationally cheaper alternative for a 3D microstructure simulation. Such simulations require coupling between the simulations in the three perpendicular 2D cross-sections and allow for the evolution of all three Euler angles. Here, as a preliminary study, we allow for the evolution of all three Euler angles in the 2D simulations.
A new set of CA simulations including 1,658 2D runs with three evolving Euler angles for isothermal solidification with undercooling of 20 K are conducted to generate the training and validation datasets for NCA. Comparison of CA and NCA for such simulations indicates training, validation and test accuracies of 93.7%, 87.9% and 89.7±1.8%, respectively. Figure 5 presents CA and NCA outcomes of an examined test case, showing that the trained model is capable of simulating 2D multi-grain growth with three evolving Euler angles. Compared with the previous cases in Section 4.2, a higher level of inconsistency between CA and NCA is observed, particularly at the grain boundaries. This indicates that a more complex NCA architecture may ultimately be required for quasi-3D microstructure simulations with NCA; however, this development is outside the scope of the present study.

Non-isothermal solidification: 2D multi-grain growth
For adopting the proposed simulation strategy for practical applications, we need to demonstrate its accuracy in predicting the solidification microstructure under non-isothermal conditions. In this section, the NCA model is trained based on CA simulations for 2D steady non-isothermal solidification. Accordingly, training, validation and test accuracies of 95.8%, 95.4% and 96.2±1.1% are obtained, respectively. Figure 6 shows that the NCA model predicts slower and faster growth rates for the areas with lower and higher levels of undercooling, respectively. Furthermore and as shown in Figure 7, NCA can also well predict the competitive growth mechanism, i.e. the grains which have one of their preferred growth directions aligned with the maximum temperature gradient grow faster than the others.  Without further training, the NCA model is finally used for predicting solidification under a transient non-isothermal temperature field, i.e. under continuous cooling. As shown in Figure 8, although no cases with continuous cooling are included in the training, the model accurately predicts the growth kinetics and the final microstructure (test accuracy of 96.9±0.8%).
In summary, the results presented in this section demonstrate the accuracy of the NCA microstructure modelling strategy and its ability to learn critical physical mechanisms of solidification such as preferred growth direction and competitive growth under various solidification conditions.

NCA computational speed
This section compares the computational speed of NCA and CA for a set of benchmark cases. As noted in the introduction, of the two most common physics-based microstructural modelling approaches, CA has a lower computational cost than PFM. Therefore, discussing the computational speed of NCA in comparison with CA serves as a conservative estimate for the speed increase achieved by NCA over the conventional microstructure simulation strategies. The trained NCA model in Section 4.2 is used here, and its computational speed is evaluated for simulating eight different domain sizes between 40 2 and 880 2 µm 2 with a nuclei density of 625 mm −2 . A constant time increment of 8 µs is adopted for both CA and NCA for the primary assessments. The measured CA and NCA runtimes on Intel(R) Xeon(R) Gold 6248R CPU for various simulation domain sizes are shown as the red and yellow curves in Figure 9. An acceleration factor of 6 to 37 is observed for NCA, depending on the domain size. While most CA codes for microstructure modelling are only compatible with CPUs [15,31,32], NCA can easily exploit the computational power of GPUs (without extra coding). As seen in Figure 9, GPU computation further accelerates NCA and makes them four orders of magnitude faster than CA. Notably, the runtime of NCA on GPU is not sensitive to the simulation domain size, at least within the examined range. Therefore, even higher acceleration factors can be expected for larger domain sizes.
Adopting a larger time increment of the simulation reduces the computational cost for both NCA and CA. To evaluate how much this deteriorates accuracy, NCA is trained for a time increment increased by a factor of K in the range of 1-50, based on a subset (one frame every K) of the CA simulation database described in Section 4.2. Furthermore, the time increment for CA is similarly increased K times and the corresponding results, along with the results of NCA, are compared with the reference CA results (those for K=1). Figure 10 shows the accuracy of both CA and NCA for different acceleration factors in comparison with the results of the reference CA simulation (with K=1). Expectedly, the accuracy of CA significantly decreases with the increase of K. Notably, NCA are found to be very accurate up to the acceleration factor of 20 since the multi-layer CNN implicitly considers a larger neighbourhood and utilises more complex functions for grain growth prediction than CA.
The accuracy of the NCA model drops for larger K values (e.g. K=50). However, as shown in Figure 10, employing a more complex CNN architecture (e.g. with a 9×9 convolution kernel for the input layer or with 9 hidden convolution layers) leads to an excellent accuracy even for K=50. The ability of NCA to accurately reproduce the microstructure evolution using significantly larger time increments than CA originates from adopting a multi-layer convolution operator in NCA, which delivers the change in the pixel states based on information from a larger neighbourhood. These results prove that NCA is able to release the requirement for small time increments which limits the computational speed of conventional microstructure modelling techniques; hence, NCA can significantly outperform the computational speed of such simulations.
To summarize, NCA exhibits up to six-orders-of-magnitude improvement in computational efficiency compared to CA (see Figure 9). The fast computational speed of NCA comes from three contributions. Firstly, NCA replaces the 'for' loop in conventional CA with efficient convolutions, leading to over 10 times speed-up without GPUs. Secondly, the high efficiency of GPUs for parallel computation further accelerates the NCA by around three orders of magnitude. Finally, NCA can benefit from the possibility of adopting larger time increments, up to 50 times larger than the conventional CA in the shown cases.

Conclusions
In this work, we have presented NCA, a deep-learning-based approach for modelling solidification microstructures. NCA leverages a multi-layer CNN to replace the simple rules of CA, providing higher computational efficiency and flexibility. The convolution operator in NCA is particularly relevant to the physics of solidification, which is a time-progressive phenomenon governed by temperature and microstructure state in the vicinity of the solidification interface. Importantly, NCA allows for encoding physical knowledge into the framework by adopting a 'physics-based activation function', ensuring the satisfaction of solidification physics while easing the training process and reducing the required data size.
We have demonstrated that NCA effectively learns from a relatively small number of CA simulations, since each CA simulation generates a very large dataset by exploring the spatiotemporal evolution of microstructure states for each cell. Interestingly, the trained NCA is independent of boundaries/initial conditions and can accurately predict solidification microstructures even for unseen scenarios during training. This is evident from the high test accuracy when testing the models for larger domain sizes, longer solidification durations, various initial nuclei settings and diverse temperature fields. Moreover, NCA is proven to successfully reproduce the solidification with continuous nucleation (see Appendix C), where nuclei are activated at various time steps, while all the nuclei in the training data are activated at the initial step.
NCA is shown to maintain a high level of accuracy and is up to six orders of magnitude faster than conventional CA, thanks to the efficiency of the adopted computational algorithm, compatibility with GPU computing, and ability to adopt larger time increments for simulations.
Several future extensions of the NCA framework are worth exploring. Firstly, NCA can be trained by data from other high-fidelity microstructure simulations, such as those based on phase-field models, to tackle more complex problems, e.g. the solute atom distribution during phase transition. Training such an NCA model can be achieved by introducing solute concentration as additional input and output channels and implementing a physics-informed loss constraint based on solute atom mass conservation.
Secondly, optimizing the training data and CNN architecture can further improve the predictability and efficiency of NCA. For instance, incorporating larger domain sizes in the training data or using a CNN with larger neighbourhood sizes can capture grain growth under larger undercoolings or time increments, as demonstrated in Appendix D and Section 5. Adopting a 3D-CNN in NCA can extend the current model into 3D microstructure simulations. Finally, further computational acceleration might be achieved by replacing the CNN with a Fourier Neural Operator (FNO) [56], which employs the Fast Fourier Transform to compute convolutions more efficiently. In summary, we believe NCA bears great potential as a universal method for fast and accurate microstructure simulation.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments
Financial support by the Swiss National Science Foundation (SNSF; grant number 200551) is gratefully acknowledged.

Data availability
The models and scripts used in this study are available on Github.

Appendices
The NCA simulation strategy is developed to capture the physics of the solidification microstructure formation, in which a CNN is trained by the outcomes of CA simulations. The details of the CA method and the generated data for training and validation of NCA are presented in Appendix A. Appendix B explains the derivation of the dendrite growth rate of Hastelloy X for consideration in the CA simulations. Appendix C and Appendix D demonstrate the applicability of NCA for simulating consecutive nuclei activation and solidification with larger undercooling ranges. Finally, Appendix E elaborates a post-processing method for NCA and its effect on the predicted microstructure.

Appendix A. CA simulations for training & validation data
CA simulations are used to generate data for training, validation and ultimately testing NCA. Algorithm 1 describes the pseudocode for the performed CA simulation in this study. As mentioned in Section 2 of the main text, the CA domain is discretized into cells, each containing information such as cell state P, Euler angles θ, temperature T , etc. The incremental evolution of the information of the cell at every increment δt is governed by the information of the cells within a 3×3 neighbourhood in the previous increment.
At the beginning of the simulation, the nuclei at locations nuc pos are activated in a nx × ny liquid domain by assigning them the cell state of 'growing' and dedicated Euler angles nuc ea [31]. Based on the 'decentred octahedron method' developed by Gandin and Rappaz [15], a growing octahedron with the size of 0 is considered at the centre of the nuclei cells whose diagonals make angles as nuc ea with the reference coordinate system. The octahedron centre location Oct and size λ are tracked during the CA simulations. The incremental growth of the nuclei is simulated through expanding diagonals of the octahedra at a rate given by the 'dendrite growth velocity'. The dendrite growth velocity is calculated based on the cell undercooling (i.e. ∆T = T melting − T ) at each increment and as described in Appendix B. Figure 11: Schematic of the 'decentred octahedron method' for grain growth in CA simulations. Figure 11 illustrates a schematic of the decentred octahedron method for representing the movement of the solidification interface. For each increment, it is mathematically checked if the centre of the liquid cells in the neighbourhood of each growing cell falls inside the octahedron and if so, the state of the liquid cell is changed to growing, and a new octahedron is assigned for it. The new octahedron shares one corner with the previous octahedron, hence has the same spatial orientation (i.e. the Euler angles of the new growing cell are inherited from the parent cell). The size of the new octahedron is derived based on the cell length l and the distance a between the cell centre and the corner of the octahedra, see Figure 11. The new octahedron then grows independently of the parent and based on its local undercooling.
The transition of the state growing to solid occurs when the growing cell has no neighbour in the liquid state. When there is no liquid cell in the domain, the simulation is finished. Our developed Python CA code can be accessed from Github and readers are referred to Ref. [31] for more details about the CA solidification microstructure modelling.
Various CA simulations were performed for generating training, validation and test data of the NCA. Except for the single-grain growth study (Section 4.1), the training of the NCA uses CA simulation data from both single-nucleus and multi-nuclei growth. It is observed that the accuracy of NCA in predicting multi-grain growth is improved, when the training data includes both single-nucleus and multi-nuclei CA simulations, especially for cases with a time increment factor K>1. For example, when K=5, the NCA model trained with the dataset in Section 4.2 obtains a better test accuracy of 96.8±0.8% compared with the model trained by the same amount of data with only multinuclei CA simulations (test accuracy of 95.3±0.9%). However, the performance of the NCA model trained with only multi-nuclei data can be improved if the training data size increases, e.g. the NCA model trained with 80 multi-nuclei CA simulations obtains a test accuracy of 97.7±0.5% for K=5.   for each C G do 29: if no liquid cell in its neighborhood then; Although there are examples of experimental measurement of dendrite growth rate, e.g. in [57], this is often estimated from empirical models [4][5][6] or phase-field simulations [58]. One of the prevailing empirical models, the Kurz-Giovanola-Trivedi (KGT) model [4,5] describes the relation between the dendrite tip radius R and its growth velocity v by the following two equations: where G, G c , P c , D, m l , Γ are the mean thermal gradient at the dendrite tip, the concentration gradient in the liquid near the dendrite tip, the Peclet number, the liquid diffusivity, the liquidus slope, and the Gibbs-Thomson coefficient, respectively. The thermal gradient G is neglected in the dendritic growth regime [7]. The concentration gradient G c and the dimensionless variable ϵ c are functions of the Peclet number P c and growth rate v [5,59]: where k, c 0 , and I v (x) are the equilibrium partition coefficient, the solute atom concentration, and the Ivantsov function, respectively. Often the constitutional undercooling, resulting from the solute atom segregation, is considered as the main contributor to the supercooling [4][5][6]. The constitutional undercooling with a given Peclet number P c is: The above set of equations can be solved for a range of v · R values and accordingly the dependence of the dendrite growth rate v on the undercooling ∆T can be derived. For the sake of simplicity, a polynomial is ultimately fitted to describe this dependence, i.e. v(∆T ).
For the application of the KGT model to multi-component systems, the diffusion field of each solute species should be superimposed, and hence, the undercooling ∆T and dendrite tip radius R are [4,5,60]: We assume the Hastelloy X as a Ni-20Cr-20Fe-10Mo alloy and use the parameters summarised in Table 4 to derive Equation 1 (in the main text) for its dendrite growth rate. The results reported in the main text discuss the application of the NCA for conditions in which all nuclei are activated at the start of the simulation. However, solidification microstructure modelling for processes such as MAM requires consideration of continuous nucleation. We examined the ability of the trained NCA in Section 4.2 to predict microstructure development for conditions where nuclei are activated consecutively during the solidification process. Figure 12 shows the NCA model predictions in comparison with CA results, showing that the trained NCA can well represent the microstructure evolution with consecutive nucleation.   The main text describes training and validation of NCA for up to 25 K undercooling. Application of the NCA for simulating fast solidification during MAM requires consideration of larger undercoolings. This section examines the effectiveness of NCA for solidification microstructure simulation with undercooling up to 45 K. This involves consideration of CA simulations from a larger domain size of 128×128 µm 2 for training NCA. The CA simulates solidification under a temperature gradient with undercooling in the range of 15 and 45 K and for nuclei density of 1831-3662 mm −2 to generate data for training and validation of NCA. NCA are trained with a fast decaying learning rate (decaying 70% every 100 steps) for 250 epochs. The model obtains accuracies of 97.7%, 97.4%, and 96.2% for training, validation and testing, respectively, see Figure 14. The trained model can also simulate solidification under continuous cooling for the mentioned undercooling range, as presented in Figure 15.

Appendix E. Post-processing of NCA results
Here describes a post-processing technique to remove the artefacts in the NCA predictions, e.g. the fluctuations of Euler angles within the grains (see Section 4.2). As the orientation of grains inherits from its nuclei, the main idea of this post-processing method is to replace the pixel Euler angles with that from one of the nearby nuclei that has the most similar orientations. For a pixel located at (x i , y i ) with Euler angles θ i and a nucleus n j in its vicinity, we define D i j as: {(x i − x n l ) 2 + (y i − j n l ) 2 } (E.1) D i j is an index representing the differences of Euler angles and location between pixels and the nucleus. In the post-processing of NCA results, we replace the pixel's Euler angles with that of the nuclei with the smallest D ni i, j . Figure 16 shows a comparison between the CA and the post-processed NCA results from Figure 4. Most of the artefacts are removed after post-processing with only a few inconsistencies at the grain boundaries compared with CA results.