The variant graph approach to improved parent grain reconstruction

The variant graph is a new, hybrid algorithm that combines the strengths of established global grain graph and local neighbor level voting approaches, while alleviating their shortcomings, to reconstruct parent grains from orientation maps of partially or fully phase-transformed microstructures. The variant graph algorithm is versatile and is capable of reconstructing transformation microstructures from any parent-child combination by clustering together child grains based on a common parent orientation variant. The main advantage of the variant graph over the grain graph is its inherent ability to more accurately detect prior austenite grain boundaries. A critical examination of Markovian clustering and neighbor level voting as methods to reconstruct prior austenite orientations is first conducted. Following this, the performance of the variant graph algorithm is showcased by reconstructing the prior austenite grains and boundaries from an example low-carbon lath martensite steel microstructure. Programmatic extensions to the variant graph algorithm for specific morphological conditions and the merging of variants with small mutual disorientation angles are also proposed. The accuracy of the reconstruction and the computational performance of the variant graph algorithm is either on-par or outperforms alternate methods for parent grain reconstruction. The variant graph algorithm is implemented as a new addition to the functionalities for phase transformation analysis in MTEX 5.8 and is freely available for download by the community.


Introduction
Many alloys undergo partial or complete phase transformation from a metastable parent to a stable child phase when exposed to thermo-mechanical stimuli.Alloy design begins with optimizing the high temperature thermo-mechanical processing regime of the parent microstructure.Here the morphology and crystallographic texture of the parent phase after high temperature processing affects the final microstructure, phase fractions and mechanical properties of the alloy upon transformation during cooling.Alternatively, phase transformation is also an effective means to improve the mechanical properties of alloys by refining the final grain size and crystallographic texture of the final child phase(s) [1,2].In the specific case of partial phase transformation, where some of the parent phase is retained, multi-phase parent-child microstructures with optimal strength-ductility combinations are designed [3].Since microstructure analyses are usually undertaken at room temperature, it is generally only possible to characterize the transformed child phase(s), or a combination of remnant parent and child phase(s).Thus, various methods to reconstruct the high temperature parent phase microstructure from experimental phase and orientation maps have been developed [4,5,6,7].
A particular difficulty in reconstructing parent grains are alloys in which both parent and child phases exhibit a high degree of crystallographic symmetry.Perhaps the most challenging example in this category is low-carbon lath martensite formed from high-temperature austenite during quenching.Typically, and depending on the operative orientation relationship (OR), up to 24 individual child orientation variants are generated from a single parent austenite orientation [8].A second difficulty to reconstruction is the high boundary fraction of Σ3 annealing twins in the parent austenite microstructure.Twinned grains share a mutual (111) plane across the twin boundary, and consequently, the six martensitic variants formed on this particular habit plane (in both twins) are crystallographically close to equivalent [5].A single prior austenite orientation may, and often does, have as many as three twins, all sharing variants across twin boundaries.The third major difficulty is that martensite variants with low misorientations form side-by-side as thin laths; bundles of which make up the well-known block structure in martensite [9].The low misorientation angle between the laths within a block makes it difficult to obtain crystallographic information from the microstructure at the level of individual martensite variant orientations.Thus, despite a good amount of work towards improving reconstruction algorithms in recent years, the accurate reconstruction of annealing twin boundaries in low-carbon steels at the level of individual variants remains a huge challenge to this day.
Generally speaking, there are two classes of algorithms for parent grain reconstruction that operate on a grain level, namely, grain graph [10,8,11] and local neighbor voting types, of which the nucleation-growth approach [4,6,12] is the most prominent example.Grain graph algorithms consider a graph in which all child grains in a microstructure are represented as nodes.The boundaries between a grain and its neighbors are represented as edges that connect the nodes.By assigning probabilities of belonging to a common parent grain to these edges, clusters of child grains that are likely to belong to the same parent grain are identified and reconstructed.The approach is computationally efficient and considers all grains within the microstructure.A drawback of this method is that it utilizes scalar probabilities to describe the likelihood of having a common parent orientation.These probabilities are extended to higher order neighboring grains without checking whether or not they apply to the same parent orientation variant; thereby leading to the erroneous clustering of child grains in some localized areas [8,11].As a consequence, varying degrees of additional processing steps are often required following the application of the grain graph approach.
As stated previously, the nucleation-growth approach is the most prominent example of local neighbor level voting algorithms.It identifies nuclei, i.e. local groups of child grains that have a common parent orientation variant, and allows these nuclei to grow into the surrounding parent phase.The determination of the common (and by extension, correct) parent orientation variant is facilitated by a voting mechanism of the participating child grains.The main advantage of this algorithm is that it is relatively robust.A drawback is that some microstructures may not contain enough nuclei to cover all parent orientations.Furthermore, the growth part of the algorithm may lead to the reversion of parent orientation variants that may not be the best fit.In recent work in Ref. [13], the growth stage of nuclei is replaced by sectioning the map in square grids and applying a voting algorithm to the grains in each square grid separately to determine parent orientations.
In this study, we introduce a new, hybrid variant graph algorithm to parent grain reconstruction.The algorithm combines the advantages of the established grain graph and nucleation-growth approaches while alleviating their shortcomings.The variant graph enhances and improves on the grain graph algorithm by enabling transitivity conditions to be satisfied, i.e. the product of two edge probabilities agreeing with the probability that the two outer grains belong to the same parent grain.To achieve this, we do not associate each child grain with only one node in the graph but rather, each child grain is associated with one node for each of the potential parent orientation variants.Consequently, the number of nodes associated with a child grain are the number of parent orientation variants allowed by an orientation relationship.
Iterating via the variant graph approach automatically generates the most likely parent orientation for each grain such that no clustering is required.The use of sparse matrices for the probability graphs and a streamlined implementation of matrix multiplication results in an efficient and robust reconstruction algorithm suitable for large orientation microscopy data sets.The new approach to parent grain reconstruction is implemented in MTEX 5.8 as as a programmatic addition to the previously established phase transformation analysis framework [7] and is freely available to the community.

The example data set: Low-carbon lath martensite steel
The new, hybrid algorithm is introduced by applying it to a low-carbon lath martensite steel microstructure.The electron backscattering diffraction map consists of 4 million data points at approximately 0.5 µm spacing.All processing prior to the application of reconstruction algorithms was undertaken in MTEX 5.8 [14] and entailed: (i) the removal of single-pixel orientation measurements, leaving 83% indexed points, (ii) the defining of a grain map with an angular threshold of 3 • , and (iii) the determination of a refined orientation relationship between austenite and martensite using the method described in Ref. [15].
The map shown in Fig. 1(a) comprises band contrast overlaid with individual pixels colored by their inverse pole figure (IPF) orientation.The prior austenite grain boundaries determined by reconstruction in Section 5.5 are overlaid on the map as black boundaries to highlight the wide distribution of parent austenite grain sizes.Fig. 1(b) highlights a small region of the map containing a single, heavily twinned prior austenite grain.The grain morphology is representative of the hierarchical structure typical of lath martensite [9].The (001) pole figure in Fig. 1(c) shows the martensite orientations of this grain, illustrated by the same IPF colors.
In subsequent sections, this prior austenite grain is also used as a representative example to review and showcase the performance of the presented reconstruction algorithms.For example, as shown in Section 3.3, there are multiple instances where the martensite variant orientations of twinned austenite grains almost completely overlap.

Established methods of parent grain reconstruction
In this section we review the two most well established grain level parent grain reconstruction methods.Both methods take advantage of an initial segmentation of the EBSD data into grains G n , n = 1, . . ., N , cf. [16].Each grain is associated with a parent or a child phase and a mean grain orientation g n .
Assuming a parent-to-child orientation relationship g p→c , all potential parent orientation variants of a child orientation g n are given by where S c j , j = 1, . . ., |S c | enumerates all child symmetries.In the case of a degenerated orientation relationship, for example, the Nishiyama Wassermann OR [17], some of the parent orientation variants may be symmetrically equivalent.To keep the notation simple in such situations, we assume that the index j runs over a maximum set of child symmetries S c j such that none of the parent orientation variants g j n are symmetrically equivalent.The goal of parent grain reconstruction is to compute the true parent orientation of each child grain G n from all possible parent orientations g j n .The true parent microstructure of a transformed microstructure is revealed by targeted surface etching techniques [10].Alternatively, the true parent microstructure of recrystallized austenite reconstructed from individual child grains is generally identified by • Several clusters of neighboring child grains voting for the same parent orientation.
• The morphology of a high-temperature parent phase microstructure, typically comprising equiaxed grains.
• An annealing twin structure exhibiting a boundary morphology typical of coherent Σ3 related twin boundaries [18].

Grain graph based parent grain reconstruction
The method describing the successful identification of clusters of similar parent orientation variants was first introduced by Gomes and Kestens cf.[11] and relies on the so-called grain graph.The grain graph is a mathematical description of the adjacency relationships of the grains G n .Each grain G n corresponds to exactly one node in the graph.Two nodes are connected by an edge if the corresponding grains share a common grain boundary.During the reconstruction algorithm, the edges are labeled with weights P m,n describing the probability that two grains G m and G n originate from a common parent grain.
An example of a grain graph is displayed in Figure 2(a).The nodes labeled G 1 , . . ., G 9 represent the grains.Adjacent grains are connected by solid lines that are referred to as edges.In order to (b) Grain graph after the first expansion step Figure 2: Schematic example of a grain graph.The circular nodes represent child grains G 1 , . . ., G 9 .Each child orientation is associated with three potential parent variants, each of which are in turn illustrated by differently colored sectors.The square node G 6 represents a retained parent grain with a "cyan" orientation.In the initial grain graph (a), grains with a common grain boundary are connected by an edge.The darkness or brightness of an edge corresponds to the probability associated with the misorientation angle between best fitting parent variants.For example, grains G 4 and G 8 are connected by a dark high probability edge since they share a very similar yellow parent orientation.In contrast, grains G 8 and G 5 are connected by a bright low probability edge since the green parent orientations are slightly different for both child grains.Following the first expansion steps of the Markovian clustering algorithm, all second neighbors are then connected by edges.The brightness (and probability) of each of the new edges is computed as the product of the brightness of the individual segments.
simplify the example schematic, we assume an OR with only three possible parent variants, i.e., every child grain G n corresponds to only three potential parent orientations g j n , j = 1, 2, 3.The three possible parent variants of each child grain are visualized by differently colored sectors.
Grain G 6 is a retained parent grain and, hence, is represented by a square filled with one color.From the colors, the guess is that: and G 8 originate from a common yellow parent grain.
• Grains G 2 , G 5 and G 8 originate from a common green parent grain.
• Grains G 3 , G 5 , G 6 and G 9 originate from a common cyan parent grain.
Obviously, the three guesses cannot simultaneously be true.The challenging task for any parent grain reconstruction algorithm is to identify the physically most likely solution.
In the case of grain graph based parent grain reconstruction, random walks are simulated through the graph to identify strongly connected clusters.More precisely, the algorithm consists of the following steps: i) Constructing the grain graph as described above.ii) Assigning a weight w m,n to each edge which describes the probability that the two adjacent grains G n , G m originate from a common parent grain.iii) Clustering the graph into strongly connected components, for example, via the Markovian clustering algorithm [19].iv) Determining the best fitting parent orientation for each cluster.
The processes of grain graph construction and clustering are detailed further as an understanding of these two items is crucial to the introduction of the new, hybrid variant graph approach in Section 4.
Constructing the grain graph.The most critical part in constructing the grain graph is the computation of edge weights.Edges are either of parent -child or child -child types.For a parent -child grain pair G m -G n , where g m is the parent orientation of G m and g j n are the potential parent orientations of the child grain G n (see Equ. (3.1)), the misorientation angle ω m,n between the parent orientation g m and the best fitting parent variant g j n is given by where ω p (g 1 , g 2 ) denotes the angular distance in orientation space modulo parent symmetry.
For two neighboring child grains G m , G n , the misorientation angle ω m,n between the best fitting combination of parent variants is In order to translate the misfits ω m,n into probabilities a function Φ is used that is close to 1 for very small misfits and decays to zero as the misfit exceeds a certain threshold δ.A common choice for such a function is the Gauss error function erf(x) scaled by constants σ and δ as The scaling is chosen such that Ψ(δ − σ) = 0.75, Ψ(δ) = 0.5 and Ψ(δ + σ) = 0.25.The grain graph is efficiently stored as a sparse N × N adjacency matrix P , where N is the number of grains, containing at row m and column n the entry P m,n if the grains G m and G n are connected by an edge with non-zero probability.All remaining matrix entries are zero.
In Fig. 2(a), the edge weights are the numbers attached to each of the edges.Furthermore, the darkness of the edges is chosen proportional to the edge weights.In the more complicated follow up graphs, the numbers are omitted.
It should be noted that because of ω m,n = ω n,m , the matrix P is symmetric and the initial grain graph is not directed.This property is lost during the subsequent clustering step.
Clustering of the grain graph.While different clustering algorithms may be applied to group grains with a common parent orientation, the following discussion is limited to the commonly used Markovian clustering algorithm [19].The latter algorithm simulates random walks within a grain graph via the following steps: i) Expansion: The key assumption of the expansion step 1 is as follows: If the probability to walk from node m to node o is P m,o and the probability to walk from node o to node n is P o,n , then the probability to walk from m via o to n is P m,o •P o,n .Accordingly, summing over all possible intermediate nodes o, we obtain the total probability to walk from m to n This is exactly the (m, n)-th entry of the product matrix P •P in step 1.As the values Pm,n may exceed 1, they are interpreted as probabilities only after the normalization step 3.The purpose of the inflation step 2 just before the normalization step is to emphasise higher probability edges over lower probability edges.More precisely, the ratio between two entries in the matrix P becomes more pronounced when higher inflation parameter α values are used.The pruning in the last step 4 attempts to keep the matrix P as sparse as possible.
The grain graph Fig. 2(a) after a single run of the steps 1-4 is depicted in Fig. 2(b).It is observed that by connecting all second order neighbours, the grain graph becomes much more connected.While new edges arise within the yellow, green or cyan clusters, they also unfortunately arise between grains that do not share a common parent orientation.Most notably, the grains G 4 and G 5 get connected with a high probability edge even though they cannot possibly originate from a common parent grain.This occurs because G 4 is connected to G 5 by two routes through G 2 and G 8 , respectively.Although both routes contain low probability edges, they sum up to a high probability edge between G 4 and G 5 .As a consequence, all nodes in our example appear as a single cluster.A complex situation as described above is one of the core reasons behind why the Markovian clustering algorithm sometimes generates very large clusters of child grains that do not necessarily agree on a common parent orientation.
By iterating this process, i.e., by considering P 2 , P 4 , P 8 , up to second, fourth, eight order neighbors are included into the probability matrix.As a consequence, the matrix P includes more and more non-zero elements and, hence, becomes less sparse.The combination of the inflation and normalization steps ensures that during the iteration process for each node, higher probability edges become stronger while lower probability edges decay to zero.This keeps the matrix sparse and eventually ensures that the matrix separates into disconnected components by converging to an idempotent matrix with all entries being either 0 or 1.
Computing parent orientations.After the grain graph converges to an idempotent binary matrix, parent grains are constructed by merging all pairs of child grains G m , G n with P m,n = 1.The parent orientation g p of a grain merged from child grains G m1 , G m2 , . . ., G m X is usually computed by minimizing the mean misorientation angle to the best fitting potential parent orientations g j m k of the child grains.This is written as the optimization problem min Unfortunately, the grain graph does not contain the information about the best fitting parent orientations g j m k .This makes the optimization problem computationally expensive to solve.Usually, this step is the most time consuming in the grain graph approach to parent grain reconstruction.

Local neighbor level voting based parent grain reconstruction
In local neighbor level voting based reconstruction algorithms, votes for each potential parent orientation are collected from the neighboring grains of each child grain.In the schematic example in Fig. 2(a), grain G 2 collects a yellow vote from G 1 and a green vote from G 3 while G 3 collects a yellow vote from G 2 and two cyan votes from G 4 and G 5 .
In order to formalize the voting, the misorientation angles between potential parent orientation variants of the child grains are considered.If a grain pair comprising a neighboring child grain G m and a parent grain G n is considered, the misorientation angles ω i m,n between the potential parent orientations g i m of G m and the parent orientation g n of G n is denoted by g n ) Similar to the grain graph approach, we use a function Ψ to transform the misorientation angles ω i m,n into values that approach zero if the misorientation angle exceeds a certain threshold or are close to one for small misorientation angles.The value P i m,n is interpreted as the voting weight for the parent orientation g i m of the child grain G m .In this sense, the neighboring parent grain G n generates a voting weight P i m,n for each potential parent orientation g i m of the child grain G m .Depending on the threshold of the function Ψ, most of these weights will be zero.
In the case of a pair G m , G n of neighboring child grains, we have between all possible combinations of potential parent orientations g i m of G m and potential parent orientations g j n of G n .Again, the function Ψ is used to transform the misorientation angles into voting weights.More specifically, for a potential parent orientation g i m of grain G m , the grain G n generates a vote with the weight Depending on the threshold angle of the function Ψ and the orientation relationship between the two child orientations g m and g n , most of the voting weights will be zero.
By summing the voting weights P i m,n separately for each potential parent orientation g i m of G m over all neighboring grains G n , we obtain the final voting weights If it is momentarily assumed that the function Ψ is simply a cutoff function, i.e., Ψ(ω) = 0 if ω is above a certain threshold and Ψ(ω) = 1 otherwise, then P i m simply counts the number of neighboring grains that have a potential parent orientation close to g i m .In order to make the voting weights comparable across different child grains, the voting weights are often normalized by such that the sum of all votes for any child grain G m satisfies i P i m = 1.In its simplest form, a voting algorithm requires the selection of the parent orientation g i m with the highest voting weight P i m for each child grain G m .More involved schemes may assign parent orientations to only those child grains when the highest voting weight exceeds a certain threshold, or when the difference between the highest and second highest voting weight is sufficiently large.After parent orientation assignment, the process is iterated with equal or relaxed criteria.In fully transformed microstructures, the iteration process is often started by considering only child -child grain pairs, which is termed as the nucleation step, followed by several growth steps where only parent -child grain pairs are considered.

Advantages and drawbacks of established parent grain reconstruction methods
The global grain graph and the local neighbor level voting approaches to parent grain reconstruction described in the above two sub-sections have complementary advantages and limitations.While the grain graph algorithm considers only the best fitting combination of parent orientations for each pair of neighboring grains, the voting based algorithm considers all possible parent orientations for a child grain and accordingly, all possible fits to the neighboring grains.Furthermore, while the grain graph approach also considers higher-order neighbors, i.e. grains that are further away, the voting based approach only considers first-order neighbors and hence makes its choice based on local and immediate neighbor information.
The aim of the new and hybrid variant graph approach is to combine the advantages of the two approaches while overcoming their shortcomings.In this regard, Section 3.1 refers to one of the central assumptions of the conventional Markovian clustering algorithm which is as follows.
Consider that P m,o is the probability that the two child grains G m and G o belong to the same parent grain and P n,o is the probability that the child grains G n and G o belong to the same parent grain.In this case, the product P m,o • P o,n is the probability that the grains G m and G n belong to the same parent grain.Generally speaking, this assumption is not true.It may happen that G m and G o agree on a common parent orientation which may be completely different to the parent orientation that G n and G o agree on.In such a situation, the true probability P m,n is much lower than P m,o • P o,n .The consequence of this condition not being satisfied is frequently observed when the conventional Markovian clustering algorithm detects potential parent grains where no reasonable grain orientation is assigned [7].
This issue is highlighted by showing the result of applying Markovian clustering to segment the grain graph generated from the example data set in Fig. 3.The edge weights for the grain graph were determined by Equ.(3.2) using the parameters δ = 5°and σ = 1.5°.The graph was then clustered in two separate calculations, first with the inflation parameter α = 1.2 and then with α = 1.6.After convergence, the parent orientations were computed in a separate step as outlined in Section 3.1.Grains that were successfully reconstructed to a parent orientation are shown using inverse pole figure colors while the grains for which no parent orientation was found are shown in white.The clusters formed by the algorithm are marked by thick black boundaries.
Fig. 3(b) clearly shows the effect of using a very low value for the inflation parameter.Underclustering results in several clusters that extend beyond the boundaries of the parent austenite grain and the determined orientations are insufficient to account for the observed martensite orientations.The latter is also as shown by the pole figure below in (Fig. 3(e)).Conversely, using a larger inflation parameter results in over-clustering of the parent austenite structure (Fig. 3(c)).As opposed to Fig. 3(b), over-clustering enables the calculation of a larger number of parent orientations with more than 97% of child grains undergoing reconstruction.The latter also account for the observed martensite orientations in Fig. 3(e).However, the biggest disadvantage to over- clustering is the increase in the number of optimization problems that need to be resolved in the second reconstruction step.In turn, this issue increases the amount of computational resources needed for reconstruction.A second consideration is that extreme over-clustering results in clusters comprising one or two grains; resulting in poorly defined parent orientations.
On the other hand, local neighbor level voting based reconstruction applies variant level information between neighboring grain pairs; thus ensuring a locally viable solution for each reconstructed parent austenite grain.However, this locally viable solution is reached after applying either: (i) a nominal orientation relationship referenced from the literature, or (ii) an average orientation relationship determined for the entire data set.Previous work has shown that there is considerable local variation in the orientation relationship between austenite and martensite [20,21] such that applying a representative average orientation relationship could result in the local misindexing of parent austenite variants.
Fig. 4 shows the result of reconstruction using the voting algorithm.For each child grain a parent orientation was determined based on the votes of neighboring child grains, as outlined in Section 3.2.The voting based reconstruction returned a high fraction of local twinned orientations (pink boundaries), seen as inclusions within larger grains.In addition, although several parent austenite orientations are indexed as they produce locally viable solutions, they do not correspond well with the observed martensite orientations.This aspect is clearly seen in the pole figure in Fig. 4(b).
In summary, while the grain graph algorithm is able to efficiently cluster the data into segments, it is prone to under and over -clustering as it lacks information at the individual parent variants level.Meanwhile, applying variant level information locally in voting based algorithms results in frequent misindexing; especially when nominal or averaged orientation relationship are used.As detailed next in Section 4, the new, hybrid variant graph approach eliminated the shortcomings identified in these two algorithms by combining variant level information with larger-scale clustering.

The variant graph
The variant graph is a generalization of the grain graph.In the grain graph each child grain is represented by a single node G m .Alternatively, the variant graph contains as many nodes g i m as the number of potential parent orientation variants allowed by an orientation relationship for each child grain G m .In both methods, the parent grains appear as single nodes g m .
Two nodes in the variant graph are connected by an edge if two conditions are met: (i) they correspond to adjacent grains, and (ii) their misorientation angle is below a certain threshold.The   5: The variant graph corresponding to the grain graph in Fig. 2(a).Each child grain Gn is represented by three nodes g 1 n , g 2 n and g 3 n which refer to the three potential parent orientations.An exception is the square parent grain G 6 which appears as a single parent orientation g 6 .In Fig. 5(a) the nodes are connected by an edge if the child grains are adjacent and the misorientation angle between the potential parent orientations is below a certain threshold.Edges are not restricted within the same variant number as the edge between the orientations g 3 5 and g 2 9 illustrates.Fig. 5(b) displays the variant graph after the first expansion step.
weighting of these edges is computed from the misorientation angle between the node orientations using the function Ψ, c.f. Equ.(3.2).More precisely, for two adjacent child nodes g i m , g j n , the edge weight is ) and for a pair g i m , g n of adjacent child -parent nodes, the edge weight is . The variant graph for the example in Fig. 2(a) is shown in Fig. 5(a).Since this simplified example only considers three parent variants, every child grain appears as three nodes g 1 n , g 2 n and g 3 n .Meanwhile, the number of edges has only increased by one.The additional edge stems from the two edges g 1 5 − g 2 9 and g 3 9 − g 3 5 connecting the grains G 5 and G 9 .Using the variant graph approach, the clusters of green, yellow and cyan parent orientations are far more separated compared to the grain graph.
Here the crucial distinction in the theories behind the grain and variant graph approaches is emphasized: Grain graph: In the grain graph approach, two adjacent grains G m and G n are connected by an edge with a weight P m,n if there is a pair of parent orientations g i m , g j n among all possible parent orientations of grain G m and all possible parent orientations of grain G n such that the misorientation angle ω i,j m,n = ω p (g i m , g j n ) is below a certain threshold.However, the critical information on the specific best fitting pair i, j is not stored.
Variant graph: In the variant graph approach, all possible parent orientations g i m for each child grain G m are stored.Two possible parent orientations g i m and g j n are connected by an edge with a weight P i,j m,n if the corresponding grains G m and G n are adjacent and the misorientation angle ω p (g i m , g j n ) is below a certain threshold.
In a sense, the variant graph is a generalization of the grain graph such that the latter may be directly derived from the former by collapsing the many variant nodes g i m into single nodes G m and by replacing the edge probabilities P i,j m,n between the variant nodes with their maximum over all possible variant pairs which in turn, gives the edge probabilities of the grain graph.At first glance, the variant graph may appear to be much larger than the grain graph, and hence, may be construed as numerically unwieldy to work with.However, the number of stored edges in the variant graph are of similar magnitude as that for the grain graph approach.More specifically, if the misorientation angle threshold is chosen such that for any pair of neighboring grains G m , G n only the best fitting pair g i m , g j n of parent orientations is connected by an edge in the variant graph, then the number of edges in the variant graph is equal to the number of edges in the grain graph.Since the amount of memory required to store a graph by a sparse adjacency matrix only depends on the number of edges, it is safely concluded that the variant graph is computationally manageable.
One of the main advantages of the variant graph is that information about second and third best fits is utilized as well.The inclusion of higher order fits significantly improves the quality of the reconstructed parent grains.However, this comes at a price, with higher memory usage requirements that scale linearly with the number of considered fits.
Similar to the grain graph, the variant graph is symmetric at the beginning as P i,j m,n = P j,i n,m but this property is lost during the clustering process.

Generalization of the Markovian clustering algorithm
The idea behind the Markovian clustering algorithm is to compute the probabilities of random walks throughout the grain graph.Therefore, two steps are crucial: (i) the expansion step 1, and (ii) the normalization step 3.These two steps are applied to the variant graph in a straightforward manner.
In the expansion step 1 of the grain graph, a probability Pm,n that two grains G m , G n belong to a common parent grain is computed by summing the product of such probabilities P m,o P o,n with respect to all middle grains G o .Alternatively, in the variant graph, only those products of probabilities P i,k m,o P k,i o,n that agree with the parent orientation g k o of the middle grain are summed up.Since all middle grains and all parent variants need consideration, the sum enlarges to It should be noted that although two summations indices in Equ.(4.1) are included, the expansion step is simply the matrix product P = P • P .
In order to turn the expanded matrix P 2 again into a probability matrix, the normalization step 3 of the conventional Markovian clustering algorithm is required.For the variant graph this is The normalization ensures that for any grain G m , the total sum of probabilities for its potential parent orientations and all neighbouring potential parent orientations is one, i.e., n i,j The purpose of the remaining two steps of the Markovian clustering algorithm, (iii) the inflation step 2, and (iv) the pruning step 4, is the keep the grain graph sparse and enforce its convergence to an idempotent matrix.Both steps are generalized to the variant graph as P i,j m,n ← P i,j m,n α and P i,j m,n ← 0 if P i,j m,n < δ.

Computing parent orientations
Unlike the grain graph approach, no optimization problems need to be resolved in order to determine parent orientations.In fact, once the generalized Markovian clustering algorithm has converged, the edge probabilities P i,j m0,n for a certain child grain G m0 are all zero except for a single entry P i * ,j * m0,n * = 1.This single edge indicates that the grains G m0 and G n * belong to a common parent grain with the parent orientation of G m0 being g i * m0 and the parent orientation of G n * being g j * n * .In fact, even the requirement of full convergence for the generalized Markovian clustering may be relaxed.In this case, for all possible parent orientations g i m of a fixed child grain G m , the most robust solution is to compute the sum of the edge probabilities to all other grains, i.e., and select the parent orientation g i m corresponding to the largest value of P i m .When normalized to one by setting the values are interpreted as probabilities voting for the parent orientation g i m of the child grain G m .In this respect, the variant graph directly resembles the voting based parent grain reconstruction approach from Section 3.2.
This final step of parent orientation determination is much faster than the corresponding step in the grain graph approach.

Variant graph clustering of the example data set
Fig. 6 shows the results of parent grain reconstruction using the variant graph algorithm, along with the reconstructed orientations and calculated martensite variants on (001) pole figures.The edge weights for the variant graph were determined in a similar manner as the grain graph using Equ.(3.2) and the parameters δ = 5°and σ = 1.5°.The variant graph was then run through the Markovian clustering algorithm using an inflation parameter α = 1.05.The graph was used to reconstruct parent austenite orientations after 3 and 10 iterations.
Using the variant graph algorithm, all the grains in the original map have been successfully assigned a parent austenite orientation.In addition, there is no spilling over of clusters to neighboring parent grains as seen when using the grain graph in Fig. 3(b).Since the variant graph stores the orientations as well as the probability value for each edge, there is no chance of such spillover.Inspection of the pole figure indicates that the variants calculated from the reconstructed parent orientations account for each martensite orientation in the original grain map after only 3 iterations.Fig. 6(c) shows the result of parent grain reconstruction after 7 more iterations, i.e., for a total of 10 iterations.At this point, all child grains are assigned parent austenite orientations and all martensite orientations are accounted for.There is, however, a slight difference in the regions assigned to the various parent orientations.
The progress of the variant graph algorithm for a selected grain in the martensite grain map is shown in Fig. 7.The upper figure shows a selected grain in the upper left-hand region of the prior austenite grain shown in 6, as well as the neighboring grains for which variant pairs are found that meet the threshold requirement for the creation of connecting edges.The edges are IPF color coded corresponding to the parent austenite orientation of each edge.The lower figure shows the IPF colored prior austenite orientations assigned to each grain after accumulating votes according to Section 4.3.
Initially, as shown in Fig. 7(a), the selected grain is only connected to its immediate neighbors by edges formed by pairs of variants.Initially, 22 nodes are connected by a total of 162 edges, implying that multiple variant combinations were found to satisfy the threshold criterion for each pair of nodes.In lath martensite, it is typical that in each grain, the threshold requirement is met by two variants with low misorientation that make up its block structure (V1-V4, V2-V5 and V3-V6 pairing according to Ref. [9]).In addition, the variant combinations corresponding to an annealing twin for the best fitting candidate orientation typically meet the threshold requirement, as well as its closest variant pairs.This indicates that as many as sixteen edges corresponding to different variant combinations may be potentially found that connect each pair of nodes in the graph.
After 3 iterations, new edges are created to account for second-and third-degree neighbors.The number of edges decreased relative to the number of nodes, as the weaker connections are pruned by the inflation step of the algorithm.Observation of the IPF coloring of the edges indicates that two colors appear to dominate and correspond to twinned prior austenite orientations.After 10 iterations, tenth-order neighbors would nominally be considered for the selected grain.However, this is not possible due to the fact that the edges must be connected by variants that produce a mutually acceptable parent orientation.At this point, it means that all possible edges for the selected grain are found and additional iterations only result in a gradual pruning of weaker edges.Thus, it is clear that the clusters of orientations cannot grow too large even if the inflation parameter is set to 1.The IPF coloring of the edges after 10 iterations indicates that the selected grain would give a good match with either of the twinned prior austenite grains in the immediate vicinity.
As opposed to the grain graph, the variant graph is able to find and preserve the information on every potential prior austenite solution for each original grain in the graph.A simple accumulation of votes as outlined in Section 3.2 is then enough to produce a prior austenite orientation map that is, based on visual observation, morphologically sound when it comes to non-twinned boundaries, as well as being able to reliably satisfy each individual martensite grain with a suitable parent orientation.As shown by the prior austenite orientation map in Fig. 7(c), some ambiguously indexed, small prior austenite grains remain in local regions when twinned prior austenite grains that are in close proximity share martensite variant orientations in the graph.
Fig. 8 shows the convergence behavior of the number of non-zero edges at different inflation parameters.As mentioned previously, for an inflation parameter of 1, the convergence of the Markovian graph is not enforced.For this value, the number of edges quickly increases to its maximum up to 11 iterations following which it then slowly decays towards a constant value.This value is much larger than the number of child grains, meaning that the convergence is not towards a single parent variant per child grain but rather, towards a steady state in the computation.
Slightly increasing the inflation parameter to 1.02 or 1.05 leads to convergence to a low number of non-zero edges.This means the algorithm eventually decides on one possible parent orientation per martensite grain.Increasing the inflation parameter even further to 1.1 and then up to 1.4 leads to extremely rapid convergence for a low maximum value of non-zero edges.Also, the maximum number of non-zero edges is reached after fewer iterations with higher inflation parameter values.This scenario corresponds to a locally restricted search for the best possible parent orientation.
The range of optimal inflation parameter values, denoting a good trade-off between computational efficiency and an accurate reconstruction that considers many neighboring grains and alternative solutions, is between 1.05 and 1.1.No significant changes in the reconstructed parent microstructure are visible for continued iterations after the maximum number of non-zero edges is reached.Further iterations mostly led to deletion of already unfavorable edges and therefore, did not affect the final result.Stopping the computation when the maximum number of edges are reached is therefore recommended to avoid unnecessary additional iterations and to keep the information on alternative solutions for the best fitting parent orientation.

Discussion
In this section, a couple of technical details are discussed following which extensions to the variant graph approach are presented.The computational performance of the new, hybrid method is also reported.

Pure random walks
As noted in Section 4.2, the parent orientations are computed even if the Markovian clustering algorithm has not converged.In the specific case that the inflation parameter α is set to 1, the sum P i m of the edge weights P i,j m,n derived in Equ.(4.2) are interpreted as a generalization of the voting weights defined in Equ.(3.5).The major difference is that in conventional local neighbor level voting based parent grain reconstruction, the weight include only first order neighbors while in the variant graph approach, neighbours up to 2 n order, where n is the number of iterations, are considered.This explains why the variant graph approach gives better results even for very small n.In the corner case of n = 0, it simply resembles the conventional neighbor level voting based algorithm.
The advantage of running the variant graph algorithm with an inflation parameter α = 1 and small n is that the resulting voting weight P i m is used efficiently to identify child grains when multiple parent orientations are assigned with similar probability.The drawback of setting α = 1 is that the number of non-zero edges in variant graph increases with the number of iterations.This makes the algorithm unfeasible for large grain maps and many iterations.

Diagonal entries
The diagonal entries P i,i m,m of the variant graph matrix correspond to self loops of each node of the variant graph.In the grain graph, the weights of these self loops is initialized by 1, thus making it equally probable to start a random walk at each node.In the case of the variant graph, the weights are initialized to P i,i m,m = 1/|S c |, which makes it equally probable to start a random walk with every possible parent orientation variant.
The variant graph also contains so-called pseudo diagonal entries P i,j m,m , which correspond to edges connecting different parent variants of the same child grain.Those edges are interpreted as follows: Assuming a certain probability P i m for parent orientation variant g i m and a second, very similarly oriented parent orientation variant g j m , then a certain probability for g j m , namely, P i m • P i,j m,m also exists.Pairs of similarly orientated parent variants appear frequently for experimentally derived and refined irrational orientation relationships that are close to but do not coincide with an ideal rational orientation relationship.
Although these pseudo diagonal entries possess a physical interpretation, their inclusion in the variant graph is not preferred.

Specific morphological conditions
Up to this point, the initial probability P i,j m,n that two child grains belong to a common parent grain is based solely on the misorientation angle ω p (g i m , g j n ) between the potential parent orientations.On the other hand, morphological information may also contribute to this probability.
As an example, consider the boundary curvature κ and denote by κ m,n the average over the all boundary segments separating the grains G m and G n .Following this, the misorientation based weights P i,j m,n are updated according to the curvature by where β is a modelling parameter that controls the influence of the curvature to the final weight.The effect of this modification is that only straight boundaries, and not curved boundaries, are more likely to be chosen by the algorithm as parent boundaries.This is especially helpful if the algorithm is presented with an ambiguous situation of finding the correct twin boundary.It is emphasized that any other morphological criterion can also be implemented, as the framework for parent grain reconstruction is a programmatic and fully customizable implementation [7].

Merging variants with small mutual disorientation angles
An effective means of reducing memory requirements and computation time is merging closely related variants, i.e., variants that have a small misorientation to each other.For the example data set of austenite to α martensite transformation, the number of variants are reduced from 24 to 12, which reduces the number of edges by a factor four.This four-fold reduction in number of edges roughly reduces the computation time and memory requirements by a factor of four as well.While it may seem that valuable information about the microstructure is discarded by this step, it is worth noting that child variants with low misorientation to each other are usually paired as neighboring laths and are often not detected during grain reconstruction by the angular threshold criterion in the first place.Inspection of the morphology of the grain map in Fig. 6 reveals many irregularly shaped grains, which significantly deviate from a lath-like morphology.It suggests that rather than individual laths, the grains represent blocks comprising multiple variants with a low mutual disorientation angle instead.This observation is made even for a relatively low angular threshold value of 3°.If it is assumed that variant level precision is lost when the initial grain map is constructed, it follows that the incorporation of full variant level precision into the edges of a variant graph for clustering is simply not necessary.While this assertion definitely bears out in the present and prominently difficult case of α martensite microstructures in steel, it should be re-examined on other data sets with different parent-child symmetry combinations.Fig. 9 shows the effect of merging closely related parent variants on the number of edges and the final reconstruction result for the example data set.Using a threshold value δ = 8.5 • to determine candidate parent variants, a total of 8 edges are established between grains A and B (Fig. 9(a)).Fig. 9(b) shows the variant indices and the misorientation angle between the parent variants for each individual edge.Closer inspection reveals that the edges represent closely related parent variants forming two clusters of twin-related parent orientations (see (111) pole figure in Fig. 9(c)).In Fig. 9(d) and (e), merging closely related parent variants reduces the number of edges between grains A and B from eight to two.In the present case, the lowest misorientation angle between the edges formed by closely related variants represents the strength of the combined edge.
When closely related variants are merged according to this procedure, the resulting reconstruction returns a pair of closely related parent orientations for each child grain.The determination of the final parent orientation therefore requires an additional step to restore variant level precision.The voting algorithm outlined in Section 3.2 is efficient in choosing the locally best fitting parent variant out of the two options determined by the clustering algorithm.Fig. 9(f) shows the result of applying this procedure on the variant graph reconstruction result of the example data set.The threshold value for edge detection is δ = 5°, and the above merging procedure of closely related parent variants is applied.The inflation parameter is α = 1.05 and clustering is allowed to proceed for 10 iterations.The variant level detail is restored by applying the neighbor level voting algorithm from Section 3.2 to choose the locally best fitting variant from the two closely related variants.Fig. 9(f) shows good agreement with the reconstruction using all 24 variants in Fig. 5(a)(c).

Performance of the variant graph algorithm
Comparing the performance of the grain graph method with the performance of the variant graph method is not straightforward.While the number of edges is significantly smaller in the grain graph, the calculation of parent orientations in the grain graph method requires a separate step and the additional post processing steps are generally non-optional.If the orientation clusters are of only interest, the grain graph algorithm is a quick and robust way of obtaining them.However, as shown in Fig. 6, these clusters rarely correspond to prior austenite grains.With the variant graph approach, the parent orientations are automatically obtained from the graph and only minor post processing is necessary.Running the variant graph algorithm with a large threshold value (as in the current example) yields a good result.However, when working with large data sets, hundreds of millions of edges are created which in turn, translates to high memory usage.In this case, a significant reduction in memory usage is obtained when the extension from Section 5.4 is applied which merges variants with mutually close disorientation.Fig. 10 shows the memory usage versus time for reconstructing the full example data set, a map of 4 million pixels (see Fig. 11).The plot compares the performance of the variant graph approach using all 24 variants versus 12 variants as per the adaptation from Section 5.4.As expected, the memory usage is significantly reduced when only 12 variants are used.The 24 variant implementation peaked at 62.2 Gb whereas the 12 variant implementation only required 27.5 Gb at its peak.The 12 variant implementation took 416 s on an Intel(R) Core(TM) i9-10920x processor and was thus 16 % faster than the 24 variant implementation.
The amount of memory required is directly proportional to the number of non-zero edges in the graph.The performance metrics presented in Fig. 10 are just an indicator of the code performance, with all reconstruction parameters kept constant for the different runs.In reality, parameters require tuning for optimal performance with each algorithm.While the code requires a fair amount of memory, its computational performance and efficiency makes it an ideal candidate for virtual memory use when physical memory is insufficient.As a last resort, large data sets could be sectioned and reconstructed in smaller subsets to avoid out-of-memory errors.
Fig. 11 shows the reconstruction of the full example data set, consisting of 4 million data points.The reconstruction was undertaken using the variant graph approach with the extension presented in Section 5.4 and an inflation parameter α of 1.05.The reconstructed prior austenite grains are shown in IPF coloring superimposed onto the α martensite band contrast map with twin boundaries highlighted in white.No post processing was applied.

Conclusions
This study introduces the new variant graph algorithm for improved parent grain reconstructions from orientation maps of partially or fully phase transformed microstructures.Using the well-known and challenging example of low-carbon lath martensite steel, the algorithm's inherent accuracy in reconstructing parent austenite grains and boundaries is showcased and its computational performance is assessed for a 4 million point data set.
The variant graph is capable of reconstructing transformation microstructures from any parentchild combination.It is a hybrid algorithm that combines the generalization of the global grain graph with the strengths of the local neighbor level voting based approach.The key advantage of the variant graph is its ability to store all possible parent orientations for each child grain such that following parent grain reconstruction, no additional post processing steps are necessary.
The unique advantages offered by programmatic extensions to the variant graph algorithm include: (i) the ability to account for specific morphological conditions other than the misorien-tation angle, like boundary curvature, when reconstruction algorithms are faced with ambiguous microstructures, and (ii) the merging of variants with small mutual disorientation angles as an effective means of reducing memory requirements and computation time for parent grain reconstruction.
The variant graph algorithm is implemented as an addition to the phase transformation analysis module in MTEX 5.8 and is freely available for download by the community.

Figure 1 :
Figure 1: (a, b) Band contrast maps overlaid with (a) the initial martensite inverse pole figure colors.The reconstructed parent austenite grain boundaries (sans Σ3 boundaries) are shown in black.(b) A single, heavily twinned parent austenite grain extracted from the map (marked by a white rectangle in (a)).(c) A (001) pole figure showing the martensite orientations in (b).

Figure 3 :
Figure 3: (a, b, c) Band contrast maps overlaid with (a) the initial martensite inverse pole figure colors for a single, heavily twinned parent austenite grain.The parent austenite grain map reconstructed using the grain graph algorithm after clustering with (b) α = 1.2 and (c) α = 1.6.(d, e, f) (001) pole figures showing (d) martensite orientations and (e, f) grain graph based parent austenite orientations with calculated martensite orientations overlaid on to measured martensite orientations for (e) α = 1.2 and (f) α = 1.6.

Figure 4 :
Figure 4: (a) Band contrast map overlaid with the parent austenite grain map reconstructed using the voting algorithm.(b) (001) pole figure showing martensite orientations, the reconstructed austenite orientations and the theoretical martensite orientations calculated from these.

6 (
b) Variant graph after the first expansion step

Figure
Figure5: The variant graph corresponding to the grain graph in Fig.2(a).Each child grain Gn is represented by three nodes g 1 n , g 2 n and g 3 n which refer to the three potential parent orientations.An exception is the square parent grain G 6 which appears as a single parent orientation g 6 .In Fig.5(a) the nodes are connected by an edge if the child grains are adjacent and the misorientation angle between the potential parent orientations is below a certain threshold.Edges are not restricted within the same variant number as the edge between the orientations g 3 5 and g 2 9 illustrates.Fig.5(b) displays the variant graph after the first expansion step.

Figure 6 :
Figure 6: (a, b, c) Band contrast overlaid with (a) the initial martensite grain map for a single, heavily twinned parent austenite grain and parent austenite grain map reconstructed with the variant graph algorithm after (b) 3 and (c) 10 iterations, along with (d, e, f) the (001) pole figures showing (d) martensite orientations and parent austenite orientations after (e) 3 and (f) 10 iterations with calculated martensite orientations overlaid on measured martensite orientations.

Figure 7 :
Figure 7: (a, b, c) Edges and nodes in a variant graph for a selected martensite grain and (d, e, f) the parent orientations from the variant graph algorithm after (a, d) 0, (b, e) 3 and (c, f) 10 iterations.The edges in the top image comprise the IPF colors after the parent orientation of each edge.The area marked with a dashed line in (a) is looked at more closely in Fig. 9.

Figure 8 :
Figure 8: The convergence behavior of the amount of non-zero edges in the variant graph as a function of inflation parameter.

Figure 9 :
Figure 9: (a) A small section of the example data set (see Fig. 7(a)), showing IPF colored edges according to the mean parent orientation corresponding to the edge.(b) The IPF colored edges between neighboring child grains A and B labelled with the parent variant numbers for the two grains, separated by the angular deviation of the candidate parent variants.(c) (111) pole figure showing the candidate parent variant orientations.The shared (111) plane is highlighted with a red circle.(d) The edges between grain A and its neighboring grain after merging of closely related variants.(e) The number of edges between grains A and B have been reduced from eight to two by the merging process.(f) The final reconstruction result after 10 iterations of clustering following the merging of closely related variants, showing excellent agreement with the parent grain reconstruction from 24 variants in Fig. 6(c).

Figure 10 :
Figure 10: Comparison of the memory usage vs. time for parent grain reconstruction of the 4 million pixel data set from Fig. 11 using the variant graph algorithm with 24 and 12 variants as discussed in Section 5.4.

Figure 11 :
Figure 11: Reconstructed prior austenite grains with IPF coloring overlaid on the α martensite band contrast map.Σ3 boundaries are highlighted in white.