Introduction

Additive manufacturing (AM) continues to gain attention in a wide variety of industrial sectors for its ability to create complex geometries and integrate multiple materials into one product. Even with the interest in AM, there are still several unknowns that limit its application. Process–structure–property (PSP) linkages define the relationships between materials, their manufacturing processes, and their subsequent performance. These relationships remain nascent for many additive manufacturing processes. A better understanding of PSP linkages leads to more effective and efficient material designs for specific applications and manufacturing processes. A common process for developing PSP relationships requires experimental generation of batches of physical samples and destructive testing/characterization, leading to high costs (both in time and money) associated with material and process development. Collection of this data is usually done in the 2D space to mitigate some of the time and cost issues mentioned but 3D data is needed for digital analysis of material properties. If the collected 2D data could be used to estimate a 3D structure for materials testing, studies into the PSP linkages of additive manufacturing could be moved to a digital environment. This research looks to create a digital tool that estimates 3D generation parameters from 2D data and utilizes said 3D generation parameters to simulate a statistically equivalent 3D structure for finite element analysis. Using 2D data to estimate 3D data would assist in the studying of process parameter effects on the structure while generating 3D data from said 2D observations would facilitate the effects of structure on the material properties. The material of choice for testing the framework, chopped carbon fiber polymer composites, was obtained from the Air Force Research Laboratory from their Direct Ink Writing (DIW) additive manufacturing process.

This paper introduces the first iteration of a statistical matching algorithm, guided in part by stereological principles, that can be applied to generate 3D structures from 2D observations. “Background” section covers the background of topics such as stereology, material simulation, DIW, characterization of microstructures, and statistical tests. “Methods” section introduces the methods for completing and testing the model mentioned above. “Results” section analyzes the robustness of the framework through the lens of three case studies with varying fiber orientation along with collected experimental data. “Discussion” section discusses implications of the results relative to modeling of 3D structures from 2D data. Finally, “Conclusion” section concludes with future considerations to advance the framework.

Background

An additively manufactured (AM) short carbon fiber composite was chosen for this demonstration with the hope that the framework could be applied to other particulate structures as the Air Force Research Laboratory showed interest in the material and AM process. The AM process, Direct Ink Writing (DIW), deposits ink onto a build plate and prints 3D structures. One of the major differences is the ink, which has varying flow mechanics when compared to other 3D printing materials such as filaments [1]. Another is the ability of DIW to print thermoset composites for better thermal and mechanical properties [1]. Direct Ink Writing can be used to make polymer and ceramic composite matrices with Kevlar and carbon fibers for reinforcement of the matrix [2, 3]. This greatly improves the applicability of DIW as the polymer alone does not have the necessary material properties. Even with the advantages of DIW, the effects of its process parameters on the structure and properties are still under extensive research.

A digital framework that utilizes collected 2D data to estimate 3D geometries could facilitate the continued study of DIW. Geometric model packing algorithms for particulate structures driven by statistical distributions are common in the material science community and have been demonstrated in tools created from years of research and development [4,5,6]. A popular method is the use of two-point correlation functions as a metric for estimating 3D data from 2D observations [4, 7,8,9]. One study utilized two-point correlation functions and simulated annealing to model inclusion secondary phases of anisotropic materials [4]. Another study used focused ion beam and scanning electron microscopy (SEM) to collected 2D images of halloysite nanotube polypropylene composite and applied an iterative optimization loop by comparing two-point correlation functions of the current synthetic structure to the experimental structure [7]. This process resembles the research’s current process with the main differences in how the fiber features were extracted along with how they were compared to the synthetic ground truth or experimental images. An issue with the two-point correlation is the loss of information and difficulty in determining which features of interest were captured in the two-point correlation [7, 10]. A study looked to apply phase-recovery algorithms to try and solve this problem for particulate structures with some success as well as limitations mentioned that are inherent to any n-point correlation function [8].

Additionally, many of these approaches use 2D views to approximate 3D microstructure properties [5]. This specific instance tried to pack ellipsoids to represent particles of an aluminum alloy by calculating pair correlation functions, a function of distance and angular orientation [5]. Another study used serial sectioning to collect 2D data of a 3D particle reinforced metal matrix composite and applied a reconstruction software to obtain an approximation of the 3D structure for material property analysis [11]; serial sectioning is used to collect data in this research project. While this study is able to estimate 3D structure from 2D observations, it only recreates the data collected instead of creating a statistically similar structure; this limits its applicability in material exploration. Many simulation studies into composites focus on the fiber orientation and alignment as this directly affects the observed material properties in different loading directions [12,13,14]. The use of SEM images and scale-invariant feature transformations (SIFT) allowed matching of points of interest in 2D to create a 3D point cloud for reconstruction of the fiber orientation distribution [12]. Optical microscopy can also be used to collect 2D micrographs for measurement of fiber orientation; a variation in fiber shape caused issues with this method [14]. A more researched additive manufacturing process, injection molding, continues to investigated the fiber orientation and effects of process parameters on said alignment [15]. In terms of image processing, template matching and Voronoi distances allow for calculation of fiber orientations in unidirectional thermoplastic composites as long as the orientation is calculated over a distance far larger than the diameter of the fiber [16].

Classical statistics approaches based on simple geometric models (e.g., spheres or ellipsoids) [4, 5, 8, 17, 18] can lack realism in generating microstructures and cause issues with simulation of digital materials. Microscale features (i.e., fibers, grains, etc.) are the focus of the presented work. Empirical distributions of important feature metrics such as shape, size, area/volume fraction, and morphological orientation are statistically matched in an iterative manner. Many human-defined statistics and parameters have been explored, such as n-point statistics, grain diameter, misorientation angle, and aspect ratios [6, 19]. These methods use statistical measures to generalize the distribution of the aforementioned parameters, resulting in software like Dream.3D, IMOD, Fovea Pro, and MicroImager that can create synthetic microstructures [19, 20]. Another tool, Rocpack, uses a packing algorithm to simulate microstructure (a particulate composite) with different packing fractions for different phases [21]. These tools/approaches allow for easy interpretation through statistical distributions drawn from experimental microstructure observations. Some issues with this classical approach include representing and reconstructing outliers, capturing the mesoscale fluctuations in materials, and successfully reconstructing intricacies associated with shapes in the microstructure [6]. Many of the statistical models fail to effectively include extrema of the microstructure due to the use of distributions like the lognormal distribution; these features usually determine the fatigue and fracture properties of the material [19]. Further, the exact form of these statistical distributions requires large numbers of observations, ideally in 3D. However, collection of 3D data remains a costly and time-consuming challenge, as this data requires more complex collection and post-processing methods[22]. While use of 3D data better defines the material across varying scales, the trade-off with said benefits and data collection/processing remains difficult to balance [22].

The approximation of 3D parameters from 2D data (stereology) has been applied within the material science research community [23]. Stereology is based on statistical probabilities of 2D observations given 3D objects, which are often analytically derived for simple objects like spheres. An example of determining the size, shape, and orientation of elliptical observations given ellipsoidal objects can be found in [24] and is shown in Fig. 1. In the paper by Ferguson, the dominant axes of an ellipsoid sliced in any 2D section plane were found, which aids in approximation of orientation based solely off 2D section images [24]. An application of stereology to cylindrical structures, akin to the short carbon fibers, estimated the orientation distribution from a single 2D section [25]. Another study in the field of concrete applied X-ray projectional radiography for in situ 2D data collection and estimated fiber volume fraction and spatial distributions with stereological principles [26]. These studies show the applicability of stereology to short carbon fiber composite modeling and its ability to improve the usability of 2D data for 3D problems.

Fig. 1
figure 1

Stereological method of 2D Slice of 3D ellipsoid for determination of new dominant axes of ellipse in 2D observed in [24]. This method allows for estimation of the 2D angular distributions in a plane, which can be used to update the 3D angular distribution

An important aspect of any synthetic material generation process is comparing the generated structure to the target. In the case of geometric model packing approaches, this generally means comparing statistical distributions. Several methods can be used, with the Student’s T Test and its variations being the most popular. However, the Kolmogorov–Smirnov (KS) test better suits the similarity comparison problem. This test compares the differences between two cumulative distribution functions (CDFs) and outputs a p value that describes the absolute value of the maximum of all differences between those CDFs [27]. Another potential test is a variation of the KS test known as the Anderson–Darling (AD) test, which takes the weighted sum of all the differences of two CDFs. Representations of both tests are presented in Fig. 2 [27]. The AD test allows for improved measurement of variance differences within the CDFs, which could lead to more realistic similarity measuring results. Finding the right statistical test allows for more concrete analysis of the validity of the results while also finding statistically similar structures.

Fig. 2
figure 2

Visualization of Kolmogorov–Smirnov (KS), maximum difference between CDFs, and Anderson–Darling test, weighted sum of differences between CDFs. Both tests return a statistic and corresponding p value. Comparison of CDFs allows for a better comparison of similarity of the whole distribution [27]

Methods

A known ground truth structure was needed to effectively test the framework; however, there are limitations to using real 3D data for this purpose. First, the ground truth would potentially have some bias introduced by the processing of the experimental data given that the actual 3D distributions would not be known analytically. Second, it would be difficult to obtain multiple ground truth structures with differing 3D distributions to test the robustness of the framework. For these reasons, a synthetic structure was created in Dream.3d by packing ellipsoids into a 3D volume to approximate fibers. The controlling parameters for the size, shape distribution, and volume fraction of the fibers were selected arbitrarily. The mean (\(\mu ^{3D}\)) and standard deviation (\(\sigma ^{3D}\)) of the size distribution, along with the aspect ratio (\(r^{3D}\)), orientation distribution, and volume fraction (\(f^{3D}\)) were used to control the distributions of the features of interest. Based on previous work, these parameters relate directly to material properties in particulate microstructures [19, 20]. The ground truth structure was sized in order to fit enough fibers, at least 1000, to properly sample the population distribution of fibers. For the ground truth statistical set, ten goal structures of size \(256\times 256\times 256\) pixels were generated for proper sampling of the fiber feature space. 2D slices were then taken in three orthogonal planes (XY, XZ, YZ) and used for collection of fiber statistics. A set of example slices is shown in Fig. 3. The 2D statistical distributions of the target (\(\mu _{gt}^{2D}\), \(\sigma _{gt}^{2D}\), \(r_{gt}^{2D}\), \(f_{gt}^{2D}\)) drive the loss associated with the iterative optimization of the 3D generation parameters needed to emulate the observations of the ground truth structure.

Fig. 3
figure 3

XY (Left) and YZ (Right) 2D slices of Aligned X case study goal structure synthetically created with Dream.3d. The chopped fibers were approximated by packing ellipsoids and 2D views compared to the experimental data showed promise

Experimental data collected from DIW printed parts was applied to the workflow to ensure its application to real microstructures. The 3D dataset was not collected for this experimental data so the only results were collected in the 2D, limiting the comparisons of this case to omly 2D; see “Experimental Data Case Study” section. The algorithm works to match the statistical distributions of the 2D target observations (the ground truth) structure. An overview of this algorithm is shown in “Optimization Loop” section. The next section will cover the fiber orientation control methodology before expanding on the rest of the optimization loop.

Orientation Control

Matching the morphological orientation distribution of features remains an important aspect of the generator and impacts the ability to match other statistical measures of the 2D observations, \(f(\theta )\). Unlike the other microstructure descriptors discussed above, there is no scalar value that adequately captures the orientation distribution of fibers: furthermore, it cannot be approximated as a classic statistical distribution. Without scalar parameter(s), there is not an intuitive way to adjust the estimated 3D distributions, \(F(\theta )\), during the optimization process to match the observations of the target structure. A process to map the fiber orientations observed in the 2D observations to the 3D orientation distribution is needed.

There are many different reference frames for representing the orientation distribution of a structure [28]. Dream.3d utilizes the homochoric representation, which is a generalization of the Lambert equal-area mapping [28]. The homochoric representations for rotations, \(r(\theta )\) are shown in Eq. 1:

$$\begin{aligned} r(\theta ) = \left[ \frac{3}{4}(\theta - \sin (\theta )\right] ^\frac{1}{3} \ for \ \theta \in [0, \pi ] \end{aligned}$$
(1)

Working in the homochoric space, in theory, makes it easier to equally sample the orientation space [28]. Limiting the acceptable range of angles for this representation also reduces issues with symmetry of the ellipsoids. By making it easier to sample the orientation space equally, Dream.3d applies the homochoric space to create equal volumetric bins. By controlling each of the probabilities of an ellipsoid having an orientation that corresponds to a bin, a probability distribution for the full orientation space is created. This method decomposes the observed 2D orientation distribution into a weighted sum of the many characteristic ones associated with the 3D orientation bins; comparable to a Gaussian mixture model. The algorithm effectively estimates the 3D orientation distribution by iteratively adjusting model weights based on direct comparison of the ground truth to the synthetic 2D orientation distributions. The discrete 3D orientation distribution is directly transferred to Dream.3d, which uses approximately 5° resolution when binning the space. This resolution results in 36 bins along each dimension of the homochoric space for a total of 46,656 total bins/weights.

For each of the 3D orientation bins, the 2D dominant axes were calculated for a set of ellipsoids (of varying aspect ratios) for each of the 2D observation planes [24]. The results were stored in lookup tables, one for each 2D observation plane, to be used in the orientation control loop’s update step. The aspect ratio of a 3D ellipsoid will affect the 2D dominant axes. To account for this, an initial population of random ellipsoids are sectioned within each 3D orientation bin to build the lookup tables. 100 ellipsoids was a sufficient number to capture the distribution of 2D orientations observed due to 3D ellipsoid aspect ratio. As the optimization framework progresses, the estimated 3D aspect ratio distribution converges and the lookup tables are periodically recalculated reflecting the current estimate of the 3D aspect ratio distribution. Regeneration of the lookup tables allows for better convergence of the orientation distribution and accounts for the effects of the aspect ratio on the orientation.

Each iteration of the orientation control loop compares the observed 2D orientation distributions of the current estimated structure and the target structure on each 2D observation plane. The ratio of each normalized histogram bin in the 2D distributions was computed to determine which 2D orientations are under- or over-observed on each plane when compared to the target structure. The set of ratios is then multiplied (as a dot product) with the lookup table for each of the 46,656 3D orientation bins. The resultant product is used to adjust the weight of each bin. If the product is above 1, then the the 3D orientation bin is contributing to over-observed 2D orientations and vice versa. Each iteration introduced a decay in the magnitude of the update steps, resulting in an initial broad search with more refined convergence over time.

Optimization Loop

The optimization loop consists of four important subsections: (1) Create/update the synthetic structure; (2) extract statistical distributions and measures from the structure; (3) compare observed distributions against observations from the goal structure; and (4) update generation parameters until convergence. The first two steps of the optimization loop utilized existing capabilities in Dream.3d presented elsewhere [29]. Steps 3 and 4 are the focus of this work and a more detailed algorithm workflow for the optimization loop is shown in Fig. 4.

Fig. 4
figure 4

Generation workflow for creation of statistically similar structure from goal 2D fiber feature information. Comparison of KS test statistics was stopping criteria for generation of synthetic structure

Initialization of the 3D structure generation parameters was performed by taking the averages of the ground truth 2D sizes (\(\mu _{gt}^{2D}\), \(\sigma _{gt}^{2D}\)), aspect ratios (\(r_{gt}^{2D}\)), and area fractions (\(f_{gt}^{2D}\)) from all three orthogonal section and applying stereological principles, such as:

$$\begin{aligned} \mu ^{3D} = \log \big ( \bar{\mu _{gt}^{2D}} * 4/\pi \big ) \end{aligned}$$
(2)

The 3D size parameters (\(\mu ^{3D}, \sigma ^{3D}\)) define a log-normal distribution for the equivalent sphere diameters of fibers in Dream.3d. The \(4/\pi \) constant for the mean of the size distribution is a common stereological constant used when comparing 2D and 3D size distributions [30]; the other scaling values, \(s_f\), were estimated by trial and error for values between 0 and 1. This range was chosen as the mean and standard deviation of the size along with the aspect ratio increase when observing projections of a 3D structure in 2D [24]. Calculations for \(r^{3D}\) and \(f^{3D}\) were done in the same manner as the size standard deviation, \(\sigma ^{3D}\), but not on the logarithm scale used for the size distribution; see Eqs. 35.

$$\begin{aligned} \sigma ^{3D}= & {} \log (\bar{\sigma _{gt}^{2D}} * s_f) \end{aligned}$$
(3)
$$\begin{aligned} r^{3D}= & {} \bar{r_{gt}^{2D}} * s_f \end{aligned}$$
(4)
$$\begin{aligned} f^{3D}= & {} \bar{f_{gt}^{2D}} * s_f \end{aligned}$$
(5)

Dream.3D determines the size, aspect ratio, and orientation of each individual fiber on each 2D observation plane. Comparisons were performed with observed 2D statistics of the target structure to calculate the needed updates to the 3D structure generation parameters. The 3D structure generation parameters, along with the orientation distribution weight array, were updated after performing statistical testing on the features of interest.

The optimization step of the algorithm used a similar process to a thermal control system; if the temperature is too hot or cold, add or remove heat from the system. In this instance, the 2D features of interest to change were the size (\(\mu _c^{2D}\), \(\sigma _c^{2D}\)), shape (\(r_c^{2D}\)), and area fraction (\(f_c^{2D}\)) of the fibers; if any of these parameters was different from the ground truth values, increase or decrease said parameter accordingly. While these parameters are not independent of each other, each of the parameters was updated independently at the same time as it would be difficult to integrate knowledge from one parameter to another in the framework. Updates to the structure generation parameters were based on the relative change, C, between the 2D statistics of the target (\(\mu _{gt}^{2D}\), \(\sigma _{gt}^{2D}\), \(r_{gt}^{2D}\), \(f_{gt}^{2D}\)), \(x_t\), and current structure (\(\mu _{c}^{2D}\), \(\sigma _{c}^{2D}\), \(r_{c}^{2D}\), \(f_{c}^{2D}\)), \(x_c\), as shown in the equation below:

$$\begin{aligned} C = \frac{x_c - x_t}{x_t} \end{aligned}$$
(6)

Upper and lower thresholds were placed on the relative change of 5% to ensure parameters did not increase or decrease sporadically from iteration to iteration. Update steps were further scaled by a monotonically decreasing function, \(I_\mathrm{{step}}\), related to the current iteration number, \(i_c\), allowing for more precision as the algorithm converged on a value; the iteration cap, \(i_f\), was set at a predetermined number of 200.

$$\begin{aligned} I_\mathrm{{step}} = \frac{i_f - i_c}{i_f} \end{aligned}$$
(7)

Statistical testing, KS test, determined whether the synthetic structure’s 2D observations were statistically equivalent to the target structure. Various sampling sizes were investigated and it was determined that 1000 samples were sufficient when performing the KS test. It was determined that the percent change of the moving average (\(M_a\)) of the KS test and percent error for the area fraction provided appropriate termination conditions. The selection of \(M_a\) of the KS test statistic reduced inherent noise in the measure caused by inter-dependencies in feature parameters and random sampling. The convergence speed for the threshold of change in \(M_a\) varied considerably and will be discussed in “Discussion” section; however, it is noted here that the \(M_a\) of the KS test statistics were weighted to emphasize current values. A set of moving average weights, \(M_aW\), are introduced below. For the KS statistics, the \(M_a\) window length was set at 10 iterations for the size, shape, and angle features while the area fraction error was set at 30 iterations; these numbers were driven by observing previous results and testing. The objective function of this optimization was the decrease of the relative change of the moving average of the KS statistic, \(C_{M_a}\), for the size, shape, and area fraction parameters for all three 2D faces of the structure, N, below a threshold value, T; see Eq. 10.

$$\begin{aligned} \text {Prod}_{ma}= & {} {\vec{M}_a} \cdot {(\vec{M}_a W)} \end{aligned}$$
(8)
$$\begin{aligned} M_a= & {} \frac{Prod_{ma}}{\sum (M_a W)} \end{aligned}$$
(9)
$$\begin{aligned}{} & {} \mathop {\mathrm {arg\,max}}\limits _x \sum _{i=1}^{N}{x} \ \text {s.t.} \ x = \text {if}(C_{M_a} < T) \ 1, \text {else} \ 0 \end{aligned}$$
(10)

An example for how 3D generation parameters are updated based on 2D observations is shown in Algorithm 1 for one parameter, the size mean \(\mu ^{3D}\). The current synthetic 2D value and ground truth 2D value, \(\mu _{c}^{2D}\) and \(\mu _{gt}^{2D}\), will be used as the two values for major comparison in terms of relative change and KS test.

Algorithm 1
figure a

Example of update step for 3D generation parameters in Dream.3d. Comparisons in 2D control how of each 3D generation parameter change during each update step. Disagreements in varying views of 2D statistics causes a smaller update step to ensure a smoother convergence. Threshold placed to ensure update steps do not exceed an absolute value of 5 percent

Examples of the 2D statistics, the KS test analyses, and other visualizations are shown in “Results” section.

Three Case Studies

Three case studies were selected to test the effectiveness of the algorithm. The size, aspect ratio, and volume fraction of fibers were held mostly constant, but the orientation distribution of the fibers within the structure was (1) randomized (Random), (2) highly aligned along x-axis (Aligned X), or (3) highly aligned along the body diagonal of the structure (Off Body Diagonal, OBD). Each case poses challenges with observing the fibers on 2D planes, comparing the statistical distributions, and determining appropriate updates during the optimization loop. Examples of 3D structures for each orientation distribution are shown in Fig. 5; colors represent unique index given to each fiber during packing in Dream.3d. All three target structures were created in Dream.3D in order to have ‘ground truth’ 3D structure generation parameters to compare to the results of the optimization framework.

Fig. 5
figure 5

(From left to right) Random, Aligned X, and OBD 3D structures generated from Dream.3d. The axis in the lower left corner shows the orientation of the 3D structures. Each structure had the same size and shape distributions with the only variation in the 3D orientation distribution of the chopped fibers

Experimental Data Case Study

To test the applicability of the framework to experimental data, data for one DIW specimen was collected. Serial sectioning collected the data on three orthogonal planes. A direct threshold applied to raw image separated the fibers from matrix and voids for use in the current workflow; see Fig. 6. Given the lack of 3D data, results were drawn from KS test comparisons and visual inspection of synthetic data created from experimental data as the baseline.

Fig. 6
figure 6

2D Slices collected from Direct Ink Writing part through serial sectioning with half of image thresholded as example of image processing needed for experimental data

Results

As previously discussed, the KS statistic and area fraction error are the two main metrics for determining convergence of the optimization loop. The KS statistic for the size, aspect ratio, and orientation distributions were plotted for each case study by iteration. The plots show the progress toward convergence as the framework updates the 3D structure generation parameters to match the observed 2D statistics. Figure 7 shows the set of plots for an instance of a complete optimization of the OBD case. KS statistics are restricted to iterations greater than 10 due to calculation of \(M_a\), while the area fraction error begins after 30 iterations. While the KS statistic was tracked as a termination criteria, the p value at the end of each run was calculated to check whether the synthetic and goal structure were statistically indistinguishable; p values in Table 1 under \(1*10^-4\) were reported as 0.

Fig. 7
figure 7

\(M_a\) of the size, shape, and 2D angle distribution KS test statistics and area fraction percent error progression of OBD case. Moving average is calculated based on 10 observations at time; therefore, the first 10 iterations do not have a moving average value; the area fraction used 30 observations for moving average calculation

Table 1 p values for all feature parameter and angle distributions for all case studies on 2D faces

Figure 7 shows the convergence of the KS statistic during the optimization process, but it does not directly show how the observed 2D statistical distributions are evolving. Figure 8 shows the evolution of the size, aspect ratio, and area fraction distributions on the XZ plan for the OBD test case. The plots show the convergence to the 2D statistics of the goal structure. A Gaussian mixture model (GMM) of 1000 samples from each of the 2D distributions for the size, shape, and area fraction parameters were applied to smooth the lines.

Fig. 8
figure 8

(Left to right) Size, shape, and area fraction line plots for OBD case for XZ plane at every 50 iterations. As the optimization loop iterates, the feature distributions in the 2D case become more similar to the goal 2D feature distributions

Figure 9 shows the evolution of the 3D orientation distribution for each of the case studies. The plots display the position of the dominant axes (major, secondary, and minor) of 1000 fibers randomly sampled from the 3D orientation distribution. Figure 10 shows the evolution of the observed 2D orientation distributions, created by applying a GMM similar to the process used in Fig. 8.

Fig. 9
figure 9

3D angular distribution evolution for OBD case. As the optimization loop iterates, the 3D angular distribution resembles the goal angular distribution within 100 iterations

Fig. 10
figure 10

2D angle distribution line plots for every 50 iterations of (left to right) XY, XZ, and YZ plane for OBD. As the algorithm iterates, the observed 2D angular distribution becomes more similar to the goal 2D angular distribution for all three case studies

Fig. 11
figure 11

Visualization of final 3D structure (left) to ground truth (right) for case studies (top to bottom) Random, Aligned X, and OBD. 3D structures were generated from Dream.3d with the generation parameters updated during each iteration to emulate the ground truth

Figure 11 shows the final, optimized 3D structures. The absolute values of the 2D and 3D error were also calculated for all feature parameters (Tables 2 and 3). All errors were reported from the 200th iteration, and the 2D errors reported were averaged from the absolute errors on each 2D observation plane. The 2D and 3D errors were calculated from 20 different test runs in order to test the repeatability of the framework for each case study.

Table 2 Average and standard deviation 2D observed error for each case study
Table 3 Average and standard deviation 3D generation parameter error for each case study

Moving on to the experimental case study, the convergence of the KS statistic for the size, shape, area fraction, and angular distribution is shown in Fig. 12. Figure 13 shows the evolution of the size, aspect ratio, and area fraction distributions on the XZ plan for the experimental data. The plots show the convergence to the 2D statistics of the experimental structure. A Gaussian mixture model (GMM) of 1000 samples from each of the 2D distributions for the size, shape, and area fraction parameters was applied to smooth the lines.

Fig. 12
figure 12

\(M_a\) of the size, shape, and 2D angle distribution KS test statistics and area fraction percent error progression of the experimental data. Moving average is calculated based on 10 observations at time; therefore, the first 10 iterations do not have a moving average value; the area fraction used 30 observations for moving average calculation

Fig. 13
figure 13

(Left to right) Size, shape, and area fraction line plots for Experimental case for XZ plane at every 50 iterations. As the optimization loop iterates, the feature distributions in the 2D case become more similar to the goal 2D feature distributions

Since there is no 3D data collected for the experimental case, only convergence in the 2D case for the angular distribution can be shown. Figure 14 shows the evolution of the observed 2D orientation distributions, created by applying a GMM similar to the process used in Fig. 13. Figure 15 shows a 2D comparison of the synthetic structure to the experimental data on the XY and XZ observation planes. A quantitative comparison of the 2D observed errors of the features parameters for the experimental case is given in Table 4.

Fig. 14
figure 14

2D angle distribution line plots for every 50 iterations of (left to right) XY, XZ, and YZ plane for Experimental case. As the algorithm iterates, the observed 2D angular distribution becomes more similar to the goal 2D angular distribution for all three case studies

Fig. 15
figure 15

Synthetic (left) and Experimental (right) 2D slices of XY and XZ plane for visual comparison of synthetic to ground truth experimental data. The chopped fibers were approximated by packing cylinders to emulate the fibers observed in the experimental data

Table 4 Absolute average and standard deviation 2D observed error for experimental data

Discussion

This work demonstrates the practicality of using observed 2D statistics of a target microstructure to iteratively search for 3D statistics similar to a target structure. However, the selected convergence criteria was not met within the allotted maximum number of iterations (200) in any of the studied cases. The simplest case (Random) was also conducted using only one face for statistical comparison and led to convergence in the XY and XZ plane within the same number of iterations and YZ in a few more iterations. Using all three planes in parallel increases the complexity of updating the 3D structure generation parameters and tracking convergence. The inter-dependencies between shape and orientation, for example, introduce inherent noise into all of the microstructure generation parameters. Implementing \(M_a\) as a measure for tracking the improvement of the convergence metrics mitigated or reduced this issue. While \(M_a\) reduced high-frequency variations, the convergence rules do not properly capture the actual convergence behavior of the algorithm. Using the p value does not solve this convergence issue (Table 1). A majority of the p values do not pass a 0.05 significance test and the fluctuations between iterations were unstable. Even without convergence, the allotted amount of iterations permitted the workflow to generate similar structures when compared to the goal, as observed in Fig. 11 and the KS statistic plots in Fig. 7.

For all three synthetic case studies, \(M_a\) of the KS statistics for size, aspect ratio, and orientation distribution generally converged within the first 100 iterations. Iterations after this either marginally improved the KS statistic or the 2D parameter error. The \(M_a\) of the aspect ratio and orientation distribution for the random case (Fig. 8) appear to have the most noise. This noise might have been caused by Dream.3D’s random seed process and the inter-dependencies of the orientation and aspect ratio distributions. Even so, the 2D and 3D feature parameter errors were small and the p values were on the order of the other two case studies. The line plots for the feature parameters and the 2D orientation distributions further support the algorithm’s convergence. After 100 iterations, all three case studies’ 2D distributions closely resemble the goal 2D distributions. Figure 9 shows the efficacy of the orientation control of the optimization loop The OBD case matches the goal distribution the best of the three case studies with a minimal increase in spread. Note that it is not detrimental to have views that do not observe a primary axis, and in this case, it is actually beneficial.

The calculated 2D and 3D errors show a noticeable difference in magnitude, with the 3D errors generally larger. These results stem from multiple effects. First, the algorithm maximizes the similarity of the statistical 2D distributions of the goal and current synthetic structure. Second, there is no guarantee of a unique 3D distribution that yields the observed 2D distribution, so a small deviation in 3D distributions is likely. Finally, 2D errors are calculated on the distribution parameters (i.e., mean and standard deviation) of the actual observed values. However, 3D errors are calculated on the 3D structure generation parameters, which are the fitting parameters for the functional forms of the distributions. In the case of the size distribution, the functional form is a log-normal distribution and the fitting parameters are approximately an order of magnitude smaller than the feature sizes themselves. This is likely contributing to the larger apparent error in the 3D comparisons. The two parameters with the lowest error in the 2D and 3D case for all three case studies are the \(\mu _c^{2D}\), \(\mu ^{3D}\) and \(f_c^{2D}\), \(f^{3D}\). This makes sense as the area fraction is a simple metric to calculate for images, and the \(\mu ^{3D}\) is relatively simple to track and update relative to other parameters. The \(\sigma ^{3D}\) and \(r^{3D}\) errors are larger, which could be concerning if the physical significance of the \(\sigma ^{3D}\) and \(r^{3D}\) were not considered. For the \(\sigma ^{3D}\), the errors relate to a goal value of 0.1; therefore, an error of 50\(\%\) could mean a sigma of 0.05 or 0.15, which does not dramatically affect the overall size distribution. The same applies for the \(r^{3D}\), as the difference between the goal of 0.1 and 0.08 (20\(\%\)) creates a minimal difference in the axes ratios of the ellipsoid with a 10/1 ratios versus an 8/1 ratio. Therefore, these results support the hypothesis that using a similarity metric to drive comparison of 2D statistical distributions leads to producing synthetic structures with comparable 3D parameters to a goal structure.

Looking at the experimental data shows similar results seen in the virtual case studies with some differences caused by the increase of variability expected in experimentally collected structures. The \(M_a\) of the KS statistic for the size, shape, and angular distribution do not converge to values that suggest the structures are as statistically similar as the synthetic test cases were. When observing a slice of the generated structure compared to a subset of the experimental data, the fiber populations appear similar in size, shape, and orientation. The 2D error for all parameters is higher than any of the errors observed in the three synthetic case studies. During the implementation, many issues were discovered that could explain the reduce performance of the methodology when applying it to this specific material data.

The first issue encountered was caused by the circular cross section of the experimental fibers, which constrained the b and c axis to be equal when packing equivalent ellipsoids. Dream.3d currently does not have this capability as the aspect ratios for the minor and secondary axes are sampled separately on their corresponding beta distribution. Therefore, it is almost impossible to generate an ellipsoid with the same aspect ratio and therefore size on the minor and secondary axes. This was discovered when observing the 2D slices of one of the planes of the synthetic emulation of the experimental data where the minor and secondary axes were visible; only ellipses were observed in this plane. This issue did not occur with the other synthetic case studies as the ground truth synthetic data was generated in Dream.3d. The group was able to compensate for this by limiting the spread of the aspect ratio beta distributions for the b and c axis; this limited the sampling range for both axes.

Another issue encountered in the experimental case study was the assumption of one particulate population in the images. For the synthetic case studies, it was assumed the fibers in the structure were sampled from one population. Given the lack of 3D data, it is possible the experimental data contains different particulate populations. The effects of the printing in DIW on the fibers themselves are still under investigation. Therefore, the group hypothesized a possibility of fibers breaking during printing in such a way that a new population of particulates was created with a different set of feature parameter distributions. This could explain some of the difficulties observed in the generator algorithm as it tried to fit a single population to particulates with several different populations.

The algorithm appears to reproduce the bulk statistics chosen for comparisons and generates visually similar 3D structures. The novelty of this approach is in the comparison with the KS test statistic of the structures using solely the observed 2D data and without the direct estimation of the 3D statistics from the 2D data. While the structures produced for all three cases appear to visually replicate and statistically match the goal structures, the convergence criteria remains imprecise. Figure 7 clearly shows that the \(M_a\) of the KS statistics converge around 70–100 iterations for all three cases; the termination criteria implemented does not properly detect this. Determining when a parameter/metric converges remains subjective, which makes finding a preferable set of rules and thresholds challenging. Inherent noise and inter-dependencies within the statistics observed increases the difficulty for framing objective metrics and convergence rules. Future work will address this issue and investigate solutions to a multi-objective scheme. The experimental data posed more of a challenge than any of the other case studies because of the increased variability in the observed structure along with the more complex angular distribution. Future work will address improvements in the methodology to combat the issues outlined for the experimental data case study.

Conclusion

The objective of this study—to create a framework for generating 3D microstructure volumes from 2D planar observations—largely succeeded for the three case studies specified above. Application to the experimental case study showed promising results with some of the issues outlined for future research. The use of the KS statistic as a metric for similarity leads to an effective and efficient iterative optimization framework. Fiber composites were chosen as a starting point for the development of the workflow, but the process should allow for studies in a variety of particulate material microstructures. This statistical synthetic generation framework allows for studies in the PSP linkage space and facilitates a deeper dive into material studies with machine learning.

While the general framework appears to produce a useful generator, an effective convergence criteria or function remains missing. Future work will further investigate this issue and determine whether the metrics, such as the KS statistic, need substitutes or whether a more complex convergence process is required. There also may need to be an increase in the resolution at which the current synthetic generator operates, allowing for more precise capture of the feature statistics. The issues mentioned for the experimental data need further studying to improve the performance of the generator. Finally, investigation into mesoscale features of the material will enhance the generator and improve the material property studies of varying particulate microstructures.