Additive manufacturing: A machine learning model of process-structure-property linkages for machining behavior of Ti-6Al-4V

Prior studies in metal additive manufacturing (AM) of parts have shown that various AM methods and post-AM heat treatment result in distinctly different microstructure and machining behavior when compared with conventionally manufactured parts. There is a crucial knowledge gap in understanding this process-structure-property (PSP) linkage and its relationship to material behavior. In this study, the machinability of metallic Ti-6Al-4V AM parts was investigated to better understand this unique PSP linkage through a novel data science-based approach, specifically by developing and validating a new machine learning (ML) model for material characterization and material property, that is, machining behavior. Heterogeneous material structures of Ti-6Al-4V AM samples fabricated through laser powder bed fusion and electron beam powder bed fusion in two different build orientations and post-AM heat treatments were quantitatively characterized using scanning electron microscopy, electron backscattered diffraction, and residual stress measured through X-ray diffraction. The reduced dimensional representation of material characterization data through chord length distribution (CLD) functions, 2-point correlation functions, and principal component analysis was found to be accurate in quantifying the complexities of Ti-6Al-4V AM structures. Specific cutting energy was the response variable for the Taguchi-based experimentation using force dynamometer. A low-dimensional S-P linkage model was established to correlate material structures of metallic AM and machining properties through this novel ML model. It was found that the prediction accuracy of this new PSP linkage is extremely high (>99%, statistically significant at 95% confidence interval). Findings from this study can be seamlessly integrated with P-S models to identify AM processing conditions that will lead to desired material behaviors, such as machining behavior (this study), fatigue behavior, and corrosion resistance.


Introduction
Metal additive manufacturing (AM) technologies provide a flexible, efficient, and rapid means to fabricate complex and customized products through a layer-by-layer approach. Two main categories of metal AM technologies predominantly used in industrial applications are powder bed fusion (PBF) and directed energy deposition (DED). These processes constitute over 90% of production-grade metal AM systems installed worldwide. PBF can be divided into two categories depending on the energy source and processing conditions. Electron beam PBF (EB-PBF) requires a vacuum environment, and laser PBF (L-PBF) operates under an inert atmosphere. In addition, Frazier (2014) highlighted the major differences in the cooling rates across metal AM processes ranging from 10 3 K/s in DED to 10 4 -10 6 K/s in L-PBF and EB-PBF. Interactions between as-AM, post-processing conditions, and final properties have always been of interest, since different cooling rates lead to varying material structures and mechanical properties [1] . Trelewicz et al. (2016) found that inherent differences in metal AM processing conditions directly impact the cooling rate and thermal gradient during the build, resulting in highly heterogeneous and AM processspecific dominant textured microstructure (e.g., dominant columnar microstructure and microsegregation) [2] .
In metal AM production cycles, post-processing steps such as heat treatment and machining are often necessary to achieve desired material properties, tolerances, and surface finish. However, it should be noted that the unique heterogeneous microstructure and resulting mechanical properties of metal AM also significantly affect their machinability. Hence, it is critical to investigate this complex interaction between grain morphology (size, density, orientation, residual stress, and phase fraction) of metal AM parts and resulting material behavior, that is, specific cutting power during machining, which is of interest in this study. The goal of this study is to establish a validated PSP linkage that could be extended to other AM material properties, such as corrosion behavior, wear, and mechanical strength.
A low-dimensional process-structure-property (PSP) linkage that captures the effects of processing conditions on critical material structure that ultimately affects material behavior has always been of interest to the research and manufacturing communities. Bostanabad et al. (2016) provided a stochastic microstructure characterization reconstruction method using supervised machine learning (ML) [3] . The microstructure reconstruction method in their research indicates that the correlation feature extraction methods are accurate and efficient to represent microstructure characterization statistically. Moreover, Kalidindi (2015) summarized the data-based methods that related to the accelerated development of advanced hierarchical materials [4] . However, this research reveals that the correlation functions and physical descriptors require high computational cost; hence, a low-dimensional method needed to be explored in the PSP linkage framework. Popova et al. (2017) investigated the AM process parameters and microstructure PSP correlations based on a reduced-ordered ML method [5] . Their research developed process-structure (P-S) linkage with a low-dimensional representative of AM heterogeneous microstructure. However, on the other side, metal AM parts' mechanical and machining behavior had not been investigated. Greitemeier et al. (2015) noticed that the chemical composition, microstructure, and mechanical properties in AM Ti-6Al-4V are highly dependent on the AM processes [6] . Furthermore, the uncertainty in metal AM parts' material properties and mechanical properties need to be understood. Hence, Markl and Körner (2016) declared that it is critical to establish a novel modeling framework based on data science and numerical methods that can bridge to a critical knowledge gap between process, material microstructure, and material behavior to overcome the current limitations due to uncertainties in PSP linkages [7] . The present study aims to develop a novel PSP ML model for metal Ti-6Al-4V AM alloys that can accurately predict material behavior in post-processing, that is, machining behavior. Li et al. (2020) clarified that titanium alloys are widely used in multiple mechanical, aerospace, and biomedical applications due to their high strength-to-density ratio and excellent corrosion resistance [8] . However, titanium alloys are often difficult to cast and process through subtractive machining (e.g., strain hardening). Liu and Shin (2019) showed that metal AM technologies provide an alternative near-net-shape fabrication capability that allows for titanium product manufacturing to become relatively more cost effective for high-performance applications [9] . Hence, to better understand the titanium alloys performance in the AM field, a PSP linkage is required to reveal the structure and machining behavior correlation of AM titanium alloys.
To build the PSP linkages, descriptors are critical components to the necessary data science-based ML approaches. These descriptors quantitatively represent recorded information on material microstructure using statistical methods and appear as correlation functions, physical descriptors, and spectral density functions for a given grain morphology. Corson (1974) described the 2-point correlation function representing heterogeneous material microstructure [10] . However, the limitation of 2-point correlations in the representation of heterogeneity reduces its direct application for metal AM. To better represent the AM microstructure, a combination of statistical methods needs to be developed. Lu and Torquato (1992) provided lineal-path function and cluster correlation function, which are widely used to extract microstructure information of materials and generate statistical data that can represent the heterogeneity of a multiphase structure [11] . Hence, in the present study, multiple types of statistic models have been used to capture information from AM microstructure.
In the case of P-S linkage, Gan et al. (2019) reported the challenges in establishing a validated relationship based on the experimental measurements of multiple AM process phenomena across multiple spatial-temporal scales [12] . Hence, numerical simulation methods become a tool to build connections between microstructure and the AM process parameters. Thijs et al. (2010) showed that in the PBF process, the AM process (e.g., EB-PBF vs. L-PBF) and the corresponding AM processing conditions, such as beam power density and scan strategies, significantly influence both the overall grain morphology and local microstructure [13] . This can be attributed to the effects of input energy density, which affects the growth direction of the elongated grains as a function of build height and the resulting cooling rate that influences the phase transformation of the material. For instance, Li et al. (2017) noted that the temperature fields on DED processing created by the moving heat power source and material absorption conditions can directly affect the microstructural evolution due to cyclic thermal processing, which results in complex solidification and "in-build" thermal cycling of previous layers [14] . Due to the significant differences between PBF and DED processes, the present study focused on Ti-6Al-4V PBF material and machining properties.
For structure-property (S-P) linkages, Paulson et al. (2019) explored the correlation between grain morphology and mechanical properties, such as microhardness, tensile, fatigue behavior, and elastic localization, based on the correlation function [15] . However, considering the inherent layer-by-layer characteristic of metal AM processing, the resulting heterogeneous microstructure will need to be considered to achieve high accuracy in the S-P correlation function. Hence, Fernandez-Zelaia et al.
(2019) established S-P linkages based on spatial statistical metric feature descriptors [16] . However, during the AM processes, different fabrication processes (e.g., L-PBF and EB-PBF) and parameters would lead to drastic differences in different parts, so a large dataset could challenge the conventional methods. Yang et al. (2018) provided S-P linkages using ML methods with limited success due to challenging high-dimensional data representation with limited quantification of grain morphology metrics and statistical evidence [17] . However, effort focused on homogeneous materials; additional investigation is needed for establishing S-P linkages based on the AM materials. Manogharan et al. (2015) illustrated that postprocessing steps, such as hot isostatic pressing, machining, and/or surface finishing operations, are required to create functional metal AM surfaces [18] . Hence, establishing and validating novel S-P linkages that are able to capture AM metal part machining behavior accurately are imperative for metal AM development. Zhang et al. (2019) indicated that in machining of wrought and AM Ti-6Al-4V, cutting conditions including cooling strategy, cutting tool geometry, tool coating, and machining parameters strongly affect the machined surface and related mechanical properties of metal AM parts [19] . Due to the presence of fine microstructure and martensitic phase distribution, PBF Ti-6Al-4V samples exhibit a higher hardness when compared to the wrought parts. Hence, a higher wear rate was observed during the machining of AM Ti-6Al-4V. Liu and Shin (2019) found that due to a higher thermal gradient and cooling rate when compared with traditional parts, PBF Ti-6Al4V usually shows higher ultimate tensile stress, yield stress, and lower elongation rate [9] , indicating that the conventional machining parameters might need to be optimized for AM parts, and a valid S-P linkage would be critical for AM material research. In addition, Edwards and Ramulu (2014) found that inherently large temperature gradients in Ti-6Al-4V AM parts result in higher residual stress, which increases with an increase in AM processing time, that is, number of AM layers [20] . Studies have shown that residual stresses are larger along the scan direction than perpendicular direction due to the larger thermal gradient, which creates an anisotropic residual stress distribution, ultimately affecting the mechanical and machining behavior of AM parts.
In summary, a reduced-order computational and analytical method is required to thoroughly explore the mechanical and machining behavior for various metallic AM materials. Such a high-throughput computational data science-based analysis of material structure and resulting material behavior will connect a massive dataset that stores microstructural characteristics to the properties of materials to gain critical insights into this complex PSP linkage. To this end, Matouš et al. (2017) provided several ML and deep learning based predictive models on material databases for building PSP linkages and estimating the material properties where no experimental data may be available [21] . In the case of machining research, Leo (2001) provided ML tools that could be further developed to predict the relationship between cutting conditions, temperature, grain size, grain fraction, and hardness data on Ti-6Al-4V for generic microstructures derived from conventional manufacturing [22] . Hence, high efficiency and validated PSP linkages for metal AM processing, which capture the AM part machining behavior and material properties, need to be developed.

Materials Science in Additive Manufacturing
This study builds on the prior work described. Gong et al. (2020) showed statistically significant differences in the machining behavior of Ti-6Al-4V across build directions in EB-PBF specimens, and as-AM and after heat-treated L-PBF specimens (21% lower specific cutting power in L-PBF specimens) [23] . In addition, Ren et al. (2019) found that visual evaluation of material characterization data showed textured differences in microstructure, residual stress, and crystal graphic information among different PFB parts [24] . Recent study by Goh et al. (2021) has identified the need to establish standards for sharing large dataset of AM processing conditions to accelerate advancements in ML applications to improve AM [25] . A recent review by Nasiri and Khosravani (2021) presented opportunities for applying ML methods to understand fracture behavior of AM parts [26] . In addition, a recent report by Sing et al. (2021) established opportunities to integrate ML methods in both upstream (i.e., part design and file preparation) and downstream (i.e., in-process monitoring) [27] .
It is evident that there is a need for a systematic framework to quantify the heterogeneity in Ti-6Al-4V material structures processed through varied AM and post-AM conditions to better understand the PSP linkages. In this study, statistical functions are used to represent scanning electron microscopy (SEM), electron backscatter diffraction (EBSD) microstructure information, and residual stress captured from X-ray diffraction (XRD). According to Chen and Guestrin (2016), a novel ML tool was developed to construct a reusable S-P linkage to predict the machining behavior of as-AM and heat-treated PBF Ti-6Al-4V [28] . In addition to the novel PSP linkage, this study employed a comprehensive dataset to generate an aggregate database that is reflective of all PBF processing of Ti-6Al-4V alloys, as shown in Figure 1: (1) 200 SEM images per material, that is, L-PBF with and without heat treatment -HT, EB-PBF per surface, that is, parallel-XY and along-XZ build orientation; (2) 3 XRD per material per surface; and (3) 3 EBSD per material per surface. Measurements were conducted from three different samples from the same processing conditions.
In summary, the overall goal of the study is to provide a novel framework for AM Ti-6Al-4V machining by developing PSP linkages to link microstructures to the corresponding machining behavior, based on the specific cutting energy. The success of this approach depends on the critical definition of a suitable reduced-order form of descriptors for heterogeneous grain morphology across a wide range of microstructures for given material composition. Figure 1 presents the aims, scope, and methodology of this study, which are detailed in subsequent sections.

Sample preparation
In the case of Ti-6Al-4V EB-PBF specimens, 25 × 25 × 50 mm Ti-6Al-4V blocks were fabricated in an Arcam A2 electron beam melting machine with 50 µm layer thickness using standard Ti-6Al-4V 50 µm preheat and melt parameters provided by Arcam. In the case of L-PBF specimens, Ti-6Al-4V blocks of similar dimensions were fabricated in an EOSINT M280 system using a fiber laser power of 200 W and spot size of 80 µm, and power density reach ~40 kW/mm 2 using standard EOS Ti-6Al-4V parameters with raster scanning and a hatch distance of 100 µm. All specimens had built orientations along the Z-axis, which is the direction of the smallest dimension. Additional L-PBF samples in the same build direction were fabricated for heat treatment and for residual stress relief. As per AM standards, samples were heated under vacuum to a temperature within the range of 899~927 ± 14°C (1650~1700 ± 25°F), held for 2-4 h and argon cooled to below 427°C (800°F), then heated again to 538 ± 14°C (1000 ± 25°F) for 4 h in vacuum followed by Argon cooling to room temperature. To eliminate the potential effects of part location across build plates, specimen locations were randomized. Furthermore, to eliminate the effects of proximity to build plate, samples for characterization were harvested from the center of the part from the topmost layer ( Figure 3).
Representative AM Ti-6Al-4V specimens were sectioned from the build platform using wire EDM. Samples representing the surface parallel and vertical to build orientation on all EB-PBF, as-AM L-PBF, and heattreated L-PBF specimens were mounted in the epoxy model for mechanical polishing to achieve 0.5 µ grading surface finish. A final ion milling was applied to prepare the sample surfaces for EBSD testing using a Thermo Scientific TM Apreo SEM with an Oxford Instruments EBSD detector. Kroll's reagent was then applied for 30 s to etch all samples, revealing all grain boundaries for SEM microstructure observation. Other samples from the same batch of fabrication representing all the surfaces described above were also used in X-ray residual stress measurements.

SEM data extraction
To develop a robust PSP model, it is critical to statistically represent grain morphology that can accurately represent AM material heterogeneity. In this study, a large number of microstructures (e.g., 200 SEM images per material surface) were captured. The rationale for this data-intensive approach is to capture a statistically representative set of local material structures for a given material surface. Previous research has shown that the representative volume element (RVE) represents a range where the material properties would not be sensitive to the bulk material properties. Przybyla and Mcdowell (2012) indicated that smaller statistical volume elements (SVEs) that capture material properties could be used to achieve a feasible computational cost and time [29] . The volume requirement of SVEs is effective in achieving the key features of a given grain morphology. To achieve high efficiency and low computational cost, SVE sets were collected from all material samples in this study.
To statistically represent quantitative descriptors of each microstructure, low-order spatial correlations such as 2-point correlation functions can be developed to capture the structural variability. In this study, 2-point correlation functions f(m,m' |r) represent the conditional probability density of finding the same phase features m and m' at the head and tail of a vector r randomly placed in the microstructure and is formally expressed as:    Where, the function p(m,x) represents the probability density of finding the local state m at the spatial location x. In this function, m and x can be treated as either continuous variables or discretized descriptors. In this study, the Ti-6Al-4V phase information is considered the local state space. Therefore, the function f(m,m'|r) denotes the conditional probability of finding different phases (α/β) in spatial bins across a Ti-6Al-4V microstructure, which are separated by the vector set r.
According to Liu and Shin (2019), it is important to include multiple descriptors to capture all the elements of heterogeneous grain morphology for developing a robust microstructure data science PSP linkage [9] . Therefore, besides the correlation functions, chord length distributions (CLDs) which are direct descriptors of the grain morphology are also applied to the workflow. The rationale behind this approach is to capture additional microstructure features of interest in metal AM parts: Grain size, shape distribution, and their anisotropy. In addition, Turner et al. (2016) have shown that the chord length directly connects to the material's plastic properties, which could be a valid statistical method for this workflow [30] . The computational cost of CLDs is relatively low and is, therefore, ideal for this study that aims to analyze a large ensemble of microstructures.
In this study, a chord is defined as a line segment that begins and ends at the boundaries of a single grain contained within the microstructure. CLDs describe the probability of locating a chord of a specific length within microstructure SVEs. In this study, CLDs were computed in X and Y direction as all microstructure images were collected in the same X and Y plane orientation from surfaces that share the same X and Y coordinates. The next step of the microstructure data extraction focused on reducing the dimensionality of the representations on each SVE in the microstructure using spatial data statistics and principal component analysis (PCA). The PCA is a data-driven linear transformation of extracted data to an orthogonal space that captures the variance within the dataset with minimum dimension. Kalidindi (2015) presented that the rationale behind this approach is to reduce the dimensions of the dataset dimensions to significantly increase computational efficiency and identify the salient features of the microstructure to establish the PSP model [4] . Multiple studies have shown that PCA is an effective tool to produce low-order, high-value representation of microstructures that are valuable for building PSP linkages across a range of material categories. In addition, Paulson et al. (2018) have shown that only a few basic functions contribute to the efficacy of S-P linkage after PCA [31] . In this study, CLDs were computed in both orthogonal orientations (x and y) then concatenated with 2-point correlation function data to build up a large feature vector for each microstructure SVE (i.e., each SEM image). In summary, the PCA input deduces microstructure information collected from SEM characterization.

XRD data extraction
In addition to the influence of microstructures collected from the SEM microstructure, Hansen (1958) indicated that other characterization features, such as residual stress, also have direct effects on mechanical properties and machining behavior [32] . AM fabrication inherently leads to a rapid cooling rate and therefore large thermal gradients that cause phase transformations, and generates significant residual stresses. Telrandhe et al. (2017) have shown that near-surface residual stresses have a significant impact on mechanical behavior, such as fatigue and microhardness [33] . During the post-processing of metal AM parts, machining is often required to achieve desired geometric dimensioning and tolerancing (GD&T), as well as desired surface finish. Therefore, the near-surface residual stresses play a key role in the post-processing machining behavior of AM surfaces. XRD is one of the most widely used non-destructive near-surface residual stress measurement methods. XRD directly measures the strain due to the distortion of the crystalline lattice structure from residual stress.
In this study, the near-surface residual stress was considered an independent input to the S-P linkage model. Specimens representing each PFB process, heat treatment condition, and vertical and horizontal surfaces to build direction were used to measure residual stress in the axial direction. All measurements were made using the sin 2 ψ technique in an X-ray diffractometer with a Cu K-α source (1.5406 Å). Strain and residual stresses were calculated based on the d-spacing and 10 unique  tilts on the {1 0 3} crystallographic plane based on the following equation: Where, the åφψ represents the strain in the specific  and ø tilt. The Ti-6Al-4V elastic constant 119 GPa was used to calculate the residual stress. The stress-free d-spacing is not necessary to calculate the residual stress in this method. Luo and Yang (2017) have shown that the d-spacing and sin 2 ψ relationship might not be linear [34] . When shear stresses are present, they will be manifested in the d-spacing-sin 2 ψ plot. Hence, strain was calculated for different ψ tilts angles in this study using Equation (II) and a linear regression analysis was performed to calculate the principal stress of each specimen.

EBSD data extraction
Grain orientation during machining of polycrystalline metals influences the machining response, that is, cutting force and surface finish quality. Demir and Mercan (2018) have shown that the effect of elastic and plastic anisotropy of the material on cutting force cannot be neglected [35] . Hence, several models have been developed for deformation mechanisms and stress in polycrystalline alloys like Ti-6Al-4V to predict and simulate slip, twinning, and detwinning in hexagonal (HCP) unit cell in the past decades.
The Schmid factor (SF) was selected in this study to capture the local slip and twinning activity in the microstructure which have direct effects on related machining behavior at a macroscale. SF is defined as a ratio of shear stress on the system  s and system applied stress  n , and can also be represented as a product of cosine of the angles between the applied stress axis and slip plane normal ∅, and applied stress axis and slip direction λ.  ) indicated that the Schmid-based model assumes that the macroscopic stress is used in the calculation of the local solved shear stress in a specific slip and twinning system, and the evolution of critical resolved shear stress modeling depends on the local stress and plastic strain, that is, applied slip system and current grain orientation [36] , which is directly related to machining behavior of the material.
In Ti-6Al-4V, the dominant α phase is an HCP crystal structure which has the following slip systems: Basal slip shown that the dominant slip systems in Ti-6Al-4V are basal, prismatic, and first-order pyramidal [37] . These are the three slip systems considered in this study.
While the SF is based on the isostress assumption, another descriptor, the Taylor factor (M) is based on an isostrain assumption, which measures the work done on the crystal for a given orientation and deformation, which can be represented as: Where, τ CRSS is the crystal critical resolved shear stress, dW is the rate of work done, and d ε is the incremental strain.
In a given transformed state of strain in a crystal, Demir (2008) pointed out that the Taylor factor shows that the minimum work is done against the slip resistances for each five of 12 slip system combinations [38] . Since this study aims to predict the machining behavior from the material structure, general shear stress and strain conditions were applied to the SF and Taylor factor calculation. In this study, the SF value and Taylor factor value calculated on three replicates of EBSD data for every material surface and were treated as independent input variables in the S-P linkage workflow.

Machine set up
The machining experiment was performed on a Haas VF2SS 3 axis vertical CNC machining center. Peripheral milling applied on the specimen blocks mounted on a custom-built vise. Cutting tools selected for the experiments were 6.35 mm (1/4 inch) nominal diameter six flute carbide end mills with KC635M TiAlN coating (Model HPFT250S6075), which is recommended for machining titanium alloys. The tools have flat square end geometry, with zero radial rake angle and 45 axial rake angle, which made the modeling of tool geometry for specific cutting energy calculation easier. Vise and workpiece were mounted on a Kistler 9257A dynamometer, which connected to a Kistler Type 5010 Charge Amplifier, as shown in Figure 2.
The cutting force data signals were collected and restored by Data Physics Quattro Dynamic Signal Analyzer using a sampling frequency of 2560 Hz. After each tool set up, a dial indicator with a resolution of 0.00127 mm was used to measure the tool runout to ensure each machining path has an acceptable tolerance.
Based on the Kennametal tool manufacturer's recommendation for Ti-6Al-4V alloys milling, three levels of machining parameters were selected for the machining experiments, as shown in Table 1.
As described above, for EB-PBF, as-AM L-PBF, and heat-treated L-PBF specimens, two feed directions, XY and XZ, as shown in Figure 3, that are perpendicular and parallel to the build orientation respectively, were utilized in the experiment. The radial immersion was 50% for the experiment.

Specific cutting energy
A normalized specific cutting energy was used in this study for cutting force analysis and machining behavior representation. A mechanistic force model proposed by Kline and DeVor (1983) was used to calculate effective specific cutting energy from average cutting force data collected [39] . The cutting feed direction is shown in Figure 3 across all material surfaces (i.e., XY and XZ feed in EB-PBF, as-AM L-PBF, and HT L-PBF).
Every cutting edge was treated as small disk elements with height dz up to the axial depth of cut. The generalized expression for angular engagement of the cutter A(i,j,k) shown as below: Where, the i indicates the number of disk elements, j indicates the angular positions of the cutter θ(j), and k defines the number of flutes. Nf is the total number of flutes. H is the helix angle, R is the nominal radius of the end mill. The tangential force element dFt and radial force element dFr are: Where, Tc is the chip thickness for this current cutting condition. The Kt and Kr are the specific cutting coefficients, which are used to calculate the specific cutting energy. Considering the cutting force collected from X and Y direction, the relative cutting force element can be expressed as: The total force applied to the material is the summary of the dFx and dFy element force from all disks within the axial depth of cut.
The final expression of cutting force and related specific cutting energy can be calculated through the following equations:

Fx
Where, (i,j,k) is an indicator showing whether the cutting edge element is engaged with the workpiece, and the Fx and Fy are the average cutting forces over one cutter rotation in X and Y directions. The specific cutting energy value Kt can be calculated from Equations (VIII) and (IX) based on the instant cutting force Fx and Fy collected from the dynamometer (using MATLAB in this present study).
The specific cutting energy is also related to the machining parameters involved from each run. Therefore, machining parameters feed, cutting speed, and depth of cut are three independent input variables along with microstructure characterization input data (Section 2.2). This result in the specific cutting energy computed as an output in the S-P linkage workflow.

Material microstructure representation
In this study, 200 sets of SVEs were used to represent each of the AM Ti-6Al-4V microstructures. Each SVE must capture enough information to avoid area sensitivity. According to recent research in titanium, Priddy et al. (2017) pointed out that the influence coefficient decayed to around zero within a ~210 µm region [40] . Therefore, SVEs with similar dimensions of 210 µm side lengths square were employed in data collection. For every AM surface (L-PBF XY HT, L-PBF XZ HT, L-PBF XY NHT, L-PBF XZ NHT, EB-PBF XY, and EB-PBF XZ), 200 microstructure images for a total of 1200 microstructure images were collected through SEM characterization.
MATLAB batch processing programs were used to compute 2-point correlation and CLDs for all SEM images. Figure 4a shows an example of the EB-PBF Ti-6Al-4V microstructure collected. Small needle-shaped columnar α grains that grow along the β boundaries to form a typical alternate α+β AM Ti-6Al-4V microstructure can be observed. As observed in Figure 4b, the normalized 2-point correlation reflects the small grain size and particle to particle distance, that is, grain density. Figure 4c shows that the CLD functions across X and Y directions show a longer grain shape distribution in X-direction than Y-direction, which was expected in this EB-PBF XZ plane.
Since the total dimensions of the correlation function and CLDs for one SVE microstructure vector were over 3000, zero and near-zero tail ends of the distribution that

A B C
would not affect the accuracy of the statistical functions were not included in the SP linkage model.
Subsequently, PCA was applied to reduce the dimension of the microstructure descriptor matrix. For each microstructure SVE vector, all variables denoted by the statistical functions can be divided into a combination of orthogonal basis vectors, which is named as principal components (PCs) and weight (PC variance score). Figure 5 shows the individual variance for each PC. It is evident that cumulative variance increases with increasing numbers of PCs. The PCA cumulative variance approaches to >99% as the number of PCs increases to 238. Therefore, the total dimension of microstructural representation was reduced to 238, which reduces the dimension of the descriptor matrix by more than 92% when compared with the original input matrix. This was critical to develop an SP linkage model that was both computationally feasible and accurate to predict material response.
In this study, XRD residual stresses were measured along the direction of machining feed. The X-ray penetration depth varied between 25~50 µm, which depends on the tilt angles. Using XRD crystal strain measurements and Equation (II), the average surface residual stresses for all six AM material surfaces are presented in Table 2.
The positive value in residual stress indicates tensile stress, while the negative value indicates compression stress. The residual stress on L-PBF surfaces after heat treatment shows a decrease in shear stress. However, heat treatment does not heavily influence the near-surface crystal principal stress in Y and Z directions in L-PBF specimens. EB-PBF samples as-AM top surface shows the largest compressive stress in the Y-axis direction and the smallest principal stress in the Z-direction, which indicates tensile stress and large shear stress along with the build orientation in EB-PBF sample.
Based on the EBSD measurements, the SF and Taylor factor (M values) were calculated from crystal orientation distribution using MATLAB program. Average SF and M values for each SVE were added to the PCA descriptor matrix. Zhang et al. (2019) indicated that the SF analysis shows SVE with high variants of active twinning and detwinning should have high values of SF [19] . Considering the major slip systems in α-Ti-6Al-4V, basal, prismatic, and the first-order pyramidal slip systems were applied in the Schmid model, while the critical resolved shear stress (CRSS) ratio for these selected three slip systems is 1:0.7:3. Demir (2009) showed that the deformation strain (DS) in Taylor factor calculation for a peripheral milling process can be expressed as a sum of pure shear and angular shear, as shown in Equation (XII) [41] . Since the strain varies based on the machining parameters, three different strain rates were selected in the Taylor model calculation.  Figure 6 shows an example of EBSD observation on the heat-treated L-PBF XY plane sample and Table S1 in the Supplementary File shows the average SF at different slip systems and Taylor factor of different strain rates for different AM material surfaces.

ML model and cross-validation results
The goal of this study to establish the S-P linkage by training a regression ML model relies on machining parameters, microstructure functions, residual stress, and SF input to predict the specific cutting energy.
To construct a robust ML model, the first step is data preparation. In this study, the dataset contains 14,400 sample points with 72 different machining parameter combinations and 200 sets of microstructure data for every AM material surface. After PCA of the SEM microstructure, this was reduced to one response variable (specific cutting power) and 262 features: 238 features for PCA processed features to represent microstructure function, three features for the residual stress, nine features for SFs input, nine features for the Taylor factors input, and three features for the machining parameter combinations (feed, speed, and depth of cut). For each machining parameter combination, three replicates were conducted, and the average specific cutting energy was treated as the response variables. The next step is to train a regression model (random 80% dataset) for testing (20% dataset). This study employed XGBoost (eXtreme Gradient Boosting Treebased approach) and linear regression models for training. Chen and Guestrin (2016) presented the XGBoost model, which has been widely used due to its accuracy and interpretability [28] . XGBoost is a regularized gradient boosting machine which controls for overfitting by employing a more regularized objective function that incorporates both a convex loss function and a penalty parameter for regression tree function. The classical linear regression model is used as the baseline model. Both models were implemented using Sklearn packages in Python. A grid search approach for tuning the hyperparameters, including the maximum depth of each subtree and the number of subtrees, was applied. A grid search evaluates different combinations of hyperparameters by cross-validation and selecting the best hyperparameter set to train the estimator of a learning model. During validation, the test set (20%) was applied to the best XGBoost estimator and the linear model to validate their accuracies with the root mean square error (RMSE) being the evaluation metric for the accuracy of the ML model.
To better understand the influence of machining parameters, microstructure functions, residual stress, and EBSD feature inputs on model accuracy, five different combinations of features were designed for training and testing. The first combination uses only machining parameters (MP) as the ML inputs; the second combination considers machining parameters and EBSD features (MP+EBSD); the third combination considers machining parameters and residual stress (MP+XRD); the fourth design considers machining parameters and SEM microstructure functions (MP+SEM); and the last design integrates all the input features (All = MP+SEM+EBSD+XRD). The rationale for this approach is to understand the individual and interaction effects of machining conditions, grain size, grain density, grain orientation, and residual stress on machining behavior.
As common in ML models, training and testing of all five combinations of design features were repeated multiple times (ML runs = 10), where randomly selected sets of training data (80%) and test data (20%) were applied. ML results to predict specific cutting power during machining of Ti-6Al-4V AM surfaces using PBF are presented in Figure 7, where the Y-axis  represents the average RMSE, and corresponding standard deviations are presented for each feature combination. It should be noted that when all features are used, the standard deviation for the XGBoost ML model is minimal (0.5 % RMSE with ± 0.025% standard deviation).

Discussion
As shown in Figure 7, predictors trained with only machining parameters achieve about 90% accuracy in both XGBoost and linear models. However, when the SFs and Taylor factors features are independently included in the ML models, the XGBoost model's accuracy increases to 94.6%, with larger variance (±5%), and the linear regression model was not improved with SFs and Taylor factor. It is evident that the incorporation of residual stress and crystal orientation increased the accuracy to predict machining behavior in a ML model. The larger variance shows that additional descriptors that represent other features of a microstructure should be considered. One possible reason is that the linear model is less capable of synthesizing high-dimensional data. However, it is evident that linear regression models are not sufficient to accurately predict the machining behavior of complex microstructures produced through metal AM. It was also observed that the accuracy of the ML model increased to 97% when machining parameters and SEM spatial functions were considered to predict machining behavior. This can be attributed to the relatively smaller grain size and higher grain density in PBF AM systems, which are captured in the SEM descriptors (2-point correlation function and CLDs). Finally, when all the design features are included, the XGBoost ML model achieved an accuracy of 99.5% ± 0.025.
It can be deduced that the XGBoost ML model developed in this study is consistently more accurate when compared to the linear regression model, which is widely used in this field. The accuracy of the linear regression model decreases in a high-dimensional dataset, such as MP+SEM. Zhang et al. (2018) indicated that in the ML model, the overfitting problem might happen when the dataset has a large dimension of input factors with a relatively small size of responses [42] . Due to the smaller dataset volume (72 data samples) in both combinations of machining parameters with residual stresses and EBSD, the variance of testing accuracy is large for both ML models, which could be attributed to overfitting in XGBoost model. It is inferred that at higher dataset volume, the XGBoost model becomes more accurate in prediction performance.
To isolate the individual effect, impact feature importance analysis was performed on all the features considered in this XGBoost ML model, namely, residual stresses from XRD measurements, SFs and Taylor factors captured from the EBSD map, and features from PCA of the SEM images. In the XGBoost ML model, the weight (importance) of a feature will increase if it participates in the creation of each tree during the forest building stage of the ML model. Dhaliwal and Nahid (2018) reported that when the tree is growing, for every gain of splits that use the feature, the importance of this feature increased [43] . Hence, the feature score (F-score) is introduced to evaluate the importance of a feature by calculating the number of times a feature appears in a tree and is presented in Figure 8.
As expected, machining parameters are the most important role in the S-P linkage construction as expected. Based on the F-scores of the last model, where all features are included, the surface residual stresses (RSs) have the highest impact among all material structure features, which can be observed in Figure 8e, followed by SF in the prismatic slip systems and small strain Taylor factor calculated from EBSD measurements. PCs calculated from the SEM microstructure data also show a positive effect on predicting the specific cutting energy. If only SEM information with the machining parameters was used to train the XGBoost model, PCs with higher variance (initial few PCs) could positively influence the prediction accuracy of specific cutting energy, as shown in Figure 8d. However, incorporating additional PCs of SEM microstructure result in lower prediction accuracy. This could be due to the collinearity of SEM data with XRD and EBSD measurements. In addition, incorporating additional PC elements of microscaled structural information may not represent useful data for a specific prediction goal of machining behavior. SFs and Taylor factors calculated from all three slip systems also possess some importance in predicting the specific cutting energy. Since the Taylor factors measurement highly depends on EBSD data quality, zero solution in EBSD data could reduce the accuracy of the Taylor factor.
In summary, incorporating microstructure, SFs, and Taylor factors extracted from EBSD data, and residual stress calculated from XRD measurements were the most effective in accurately predicting machining behavior when compared with a baseline model where only machining parameters were incorporated in both linear regression and XGBoost model. It is also evident that a high-dimensional and large-volume dataset greatly improves the accuracy of prediction, but advanced ML approaches are necessary to handle such complex datasets. Due to the relatively small data volume of residual stress, SFs, and Taylor factors, models for MP+EBSD, and MP+XRD show a lower accuracy when compared with model MP+SEM. However, a model with all features (All) is highly accurate to predict the AM Ti-6Al-4V machining behavior among different AM surfaces and build orientations, due to the complex thermal gradients in PBF processes that lead to different residual stress conditions and crystal structure. This can be attributed to the incorporation of grain orientations information and residual stress with SEM and machining parameters. In addition, the future study will include further extending this model to similar alloys for AM (e.g., Ti6242 -Ti-6Al-2Sn-4Zr-2Mo-0.08Si) and for other AM processes for Ti-6Al-4V (e.g., wire or powder based, laser or electron beam, or plasma-based DED processes).
This study developed a new workflow to establish and validate high accuracy ML models for S-P linkages based on reduced-order grain morphology information for machining behavior on the PBF Ti-6Al-4V alloys. The data extraction methods were efficient and validated. In addition, Paulson et al. (2019) have established the P-S linkages to connect metal AM microstructure with several AM processing conditions based on laser power density and scanning strategies [15] . Based on the findings from this study, a full metal AM PSP linkage can be built to link AM processing with final post-processing machining behavior based on the material characterization and reducedordered data science approach.

Conclusion
A novel Ti-6Al-4V AM workflow is presented to build an S-P linkage in microstructure evolution and machining behavior relationship through the utilization of advanced data science techniques. This workflow discovered the influence of multiple features of a grain morphology created by the PBF process on specific cutting power during post-AM machining. Five steps were introduced in this approach: (1) Microstructure data processing, (2) dimensionality reduction, (3) machining data extraction, (4) extraction and evaluation of S-P linkages, and (5) feature importance analyses. Due to the large dataset used in this workflow, PCA and ML tools were developed to overcome the difficulties in conventional material science analysis. A comprehensive set of fully functional ML programs to batch process large SEM, EBSD, XRD, and cutting force data codes were created to batch process the large structure and properties dataset.
This novel workflow was highly accurate (>99%) in predicting the machining behavior of PFB Ti-6Al-4V microstructures. Grain morphology features included in the workflow were microstructure spatial correlation functions, CLDs, residual stress, SFs, and Taylor factors, which significantly improved the accuracy of machining behavior prediction. This study provides a feasible routine for metal AM parts machining behavior prediction for post-processing in the future.
Although the S-P linkage showed excellent results in this study, this study was limited to only PBF AM parts. Additional studies using wrought and other AM processes, such as wire and powder fed DED technologies, are still needed.

Fx
Average cutting force parallel to feed direction (N)

Fy
Average cutting force normal to feed direction (N)