1 Introduction

Topology optimization derives the optimal material densities in a target domain for a given set of load and boundary conditions. In contrast to other typical structural optimization results, topology optimization provides a unique and organic shape. Therefore, this method is widely studied in various applications that require novel designs. The objective of topology optimization (Bendsøe and Kikuchi 1988; Bendsøe 1989) is to minimize the overall weight of structure while satisfying constraints such as volume, displacement, stress, etc.

The initial topology optimization theory was established by Bendsøe and Kikuchi (1988); Bendsøe (1989); Zhou and Rozvany (1991). They introduced a penalty value to derive the optimum material density layouts in the domain to augment the common optimization algorithm, called Solid Isotropic Material with Penalization (SIMP). This advanced optimization theory was improved and formalized between 1990 and 2000 through further studies by Diaz and Sigmund (1995); Sigmund (1997); Bendsøe and Sigmund (1999). Another popular topology optimization approach is the level-set method initially proposed by Sethian and Wiegmann (2000). Subsequently, many variations to the original level-set method have been proposed by Wang et al. (2003); Allaire et al. (2004). This method utilizes a flexible implicit representation of the interface (or boundary) described by a signed distance function. During the optimization, the level-set equations describing the boundary shapes are updated to minimize structural compliance, similar to the SIMP method.

However, the conventional topology optimization method has two major disadvantages; (1) the resulting organically shaped structures are often difficult to fabricate, (2) the topological optimization process is computationally expensive. Previously, it was challenging to manufacture the complex organic shapes derived by topology optimization due to technical limitations or the need for considerable resources to materialize the designs. However, these manufacturing challenges have been overcome by technological developments such as advanced computer numerical control (CNC) (Newman and Nassehi 2007), computer-aided design (CAD) (Groover and Zimmers 1983; Ji and Marefat 1997), 3-D printing, and additive manufacturing (Wong and Hernandez 2012; Frazier 2014; Yan et al. 2018; Shahrubudin et al. 2019).

The second disadvantage was addressed through various types of topology optimization, which have algorithmic differences (Bendsøe and Kikuchi 1988; Bendsøe 1989; Zhou and Rozvany 1991; Sankaranarayanan et al. 1994; Diaz and Sigmund 1995; Sigmund 1997; Bendsøe and Sigmund 1999; Wang et al. 2003; Allaire et al. 2004; Sigmund 2007) in the mathematical process. However, the improved algorithms, with iterative finite element method (FEM) calculations, still had common flaws that required significant computational effort to determine the optimum structures. Recently, various studies have employed deep-learning-based algorithms to improve or reduce the expensive calculations required for structural optimization. Specifically for topology optimization, deep convolutional neural networks (CNN) being widely applied since they are suitable for processing spatial physics-based data in the domain. Sosnovik and Oseledets (2017) first attempted to use a trained network to improve the resolution of the final material layout in a 2-D domain. Banga et al. (2018) proposed a method to predict optimal 3-D structures using encoders and decoders, which can interpret multi-dimensional data and provide similar dimensional outputs (i.e., image, video, etc.) by inputting the static analysis results using the initial geometry. Zhang et al. (2019) tested various combinations of static analysis results from the baseline geometry as an input to predict the final material layouts using Unet (Ronneberger et al. 2015), which consists of complex U shape architecture including a set of encoders and decoders. They were able to develop an accurate surrogate model predicting the results as almost the same as the ground truth results. Oh et al. (2019) proposed a deep-learning-based framework by integrating topology optimization and generative design. Yu et al. (2019) developed a CNN-based encoder and decoder network to predict the optimum structures in both low-resolution and high-resolution. In other studies, the approaches were extended by considering complex optimization processes such as stress constraints (Abueidda et al. 2020), manufacturing constraints (Chandrasekhar and Suresh 2020), and metamaterial design (Chandrasekhar and Suresh 2020). Nie et al. (2021) proposed a new-data-driven topology optimization algorithm enhancing the generality by replacing the fixed boundary or load conditions with varying vectors. Qiu et al. (2021) developed a surrogate model using combinations of such advanced neural networks as Unet and recurrent neural network (RNN), to implement the optimum model predictions for 2-D and 3-D structures. Xue et al. (2021) utilized super-resolution CNN (SCNN) to implement high-resolution topology optimization in both 2-D and 3-D domains. Recently, Muñoz et al. (2022) suggested a hybrid structural optimization by linking the two structural optimization processes: topology optimization and typical parametric shape optimization. Furthermore, they developed an ML-based communication model which can share the geometrical information between topology and parametric shape optimization modules to derive the manufacturable optimum topologies. Kazemi et al. (2022) employed generative adversarial networks (GANs) to derive the conceptual designs considering coupled physics such as structural load and heat conduction. They designed a map to connect the input as the strain energy density field and output as the optimum topologies having the same configurations, which can be derived by the level-set method. Seo and Kapania (2022) provided a new framework for various applications using the commercial FEM software, Abaqus, and provided more general applications such as stiffened panel design. This previous research was able to reduce the total calculation time by 98 % for deriving the optimum global structures using the network trained with datasets from local example cases.

However, these ML-based topology optimization methods (Banga et al. 2018; Yu et al. 2019; Zhang et al. 2019; Abueidda et al. 2020; Chandrasekhar and Suresh 2020; Nie et al. 2021; Xue et al. 2021; Seo and Kapania 2022) still had a limitation that proposed surrogate models are applicable only for very limited cases, namely simple geometries, load, and boundary conditions (i.e., cantilever, simply supported and MBB beams). Therefore, the present study suggests a new framework to overcome these restrictions of limited applicability by proposing a different and additional mapping process. The proposed mapping process makes it possible to transfer the nodal or elemental data on any arbitrary mesh to tensor format by assigning the averaged nodal or elemental values to the corresponding tensor element. These tensor form data can be directly used to train the network. Therefore, we can develop a surrogate model that can be employed for diverse cases using any shape of structures. Various mapping resolutions are tested to determine the efficient size of the grid map. Predicted results on the map are transferred onto the actual mesh bypassing the inverse mapping function to conduct the structural analysis and evaluate the performance of the predicted structure. In addition, the classification network is employed to develop the surrogate model for predicting the optimum topologies rather than the regression model.

The classification model provides a set of final results clearly divided into structural and void regions in a short time compared with the previous regression model approach (Seo and Kapania 2022). In addition, the data augmentation technique was first applied for the physical system to efficiently increase the amount of data without actual calculations. Finally, a high-performance network was developed through parametric studies concerning the amount of data and the complexity of the network. The classification and regression models are tested with 1,000 samples to evaluate the proposed framework. The proposed method provides 5 % more accurate results than the previous regression approach.

Various network architectures are explored by varying the number of data points to derive the best-performed network. Using 4,000 test cases, the selected best-performed network showed a 90 % accuracy based on the IoU score, and most cases showed more than 80 % accuracy. In addition, the trained network was applied for local layout optimization of practical examples such as single crank, curved crank, and multiple column-wise structures. In particular, it was possible to reduce calculation time by 98 % when using the proposed method compared to the conventional SIMP-based topology optimization for the repeated column structure design problem. Therefore, the proposed method in this study overcomes the limitations and weaknesses of previous studies in terms of prediction accuracy and calculation efficiency.

This paper is organized as follows: Sect. 2 introduces the basic theory of the conventional SIMP-based topology optimization. Section 3 presents the proposed method to develop a general-purpose surrogate model to predict optimum material layouts. This section will explain the overall development process and application of the trained model, including the data generation, network training, data flow for network utilization. Section 4 presents the results of case studies to evaluate the proposed method. Finally, Sect. 5 provides the conclusions and future research opportunities.

2 SIMP-based topology optimization

Topology optimization determines the optimum material layout in the domain when the structure is subjected to specified load and boundary conditions. One of the widely used methods is SIMP-based topology optimization, which was proposed by Bendsøe and Kikuchi (1988); Bendsøe (1989). Zhou and Rozvany initially proposed the iterative continuum-type optimality criteria (COC) (Rozvany and Zhou 1991) and discrete COC (DCOC) (Zhou and Rozvany 1992) to solve the SIMP formulation for the ‘MBB beam’ problem and provided exact analytical truss solutions. SIMP method utilizes finite element analysis (FEA) to calculate the static response of the structure to determine the material density value, between 0 and 1, for each discretized element to minimize the global compliance. In the initially proposed method, the stiffness value can become zero, resulting in a singular stiffness matrix. Therefore, Sigmund (1997) added \(E_{\min}\) to the stiffness calculation to avoid this singularity issue. The final equation for the elemental stiffness (\(E_e\)) can be described in terms of each penalized elemental density (\(d_e^p\)) as:

$$\begin{aligned} E_e (d_e) = E_{\min} + d_e^p (E_0 - E_{\min}), \quad d_e \in [0,1] \end{aligned}$$
(1)

The objective function to minimize the compliance can be formulated as:

$$\begin{aligned} \begin{aligned} \min _{x}:&c({d_e})={\mathbf {U^{T} K U}}=\sum _{e=1}^{N} E_{e}\left( d_{e}\right) u_{e}^{T} k_{0} u_{e} \\ \text{ subject } \text{ to: }&\frac{V(d)}{V_{0}}={V_f} \\&{\mathbf {KU=F}}, \qquad 0 \le {d_e} \le 1 \end{aligned} \end{aligned}$$
(2)

The achieved material densities describe the final structural configuration with minimum compliance (c) and satisfying the volume constraint. During the optimization process, the iterative FEM calculations are used to obtain the compliance (c), calculated using the displacement vector (U), and stiffness matrix (K) as described in Eq. 2. Additionally, various other constraints can be considered, such as stress, strain, or even manufacturing constraints. However, here we only consider volume constraint to reduce the complexity of the data generation and training process. Typically, the stiffness calculation with Eq. 2 results in a material discontinuity in a local domain, called checkerboard pattern, during the optimization step, resulting in non-physical outcomes. This checkerboard pattern occurs when the optimum material densities vary significantly from one element to another, resulting in an alternating black and white pattern on the visualized contours. Various filtering methods have been studied and actively applied to prevent the checkerboard pattern issue. One of the popular filtering methods was proposed by Sigmund (2007). We utilize the commercial topology optimization solver, which is integrated into the Abaqus/CAE package. Abaqus/Tosca (Simulia 2015) already includes filtering schemes to eliminate the checkerboard patterns. The parameters for topology optimization are selected as: the volume fraction (\(V_f\)), which is fixed as 0.5 for all cases, and the filtering radius as 1.3 times of the minimum element size in the target domain.

3 Proposed method

3.1 Advanced convolutional neural network, Unet and Unet++

In this research, we built a neural network using the proven CNN encoder–decoder architecture, Unet (Ronneberger et al. 2015) and Unet++ (Zhou et al. 2018). Ronneberger et al. (2015) originally proposed Unet for feature recognition to segment biomedical images.

Fig. 1
figure 1

Advanced encoder–decoder architecture, a 4-level Unet, b 4-level Unet++ (Zhou et al. 2018)

Unet is sequentially constructed with the encoders and decoders, and each of them are linked at the same level with the skip connection as shown in Fig. 1a. The encoder is a convolutional block employing a pooling, called down-sampling, to encode the input data into a feature map. The decoder interprets the encoded information to project the features derived by a series of encoders semantically. The decoder is composed of up-sampling and concatenation, called skip connection, with a regular convolution operation. One of the objectives of up-sampling is to make the encoded map size as the concatenation blocks from the same level encoder. In Fig. 1, the dotted arrows indicate the data pipeline for the concatenation. In this regard, the primary contribution of Unet is that, while up-sampling in the network, we also concatenate the higher resolution feature maps from the encoder network with the up-sampled features in order to learn representations with the following convolutions more effectively.

From the network connectivity in Unet and Unet++, as shown in Fig. 1, a convolutional module (\(X_{i,j}\)) provides output (\(x_{i,j}\)). Where ‘i’ denotes the down-sampling layer along with the encoder and ‘j’ indexes the convolution layer of the dense block to employ a skip connection. So, we can formulate the output value, \(x_{i,j}\):

$$\begin{aligned} x_{i, j}= {\left\{ \begin{array}{ll}{\mathcal {H}}\left( {\mathcal {D}}\left( x_{i-1, j}\right) \right) &{} j=0 \\ {\mathcal {H}}\left( \left[ \left[ x_{i, k}\right] _{k=0}^{j-1}\right] , {\mathcal {U}}\left( x_{i+1, j-1}\right) \right) &{} j>0\end{array}\right. } \end{aligned}$$
(3)

where \({\mathcal {H}}(\cdot )\) is a convolution process that is followed by an activation function. \({\mathcal {D}}(\cdot )\) and \({\mathcal {U}}(\cdot )\) denote a down-sampling and up-sampling, respectively. The caption operator, [], denotes the concatenation of connected layers. So, \(j =0\) represents the down-sampling process, and these modules receive a single input from the previous level encoder. Modules at \(j=1\) level receive two inputs with the skip connection between the same level encoder and up-sampling from the sub-level module, as shown in Fig. 1b. For \(j >0\), modules receive \(j+1\) inputs from the outputs of the previous modules at ‘j level’ and skip connections from the same level modules. Each convolution module consists of multiple convolutional layers, and the Leaky Relu activation function (Xu et al. 2015) is applied for the final output.

Despite the architectural benefits of Unet, Zhou et al. (2018) found that the network accuracy did not improve, when used for a complex picture segmentation, as the depth of the network was increased. Therefore, they suggested an improved architecture, Unet++, that is the modified Unet by increasing the convolutional condense of skip connection without increasing the depth of the network as shown in Fig. 1b. Another key characteristic of Unet++ is the use of hybrid loss as shown in Fig. 1b. The conventional loss value is calculated for the final output only, but the hybrid loss is a summed value from the losses at the top level of the convolutional modules. This approach is called deep supervision and has an advantage considering the intermediate loss values to optimize the network more efficiently. More details are described in reference (Zhou et al. 2018). The cross-entropy function (Zhang and Sabuncu 2018) was employed as the loss function, which is commonly used in the classification model development in both Unet and Unet++.

To prevent the overfitting issues, a batch normalization layer and dropout layer (p=0.3) are included. More detailed architectural information, such as padding, kernel, filter, is given in Appendix A.

3.2 Arbitrary mesh to trainable tensor mapping

To use the extracted physics-based data from the finite element (FE) model, an additional process is required to convert the mesh-oriented data to a trainable tensor format. Previous study (Seo and Kapania 2022) utilized square or rectangular mesh grids to directly transform the physics-based data to the tensor form, as shown in Fig. 2.

Fig. 2
figure 2

Regularized grid domain to transform to the trainable tensor (Seo and Kapania 2022)

This research similarly employs rectangular frame to extract the physics-based data, but this frame can be used to specify the local domain for topology optimization and to transfer the data from FEA results on any arbitrary mesh to a trainable tensor. Therefore, both triangular and rectangular mesh elements can be used to build the FE model in these local observation windows. The nodal (or elemental) values are mapped to the tensor grid by averaging the nodal values on each divided region (\({\hat{x}}_{ij}\)), as shown in Fig. 3a. The tensor form tables can be achieved by identifying the representative values on the finite interval. As shown in Fig. 3a, many nodal values can be averaged out when using the relatively low-resolution grid map, and these averaged representative values may not reflect the original physics-based data. As the mapping grid is more refined, more information can be transferred to the tensor forms, preventing data loss due to the additional mapping process for creating trainable data, as shown in Fig. 3b. However, if the mapping resolution is increased, the specific transformed value (\({\hat{x}}_{32}\)) can become ‘0’ and result in a discontinuous field even though the material region is continuous. This artificial discontinuous layout is similar to the typical checkerboard pattern that is often observed in general topology optimization.

Fig. 3
figure 3

Mapping process to convert the arbitrary mesh to grid tensor, a low-resolution, b high-resolution

Therefore, a typical median filter (Bankman 2008) is applied to prevent additional checkerboard pattern due to the mapping process shown in Fig. 4. The median filter is widely used in image-processing applications to simply replace the discontinued pixels, considered noise, with the median value at a specific observation area. The applied median filter identifies the discontinuity region and assigns the median value of adjoint elements, as shown in Fig. 4. The final mapped results can be converted into a tensor form suitable to apply further processing to create a training set.

Fig. 4
figure 4

Filtering process to remove additional mapping-induced checkerboard patterns

For inverse mapping, the estimated material layouts can also be transferred back to the FE mesh to evaluate the structural performance of the predicted structure. The final design is rebuilt by replacing the elemental densities with the predicted values from the network application. Hence, the inversely mapped mesh can be directly used for static analysis to evaluate the structural performances using the predicted optimum material layout by comparing the results from the ground truth and estimated optimum structure.

3.3 Pre-processing for input data

From the previous literature, the output of CNN-based topology optimization is typically defined as the optimum material layouts, but the definition of the input data varies from study to study depending on the specific research objective. For example, Xue et al. (2021) input the target volume fraction, load, and boundary conditions, constructing different channels to predict the optimal structure. On the other hand, previous works (Zhang et al. 2019; Seo and Kapania 2022), which are the basis for this study, used the static analysis results from the initial geometry considering the load and boundary conditions. They constructed input data sets using strains (or stresses), and displacements in each direction, which were entered as values for multiple channels, as shown in Table 1.

Table 1 Comparison of data and network architectures for different input application

However, in this study, we theoretically utilized the physics-based data by manipulating the strain energy calculation terms as a part of the objective function in the conventional topology optimization. The strain energy (U) can be calculated by the multiplication of strain, \([\epsilon ]\), and elastic modulus, [E], in the 2-D domain (\(\Omega\)).

$$\begin{aligned} U =\frac{1}{2} \int _{\Omega }\{\epsilon \}^{T}[E]\{\epsilon \} d \Omega \end{aligned}$$
(4)

Suppose we assume that the structure is designed with homogeneous material. In that case, the strain energy density (\({\overline{U}}\)) can be derived as:

$$\begin{aligned} {\overline{U}}=\frac{E_{e}}{2\left( 1-\nu ^{2}\right) }\left( \epsilon _{x}^{2}+\epsilon _{y}^{2}+2 \epsilon _{x} \epsilon _{y}\right) +\frac{E_{e}}{(1+\nu )} \epsilon _{x y}^{2} \end{aligned}$$
(5)

where \(E_e\) is the elemental elastic Young’s modulus, \(\nu\) is Poisson ratio, \(\epsilon _x\), \(\epsilon _y\), and \(\epsilon _{xy}\) are strains.

When considering the effect of each term in the strain energy calculation, the elemental strain energy can be calculated with the elemental strain energy density multiplied by the element area in the 2-D domain. All the elemental area after the mapping has the same unit value as shwon in Fig. 3, so the elemental strain energy density terms can represent the initial strain energy distribution due to the load and boundary conditions. Therefore, the input data of the ML-based surrogate model can be structured with the second-order terms of strain energy density. In this study, we used the commercial FEM software, Abaqus/Standard and Abaqus/Tosca, for structural analysis and topology optimization. Abaqus package always reports engineering shear strain (\(\gamma _{xy}\)), not tensor shear strain (\(\epsilon _{xy}\)). The engineering shear strain is simply converted into tensor shear by multiplying by 0.5 (\(\epsilon _{xy}=1/2 \gamma _{xy}\)) for a linear and elastic material.

The terms of strain energy density include constants, the elastic modulus and the Poisson ratio. However, a typical convolutional process requires a data normalization, which renders the actual constants magnitude negligible (e.g., \(\frac{E_e}{2(1-\nu ^2)}\) for \(\epsilon _x^2\)). The main objective of data normalization is to improve training performance, which can be supported with two features as follows. First, the order difference between each strain channel can cause convergence issues that decrease the training performance since it is hard to find the optimum weights and biases along all input channels. Second, the strain field values are close to zero, but not exactly zero (i.e., \(\overline{\epsilon _x^2}\) =\(10^{-8}\)), which may result in singularities during the convolutional process. Therefore, data normalization was performed prior to the convolutional process to prevent the training divergence issue and improve the training performance. The normalized value (\({\bar{x}}\)) is calculated using the lower (a) and upper (b) bounds as:

$$\begin{aligned} {\overline{x}}=(b-a) \frac{x-x_{\min }}{x_{\max }-x_{\min }}+a \end{aligned}$$
(6)

In this study, various [ab] combinations were tested by varying the parameters with log scale, such as [1,10], [1,100], ...[10,100], ..., etc. In particular, in the lower bound case, a positive value was always selected to distinguish the physics-based data on the structural region from the void region, data having the value of ‘0’. So, initial geometry density (\(d_0\)) can be mapped to the observation window concerning the structural region as ‘1’ and the void region as ‘0’. For the variations of the [ab] combination, since no significant difference was seen, the following study was conducted by selecting the value of [1,10] with the fastest conversion speed for the training process. Therefore, the five normalized physics-based data will construct five channels of input, including \(d_0\), \(\overline{\epsilon _x^2}\), \(\overline{\epsilon _y^2}\), \(\overline{\epsilon _x \epsilon _y}\), and \(\overline{\epsilon _{xy}^2}\).

3.4 Pre-processing for output data: segmentation

The representative types of ML-based prediction methods are regression and classification. Both approaches have similar characteristics that estimate the output values, but the regression model provides continuous values, while the classification model provides discrete labels. In some cases, the regression problem can be transformed to the classification by labeling the outputs using an interval. For instance, the student’s exam grade can be described with the exact scores. But, it can also be described with the scaled grading system categorized with respect to the range of scores (i.e., A: over 95, A-: 95-90, B: 90-85, etc.) The output data for the regression model built with the original topology optimization outputs, which are continuous values that vary between 0 and 1, as shown in Fig. 5.

Fig. 5
figure 5

Actual mapping process to generate the labeled density distribution data (label 0: no structure, label 1: structure)

Another output data format for segmentation requires the labeling procedure to distinguish between the different objects in the domain. The domain was divided into material elements and no material elements labeled with two different integers, ‘1’ and ‘0’, respectively. For the labeling tasks, if the derived elemental density was greater than 0.5, this element requires the material on that location to satisfy the topology optimization objectives and constraints. Therefore, the output data is labeled as segmentation of the mapped results, as shown in Fig. 5.

Another advantage of employing the segmentation algorithm is the obvious ways to evaluate the predicted results. For example, we have to compare the predictions with a truth model on 2-D or higher dimensional domains for the regression approach, so the accumulated loss value (type of error) can be used to evaluate the prediction quality. However, using this loss value to determine the performance of the network is not straightforward since the loss value depends on the order and range of the output. Nonetheless, for a classification type problem, there are many other ways to assess the accuracy of the trained network (Fig. 6).

Fig. 6
figure 6

IoU score calculation

This study applies the intersection over union (IoU) score, most widely used in computer vision applications. This IoU metric is calculated by comparing the labeled regions between the predictions and the truth. The IoU score is defined as the percentage of the overlapped area of the target mask and the predicted labeled area. This score can be formulated as:

$$\begin{aligned} \begin{aligned} I o U&=\frac{Area\ of\ Overlap}{Area\ of\ Union} \\&=\frac{Ground\ Truth\ \cap \ Prediction }{Ground\ Truth\ \cup \ Prediction} \end{aligned} \end{aligned}$$
(7)

The IoU score was used as a criterion to evaluate the performance of the trained networks in this study, and selected the final network architecture which provided the highest IoU score.

3.5 Data generation

The primary objective of this research is to develop a network applicable to optimizing a local region of structures exposed to various load and boundary conditions. Therefore, diverse analyses should be considered and implemented to build up a suitable database. A vertically oriented structure was used as a basic model, as shown in Fig. 7.

Fig. 7
figure 7

Examples of FEM models for data generation

The critical part of developing the database is to represent the various distributions of strains on the observation window in the target domain. We designed baseline FE models by applying the load on the top edge and setting the boundary conditions on the bottom edge. The boundary conditions were varied by fixing the local or all nodes on the bottom edge or using pinned boundary conditions on the left and right end nodes. Three different types of load conditions were randomly generated and applied: point load, locally distributed load, and globally distributed load. The distributed load can be a wave-wise or a ramp load using the combinations of sine and cosine functions by randomly selecting the phase, period, and amplitude and using a floor function to generate the ramp value. Also, the nodes at which the applied loads were to be chosen at least 10 % of the total edge length to avoid the stress concentration due to the tip load condition. Especially for this research, we developed an FE model using a linear element with reduced integration, having a single integration point per element to increase the calculation efficiency during the FEA and topology optimization. In addition, a refined mesh with grid spacing varying between 1/100 and 1/50 of the maximum length scale of the structure was used to prevent convergence issues. Each data point is generated by following the flowchart described in Fig. 8.

Fig. 8
figure 8

Data generation and input/output dataloader development process

All the analyses, including the static FEM analysis and topology optimization, were conducted with Abaqus/Standard and Abaqus/Tosca, respectively. The data generation process is automatically completed with Python scripts. In addition, the required mathematical functions and random value selection (Fig. 8) were implemented using the NumPy package within Python scripts.

3.6 Data augmentation

We purposely increased the amount of data without actual calculation employing the data augmentation scheme. Data augmentation is a typical method to increase the number of data in computer vision problems. For example, when developing a segmentation tool to classify a cat or a dog, the required data are a large number of pictures labeled with each animal’s name. In this case, if the resource is insufficient, the augmented data can be generated by flipping the original pictures horizontally and vertically. A human can recognize them as the same pictures, but the computer will consider them as completely new information. The problems in this research are assumed to be linear systems, so we can augment the static analysis and topology optimization results. According to the linear static system, we can simply flip the contour results vertically, laterally, and diagonally and get the augmented results without actual calculations, as shown in Fig. 9.

Fig. 9
figure 9

Data augmentation for physics-based data in mapping resolution 80\(\times\)80, original data and augmented data in red box

Fig. 10 describes the detailed process of developing original and augmented data. We generated actual 12.5 K data and augmented 37.5 K data by flipping three different directions. The total of 50 K data were split into 10 K, 20 K, 30 K, 40 K, and 50 K to determine the effect of the amount of data to achieve the accurate surrogate model.

Fig. 10
figure 10

Data loader development process using the original data (=12.5 K) and augmented data (=37.5 K)

3.7 Network training

The overall process for developing the ML-based surrogate model was conducted with Python scripts. Also, the PyTorch and NumPy packages were used to build up the network. We set the GPU tensor with CUDA (Cook 2012) to process the mapped results from the FEM analysis and topology optimization to boost the training calculations. The computing resource, NVIDIA A100 Tensor core GPU, is provided through Advanced Research Computing (ARC) at Virginia Tech. Adam optimization (Kingma and Ba 2014) is used for the training process. The loss functions for regression and classification were MSE loss (Lehmann and Casella 2006) and Cross-Entropy loss (Zhang and Sabuncu 2018), respectively. The maximum training epoch number is determined by the converged training and testing performances.

The networks converged in between 2,000 and 3,000 epochs in most cases. The details of network architecture are described in Appendix A for the 3 level Unet++ model. The total hyper-parameter for three levels Unet++ is 2.2 M, and this number represents the fact that the trained network is both extremely deep and complex. The actual training time was around 90 hrs to obtain the converged model using the 50 K data points.

4 Numerical results

This section assesses the prediction performance by comparing the methodology suggested in previous studies with the proposed framework here. First, each network was developed with both methods of regression and classification to compare the predictive performance. At the same time, the effect of the number of physical-based data were also evaluated. Then, the performance improvement of Unet++ was determined by comparing the IoU scores using both Unet and Unet++. The trend of accuracy improvement according to the amount of data were analyzed by increasing the amount of data used to train the network. Finally, the usability of the trained network was evaluated by solving example problems using the best network selected in these comparative analyses. The converged loss value is used to determine the minimum epoch number of the training process. Various methods were applied to prevent the overfitting issues, such as batch normalization, dropout, and early stop method. The example training performance plot is shown in Fig. 11. This section also shows the results of examples to illustrate the performance of a trained network employed to predict the optimum material layouts, based on the results of static FEM calculations performed on the initial geometry. Three examples were studied to validate the trained network; 1) single crank, 2) curved crank, and 3) column-wise structure. When the predicted structures were not close to the ground truth (i.e, IoU < 80 %), the estimated layouts were inversely mapped to the original mesh to conduct the structural analysis in Abaqus/CAE to evaluate the structural performance. The prediction accuracy of each network was compared based on the IoU score, and the calculation followed Eq. 6.

Fig. 11
figure 11

Overfitting mitigation using dropout strategy for training the same architecture network (Unet) using the same number of data (= 10 K)

The process to train a network utilized to derive the optimum structure using the pre-processed static results is shown in Fig. 12. We also conducted conventional SIMP-based topology optimization for each testing case to evaluate the prediction accuracy. Fig. 12 describes how to evaluate the prediction quality using IoU score.

Fig. 12
figure 12

Process to calculate the IoU score to evaluate the prediction quality

4.1 Network type selection: regression or classification

In this section, we assess the proposed segmentation approach, which differs from the previous regression model (Zhang et al. 2019; Seo and Kapania 2022), to predict the optimum material layouts. Also, a new input architecture is evaluated by comparing the output performances with results derived by previous research (Seo and Kapania 2022). Since the regression and segmentation outcomes have different output forms, it is difficult to compare the results and evaluate the performances directly. Therefore, an additional process is required for regressed results to be compared with the segmented output. In order to transform the predictions from the regression model into a structure, it is post-processed with the same step as in labeling a binary segmentation to denote the presence or absence of material in each element. These post-processed regression outcomes can be evaluated using the IoU score. Therefore, we can determine the better approaches by comparing the prediction accuracy of both the regression and the segmentation models. The regression and classification surrogate models were developed using the same data and same network architecture and predicted results are shown in Fig. 13.

Fig. 13
figure 13

Prediction accuracy comparison for regression and segmentation approaches

As shown in Fig. 13, the classification model generated predictions of the optimum material layouts closer to the ground truth than the regression model. In turn, the predicted result by segmentation is much closer to the ground truth value than the post-processed regression results. Furthermore, the training time for segmentation model training was reduced compared to the regression case due to more straightforward outputs. To compare the total training time, we trained systems using the same equipment, data, and number of epochs. The training time was 28 and 8 hours for regression and classification, respectively.

Table 2 Regression and classification performance comparison for training the same architecture network (Unet) using the same number of data (= 10 K)

For more applications, Table 2 provides IoU scores for both regression with post-processing and classification results using 1,000 testing cases. In both cases, the mean and standard deviation (STD) show that the predicted optimum density fields obtained from classification provide better agreement with the truth than the regression approach. Due to the complexity of continuous output, the nonlinearity between input and output increases during regression. On the other hand, since the segmented output is divided into ‘0’ and ‘1’ binary, the loss function is updated in a short time, and the relationship between input and output is also relatively simple. As described in Sect. 4.3, we proposed the new input data architecture considering the strain energy density to be used as the objective function of conventional topology optimization. We employed the five channels data, including the original target structure with the initial density and the squared strain terms described in Eq. 5. As shown in Table 2, the network trained with the proposed input data structure provided more accurate predictions. The proposed method has a better performance; IoU score of 86 % versus 80 % for the proposed and Zhang et al. (2019) methods, respectively. This comparative analysis shows that the segmentation approach has an advantage in improving the prediction accuracy and calculation efficiency rather than a regression approach. Furthermore, rather than employing strain and displacement fields, as used in prior studies (Seo and Kapania 2022), it was confirmed that the input data channels considering the actual terms used in the objective function for the topology optimization can increase the prediction accuracy.

4.2 Network and data architecture selection

In this section, the Unet and Unet++ were used to develop the surrogate model. Table 3 presents the performances of each network prediction by varying the architectures, number of data, and mapping resolution. The IoU scores are calculated using 4,000 test cases, which were all not used for the training process. The predicted optimum outcomes from the Unet++ yield more accurate results compared with the outputs derived from Unet for every case varying the amount of data (from 10 K to 50 K) and for different mapping resolutions. Therefore, without increasing the depth of the network, it is possible to increase the accuracy of surrogate models by increasing the density of skip connections, a key advantage of the Unet++ architecture, as convinced by Zhou et al. (2018).

Fig. 14
figure 14

IoU score box plots, a resolution 40\(\times\)40, b resolution 80\(\times\)80

Table 3 IoU scores for different network architecture and data channels

As shown in Fig. 14 and Table 3, the prediction accuracy was improved as the data size was increased. The IoU scores were near 90 % for both mapping resolutions using 50 K data. Also, the STD value decreased with more datasets, which means that the network’s prediction reliability improved. As shown in Fig. 15, when using the network trained with 50K of data and 80\(\times\)80 mapping, the accuracy of most of the analysis results (more than 50 % out of 4,000 tests) is greater than 90 % and the maximum IoU score is 99.8 %, indicating the very high accuracy achieved by the present approach.

Fig. 15
figure 15

Prediction accuracy distribution using Unet++ trained with 80\(\times\)80 resolution 50K data tested with 4,000 samples

Fig. 16 shows the randomly selected results among the cases used for testing. From the IoU score and visualized overlapped area, it is seen that the predicted optimum layouts of the part of a thick region that plays a major role in structural performance show a good agreement with the ground truth. Even though in some cases, minor parts having a relatively thin area shows differences between the truth and predictions. In a few cases, Unet can provide better geometry with a higher IoU score rather than Unet++. Typically, for these cases, as shown in Fig. 16, predictions from both Unet and Unet++ have almost the same configurations with around a 3 % of IoU score difference. So for common applications, Unet++ is recommended for developing the surrogate model, as it still provides more accurate prediction results than Unet, with a few exceptions considering the overall performance, as shown in Fig. 14.

Fig. 16
figure 16

Predicted results from Unet and Unet++ evaluation using IoU scores

Fig. 17
figure 17

Unet++ predictions for a single link optimization design subjected to the two different load conditions

4.3 Network application for local optimization problem

To evaluate the general usability of the surrogate model, the trained network was employed to solve the global-local structural optimization problems. No additional process was conducted to develop the extra data. Instead, we directly used the Unet++ trained with the 50K data and \(80\times 80\) resolution mapping process. The trained network can deliver accurate optimum layouts when the inputs from the observation window are similar to the data generation conditions, as shown in Fig. 7. The recommended target structure size is between 30 to 70 % of observation windows, and roughly 20 to 60 elements are required to transfer the mesh-oriented data to a trainable tensor properly. The observation window can be rotated and used repeatedly in several local optimization processes.

For the first and second examples, the trained network is applied to optimize the local area of two different shapes of cranks, commonly used as the mechanical parts in a rotating system. In the third case, the 2-D bridge structure was applied to test the multiple uses of the trained network. To develop the reference as ground truth, SIMP-based topology optimizations were conducted for the local region with the volume fraction (\(V_f\)) is fixed as 0.5 for each target domain as a constraint.

4.3.1 Case 1: simple crank

The first case optimizes the center of the straight trapezoidal-shaped crank. The pinned boundary condition is applied for the reference point of the left hole using the node-surface coupling function in Abaqus/Standard. Next, the bearing load is applied to another reference node on the right hole with the kinematic coupling between the reference node and the surface on which the bearing load is applied.

The predicted optimum layouts were very close to the truth outputs, and high IoU scores, over 90 %, supported these good agreements for both examples, 1.1 and 1.2 as shown in Fig. 17. For these examples, the trained network estimated rather well even minor structural features.

Fig. 18
figure 18

Unet++ predictions for curved link design subjected to two different load conditions

4.3.2 Case 2: curved 3-holes crank

For the second example, we tested the trained network to optimize two local domains on the curved link as shown in Fig. 18. Two side holes were pinned, and the bearing load was applied at the center hole. Due to the geometric and structural complexity, the input strain fields are becoming complex compared to the previous examples. The predicted layouts have a good geometrical agreement with the ground truth results by comparing the IoU scores (> 80 %) and are visually consistent. But, it can be seen that there are differences in some fine details of structural layouts in example 2.2.

To evaluate the response of predicted structures, the predicted layouts were inversely mapped to the original meshes, as shown in Fig. 19. We can estimate the differences in structural performances from the distinction between the predictions and truth layouts. In addition, as a result of simple static analysis, the stress distribution is similar to the truth model, but the maximum von Misses stress values are different (by 25 %), presenting a lower prediction accuracy.

Fig. 19
figure 19

Structural performance comparison between the ground truth and prediction: Maximum von Mises stress (+25.62%) and volume (+6.27%)

The prediction model seems to have failed to properly distribute the load as the predicted structure on the right side became too simple as compared to the ground truth. As shown in Fig. 19, the stress distribution of the predictive model is concentrated in a narrower area than in the ground truth structure. Also, the rough outline of the estimation model causes stress concentration at the local nodal point as shown in Fig. 19b. This rough outline has occurred due to the inverse mapping process. Therefore, adding the additional smoothing filter step for final optimum layouts will resolve these stress concentration issues. We can also increase the mapping resolution to derive more smoothed outlines.

4.3.3 Case 3: bridge

The final example is the global-local optimization case for deriving the multiple local optimum structures on the repeated columns, as shown in Fig. 20. The lower edge is fixed, and a randomly generated load is applied on the upper edge, similar to the data generation process. As can be seen from configurations, the predicted optimum structure has similar outlines as the ground truth structure. However, the overall IoU scores were relatively lower than the values obtained from the previous examples. The scores of 60-70 % were obtained due to the differences in some of minor structure’s details.

Fig. 20
figure 20

Unet++ applications of multiple columns design

Fig. 21
figure 21

Structural performance comparison between the ground truth (a) and prediction (b): Maximum von Mises stress (− 0.2905%) and volume (+ 0.6813%)

Similarly, we also inversely mapped the predicted optimum layouts to original FE models to evaluate the structural performance, for examples 3.1 and 3.2. The static analysis results under same loading conditions are shown in Figs. 21 and 22. There are no significant differences in the shape of the dominant thick member in the structure. So, the structural performances were similar for prediction and truth results. Especially for example 3.2, the expected optimum structure provides better results, showing a much lower maximum stress value (− 15.07 %). Therefore, what can be predicted through the trained network can give similar or better results than the model derived by the SIMP-based topology optimization even though the predicted outline is slightly different in shape. This is probably expected that a better local minimum will be derived from the network application, which can provide better global performance. Still, this observation should be studied by considering stress as a global constraint in the extended domain since the derived and estimated optimum structural layout is valid for local areas.

Fig. 22
figure 22

Structural performance comparison between the ground truth a and prediction. b Maximum von Mises stress (− 15.07%) and volume (+ 0.4321%)

Table 4 Average calculation time [s] comparison for third case application

Typically, generating data and training the surrogate model takes a significant amount of time. However, once the converged network is developed, users can repeatedly employ the network to predict the optimum material outcomes without any expensive calculations. Table 4 shows the averaged calculation times for conventional SIMP-based topology optimization and the proposed method to derive the multiple optimum structures in examples 3.1 and 3.2. The total calculation time for network application is summed with the static analysis to generate the input data and network application using the CUDA process, not including the post-processing and any other processes to create an input CUDA tensor. Based on the above results, it is possible to reduce the calculation time by up to 98.76 % or more when using the proposed network compared to the existing method. Furthermore, we can confirm that the same or better results can be often expected in using ML, which also are obtained quickly. Therefore, we can apply the proposed framework to solve advanced structural global-local optimization problems considering various observation windows.

5 Conclusions

We have proposed a framework to predict the optimum material layouts on local domains using advanced CNN, Unet and Unet++. We utilized static analysis results with the initial geometry as an input to derive the optimum structures without the conventional topology optimization, which relies on expensive iterative calculations. Further, various steps were added to overcome the limitations identified in previous studies. First, a novel mapping process was proposed to transfer the physics-based data on any mesh shape to a trainable tensor format with multiple channels representing the target domain’s initial conditions. Therefore, unlike previously proposed methods, it is not necessary to use a standardized mesh. Any mesh configuration with different elements, such as triangular, quadrilateral, etc., can be used for target geometry. In addition, the static analysis data were pre-processed to utilize the input more generally, and this pre-processing step also improved the training performance. The proven encoder–decoder architectures, Unet and Unet++, were used to build a high-performance surrogate model which can predict the high-resolution outputs as optimum structural layouts. The Unet++ developed here has 2.2 M hyper-parameters, representing the complexity of the network. A high-performance GPU, NVIDIA A100, was used to train the surrogate models. We used the processed data considering the squared strain terms concerning the strain energy calculation in the objective function of conventional topology optimization. Consequently, the proposed method using the processed strain fields provided more accurate results than those derived by the approaches suggested in previous studies.

Another difference from previous studies is that a classification technique, called segmentation, is applied to improve network accuracy and training efficiency. In contrast to the previous research that commonly used the regression model to extract continuous output, we used binary segmentation to derive the rigorous optimum material layouts; the elemental density is always classified as void (label 0) and structure (label 1). The classified results are similar to the filtered layouts after the conventional topology optimization. Therefore, the proposed method can be considered a more efficient approach that has the advantage that no additional filtering process is required for the fabrication design development. Furthermore, although we trained both regression and segmentation models, the binary segmentation model resulted in more accurate predictions compared to the ground truth for both testing cases and example problem applications. Also, regarding training efficiency, the training time was reduced for classification models even using the same data with the same number of epochs. The predicted material densities on the tensor were inversely mapped to the original mesh to conduct the structural analysis to evaluate the performance using the estimated optimum model. Contrary to previous studies, the proposed framework suggests the optimum outline and provides the FE calculations to assess the actual structural performance. The best-performed network predicted optimal structures with 90 % accuracy, and the derived geometry showed almost the same structural performances compared to the ground truth derived by the SIMP-based topology optimization. Also, the proposed method can reduce the calculation time by 98 % for deriving the multiple local optimum layouts compared to the conventional optimization process.

Future research areas can be classified into two topics. First, the current framework can be improved by considering more complicated structural optimization problems such as increasing the complexity of geometry, boundary, and load conditions. Second, the framework can be utilized to analyze other types of structural analysis. Here, we suggested a methodology to process the fields-wise physics-based data from the computational model using any arbitrary mesh configurations. Similarly, researchers can use mapping and inverse mapping process to prepare the input and output data for structural problems with field results such as composite material design, health monitoring, data-induced FE analysis, etc.