Data Mining and Machine Learning Methods Applied to A Numerical Clinching Model

: Numerical mechanical models used for design of structures and processes are very complex and high-dimensionally parametrised. The understanding of the model characteristics is of interest for engineering tasks and subsequently for an efﬁcient design. Multiple analysis methods are known and available to gain insight into existing models. In this contribution, selected methods from various ﬁelds are applied to a real world mechanical engineering example of a currently developed clinching process. The selection of introduced methods comprises techniques of machine learning and data mining, in which the utilization is aiming at a decreased numerical effort. The methods of choice are basically discussed and references are given as well as challenges in the context of meta-modelling and sensitivities are shown. An incremental knowledge gain is provided by a step-by-step application of the numerical methods, whereas resulting consequences for further applications are highlighted. Furthermore, a visualisation method aiming at an easy design guideline is proposed. These visual decision maps incorporate the uncertainty coming from the reduction of dimensionality and can be applied in early stage of design.

The ongoing digitalisation includes significant changes in the design of structures and processes. Due to high costs the design of new products or production processes by 'trialand-error' investigations at real prototypes seems not economically justifiable any more.
To overcome this, a simulation-based design process is established in many industries. The main idea is to simulate the new product/process by a numerical model. In general, this model is parametrised with n x input parameters x i in order to find a suitable set of parameters fulfilling constraints and optimality criteria. The challenging task is, to parametrise the model and to identify the optimal design. For supporting this process many numerical tools are available and need to be applied specifically. The main goals for these methods are: Gaining insight into the numerical simulation model, identifying influences of the parameters to constraints and optimality criteria and support the design process in an efficient manner. In the context of 'computational intelligence', the applied methods/tools are mainly associated with two research areas: Data mining and machine learning. These fields are highly innovative and the application possibilities of 'big data'methods are enormous. In contrast to the application field 'big data', the engineering tasks are characterized by 'small data', because the generation of data (i.e. multiple simulation model computations) is highly expensive.
Therefore, the common procedure in a simulation-based design process is as follows: 1. development and parametrisation of a simulation model, 2. computing a sample set (Design of Experiments), 3. explore the sample set by data mining methods, 4. construct a meta-model with machine learning methods, 5. use the meta-model for: (a) sensitivity analysis, (b) uncertainty analysis, (c) optimisation.
Many modifications or exchanges in this order of steps are possible, e.g. occasionally iterative improvements are necessary. The common guiding theme of this contribution is to discuss and apply several methods at a real design task in the context of a clinching process.
The requirements for the design of mechanical joints have increased steadily in recent years.
In the past, the main requirements that have significantly influenced the development of mechanical joining technology were economic efficiency, productivity, process reliability and lightweight construction. A common technology for joining sheet metal is clinching. When using clinching, the sheet metal is locally formed, a ductile material is required but no rivet or any other further element is needed and no heat is induced into the joining zone. Currently, clinching is used only for thin sheet metal, because the determination of the tool geometry parameters is done by empirical knowledge in combination with experimental tests. Unfortunately, the suitability of clinching for connecting substantially longer sheet metal thickness has neither been sufficiently investigated and suitable toolkits for larger overall sheet metal thickness are only available in very small quantities and only in a few sizes. The state-of-the-art for ascertaining tools for larger sheet metal thickness and point dimensions is trial-and-error, which leads to high costs caused by producing a large number of tools and doing experiments. This paper demonstrates some analytical approaches to this problem in order to obtain a suitable tool design for clinching thick sheet metal with the help of a finite element analysis in combination with machine learning methods. The clinching model is introduced in Section 2. The applied methods are well established and only briefly reviewed. The exploration of existing data by data mining methods is discussed in Section 3. In order to increase the efficiency, two different types of meta-models are compared in Section 4 and utilized for the sensitivity evaluation in Section 5. Furthermore, a new method for simplification of the design process is proposed, see Section 6. Therefore, two-dimensional nomographs are developed, which are common visual aids in engineering related design tasks. On the basis of the acquired information about the simulation model, 'decision maps' are constructed. A list of the most relevant symbols can be found at the end of this paper.

Simulation model
As described, the foundation of simulation-based design is a parametrised simulation model. Examples for simulation models are analytical functions or finite element models.
The parametrisation of the model is done by n x input parameters assembled in an input vector x (also called feature vector). The related input space X ⊂ R n x is usually composed of restricted intervals for each dimension These restrictions are usually motivated by physical plausibility (minimum/ maximum thickness, mass, stiffness, ...). The simulation model maps the input space to the result space Z. Usually, this space is parametrised as well, thus n z result parameters (synonyms: output, fitness, response) exist, which leads to a result space Z ⊂ R n z . For applications in mechanical or structural engineering, result parameters are e.g. displacements, stresses and strains. The parametrisation of the result is important, due to the manifold result quantities in time dependent finite element analysis. Therefore, significant (e.g. maximum displacement) or cumulative (e.g. mean displacement) meta parameter are considered. An important characteristic property of the simulation model ξ is that only point-to-point relations are possible. Thus, continuous approaches are not applicable and a sampling-based evaluation needs to be utilized.

Data mining methods
Data mining methods are used to explore existing data sets and generate knowledge and information about the underlying (but unknown) relationships of input and output data.
In the field of application described in this contribution, the data has to be generated. Therefore, methods in context of the Design of Experiments (DoE) are utilized to sample the input space according to Eq. (1) by n sim simulation points. These simulation points are a set X of input vectors and a set Z of related result vectors. Due to constraints in the result space, the sample set can be split into permissible and non-permissible samples. This means, the total sample is the union of these two subsets The applied sampling scheme is responsible for the effectiveness, i.e. ensuring a minimum number of samples in the full input space. The simplest method is to evaluate a simulation model based on high dimensional grids. More effective are space filling designs, aiming uniform distributed samples. Common algorithms for these sampled random numbers are 'Latin Hypercube sampling' [McKay, Conover and Beckmann (1979)] and 'SOBOL' sequences' [Sobol' (1967)].
The methods applied for data mining are outlier analysis [Breunig, Kriegel, Ng et al. (2000)], cluster analysis [Halkidi, Batistakis and Vazirgiannis (2001)] and classification. Furthermore, sensitivities and correlations can be computed for the existing data.

Machine learning
In engineering applications, the computation of the mechanical model, e.g. by the finite element method, is highly time consuming and for various evaluations (e.g. sensitivity analysis, uncertainty analysis) a large number of model computations is necessary. To overcome this disadvantage, the research area machine learning or 'meta-modelling' provides multiple methods for an approximative model computation [Siddique and Adeli (2013);Simpson, Poplinski, Koch et al. (2001)]. The mechanical model ξ is represented by a meta-model such that the orignal is aiming a minimal error ε between mechanical model and approximation Commonly, artificial neural networks [Siddique and Adeli (2013)], radial basis function networks [Jin, Chen and Simpsion (2001)] or support vector regression [Cherkassky and Ma (2004)] are used. A simpler approach is the polynomial approximation. For these methods, several implementations (e.g. 'libSVM' [Chang and Lin (2011)]) exist and can be utilized as surrogate model. Details about the methods applied in this paper (artificial neural networks and polynomial approximations) can be found in Section 4.1.
Remark The procedure described in this contribution is based on non-adaptive metamodels. This means, the sample set is pre-computed and the meta-model is trained on the basis of this data. Adaptive meta-models integrate the sampling and training in the usage phase [Dubourg (2011)]. Due to the adaptivity, they perform better for highly non-linear problems but are more complicated in implementation and usage.

Sampling-based sensitivity measures
Sensitivity analysis is the quantification of the influence of each input parameter to the result parameters [Saltelli, Chan and Scott (2008)]. In general, it is possible to distinguish between global and local sensitivity measures. Locally, the gradient of the simulation model ξ gives information about the influence of each parameter [Sobol' and Kucherenko (2009)]. For the numerical design, this method is not applicable and the variance-based SOBOL' index [Sobol' (2001)] or the frequency-based measure FAST (FOURIER amplitude sensitivity test) [Saltelli, Tarantola and Chan (1999)] can be applied alternatively. The measures are computed on the basis of specific sampling schemes and need a high amount of model evaluations. The variance of the input parameter is predefined on the basis of informative or non-informative random numbers. To quantify the robustness of one selected design, informative random numbers (e.g. normal distribution) can be used. For the design process analysed in this contribution, no information should be considered and, therefore, a noninformative uniform distribution is applied.

Clinching process
Clinching is one of the common mechanical joining technologies that can be found in car body production and household appliance (refrigerators, washing machines etc.). It can be classified as a manufacturing process, more detailed as a joining technology or specific as joining by forming, such as hemming or riveting. The main advantage of clinching is that no additional part is necessary (in opposite to riveting) to join the blank sheets. Furthermore, a prepunched hole is not required. It is possible to join blank sheets with different material, for example aluminium and steel. In comparison to welding, no heat is supplied to the joining zone.
For the clinching process, special tools are required: A punch, a die (several types exist, such as round point dies, flat dies or dies with movable blades) and a blank holder. In the first step, the blanks are positioned and clamped between the blank holder and the top of the die. The blank holder prevents a take-off of the blanks and supplies the push out force for removing the punch out of the blanks. In some cases, an additional lower blank holder is use, in order to push the clinch joint out of the die cavity after the clinching process. The upsetting step (see II. in Fig. 1) is characterized by moving the punch downwards until the die-sided sheet metal is in contact with the die. In the last step, the punch is penetrating the material to a predefined position so that an interlock is formed. In order to get a strong clinch joint, the clinch point should have a large neck thickness. A large interlock provides high cross tension strength and prevents unbuttoning under shear load.

Numerical model and description of parametrisation
Simulation In order to save simulation time and getting practical results, the simulation model is a 2D-model with axisymmetric boundary conditions, because of the rotational symmetry of geometry and loads (see Fig. 2). Nevertheless, it is possible to compute 3D-models, when non-symmetric geometries or loads occur. The used simulation type is Langrangian incremental with Newton-Raphson iteration and an automatic remeshing. That means, if one rigid part or mesh penetrates another or the used mesh is highly distorted, a remeshing is executed automatically. The material model of the sheets is elasto-plastic, to get springback effects after the joining process. The tools are defined as rigid objects, where the die is fixed and the punch is moving at constant speed. The blank holder is coupled to the punch by pretensioned springs with linear stiffness.
The mechanical properties of the considered material (construction steel S350GD) are determined in tensile tests and transferred to simulation software as pairs of variates for true strain and yield stress (yield surface). The friction between the sheet metal and tools -but also the sheets itself -is very complex. In the simulation, the friction conditions are defined as a constant shear factor during the whole process, which is state of the art in metal forming simulation. Further information can be found in Landgrebe et al. [Landgrebe, Kropp, Gehrke et al. (2017); Landgrebe, Mauermann and Kropp (2017)].
Validation To make sure that the simulation results match with experimental data, three configurations with different sheet thickness ratios where joint (t 1 + t 2 ∈ {3 + 2 mm; 2.5 + 2.5 mm; 2 + 3 mm}). This comparison is done by considering the contour geometry of the cross-section and the load-stroke-data of the joining process. The very good agreement between simulation an experiment can be seen in Figs. 2 and 3. Geometric tool parameters and output parameters In the present paper, the focus is on the interaction of geometric tool parameters (see Fig. 4). In Tab. 1, the investigated parameters for the die and the punch geometry are shown. Investigated is a fixed total sheet thickness of t.t = 5 mm, with variable ratio of the single sheet thickness x t1 . To avoid samples without reference to reality, some parameters are given in a relative manner. For example, it is not practicable using a punch diameter D.S that is larger than the diameter of the die D.M. Such that, the punch diameter is related to the die diameter D.S = x D.S_rel · x D.M .
The response variables interlock z f and neck thickness z t.n (see Fig. 5) are significant for the quality of a clinching joint. The overall objective is to find geometries with maximal interlock and neck thickness constrained by specific thresholds.

Exploration of data
The given initial design space (X ,Z ) init following from the data in Tab. 1 is the basis for a DoE. Therefore, 9101 samples of the numerical simulation of the clinching process are carried out on the basis of Latin Hypercube Sampling (LHS). The sample set was reduced by removing non-physical samples. It is the main goal of the numerical simulations to derive general properties of the physical process with respect to preferred geometric properties, described by two result quantities. The evaluation of the result values leads to the conclusion that not every design vector x is permissible regarding the desired properties. The existing data set is investigated with data exploration techniques, in order to enhance the general understanding of the physical process as well as validate upcoming results of the surrogate models by comparison to characteristic quantities based on the actual data set. x D.M die diameter 7.5 mm 12.5 mm 2 x t.M_rel die depth in relation to total sheet thickness bottom thickness 0.589 mm 5.035 mm output parameters 1 z f interlock 2 z t.n neck thickness

Permissible design space
Initially, the data set is split into permissible and non-permissible data according to Eq. (3), whereas the constraint is defined by forcing the existence of an interlock. With respect to Eq. (7), n perm = 2365 permissible data points in (X , Z ) perm and, consequently, n nperm = 6736 non-permissible in (X , Z ) nperm are obtained.
To have a brief overview of the interdependencies between two parameters, a scatter plot of the data as depicted in Fig. 6 can be used. The upper triangular matrix is related to X perm , the lower to X nperm .
With the focus on the permissible data, there are two noticeable issues. First of all, the correlation or more general dependency between the following pairs of parameters is Secondly, the non-permissible data is approximately as equally distributed in X as the permissible data, except of the correlated parameters. Hence, if an input space X perm is defined by X perm = [x min,perm,1 ,x max,perm,1 ] × ... × [x min,perm,11 ,x max,perm,11 ] , the permissible design range for each input parameter is equal to the initial ranges in Tab. 1. It is not ensured that x ∈ X perm is permissible. Particular with regard to the approximation by a surrogate model, it is expedient to integrate e.g. a classification method. An alternative approach is the computation of design spaces based on permissible points only [Graf, Götz and Kaliske (2018)]. Other names for this method are solution space [Zimmermann and von Hoessle (2013)] or feasible design area [Duddeck and Wehrle (2015)]. If these subsets are independent intervals for each input variable, a permissible hypercuboid can be computed. A method to compute such a hypercuboid is given in Goetz et al. [Götz, Liebscher and Graf (2012); Graf, Götz and Kaliske (2018)].
For the analysed data set, the permissible hypercuboid is given in Fig. 7 and Eq. (9). The difference compared to the initial design space are emphasized The permissibility is defined according to Eq. (7), whereas the objective is a permissible hypercuobid with maximum number of points inside. The visualisation applied in Fig. 7 is called parallel coordinate plot [Inselberg (1985)]. Hereby each parameter is represented by a separated axis. A parameter vector inside the permissible hypercuboid is represented as line inside the grey marked area. The shown hypercuboid contains 25.6 % of the permissible points, which is a reasonable result. The benefit o f t he c omputed p ermissible d esign s pace i s, t hat w ithin t he d esign process, the engineer can select any parameter combination inside the permissible design space. A disadvantage is that the usable hypervolume, representing the range of possible designs, is only 10.2% of the initial design space, see Tab. 1.

Sensitivity analysis
For engineering related design guidelines, the amount of 11 independent parameters is challenging regarding the specification of a suitable design. A sensitivity analysis is carried out in order to identify parameters with major influence on the result quantities.
Correlations are evaluated with the goal to substitute one parameter by another for the purpose of parameter reduction. One correlation coefficient as well as two sensitivity coefficients will be briefly introduced and applied to the permissible data set (X , Z ) perm .
Spearman correlation coefficient The SPEARMAN rank correlation coefficient is commonly applied to assess the relationship between two parameters described by monotonic function and is therefore a non-linear correlation coefficient [Saltelli, Chan and Scott (2008)]. It is defined as with the rank rg( ) and the mean of the rank rg x = rg z = (n + 1)/2. A coefficient value |r SPEARMAN | = 1 is related to an ideally positive or negative monotonic (non-linear) relation between x and z. The resulting correlation coefficients r SPEARMAN is depicted in Fig. 8 for each parameter combination where the result parameters are separated by the dashed line.
The correlation assumption by the visual interpretation of Fig. 6 is reassured by following correlation coefficients For the further analysis and utilization of the data set, the correlations between input and output parameters are from higher importance. Under the assumption that |r SPEARMAN (x, z)| can be additionally interpreted as sensitivity measure, x t1_rel has the major effect to z t.n and x t.b to z f . Especially the identified correlation information between the input parameters x t.m_rel ∼ x D.S_rel and, consequently, the reduced permissible design space (see Fig. 6) is decisive for the training, evaluation and utilization of surrogate models.
Correlation relation estimation Additional to the correlation analysis, an estimation of the sensitivities is utilized by the application of a correlation relation estimation.
According to Siebertz et al. [Siebertz, van Bebber and Hochkirchen (2010)], the correlation relation can be defined as  (2010)]. This method itself is not applicable to existing data, but the general concept has been adapted to a variance-based sensitivity measure for existing data KVest (KVestimation).
Each input parameter is separated into p intervals I i,k , k ∈ {1, . . . , p} with the associated data set for each output parameter z j . Hence, the conditional mean values are utilized to approximate the correlation relation with the variance of the mean values by The presented sensitivity measure is interpretable as first order SOBOL' indices, see Section 5.1. Assuming a large number of data samples as well as sub spaces, it holds S KVest , whereas a small data set is still applicable to quantify the significance of input parameters even for non-linear dependencies. For each output parameter, the computed sensitivities for the most sensitive input parameters are shown in Tab. 2. Especially the variance of z t.n seems to be solely depending on x t1_rel and weakly x t.b . According to the KVest measures, the other remaining input parameters are insignificant t o t he o utput p arameter. C ompared t o t he correlation evaluation with the SPEARMAN coefficient (see Eq. (11)), the same two main contributors are identified.
EASI For the purpose of comparison and validation of the identified correlation and sensitivity values, another and methodical different approach is used as sensitivity measure. Originally, the 'effective algorithm for computing global sensitivity indices' (EASI) has been introduced by [Plischke (2010)] as a sensitivity measure for existing data of nonlinear systems. The method is adapted to the general concept of FAST, which is analysing the input space by frequency dependent characteristic functions and the map G ω (s) = 1 π arccos(cos(2π · ω · s)).
Due to permutation and resorting of existing data (X , Z ), the EASI approach leads to the approximation of Eq. (16) for the frequency ω = 1. At the beginning, one input parameter x i and one output parameter z j are increasingly sorted to obtain an ordered vector (x i,l , z j,l ) = (x i , z j ) l | x i,l ≤ x i,l+1 ∀ l = 1, 2, . . . , n perm . In order to approximate a triangular shaped function, all odd indices of (x i , z j ) l in increasing order are followed by all even indices in decreasing order, such as This specific shuffling leads to a triangular-shape vector, which is in analogy to the FAST sampling at ω = 1. Since the permutation of the output parameter is conjoined to the input permutation, the resonances of period ω = 1 and its higher harmonics in the power spectrum are used to determine the sensitivity. Based on the complex coefficients of the discrete Fourier transformation, the first order sensitivity index is computed by According to Plischke [Plischke (2010)], the number of higher harmonics is set to M = 6.
The computed sensitivities are given in Tab.

2.
Both introduced sensitivity measures only provide an approximation for first order sensitivity. The sum of higher order sensitivities (total sensitivity) can be estimated by S EASI, KVest As can be seen in Tab. 2, the amount of higher order sensitivities is significant and cannot be neglected. For both result values, the determined sensitivities are depicted in Fig. 9. Evidently, the qualitative results of both measures are similar to each other or equal regarding the descending order of the sensitive input parameters.
Self-organizing map An entirely different approach in correlation analysis is the evaluation of Self-organizing maps (SOM). Introduced by KOHONEN, SOM try to adapt the biological brains ability to map complex high-dimensional data to a low-dimensional (2D) representation of information, thus SOM : R n x → R 2 . Topological properties of the data are hereby preserved, which allows the identification of e.g. clusters or correlation behaviour. Possible evaluation methods for cluster identification or spatial distribution of data are described in Liebscher et al. [Liebscher, Witowski and Goel (2009);Ultsch (2003)].
In this contribution, component maps, based on an orthogonal neural grid of 30 × 30 neurons, are utilized and the resulting component charts are separately depicted in Fig. 10. Based on the grid representation of the neural network, the parameter value of a neuron Figure 9: Result of sensitivity analysis with KVest and EASI for z f (interlock) and z t.n (neck thickness) Figure 10: Component map of Self-Organizing map trained for permissible data (X , Z ) perm (response parameters are highlighted in green) Summary of data mining From the previously discussed results, the following conclusions can be drawn: • independent measures (KVest and EASI) yield the same results, • most significant parameter for result interlock z f : x t.b (bottom thickness), x D.S_rel (punch diameter in relation to die diameter) and x t1_rel (thickness upper sheet in relation to total sheet thickness), • most significant parameter for result neck thickness z t.n : x t1_rel (thickness upper sheet in relation to total sheet thickness), x t.b (bottom thickness) and x D.M (die diameter), • high amount of parameter interaction is expected, • permissible design space cannot be identified on the basis of the scatter plot, • dependencies (correlations) in the permissible and non-permissible data set are identifiable.

Meta-modelling
In the following, two different machine learning methods (artificial neural network and response surface) are discussed and applied to the presented example. Furthermore, the meta-models are evaluated and visualised.
are represented at the neurons position in the grid. Under the assumption of a considerably good fit of the network, correlation analysis can be provided, since at each map position the actual data space is represented as well.
No correlation is visualized by two orthogonal component charts, whereas either positive or negative correlation is represented by equal or inverse gradients of the components. Those observations can be done globally over the map and, therefore, the entire data space, or locally for a partial area of the SOM.
Regarding the data set, multiple interdependencies can be identified. The response values seem to be independent to each other, since charts are clearly orthogonal to each other, whereas the positive correlation between z t.n and x t1_rel can be derived from the equally oriented trend lines. Same applies for the dependency between x D.S_rel and x t.M_rel . The distinctive negative correlation between x t.b and z f is not clearly visible, but can be confirmed by SOM. Due to the component charts, see [Malone, McGarry, Wermter et al. (2005); Mostafa (2009)], low (blue) parameter values of x t.b comply with the total range of z f , but increasing parameter values lead to a decrease of the output parameters (see lower left and right corner).

Artificial neural networks and polynomial approximation
Artificial neural networks An artificial neural feed-forward network with one hidden layer (with n v neurons, see Fig. 11) can be written as weighted sum of the output of the neurons The vector β contains the weighting factors between hidden layer and output neuron. The function G is the processing of one neuron, see Fig. 12. It is G(w, θ , x) = ϕ(ξ , θ ) = ϕ(ξ (w, x), θ ). The neuron consists of two separated functions. A common propagation function is the weighted sum of all input signals ξ (w, x) = ∑ n x i=1 w v,i · x i . The weights w representing the strength of influence for each input neuron to each hidden neuron separately. The activation function ϕ can be formulated by various functions, e.g. hyperbolic tangent or binary threshold. In this study, an unipolar function is used. This function depends on the output of the propagation function ξ and a bias value θ .

Polynomial approximation
The polynomial approximation is mostly called 'Response Surface Method (RSM)'. The main idea is to represent the data by a polynomial of degree n.
Usually, the degree is n = 1 for linear approximation or n = 2 for quadratic approximation The coefficient β 0 is the constant, β i,i j,ii are the coefficients for the linear, mixed and quadratic terms, respectively.
The training procedure is the computation of vector β on the basis of minimization of the root mean square error by an HESSIAN matrix, see Myers et al. [Myers, Montgomery and Anderson-Cook (2009)]. Thereby, the most common assumption for the error ε in Eq. (5) is zero mean, yielding the ordinary least-squared estimator of β .

Selection and training of meta-models
Meta-models contain various parameters, which need to be found during a training process. Due to different characteristics (e.g. size, dimensionality, complexity, white noise, ...), data sets cannot be considered by one general meta-model. Thus, it is challenging for meta-modelling to select a type of model, the architecture of the model and the training parameters of the model. A common method is to analyse the performance of different meta-models for the same data set. To have possibility for validation, the data set is split into a set used for training and another for testing. The approximation error depends on the selection of training and test data. To overcome this, the k-fold cross-validation (CV) [Arlot and Celisse (2010); Geisser (1975)] is commonly used. Thereby, the data set is split into k subsets and the training is repeated k times. Typically, the number of subsets is selected within k ∈ {5, 10}.
The comparison of different meta-models is based on a mean error The error measure ε RMS is given in Eq. (26). The k-fold CV is basis for an automatic selection process.
For the solution analysed in this contribution, the meta-modelling is performed for both result quantities z f and z t.n independently. The training is done for a polynomial approximation (linear and quadratic) according to Eq. (22), support vector regression using 'libSVM' [Chang and Lin (2011)] and ANN according to Eq. (20). The finding of an optimal ANN architecture is included in the automatic select process. These neuron-layer combinations are based on the number of neurons in a first hidden layer of {1 − 25} and for a second hidden layer of {0 − 25}. By holding the condition that the number of hidden neurons in the second hidden layer is smaller/equal that in the first hidden layer, 219 different architectures are observed.
The training, i.e. the optimisation of neuron bias values and connecting weights, can be done by back-propagation algorithms. These algorithms are very slow, which is critical due to the high amount of observed models. Therefore, the LEVENBERG-MARQUARDTalgorithm can be used [Hagan and Menhaj (1994); Levenberg (1944); Marquardt (1963)]. This algorithm is very fast but tends to over-learning. This disadvantage is due to the repeated training in the framework of k-fold CV not relevant.
The analysis is performed on an 'Intel i7-4790' CPU within 40 hours for each result quantity. As result of the observed meta-models, ANNs perform best. The best architectures (number of input neurons -hidden layer neurons -hidden layer neurons -output neurons) are for z f : (11 − 25 − 16 − 1) and for z t.n : (11 − 22 − 18 − 1).

Quality evaluation and visualisation
The quality of meta-models is evaluated by characteristic measures. The error to be observed is the difference between original result and the approximation. The error for a simulation point l is The following two measures are mainly used. The root mean square error (RMS) is A disadvantage is, that the value of the RMS depends on the order of magnitude for the observed result. This means, only a relative evaluation of the approximation quality is possible. But different meta-models for the same data set can be evaluated relatively to each other. Another important measure is the coefficient of determination R 2 (also called CoD [Most and Will (2010)]) For the analysed data set, the quality measures are given in Tab. 3. Additionally to the ANNs, a polynomial meta-model with quadratic polynomial degree is given. The polynomial approximation follows the formulation in Eq. (22) in general. However, only the most significant input parameters (based on correlation measures) are considered in the utilized polynomial approximation. The error measures in the table show the disadvantage of the measure ε R 2 . For the polynomial approximation, a value of z f = 0.92 indicates a reasonable approximation, but Fig. 13(b) shows that the approximation quality is not reasonable. The measure ε RMS indicates the very good approximation quality of the ANNs, which can be validated by Fig. 13(a) and Fig. 13(c). The graphs in Fig. 13 show the predicted results z * i and the original results z i for all sample data. (d) z t.n with polynomial meta-model Figure 13: Quality assessment of analysed meta-models (black: z i , green: z * i ) It is common to use 3D visualisations depict the meta-model, which is challenging for the multi-dimensional input space (here 11-dimensional). The others (not shown dimensions) need to have fixed values, such that n 2 x −n x 2 possibilities for visualisations exist. Thus, any visualisation of multi-dimensional data in 3D is limited. In Figs. 14 and 15, two input parameter are selected, separated for each result. Additional to the meta-model result (continuous plot), the original data set is shown.
Most of the scatter of the input data can be explained by the two selected input parameters. In contrast, Fig. 16 has two selected input parameters where the meta-model cannot explain the scatter. These parameters can be interpreted as locally (due to the fixed other parameters) non-sensitive.

Sensitivities
In this section, the most common sensitivity measures are described, computed and discussed for the clinching process. Namely, these measures are the SOBOL' indices of first order S SOBOL' and the total SOBOL' indices S SOBOL',T . The sensitivities are computed using the developed meta-models. The results are compared to the measure for existing data S KVEST (see Section 3.2) and the measure S COP . The Coefficient of Prognosis (CoP) is used in [OptiSLang (2011)] and computed for the polynomial meta-model. Furthermore, the functional behaviour of the total SOBOL' index called sectional sensitivity measure shows the change of influence for each input parameter.

Sensitivity measures
For variance-based sensitivity measures, it can be distinguished between effects of first and higher order. Effects of first order S x i |z j , also called main effects, quantifying the influence of one parameter x i to the variance of the result z j . Effects of higher order S x i,...,p |z j capturing the influence of two or more input parameters x i,...,p to variance of the result z j . The overall influence of one input parameter is called total sensitivity. The computation of the SOBOL indices, i.e. the solution of high dimensional integrals, can be done by Monte-Carlo simulation or more efficient on the basis of polynomial chaos expansion [Sudret (2008)].

SOBOL' index
The SOBOL' index of first order [Sobol' (2001)] is defined as with the result quantity z j and the input quantity x i . The expected value of z j under the condition of occurrence for a specific x i is E[z j |x i ]. The variance overall possible x i is Var x i .
The variance of the result Var[z j ] is designated as total variance. This measure is also called measure of importance.
Total SOBOL' index The cumulative influence of all lower level sensitivities (i.e. first order, second order, ...) can be computed by the total SOBOL' index [Homma and Saltelli (1996)]. In analogy to Eq. (28), it is The subscript ∼i indicates the consideration of all input quantities is the partial variance of all input quantities x k without x i .

Coefficient of Prognosis -CoP The sensitivity measure Coefficent of Prognosis is defined in Most et al. [Most and Will (2011)] as
S CoP The measure is the total SOBOL' index linearly scaled by the Coefficient of Prognosis, which is a quality estimator for the meta-model. The quality estimators CoP of the polynomial approximation for the two results are CoP z f = 0.9169 and CoP z t.n = 0.9866.

Results of sensitivity analysis based on meta-model computations
Additionally to the above described measures, the measure KVest, already discussed in Section 3.2, is included in the comparison. The computed sensitivities can be found in Figs. 17(a) and 17(b) for the response interlock z f and neck thickness z t.n , respectively.
KVest versus SOBOL' As already pointed out in Section 3.2, the sensitivity measure KVest is comparable to SOBOL' indices of first order. But, in this example there is a substantial difference between them. The measure KVest identifies more sensitive parameters than the SOBOL'-indices. Only the most significant parameters are found by both. One reason for the difference is the dependency of the input data, see Figs. 6 and 8 and Eqs. (8) and (11). Within the sampling-based sensitivity measures, the dependencies cannot be considered and cause the differences. Another reason of difference is the usage of the meta-model for computing the SOBOL' indices. To capture such influences, sensitivity measures based on existing data and sampling-based measures need to be compared.

SOBOL',T versus CoP
The comparison yields, that the selection of the type of metamodel has a significant influence on the sensitivity evaluation. The measure CoP is a linear scaled total SOBOL' index, such that a comparison of the measures is possible. Even if the three most sensitive parameters (x t1_rel , x D.M , x t.M_rel ) for the result z t.n are identified with both measures, the differences are very substantial. The amount of influence and the interaction phenomena are very different.

Sectional sensitivities
Description of methodology In Pannier et al. [Pannier and Graf (2014)], a method for the sectional evaluation of global sensitivity measures is proposed. The aim is to have more detailed information about the sensitivity for x i by separating the input space X according to Eq. (1) into p equidistant intervals. The separation is done independently for each input parameter, such that the sensitivity is computed as function depending on x i . The splitting is formulated as with The same splitting is applied to the data set for the sensitivity measure 'KVest', see Eq. (13). For the input parameter x i , all segments X i,k = I 1 × I 2 × . . . × I i,k × . . . × I n x have to be considered. The sensitivity is computed for each segment of the input space k ∈ {1, . . . , p}, such that is the sectional sensitivity for the input parameter x i . As sensitivity measure, all common global sensitivity measures can be applied. For global comparison, these sectional sensitivities have to be scaled by Thus, the mean of the sectional sensitivity values is equal to the sensitivity of the full input space X computed with the same global sensitivity measure S.
Sectional sensitivity can be interpreted as mean quantitative functional behaviour for each input parameter x i under consideration of the variance of all others. For the interpretation of the results, it has to be considered, that the sensitivities between different input parameters are not comparable within one segment, because they are computed in different subspaces X i,k .

Results
Here p = 10 sections are evaluated, the results are given in Fig. 18. Due to the high number of interaction phenomena, the total SOBOL' index S Sobol',T is considered.
For the result z f , the two parameters x t.b and x D.S_rel are remarkable. All other parameters showing uniform behaviour, which means the influence is constant over the complete This characteristic has to be evaluated according to the production process, low parameter values exist at the end of the process, see Fig. 1. To validate this result, the meta-model visualisation in Fig. 14(a) can be used. For small values of x t.b , a very high gradient is observable. The same evaluation is valid for x D.S_rel .
For result z t.n , the input parameters x t1_rel , x t.M_rel and x D.M are significant. The parameter x t1_rel is very sensitive and shows small changes only. The input parameter x t.M_rel shows decreasing sensitivity. This behaviour cannot be found in Fig. 15(a), presumably because the interaction sensitivity of x t.M_rel is very high, which cannot be visualized in a 3D-plot.

Summary of sensitivity results
The following main facts can be concluded: • sensitivity of result z f (interlock) most relevant parameters: x t.b (bottom thickness), x D.S_rel (punch diameter in relation to die diameter) and x t1_rel (thickness upper sheet in relation to total sheet thickness), same relevant parameters are found as on the basis of existing data, seeTab. 3, parameter interaction exists in the meta-model, sensitivity of x t.b (bottom thickness) decreases for high parameter values.
• sensitivity of result z t.n (neck thickness) most relevant parameters: x t1_rel (thickness upper sheet in relation to total sheet thickness), x t.M_rel (die depth in relation to total sheet thickness), x t.b (bottom thickness) and x D.M (die diameter), except x t.M_rel , same relevant parameters are found as on the basis of existing data, see Tab. 3, high amount of parameter interaction exists in the meta-model, sensitivity of x t1_rel (thickness upper sheet in relation to total sheet thickness) and x tM_rel (die depth in relation to total sheet thickness) showing significant changes within the parameter range.
• dependency in the input data set does not allow to fully comparable the metamodel-based sensitivity measures and the sensitivities discussed in Section 3.2 (extrapolation of meta-model), • results depend on the selected meta-model.

Decision map-engineering approach for visualisation of high-dimensional parameter spaces
A common engineering design approach is the visualisation of complex functional relationships by two-dimensional nomographs. These nomographs simplify the design process and helps to find appropriate parameter ranges for each problem. This approach is adapted for the ANN meta-model, developed in Section 4.2. The map should represent the 11-dimensional input vector x, by two dimensions ξ map : R 2 → R. Thus, the function ξ * : R 11 → R needs to be constructed in a reduced way. In general, two methods can be used. First, by the application of dimension reduction methods, such as principal component analysis or singular value decomposition, the sample space can be reduced,  Figure 19: Proposed approach for generating decision maps maybe to a two-dimensional space. Crucial is, that this space has no physical relationship and, therefore, engineering decisions cannot be made based on those values. The second methodology is proposed in this contribution and is based on the selection on two representative dimensions x i , x j , which are the map dimensions. The question to be answered is, what values should be taken for the others not shown parameters x ? . This aspect is previously already discussed for the meta-model visualisations in Figs. 14 and 15. The proposed method is to consider the not shown parameters x ? as intervals and construct two decision maps. One decision map represents the lowest and the other the highest values. The computation of these extremal values needs to be repeated for each lattice point. In Fig. 19, the approach is shown, especially the consideration of constraints h(x) is highlighted. The difference of the two maps can be interpreted as the influence of the not shown parameters x ? . It has to be remarked, that the common method plotting two signification input values and consider the others x ? as mean value yields wrong interpretations.

Construction of decision maps for interlock and neck thickness
The major requirement for any design is permissibility. For the analysed sample set, no clear relationship of permissibility can be identified, see Fig. 6. To take this characteristic into account, the decision maps are built for the permissible design space, see Section 3.1. This hypercuboid Eq. (9) represents permissible samples only. Because of the interaction freedom of the hypercuboid with respect to the selection of samples, the maps can be made interaction free. Alternative approaches, as describing the permissible samples with the help of convex hulls, do not have this advantage.
For the functions ξ * = z f and ξ * = z t.n , the monotony requirement cannot be guaranteed and two constraints need to hold. The constraints are due to the dependency of x t.M_rel and x D.S_rel , which can be seen in Fig. 6. The other identifiable dependencies, as x t1_rel and x t.b , are not present in the permissible design space. The constraints for permissibility are formulated as h 1 (x) = x D.S_rel − 0.50 · x t.M_rel − 0.50 < 0 and (35) h 2 (x) = x D.S_rel − 0.32 · x t.M_rel − 0.43 > 0.
(36) and x D.M are most sensitive in descending order. For the result z t.n , the input parameter x t1_rel , x t.M_rel and x D.M are most sensitive. The idea of the map is the visualisation the functional characteristics with respect to these input parameters. If only the two most sensitive parameters would be used, the resulting map would be too general and not applicable in practice. To overcome this, a grid of maps is computed, therefore fixed values of the third and fourth most sensitive parameters are considered. For z f , no specific characteristic can be found in the sectional sensitivities, see Fig. 18(a). But, for z t.n the input parameter x D.M show a significant change in sensitivity, see Fig. 18(b). Therefore, the fixed values of the input parameter are placed in these regions, which are already indicated in Fig. 18(b). To directly include the constraints Eqs. (35) and (36), a link between z f and z t.n is provided by sharing the parameter x D.S_rel .
As depicted in Fig. 19, the values of the decision map should be represented by the minimum and the maximum z = [z min , z max ]. If the optimisation tasks are solved, the extremal values are inside of extrapolation meta-model areas. This yields a drastic overestimation of the uncertainty and is not explainable in practice. To cope with this, the map is approximatively constructed by replacing the extremal values by empirical quantile values assuming z min ≈ z 0.01 and z max ≈ z 0.99 . The random sampling of the input space is performed by Monte-Carlo simulation using a set of n sim = 50000 samples for each lattice girder point.
The resulting maps for z f and z t.n are shown in Figs. 21(a) and 21(b) for the minimum and maximum values, respectively.
The minimum maps can be used by the following steps, as indicated in Fig This results can be projected to the maximum map, yielding z f ∈ [0.34, 0.73] and z t.n ∈ [0.19, 0.44]. For the further design process (finding the other parameters x ? ), it is ensured, that the constraints hold and the result values are within the given intervals.

Evaluation of the decision maps
The evaluation of the decision maps (Figs. 21(a) and 21(b)) shows a significant difference between the minimum and the maximum value map. This shows the relevance of computing the extremal maps and not only a mean value map. A reason for the unexpected difference (unexpected because the most significant parameters are represented) can be the high amount of parameter interactions. This can be seen in Fig. 17 at the difference of first order SOBOL' index and the total SOBOL' index.
The minimum and the maximum maps for z f show the same trend. Furthermore, it can be seen that for the three different x D.M a slight shift of the contour lines is recognisable. The change with respect to x t1_rel depends on the value of x D.M . The differences between the minimum and the maximum values are the same for all nine maps, which is indicated by, a shift of the contour lines for each and can be interpreted as nearly linear behaviour, w.r.t.
x D.M and x t1_rel . An important information of the sectional sensitivities (see Section 5.3) is, that the effect of high sensitivity in the first sections and less sensitivity in the last section for x D.S_rel . This can also be found in the map, the gradient of the contour lines decreases for increasing x D.S_rel . The same characteristic can be found for x t.b .
The maps for z t.n are different compared to the maps for z f . The shapes of the contour lines are different for the minimum and the maximum chart the uncertainty is higher. This means, for any selected design the range of possible values is higher. For the minimum map, the (a) Minimum decision maps (b) Maximum decision maps Figure 21: Decision maps for z f (interlock) and z t.n (neck thickness), with three testing points (•, ♦, ) and highlighted constraints (see Eqs. (35) and (36))