Paper The following article is Open access

Discovering interpretable physical models using symbolic regression and discrete exterior calculus

and

Published 16 January 2024 © 2024 The Author(s). Published by IOP Publishing Ltd
, , Citation Simone Manti and Alessandro Lucantonio 2024 Mach. Learn.: Sci. Technol. 5 015005 DOI 10.1088/2632-2153/ad1af2

2632-2153/5/1/015005

Abstract

Computational modeling is a key resource to gather insight into physical systems in modern scientific research and engineering. While access to large amount of data has fueled the use of machine learning to recover physical models from experiments and increase the accuracy of physical simulations, purely data-driven models have limited generalization and interpretability. To overcome these limitations, we propose a framework that combines symbolic regression (SR) and discrete exterior calculus (DEC) for the automated discovery of physical models starting from experimental data. Since these models consist of mathematical expressions, they are interpretable and amenable to analysis, and the use of a natural, general-purpose discrete mathematical language for physics favors generalization with limited input data. Importantly, DEC provides building blocks for the discrete analog of field theories, which are beyond the state-of-the-art applications of SR to physical problems. Further, we show that DEC allows to implement a strongly-typed SR procedure that guarantees the mathematical consistency of the recovered models and reduces the search space of symbolic expressions. Finally, we prove the effectiveness of our methodology by re-discovering three models of continuum physics from synthetic experimental data: Poisson equation, the Euler's elastica and the equations of linear elasticity. Thanks to their general-purpose nature, the methods developed in this paper may be applied to diverse contexts of physical modeling.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Science and engineering routinely use computer simulations to study physical systems and gather insight into their functioning. These simulations are based on established models made of sets of differential and algebraic equations that encode some first principles (i.e. balances of forces, mass, energy$\ldots$), and whose analytical solution cannot be generally found. Given the recent advances in machine learning, methods that leverage experimental data to automatically discover models from observations offer a valuable support to scientific research and engineering, whenever the analysis of a system with complex or unknown governing laws is involved.

To address the question of interpretability, symbolic regression (SR) has been employed in several works to discover the mathematical expressions of the governing equations of single-degree-of-freedom or dynamical systems [13]. In [4], a combination of neural networks and physics-inspired techniques enhance SR by cleverly reducing the search space, while in [5] SR was used to distill equations from a graph neural networks with a suitable architecture for problems involving Newtonian and Hamiltonian mechanics. Recent works [6, 7] proposed to combine deep learning with SR via recurrent neural networks or transformer networks. In [8, 9], a sparse regression approach was introduced to recover the governing equations of nonlinear dynamical systems and continuum systems governed by partial differential equations (PDEs). However, this method produces coordinate-based descriptions of the PDEs that obscure their meaning, it is sensitive to noisy data and is based on a special representation of the equations (linear combination of terms containing the derivatives of the unknown). Apart from [9], all these papers apply SR to the (re)discovery of algebraic or ordinary differential equations of classical physics or other standard benchmarks, and are hence inappropriate for field problems, which are a large class of problems in continuum physics (fluid/solid mechanics, heat transfer, mass transport, $\ldots$). Moreover, all these approaches do not exploit the geometrical character of physics, so that, for example, physical variables and mathematical operators may be combined arbitrarily without accounting for their spatial localization (i.e. whether they are defined on points, lines, surfaces or volumes). A high-level comparison between our framework and existing ones is reported in table 1.

Table 1. Comparison among SR frameworks for physical model identification. Some instances of ✘ are software and/or conceptual limitation. We denoted by ✘* some limitations of our framework that we plan to overcome next.

  OursPySINDy [8, 9]EQL [18]Eureqa [1]DSR [6]AI Feynman [4]PySR [19]
Field problems  $\checkmark$ $\checkmark$
 Domain source $\checkmark$
 Stationary/non-stationary $\checkmark$/✘*✘/$\checkmark$ ✘/✘✘/✘✘/✘✘/✘✘/✘
 Variational/non-variational $\checkmark$/✘*✘/$\checkmark$ ✘/✘✘/✘✘/✘✘/✘✘/✘
Dynamical systems  $\checkmark$ $\checkmark$
Algebraic equations  $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$

Here, we devise a framework that combines SR and discrete exterior calculus (DEC) [10, 11] for the discovery of physical models starting from experimental data. Following the pioneering ideas of Tonti [12, 13] and others [1416] we adopt a discrete geometrical description of physics that offers a general-purpose language for physical theories, including the discrete analog of field theories. A similar idea was proposed in [17], although with a less complete mathematical framework than DEC, a different model search strategy was employed and a no application to vector-valued problems was shown. In section 2, we introduce the main concepts of DEC and we extend the theory to deal with problems characterized by vector-valued unknowns, such as the equations of linear elasticity. Then, we describe our approach to SR that exploits the synergy between DEC and SR for the effective identification of mathematically and physically consistent models. In section 3 we demonstrate that our method can successfully recover the coordinate-independent representations of three models of continuum physics starting from a limited set of noisy, synthetic experimental data.

2. Automated discovery of physical models

As anticipated in the Introduction, we combine SR and DEC to produce interpretable physical models starting from data. In particular, geometry is at the foundations of many physical theories of the macrocosm, e.g. general relativity, electromagnetism, solid and fluid mechanics [12], and DEC allows to capture the geometrical character of these theories better than standard vector calculus [10, 1416]. The incorporation of such a mathematical language within SR has several key advantages:

  • being DEC a discrete theory, it bypasses the differential formulation, so that the governing equations can be readily implemented in computations, without the need for discretization schemes;
  • it allows to distinguish physical variables based on their spatial localization and thus guarantees that the generated symbolic expressions are mathematically and physically consistent;
  • it helps to constrain the search space of the SR by suggesting a type system for the mathematical operators and physical variables;
  • it encodes the discrete counterpart of several differential operators (such as divergence, gradient and curl) in a minimal set of operators, thus making the symbolic representations more compact and expressive.

We argue that our model discovery strategy exploits these features of DEC to produce compact and effective models that generalize well starting from small training datasets. Also, our approach aims at identifying the governing equations that generate the observed data, rather then directly fitting the observations through an end-to-end model that relates inputs (sources, parameters) to measured quantities. This is a substantial difference with respect to all the state-of-the-art alternatives reported in table 1, which is also expected to provide better generalization and physical insight associated to the recovered model.

A schematic of the proposed framework is depicted in figure 1. Model discovery consists of several steps. In the problem setting step, the physical system whose model has to be identified is defined in terms of a discrete structure, i.e. a 'mesh', resembling (a simplified/coarse-grained description of) its geometry. A preliminary dimensional analysis is carried out to define the appropriate scales for the physical quantities and the number of dimensionless groups, including those that will be used as fitting parameters as a part of SR. Physical quantities, including those that are measured experimentally, are defined in terms of their spatial localization and adimensionalized. The set of DEC operators and other mathematical operations to be used for SR is chosen. The definition of boundary conditions and domain sources (such as external forces) is also carried out at this stage. Then, SR is performed based on the following loop:

  • (i)  
    candidate models generation: a population of trees representing mathematical models based on the chosen operators set is generated, initially in a random fashion;
  • (ii)  
    each model is solved, i.e. by interpreting it as the expression of a potential energy function and minimizing it, and the solution (or a derived physical quantity) is compared with experimental data to evaluate an error metric that contributes to the fitness of each model;
  • (iii)  
    the candidate models are evolved through genetic operations (crossover and mutation) based on their fitness values and produce a new generation.

Figure 1.

Figure 1. Schematic of the model discovery approach.

Standard image High-resolution image

The loop repeats usually until reaching a given number of generations, after which the top performing model(s) can be extracted.

In the following sections, we first briefly review some fundamental concepts of DEC, and then describe the SR approach based on it, with particular reference to physical systems that can be described by a potential energy (variational formulation).

2.1. Elements of DEC

Exterior calculus (EC) deals with objects called differential forms and their manipulations (differentiation, integration) over smooth manifolds. Compared to classical tensor calculus using index notation, it provides a coordinate-independent description of physical models and unifies several differential operators, which results in more compact and readable equations that easily generalize to manifolds of different dimension [10]. DEC offers an inherently discrete counterpart to EC, designed to preserve some fundamental differential properties. In particular, in the context of DEC, cell-complexes and cochains play the role of smooth manifolds and differential forms, respectively. Here, we provide a very brief introduction to DEC, geared toward the applications we will consider in section 3. We report in appendix A more formalism of DEC definitions and notations. Without loss of generality, we limit the presentation to simplicial complexes. A more detailed treatment of DEC can be found in [10, 11, 20].

Henceforth, $\mathcal{E}^N$ is the embedding N-dimensional Euclidean point space, $\mathcal{V}$ is the associated space of translations associated with $\mathcal{E}^N$ and Lin is the space of linear applications from $\mathcal{V}$ to $\mathcal{V}$ (tensors).

From the physical modeling viewpoint, a simplicial complex provides the topological and geometrical structure that supports the physical phenomenon. A generic simplex is indicated with the letter σ and we use a pair of subscripts to identify each of them, such that $\sigma_{p,i}$ is the i-th p-simplex. Cochains are the discrete counterpart of differential forms, hence they are also known as discrete forms. Specifically, a cochain can describe a physical quantity (differential form) integrated on the corresponding simplex and is thus the proxy for a field. The discrete analogue of the exterior derivative in EC is represented by the coboundary d, which plays a key role in expressing balance statements [12] of physical models. A dual complex is needed to represent the behavior of a physical quantity in the neighborhood of a simplex (see section 4.2 in [12]); a role similar to that of the dual complex is played by, e.g. stencils in the finite difference method.

In addition to the topological operators introduced so far, metric-dependent ingredients are needed to complete the formulation of physical theories, especially as concerns constitutive equations. Indeed, these equations relate primal to dual variables [12, 14], so that it is convenient to introduce an isomorphism between primal and dual cochains, called the discrete Hodge star and denoted by $\star$. Finally, in continuum theories the L2 inner product allows to express power expenditures by pairing power-conjugated quantities and thus allows to represent potential energy terms in variational models for physical systems, which will be employed in our numerical experiments. In particular, it acts on a pair of primal/dual cochains and in the following is indicated by $\langle , \rangle$. This inner product allows to define the discrete codifferential δ as the adjoint of the coboundary with respect to this operator.

2.2. Symbolic Regression

We tackle SR through genetic programming [21] (GP), an evolutionary technique that explores the space of symbolic models starting from an initial population and using genetic operations. The candidate models (individuals) are represented by expression trees, whose nodes can be either primitives (i.e. functions and operators from a prescribed pool) or terminals (i.e. variables and constants). Specifically, for the primitives we use the standard operations on scalars and the DEC operations introduced in section 2.1, see table B.1.

The initial population is generated using the ramped half-and-half method (minimum height = 2, maximum height = 5) that allows to obtain trees with a variety of sizes [22]. We constrain the initial individuals to have length—the number of nodes of its tree—less or equal than 100 and to contain the unknowns of the problem.

2.2.1. Model discovery strategy

The goal of the GP evolution is to maximize the following fitness measure

Equation (1)

where α is a scaling factor set equal to $\{1, 10, 1000\}$ for Poisson, Elastica and Linear elasticity, respectively, MSE is the mean-square error of the solution corresponding to the current individual I and R is a regularization term, and η is the associated hyperparameter. The regularization term aims at preventing overfitting, i.e. high fitness on the training dataset and low fitness on the validation dataset. To favor individuals with low complexity among those who accurately fit the training dataset, we represent the regularization term as $R(I) : = \text{length}(I)$.

Starting from the initial population, the program evolves such a population through the operations of crossover and mutation, similarly to genetic algorithms. Individuals eligible for crossover and mutation are chosen through a binary stochastic tournament selection, where the competing individuals have given survival probabilities such that the weakest individual too has a chance of winning the tournament. In all the experiments we used one-point crossover and a mixed mutation operation that is chosen among uniform mutation, node replacement and shrink (see here for details on these operations). The two survival probabilities for the stochastic tournament and the three probabilities associated to the mixed mutation are hyperparameters tuned as a part of the model selection.

As anticipated, the discrete mathematical framework introduced in the previous section naturally suggests a type system for the primitives of the GP procedure. For example, we cannot sum cochains with different dimensions. To enforce this typing, we use a variant of GP called Strongly typed GP (STGP), which generates and manipulates trees there are type-consistent. Crucially, in our approach the physical consistency of the generated expressions is guaranteed by the use of dimensionless quantities. To evolve the population, we use a $(\mu + \lambda)$-genetic algorithm with $\mu = \lambda$ and overlapping generations, such that top µ individuals with the highest fitness are selected (truncation selection) among the total population including parents and offspring [23].

2.2.2. Datasets

For each physical system analyzed in our experiments (section 3), we generate a dataset based either on exact solution of the discrete model that we would like to recover or on the numerical solution of the corresponding continuous problem, for different values of a chosen load parameter. We split the dataset in three sets (double hold out): training $(59\%)$, validation $(21\%)$, and test $(20\%)$. In the present context, training consists of model discovery and possibly fitting of dimensionless parameters with unknown values. By a screening phase we identify the hyperparameters that mostly affect the learning process and we tune their values through a grid search to minimize the MSE on the validation set. More details on the grid search choices and results are reported in appendix C. With the best values resulting from the grid search (table 2), we perform 50 model discovery runs using as a dataset the combination of the training and the validation sets. Finally, we assess the generalization capability of the best model found in the model discovery runs by evaluating its MSE on the test set.

Table 2. Values of the hyperparameters used in the numerical experiments of section 3. For mixed mutation, the probabilities refer to uniform mutation, node replacement and shrink, respectively. For stochastic tournament, the probabilities refer to the survival change of the strongest and the weakest individual, respectively.

 Value
Hyperparameter Poisson Elastica Linear elasticity
Number of individuals (µ)200020002000
Crossover/mutation probabilities(0.2, 0.8)(0, 1)(0.2,0.8)
Mixed mutation probabilities(0.8, 0.2, 0)(0.8, 0.2, 0)(0.7, 0.2, 0.1)
Stochastic tournament probabilities(0.7, 0.3)(1, 0)(0.7,0.3)
Regularization factor (η)0.10.010

2.2.3. Variational models of physical systems

We model physical systems adopting a variational approach, such that each individual of the population that is evolved via SR corresponds to a discrete potential energy functional. A typical individual tree is represented in figure 3(B). Boundary conditions are enforced by adding a penalty term to the discrete energy. For the fitness evaluation, we first compute the minimum-energy solution of each model numerically. We assign the arbitrary value 105 to the MSE of constant energies or for which the minimization procedure does not converge. The minimum-energy solutions are then compared with those in the datasets to compute the fitness of the individuals according to equation (1).

2.3. Implementation

We implement our framework in two open-source Python libraries: dctkit (Discrete Calculus Toolkit) [24] for the DEC concepts described in section 2.1 and alpine [25] for the SR. Specifically, dctkit leverages the JAX [26] library as a high-performance numerical backend for the manipulation of arrays and matrices encoding cochains and DEC operators. For the minimization of the discrete energies found during the SR process, we use the limited-memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) algorithm implemented in the pygmo [27] library, with the gradients evaluated via the automatic differentiation feature of JAX. The alpine library is based on the STGP algorithms provided by a custom version of the Distributed Evolutionary Algorithms in Python (DEAP) [28] library and on the distributed computing features of the ray library [29].

3. Results

In this section, we assess the effectiveness of our model discovery method by solving three problems: Poisson [30], Euler's elastica [31] and Linear elasticity [32]. In the Poisson and Linear elasticity problems, the goal is to recover the governing equations of the discrete model that is used to generate the dataset. In particular, by running the SR algorithm multiple times, these benchmarks allow to evaluate the recovery rate, i.e. percentage of runs where the exact equations of the model used to generate the data were found. Instead, in Elastica we emulate a real scenario where experimental data is employed for model discovery.

The primitives used for the problems are reported in table B.1. Note that for Elastica we also use $\unicode{x1D7D9}$ as the identity dual zero-cochain and $\unicode{x1D7D9}_{\text{int}}$ as a primal zero-cochain that is identically 1 on the interior of the simplicial complex and 0 on the boundary nodes 2 . For all the problems, in addition to the primitives we employ the constant terminals $\{1/2, 2, -1\}$; for Linear elasticity we also add the constants $\{10, 0.1\}$.

3.1. Poisson equation in 2D

In the continuous setting, the Poisson equation can be obtained as the stationarity conditions of the Dirichlet functional [33]. Such a functional contains the L2 norm of the gradient of the unknown field. In the discrete setting, we represent the dimensionless unknown as a primal scalar-valued zero-cochain u, and application of the coboundary operator d leads to a primal scalar-valued one-cochain that is the discrete equivalent of the gradient of the continuous field integrated along the edges. Denoting by f the primal scalar-valued zero-cochain representing the (dimensionless) source, we can write the potential energy function as

Equation (2)

It can be shown [34] that the stationarity conditions of $\mathcal{E}_{\text{p}}$ correspond to the discrete Poisson equation. The benchmark consists in recovering equation (2) defined over a two-simplicial complex. In particular, a Delaunay triangulation of the unit square with 230 nodes was constructed using gmsh [35].

For the SR procedure, we generated 12 arrays representing the coefficients of the zero-cochain u by sampling the following scalar fields at the nodes of the mesh:

Equation (3)

Equation (4)

Equation (5)

The corresponding source cochains f were computed taking the discrete Laplacian equation (A.9) of these functions. The full dataset is made of the (u, f) pairs. The values of the functions $u_{i,j}$ evaluated at the boundary of the square domain were used to enforce Dirichlet boundary conditions during the minimization of each candidate energy produced in the SR. As initial guesses for the minimization procedures, we took the null zero-cochain.

As reported in table 2, the set of hyperparameter values includes regularization (η = 0.1). With such a set, we obtained a recovery rate of 66% in the model discovery runs. Notice that the model discovery process is generally effective, even when the exact model is not found, as shown by the decrease of the mean of the fitness of the best individual of each generation, as the evolution proceeds (figure 2(A)).

Figure 2.

Figure 2. Results for the Poisson problem. (A) Evolution of the mean and standard deviation of the absolute value of the best training fitness (i.e. fitness of the best individual of a generation evaluated on the training set) over the model discovery runs $(n = 50)$ (the first 34 generations have been removed from the plot, since the range of values of the fitness is very large). Comparison between the minimum-energy solutions (B), (D) and (F) corresponding to the best individual at the end of the model discovery stage evaluated for the source cochain associated to the test data $\{u_{2,3}, u_{1,3}, u_{1,0}\}$ of equation (3) (C), (E) and (G).

Standard image High-resolution image

To assess the effect of regularization on model discovery, we performed 50 additional runs with the best combination of hyperparameters without regularization (η = 0) found in the grid search (see table C.1). In this case, the recovery rate dropped to 60%, while the length of the tree corresponding to the best individual found at the end of the SR procedure was $39.97\pm 15.53$, which is significantly larger than that (9) of the shortest string corresponding to the energy equation (2). To compensate for such a growth in model complexity, we set η = 0.1 while keeping all the other hyperparameters fixed and ran 50 additional evolutions seeding the initial population with the 50 individuals found at the end of the model discovery runs without regularization. After 20 generations, all runs converged (recovery rate = 100%) to equivalent expressions of the energy equation (2), with length comprised between 9 and 11 (fitness equal to 0.9 and 1.1, respectively), see table 3 and figure 3(A).

Figure 3.

Figure 3. (A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the the final runs $(n = 50)$ with η = 0.1 and the initial population consisting of the best 50 individuals found in the runs with η = 0. Dashed horizontal lines mark the fitnesses of individuals having the shortest expressions that represent the energy in equation (2). (B) Tree corresponding to representation of one of the individuals with lowest absolute value of the training fitness (see table B.1 for the meaning of the primitives appearing in the tree).

Standard image High-resolution image

Table 3. Discrete energies corresponding to the four best individuals obtained in the model discovery runs $(n = 50)$ with η = 0.1 and the initial population consisting of the best 50 individuals found in the model discovery with η = 0.

#EnergyTraining fit.Test fit.Test MSE
1 $\langle u, \delta(d(u/2) - f\,) \rangle$ 0.90.9 $9.8 \cdot 10^{-10}$
2 $\langle u, \delta(du) - 2f \rangle$ 0.90.9 $9.8 \cdot 10^{-10}$
3 $\langle \star(\star u), \delta(du) - 2f \rangle$ 1.11.1 $9.8 \cdot 10^{-10}$
4 $\langle du, du \rangle - \langle\, f, 2u \rangle$ 1.11.1 $9.8 \cdot 10^{-10}$

3.2. Euler's elastica

Unlike the previous problem, here we emulate a physical experiment involving an elastic cantilever rod subject to an end load by generating synthetic experimental data, instead of starting from an exact solution of a given discrete model. Specifically, we compute numerically the solution of a continuous model for different values of the load, we perturb it with noise and sample it at mesh nodes, to mimic experimental measurements. In the following, $\mathcal{I} = [0,L]\subset \mathbb{R}$ is a one-dimensional (1D) interval that represents the coordinates of the points of the longitudinal axis of the rod in its undeformed, straight configuration, L is the length of the rod, $\tilde{u}(\sigma)$ is the angle formed by the tangent to the axis of the rod and the horizontal direction at the dimensionless arc-length $\sigma\colon = s/L$. The axis of the rod is assumed to be inextensible and its curvature can be readily computed as $\tilde{u}^{\prime}$, where a prime denotes differentiation with respect to σ. For a constant cross-section rod subject to a vertical end-load at the right end and clamped at the left end, the boundary-value problem for the (dimensionless) continuous Euler's elastica equation reads [31]

Equation (6)

and the corresponding variational formulation is

Equation (7)

In equations (6) and (7), $f = PL^2/B$ is the dimensionless load parameter, where P is the vertical component of the load applied at the right end and B is the bending stiffness of the rod.

In the discrete setting, we represent the unknown of the problem as a dual scalar-valued zero-cochain u, where the value at each dual node is the angle formed by the corresponding primal edge and the horizontal direction. The discrete curvature k may be evaluated by first taking the difference between the angles of two consecutive edges ($d^\star u$), then transferring this information to the primal nodes ($\star d^\star u$) and finally considering only the internal nodes ($\unicode{x1D7D9}_{\text{int}} \odot \star {d}^\star u$), where k is meaningful. Hence, the discrete equivalent of (7) reads

Equation (8)

where $k = \unicode{x1D7D9}_{\text{int}} \odot \star {d}^\star u$. The 1D mesh for this problem was generated directly by code. Precisely, we generated a one-simplicial complex in $[0,1]$ consisting of 11 uniformly distributed nodes.

To generate the datasets, we considered a rod with length L = 1 m and bending stiffness $B \approx 7.854\, \textrm{N}\textrm{m}^2$. We computed numerically the solutions of the continuous problem equation (6) for ten different values of the force $P \in \{-5,-10,\ldots, -45, -50\}$ N and we sampled the (x, y) coordinates of the points of the deformed configurations at the nodes of the one-simplicial complex. Finally, we perturbed them with uniformly distributed noise (amplitude equal to ≈5% of the order of magnitude of the clean data) and we recovered the arrays of the coefficients of u. Linear approximations of these data were used as initial guesses for the minimization of the candidate energies found during the SR. The full dataset consists of the $(u,PL^2)$ pairs.

In a realistic experimental scenario, the load magnitude is prescribed, while the bending stiffness is an unknown material-dependent parameter that must be calibrated to reproduce the measurements. Hence, for each individual of the SR procedure the bending stiffness must be fitted on one solution $\bar{u}$ belonging to the training set. For this purpose, for each candidate energy $\mathcal{E}(u,f)$ we solved the constrained optimization problem

Equation (9)

Then, from the relation $B = PL^2/f$ we recovered the value of B and used it to evaluate the fitness of the individual over the whole training, validation and test sets, as described in section 2.2. In particular, taking the ground-truth model, $\mathcal{E} \equiv \mathcal{E}_{\text{el}}$, and solving the optimization problem we can identify a value $B_{\text{noise}} \approx 7.4062\, \textrm{N}\textrm{m}^2$ for the bending stiffness that accounts for the noise.

Both the average and the standard deviation of the fitness of the best individual of each generation over 50 model discovery runs decrease during the SR evolution (figure 4(A)) and attain the values 0.28 and 0.13, respectively, after 100 generations. Also, the selected models fit the data of the test set well (figures 4(B) and (C)). These observations confirm the effectiveness of the model discovery process along with the accuracy and generalization capabilities of the recovered models.

Figure 4.

Figure 4. Results for the elastica problem. (A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the the model discovery runs $(n = 50)$ along with the tree representation of the best individual (int_coch is $\unicode{x1D7D9}_{\text{int}}$). Comparison between the deformed configurations corresponding to the best individual at the end of the model discovery stage (blue curves) evaluated for $P = -45$ N (B) and $P = -10$ N (C) and the test data (red dots).

Standard image High-resolution image

We notice that the first and the last models among top four of table 4 are such that $B = B_{\text{noise}}/2h$, where h is the (primal) edge length. This relation exploits the uniform edge length of the simplicial complex representing the axis of the rod. Therefore, these models coincide with that of equation (8) up to addition and/or multiplication with a constant.

Table 4. Discrete energies corresponding to the four best individuals obtained in the model discovery runs $(n = 50)$ of the Elastica problem using the final values of the hyperparameters reported in table 2.

#EnergyTraining fit.Test fit.Test MSE B
1 $\langle \star \unicode{x1D7D9}_{\text{int}}, (d^\star u)^2\rangle - \langle \sin u, f\unicode{x1D7D9}\rangle$ 0.1960.19390.008437.0312 Nm2
2 $\langle \arccos(-1) \sin u + \delta^\star(\sin(\sin(d^\star u))), u - f\unicode{x1D7D9} \rangle$ 0.21230.22470.00757.1779 Nm2
3 $\langle u - f\unicode{x1D7D9}, \delta^\star (\sin(\sin(d^\star u))) + 1/\exp(-1)\sin u \rangle$ 0.2140.21680.00676.8262 Nm2
4 $\langle \star \unicode{x1D7D9}_{\text{int}}, (d^\star u)^2 \rangle - \langle f\unicode{x1D7D9}, \sin u \rangle + 1/2$ 0.2160.21390.008437.0312 Nm2

3.3. Linear elasticity in 2D

Differently from the previous benchmarks, this case deals with a vector-valued unknown and thus proves the capability of our framework of handling another class of problems. We collect the dimensionless coordinates of the nodes of the two-simplex representing the reference configuration of an elastic body in a primal vector-valued zero-cochain. We assume plane strain conditions. Let $\boldsymbol{e}_i, i = 1,2$ be the covariant basis for $\mathcal{V}$ made of two edge vectors of the reference configuration of a given two-simplex and let $\boldsymbol{e}^{\prime}_i$ be the covariant basis $\mathcal{V}$ made of two edge vectors of the corresponding current configuration of the two-simplex. The contravariant bases are indicated with superscript indices. We represent the two-dimensional (2D) deformation gradient F as a dual (tensor-valued) zero-cochain whose coefficient on a given two-simplex is

Equation (10)

Then, we define the 2D, infinitesimal strain measure as a dual zero-cochain

Equation (11)

and by applying the constitutive equation for linearly elastic isotropic bodies we obtain the corresponding dimensionless (2D) stress as a dual zero-cochain

Equation (12)

where I is the identity dual zero-cochain and $(\mu\_, \lambda\_)$ are the Lamé moduli (here, we take $\lambda\_ / \mu\_ = 10$). Notice that the second term in the constitutive equation involves the coefficient-wise multiplication between a scalar-valued and a tensor-valued cochain.

In the case of zero body forces and assuming Dirichlet boundary conditions (prescribed positions of the nodes), the potential energy function reads

Equation (13)

The goal of this benchmark is recovering equation (13) as a function of F . The deformed configuration is given in terms of the current positions of the nodes that minimized equation (13). In principle, an admissible candidate energy $\mathcal{E}(\boldsymbol{F})$ must be frame-indifferent, i.e. $\mathcal{E}(\boldsymbol{F} + \boldsymbol{W}) = \mathcal{E}(\boldsymbol{F})$ for any skew-symmetric dual zero-cochain W . To enforce such a constraint, we added the penalty term $- (\mathcal{E}(\boldsymbol{F} + \tilde{\boldsymbol{W}}) - \mathcal{E}(\boldsymbol{F}))^2$ to the fitness equation (1), where

Equation (14)

This addition is clearly necessary but not sufficient. In practice, we observed that all the energies obtained at the end of the model discovery runs are frame-indifferent.

We considered a two-simplicial complex (dimensionless reference configuration) generated as a Delaunay triangulation of the unit square with 142 nodes. The full dataset (training, validation and test sets) consists of 20 samples, each including the positions of the nodes in the deformed configurations associated to the problems of uniaxial tension and pure shear. For these problems, the (uniform) strain and stress cochains are:

  • Uniaxial tension:
    Equation (15)
    where $s = 2\epsilon + (\lambda\_/\mu\_)(\epsilon + \ell)$, $\ell = - (\lambda\_/\mu\_)\,\epsilon/(2 + (\lambda\_/\mu\_))$, and $\epsilon \in \{0.01, 0.02, \ldots, 0.1\}$;
  • Pure shear:
    Equation (16)
    where $\gamma \in \{0.01, 0.02, \ldots, 0.1\}$.

In both cases, the current positions of the nodes can be easily computed from the strain cochain and from equations (10) and (11). As initial guess for the energy minimization procedures associated to the evaluation of the fitnesses of the candidate energies, we took the reference positions of the nodes.

In the model discovery runs performed using the set of hyperparameters of table 2, we obtained a recovery rate of $92\%$. However, the SR process is always very effective, even in the runs where the correct solution is not attained (figure 5(A)). Notice that the second best individual of table 5 equals $1/2 \, \mathcal{E}_\mathrm{LE}$. For the remaining energies reported in table 5, we can still prove their equivalence to $\mathcal{E}_\mathrm{LE}$, under specific hypothesis. In the case of homogeneous deformations ($\boldsymbol{F}(\star \sigma_{2,i}) = \tilde{\boldsymbol{F}}$ for any i)

Equation (17)

Equation (18)

Equation (19)

where A is the total area of the two-complex. Moreover, the following identities hold in general:

Equation (20)

Equation (21)

Equation (22)

Additionally, when A = 1 as in our benchmark, equations (17)–(19) provide the identities $\langle \boldsymbol{I}, \boldsymbol{F}^\mathrm{T} \rangle^2 = \langle\text{tr}(\boldsymbol{F})\boldsymbol{I}, \text{sym}(\boldsymbol{F})\rangle$ and $\langle \boldsymbol{I}, \boldsymbol{I} \rangle = 2$. These facts, together with equations (20)–(22), allow to prove the aforementioned equivalences. Of course, some of these equivalences do not hold in general (i.e. when deformations are not homogeneous), so the recovered models correspond to different expressions of the potential energy, while having the same accuracy (in terms of MSE) with respect to the datasets. To resolve this ambiguity, we compute the associated energy densities by dividing the energies by A, since the deformations are homogeneous. We find that energies #1, #3 and #4 of table 5 are not admissible, because their energy densities linearly depend on the area A due to the term in equation (17). However, these expressions may be fixed by interpreting such a term as that in equation (18), so that they all correspond to the exact energy density that was used to generate the dataset. We remark that this a posteriori discussion can be done since our method produces interpretable models that are amenable to analysis.

Figure 5.

Figure 5. Result for the Linear elasticity problem. (A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the model discovery runs $(n = 50)$. Comparison between the deformed configurations (B), (D) for $\epsilon \in \{0.01, 0.02\}$ (uniaxial tension) and (F), (H) for $\gamma \in \{0.06, 0.08\}$ (pure shear) corresponding to the best model at the end of the model discovery stage and the associated test data (C), (E) and (G), (I), respectively. The magnitude of the displacements has been scaled by a factor of 5 for clarity. Orange meshes represent the reference configurations.

Standard image High-resolution image

Table 5. Discrete energies corresponding to the four best individuals obtained from the model discovery runs $(n = 50)$ of the Linear elasticity problem using the final values of the hyperparameters reported in table 2.

#EnergyTraining MSETest MSE
1 $\langle \boldsymbol{I} - \boldsymbol{F}^\mathrm{T}, \boldsymbol{I} \rangle^2 - 0.1(\langle \boldsymbol{I} - \boldsymbol{F}^\mathrm{T}, \text{sym}(\boldsymbol{F}) - \boldsymbol{I} \rangle + \langle \boldsymbol{I} - \text{sym}(\boldsymbol{F}), \text{sym}(\boldsymbol{F}) - \boldsymbol{I} \rangle)$ 0 $1.1672 \cdot 10^{-16}$
2 $\langle -0.5\boldsymbol{I} + 0.5 \,\text{sym}(\boldsymbol{F}), \text{sym}(\boldsymbol{F}) - \boldsymbol{I} + (\text{tr}(\boldsymbol{F}) - \text{tr}(\boldsymbol{I}))(2\text{tr}(\boldsymbol{I})\boldsymbol{I} + \boldsymbol{I}) \rangle$ 0 $2.0785 \cdot 10^{-16}$
3 $\langle \boldsymbol{I} - \boldsymbol{F}^\mathrm{T}, \boldsymbol{I} - \text{sym}(\boldsymbol{F}) + 5\langle \boldsymbol{I} - \boldsymbol{F}, \boldsymbol{I} \rangle \boldsymbol{I} \rangle$ 0 $3.7309 \cdot 10^{-16}$
4 $\langle \text{sym}(\boldsymbol{F}) - \boldsymbol{I}, \text{sym}(\boldsymbol{F}) - \boldsymbol{I} \rangle + 5 \langle \boldsymbol{I} - \boldsymbol{F}, \boldsymbol{I} \rangle^2$ 0 $8.7033 \cdot 10^{-16}$

We notice that traditional SR can be used when the deformation gradient and the stress are homogeneous. However, its performance is much worse than that of our method, because the former lacks tensor-based operations, which are needed to achieve a compact representation of the energy density. The results of the comparison between the two methods are reported in appendix D, along with an example of a genuine field problem in linear elasticity, which clearly cannot be solved via a traditional SR approach. We believe that these simple, yet effective examples demonstrate the advantage of the proposed approach in situations where conventional SR methods struggle to learn an accurate model.

4. Conclusions

In this work, we have introduced a new method to learn discrete energies of physical system from experimental data that exploits DEC as an expressive language for physical theories and the interpretability stemming from the SR approach. We have demonstrated its effectiveness by studying three classical problems in continuum physics. Our approach suggests that the synergy between geometry and SR leads to physically-consistent models that generalize well. Further, it paves the way to the application of SR to the modeling of a broad class of physical systems.

4.1. Limitations and future work

We have developed our framework based on a special class of problems (variational and steady-state). This choice was motivated by the simpler implementation of the numerical routines compared to time-dependent, non-variational problems. To overcome this limitation, we will implement time-integration schemes and, regarding non-variational problems, we will use trees to represent the residual of the system of governing equations, instead of the energy functional.

Acknowledgments

Funded by the European Union (European Research Council (ERC), ALPS, 101039481). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the ERC Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. Computational resources have been partially provided by DeiC National HPC (g.a. DeiC-AU-N1-2023030).

Data availability statement

The data cannot be made publicly available upon publication because they are not available in a format that is sufficiently accessible or reusable by other researchers. The data that support the findings of this study are available upon reasonable request from the authors [25].

Appendix A: Mathematical details on DEC

A.1. Simplicial complex

A p-simplex is the convex hull of p + 1 independent points in $\mathcal{E}^N$. A collection of simplices defines a simplicial complex K if it satisfies the following conditions:

  • every lower-order simplex of a given simplex in K is in K;
  • the intersection of any two simplices is either empty or a boundary element of both simplices.

A.2. Chains and cochains

A (scalar-valued) p-chain is a collection of values, one for each of the p-simplices of K. Each p-simplex corresponds to a basic chain $\sigma_{p,i}$ (i.e. a chain assigning 1 to a given simplex and 0 to the others) and we can use the same notation to indicate both the simplex and the corresponding basic chain. The p-chains group is denoted by $\mathcal{C}_p(K, \mathbb{R})$. A p-chain cp can be written in an unique way as a finite sum of basic chains:

Equation (A.1)

where np is the number of p-simplices of K, $c_p(\sigma_{p,i}) \in \mathbb{R}$, and np is the number of p-simplices. In computations, a chain can be represented by a vector where the ith entry is the value of the chain on the ith simplex.

A p-cochain is a linear functional that maps chains to scalars. Analogously to chains, we can represent a cochain cp in terms of basic cochains

Equation (A.2)

where $c^p(\sigma^{p,i})$ are coefficients to be chosen in an appropriate space and $\sigma^{p,i}$ is the ith basic p-cochain, i.e. the cochain that assigns 1 to the ith basic chain, which we can again identify with the ith simplex, and 0 to the others. The space of scalar-valued p-cochains on a simplicial complex K is denoted by $\mathcal{C}^p(K, \mathbb{R})$, while the spaces of vector-valued and tensor-valued cochains are denoted by $\mathcal{C}^p(K, \mathcal{V})$ and $\mathcal{C}^p(K, \text{Lin})$, respectively. We indicate any of these spaces with $\mathcal{C}^p(K, \cdot)$. We will use bold symbols for vector-valued and tensor-valued cochains and for their coefficients $\boldsymbol{c}^p(\sigma^p_i) \in \mathcal{V}$ and $\boldsymbol{C}^p(\sigma^p_i) \in \text{Lin}$, respectively. In computations, a cochain can be represented by a multi-dimensionl array, where each row contains its value(s) associated to a given simplex.

A p-cochain can be readily evaluated on a p-chain by exploiting the duality between basic chains and basic cochains. We denote such an evaluation by the pairing

Equation (A.3)

with $\langle \langle \sigma^{p,i}, \sigma_{p,j} \rangle \rangle = \delta^i_j$. Notice that $\langle \langle c^p, \sigma_{p,i} \rangle \rangle = c^p(\sigma^{p,i})$ and $\langle \langle \sigma^{p,i}, c_p \rangle \rangle = c_p(\sigma_{p,i})$. This operation is a discrete preliminary to integration when regarding cochains as densities with respect to the measures imparted to cells by real-valued chains [16].

A.3. Boundary and coboundary

The boundary of a p-simplex is the union of its $(p-1)$-faces, that is, $(p-1)$-simplices consisting of subsets of the vertices of the p-simplex. Formally, the boundary operator is a linear map $\partial: \mathcal{C}_p(K, \mathbb{R}) \rightarrow \mathcal{C}_{p-1}(K, \mathbb{R})$ defined by

Equation (A.4)

where $\sigma_p = [\sigma_{0,0}, \ldots, \sigma_{0,p}]$ is a generic p-simplex and $[\sigma_{0,0}, \ldots, \hat{\sigma}_{0,i}, \ldots, \sigma_{0,p}]$ is the $(p-1)$-simplex obtained omitting the ith vertex. The boundary operator encodes fundamental topological information about the simplicial complex and allows to define the coboundary operator $d: \mathcal{C}^{p-1}(K, \mathbb{R}) \rightarrow \mathcal{C}^p(K, \mathbb{R})$ as its adjoint with respect to the pairing defined in equation (A.3):

Equation (A.5)

The boundary operator $\partial$ may be represented by the incidence matrix [12, 20], while the coboundary operator corresponds to its transpose. Notice that there is one (co)boundary matrix per simplex dimension.

A.4. Dual complex

Given a simplicial complex K of dimension n, we can define its dual $\star K$ by associating each p-simplex $\sigma_p \in K$ with its dual $(n-p)$-simplex $\star \sigma_{(n-p)}$, following a specific rule (see Remark 2.5.1 in [11]). In this work, we use the circumcentric (or Voronoi) dual. Once the orientation of the primal is chosen, that of the dual is induced, and the dual boundary and coboundary $d^\star$ account for such orientation (see [10, 11, 36] for more details).

A.5. Discrete Hodge star

The diagonal discrete Hodge star $\star: \mathcal{C}^p(K, \cdot) \rightarrow \mathcal{C}^{n-p}(\star K, \cdot)$ is defined as follows [10, 11]:

Equation (A.6)

Here, $\star \sigma^p$ is the basic dual $(n-p)$-cochain corresponding to σp .

A.6. Inner product, codifferential and discrete Laplacian

We define the L2 inner product on cochains as

Equation (A.7)

with $c_1^p, c^p_2 \in \mathcal{C}^p(K, \cdot)$, where the dot denotes the standard multiplication between scalars or the standard Euclidean inner product between vectors or tensors.

Further, with respect to this inner product, we may introduce the adjoint of the coboundary operator, the discrete codifferential operator δ, which maps primal scalar-valued p-cochains into primal scalar-valued $(p-1)$-cochains, as

Equation (A.8)

by requiring that $\langle d\alpha, \beta \rangle = \langle \alpha, \delta \beta \rangle$, for any $\alpha \in \mathcal{C}^{k-1}(K, \mathbb{R})$, $\beta \in \mathcal{C}^{k}(K, \mathbb{R})$. In the previous statement, we write $\star$ for both the Hodge star and its inverse and $d^\star$ is the dual coboundary operator. The dual inner product and dual codifferential $\delta^\star$ can be analogously defined.

In order to deal with the Poisson equation (see section 3.1), we introduce the discrete Laplace–de Rham operator:

Equation (A.9)

This is the discrete analogue of the Laplace–de Rham operator on differential forms, which coincides with minus the standard Laplacian of a scalar field [37].

A.7. Cochain functions

Finally, we extend functions operating on scalar fields (such as $\sin, \cos, \exp, \log$) to cochains by component-wise application. In particular, the component-wise multiplication between scalar-valued cochains will be denoted by the symbol $\odot$.

Appendix B: List of primitives for the SR used in the numerical experiments

Table B.1. List of primitives used in the experiments of section 3. Add: +; Sub: −; MulF: · ; Div: $\div$. SinF: sin; ArcsinF: arcsin; CosF: cos; ArccosF: arccos; ExpF: exp; LogF: log; InvF: inversion of a float. SqrtF: square root; SquareF: square. d: coboundary; del: codifferential; Sin: sin of a cochain; Arcsin: arcsin of a cochain; Cos: cos of a cochain; Arccos: arccos of a cochain; Exp: exp of a cochain; Log: log of a cochain. St: Hodge star; InvSt: inverse Hodge star. Sqrt: square root of a cochain; Square: square of a cochain. tran: transpose of a tensor-valued cochain; sym: symmetric part of a tensor-valued cochain. tr: trace of a tensor-valued cochain. Mul: multiplication between a scalar and a cochain. Mulv: multiplication between a scalar-valued cochain and a tensor-valued cochain. InvMul: multiplication between the inverse of a scalar and a cochain. Inn: inner product between cochains of the same type (either primal or dual). AddC: addition of cochains; SubC: subtraction of cochains. CochMul: component-wise cochain multiplication. In all the operations on cochains X = (P,D), where P means 'primal' and D means 'dual'; J = (0, ... ,n); R = (S,T), where S means 'scalar-valued' and T means 'tensor-valued'.

   Problem
Primitive namesInput typesReturn type Poisson Elastica Linearelasticity
Add, Sub, MulF, Div (float, float) float $\checkmark$ $\checkmark$ $\checkmark$
SinF, ArcsinF, CosF, ArccosF, ExpF, LogF InvF float float $\checkmark$ $\checkmark$
SqrtF, SquareF float float $\checkmark$ $\checkmark$ $\checkmark$
dXJS, delXJS, SinXJS, ArcsinXJS, CochainXJS CochainXJS $\checkmark$ $\checkmark$
CosXJS, ArccosXJS, ExpXJS, LogXJS
StJR, InvStJR CochainXJR CochainXJR $\checkmark$ $\checkmark$ $\checkmark$
SqrtXJR, SquareXJR CochainXJR CochainXJR $\checkmark$ $\checkmark$
tranXJT, symXJT CochainXJT CochainXJT $\checkmark$
trXJT CochainXJT CochainXJS $\checkmark$
MulXJR (CochainXJR, float) CochainXJR $\checkmark$ $\checkmark$ $\checkmark$
MulvXJ (CochainXJS, CochainXJT) CochainXJT $\checkmark$
InvMulXJS (CochainXJS, float) CochainXJS $\checkmark$ $\checkmark$
InnXJR (CochainXJR, CochainXJR) float $\checkmark$ $\checkmark$ $\checkmark$
AddCXJR, SubCXJR (CochainXJR, CochainXJR) CochainXJR $\checkmark$ $\checkmark$ $\checkmark$
CochMulXJS (CochainXJS, CochainXJS) CochainXJS $\checkmark$ $\checkmark$

Appendix C: Details on the selection of the hyperparameters for the SR

We report in table C.1 the results of the grid search carried out for each problem. For Poisson and Linear elasticity, we used the recovery rate as the performance measure, while for Elastica we took the MSE on the validation set, since the data are noisy. For each combination of hyperparameters we performed five runs. In particular, for Linear elasticity, we selected the set of hyperparameters among those (5) leading to a $100\%$ recovery rate in the grid search after carrying out 50 final model discovery runs. We reported in table 2 the set of hyperparameters corresponding to the best recovery rate.

Table C.1. Grid search results. Here, MM = mixed mutation, ST = stochastic tournament. In bold, the best validation scores for the three problems.

Number of individuals(Crossover, mutation) prob.MM prob.ST prob.RegularizationValidation MSE (recovery rate)
Poisson Elastica Linear elasticity Poisson Elastica Linear elasticity
1000(0, 1)(0.7,0.2,0.1)(1, 0)0.10.050.015.0714 ± 4.5954 (20%)0.0784 ± 0.09991.2424 $\cdot 10^{-5}$ ± 1.1649 $\cdot 10^{-5}$ (40)%
1000(0, 1)(0.8,0.2,0)(1, 0)0.10.050.014.2284 ± 3.8019 (20%)0.0172 ± 0.01188.4039 $\cdot 10^{-6}$ ± 9.7148 $\cdot 10^{-6}$ (20%)
2000(0, 1)(0.7,0.2,0.1)(1, 0)0.10.050.013.0376 ± 3.4102 (20%)0.0802 ± 0.13851.3233 $\cdot 10^{-5}$ ± 8.9749 $\cdot 10^{-5}$ (40%)
2000(0, 1)(0.8,0.2,0)(1, 0)0.10.050.010.4954 ± 0.4255 (40%)0.0197 ± 0.01215.6222 $\cdot 10^{-6}$ ± 4.9649 $\cdot 10^{-6}$ (0%)
1000(0.2, 0.8)(0.7,0.2,0.1)(1, 0)0.10.050.011.7346 ± 3.4561 (60%)0.2475 ± 0.21594.3172 $\cdot 10^{-5}$ ± 7.8832 $\cdot 10^{-5}$ (60%)
1000(0.2, 0.8)(0.8,0.2,0)(1, 0)0.10.050.010.0250 ± 0.0501 (80%)0.1190 ± 0.17411.6539 $\cdot 10^{-5}$ ± 1.0686 $\cdot 10^{-5}$ (0%)
2000(0.2, 0.8)(0.7,0.2,0.1)(1, 0)0.10.050.011.5245 ± 2.6260 (60%)0.1049 ± 0.11825.3876 $\cdot 10^{-6}$ ± 6.7293 $\cdot 10^{-6}$ (60%)
2000(0.2, 0.8)(0.8,0.2,0)(1, 0)0.10.050.010.0947 ± 0.1888 (60%)0.0228 ± 0.02085.4618 $\cdot 10^{-6}$ ± 6.8377 $\cdot 10^{-6}$ (40%)
1000(0, 1)(0.7,0.2,0.1)(1, 0)00.0102.3072 ± 2.6184 (20%)0.1752 ± 0.1992 6.8078 $\boldsymbol{\cdot 10^{-12}}$ ± 1.1160 $\boldsymbol{\cdot 10^{-12}}$ (100%)
1000(0, 1)(0.8,0.2,0)(1, 0)00.0101.1878 ± 0.9544 (20%)0.1896 ± 0.21861.4765 $\cdot 10^{-5}$ ± 2.8894 $\cdot 10^{-5}$ (20%)
2000(0, 1)(0.7,0.2,0.1)(1, 0)00.0100.5699 ± 0.6354 (20%)0.2047 ± 0.2529 6.0695 $\boldsymbol{\cdot 10^{-15}}$ ± 1.1909 $\boldsymbol{\cdot 10^{-15}}$ (100%)
2000(0, 1)(0.8,0.2,0)(1, 0)00.0100.2582 ± 0.3603 (40%)0.0583 ± 0.07874.1890 $\cdot 10^{-5}$ ± 8.3782 $\cdot 10^{-5}$ (80%)
1000(0.2, 0.8)(0.7,0.2,0.1)(1, 0)00.0101.6456 ± 2.9635 (60%)0.0711 ± 0.11501.7215 $\cdot 10^{-4}$ ± 3.1495 $\cdot 10^{-4}$ (40%)
1000(0.2, 0.8)(0.8,0.2,0)(1, 0)00.0100.4365 ± 0.4468 (40%)0.0262 ± 0.02142.5231 $\cdot 10^{-8}$ ± 5.0461 $\cdot 10^{-8}$ (80%)
2000(0.2, 0.8)(0.7,0.2,0.1)(1, 0)00.0101.1467 ± 1.9329 (60%)0.0795 ± 0.13968.0971 $\cdot 10^{-4}$ ± 1.4177 $\cdot 10^{-4}$ (40%)
2000(0.2, 0.8)(0.8,0.2,0)(1, 0)00.0100.0474 ± 0.0947 (80%) 0.0157 ± 0.0073 3.7003 $\cdot 10^{-8}$ ± 7.3999 $\cdot 10^{-8}$ (80%)
1000(0, 1)(0.7,0.2,0.1)(0.7, 0.3)0.10.050.011.4862 ± 2.2846 (60%)0.0496 ± 0.06811.6212 $\cdot 10^{-4}$ ± 3.0510 $\cdot 10^{-4}$ (40%)
1000(0, 1)(0.8,0.2,0)(0.7, 0.3)0.10.050.012.0693 ± 2.0785 (20%)0.1094 ± 0.13534.1410 $\cdot 10^{-5}$ ± 7.4200 $\cdot 10^{-5}$ (40%)
2000(0, 1)(0.7,0.2,0.1)(0.7, 0.3)0.10.050.011.3861 ± 2.3857 (60%)0.1339 ± 0.1521 2.5238 $\boldsymbol{\cdot 10^{-14}}$ ± 2.7976 $\boldsymbol{\cdot 10^{-14}}$ (100%)
2000(0, 1)(0.8,0.2,0)(0.7, 0.3)0.10.050.010.3846 ± 0.2734 (0%)0.2299 ± 0.32289.0010 $\cdot 10^{-6}$ ± 6.1403 $\cdot 10^{-6}$ (20%)
1000(0.2, 0.8)(0.7,0.2,0.1)(0.7, 0.3)0.10.050.013.3625 ± 3.6303 (40%)0.0497 ± 0.03991.4959 $\cdot 10^{-5}$ ± 1.0025 $\cdot 10^{-5}$ (20%)
1000(0.2, 0.8)(0.8,0.2,0)(0.7, 0.3)0.10.050.011.8037 ± 3.1416 (40%)0.0804 ± 0.07706.8314 $\cdot 10^{-6}$ ± 5.5778 $\cdot 10^{-6}$ (40%)
2000(0.2, 0.8)(0.7,0.2,0.1)(0.7, 0.3)0.10.050.011.9301 ± 2.6795 (60%)0.0663 ± 0.06156.1977 $\cdot 10^{-6}$ ± 7.5906 $\cdot 10^{-6}$ (60%)
2000(0.2, 0.8)(0.8,0.2,0)(0.7, 0.3)0.10.050.01 0.000 ± 0.000 (100%) 0.0660 ± 0.05937.7641 $\cdot 10^{-6}$ ± 6.4927 $\cdot 10^{-6}$ (40%)
1000(0, 1)(0.7,0.2,0.1)(0.7, 0.3)00.0102.1301 ± 2.4017 (20%)0.3302 ± 0.39874.5915 $\cdot 10^{-7}$ ± 9.1827 $\cdot 10^{-7}$ (80%)
1000(0, 1)(0.8,0.2,0)(0.7, 0.3)00.0102.9042 ± 2.7186 (20%)0.0588 ± 0.09177.7295 $\cdot 10^{-7}$ ± 1.1600 $\cdot 10^{-6}$ (60%)
2000(0, 1)(0.7,0.2,0.1)(0.7, 0.3)00.0100.0450 ± 0.0901 (80%)0.2218 ± 0.30325.0241 $\cdot 10^{-6}$ ± 9.6092 $\cdot 10^{-6}$ (40%)
2000(0, 1)(0.8,0.2,0)(0.7, 0.3)00.0100.3978 ± 0.5296 (60%)0.1401 ± 0.2211 9.0573 $\boldsymbol{\cdot 10^{-16}}$ ± 4.3205 $\boldsymbol{\cdot 10^{-16}}$ (100%)
1000(0.2, 0.8)(0.7,0.2,0.1)(0.7, 0.3)00.0100.8872 ± 1.3022 (40%)0.0772 ± 0.1239 2.6749 $\boldsymbol{\cdot 10^{-11}}$ ± 5.3488 $\boldsymbol{\cdot 10^{-11}}$ (100%)
1000(0.2, 0.8)(0.8,0.2,0)(0.7, 0.3)00.0102.4462 ± 2.9052 (40%)0.0276 ± 0.02674.3438 $\cdot 10^{-7}$ ± 8.6875 $\cdot 10^{-7}$ (80%)
2000(0.2, 0.8)(0.7,0.2,0.1)(0.7, 0.3)00.0100.1696 ± 0.3391 (80%)0.1049 ± 0.1089 4.9220 $\boldsymbol{\cdot 10^{-15}}$ ± 8.6064 $\boldsymbol{\cdot 10^{-15}}$ (100%)
2000(0.2, 0.8)(0.8,0.2,0)(0.7, 0.3)00.0100.6313 ± 1.2626 (80%)0.0282 ± 0.01112.9903 $\cdot 10^{-7}$ ± 5.9782 $\cdot 10^{-7}$ (80%)

Appendix D: Comparison between our method and traditional GP on the linear elasticity benchmark

As discussed in section 3.3, the dataset used for Linear elasticity only consists of homogeneous deformations. Hence, a traditional genetic programming-based SR offers an alternative strategy to learn equation (13), where the energy is a function that maps scalar variables $F_{11}, F_{12}, F_{21}, F_{22}$, which represent the homogeneous deformation gradient. We compared our method with such an approach, where we used addition, subtraction, multiplication, division, square and square root as primitives and did not enforce any strong typing. After choosing the best set of hyperparameters through a grid search, we performed the 50 model discovery runs to assess the performance of the traditional approach (figure D.1). Notably, the recovery rate dramatically drops to $4\%$, since in the traditional SR tensorial operations (such as, trace and symmetric part) must be learned as well from component representations.

Figure D.1.

Figure D.1. Result for the linear elasticity problem using traditional genetic programming. (A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the model discovery runs $(n = 50)$. Comparison between the deformed configurations (B), (D) for $\epsilon \in \{0.01, 0.02\}$ (uniaxial tension) and (F), (H) for $\gamma \in \{0.06, 0.08\}$ (pure shear) corresponding to the best model at the end of the model discovery stage and the associated test data (C), (E) and (G), (I), respectively. The magnitude of the displacements has been scaled by a factor of 5 for clarity. Orange meshes represent the reference configurations.

Standard image High-resolution image

Also, the traditional SR approach cannot be applied without introducing additional operators (such as discrete integration/differentiation operations) whenever the deformations are non-homogeneous. Hence, our approach provides a distinct advantage over traditional SR in the case of field problems. To better elucidate this aspect, we studied a genuine field problem, where a body force induces non-homogeneous deformations. In this case, equation (13) becomes

Equation (D.1)

where u and f are the primal vector-valued zero-cochains representing the (dimensionless) displacements of the primal nodes and the body force at each primal node, respectively.

We generated ten data samples where u is quadratic (constant body force) and given by:

Equation (D.2)

where $a \in \{0.01, 0.02, \ldots, 0.1\}$ is a dimensionless constant and $(x_{i,1}, x_{i,2})$ are the (normalized) Cartesian coordinates of the i-th node and $\hat{\boldsymbol{e}}_i$ the corresponding orthonormal basis vectors. The associated body forces f are computed by taking the negative of the (discrete) divergence of the stress tensor S , with the deformations and stresses related by equations (10)–(12). The full dataset (training, validation and test) consists of the ten newly generated $(\boldsymbol{X}, \boldsymbol{f})$ pairs, where X is the matrix of the current positions of the nodes, and the pure tension samples (10) of section 3.3.

We performed 50 model discovery runs using the best set of hyperparameters found for Linear elasticity; the results are reported in figure D.2. The recovery rate dropped to $62\%$, probably because the hyperparameters were not optimal for the problem based on the new dataset.

Figure D.2.

Figure D.2. Result for the Linear elasticity problem using a dataset containing non-homogeneous deformations. (A) Evolution of the mean and standard deviation of the absolute value of the best training fitness over the model discovery runs $(n = 50)$. Comparison between the deformed configurations (B), (D) for $\epsilon \in \{0.01, 0.02\}$ (uniaxial tension) and (F), (H) for $a \in \{0.03, 0.04\}$ (non-homogeneous deformations, see equation (D.2)) corresponding to the best model at the end of the model discovery stage and the associated test data (C), (E) and (G), (I), respectively. The magnitude of the displacements has been scaled by a factor of 5 for clarity. Orange meshes represent the reference configurations.

Standard image High-resolution image

Footnotes

  • This primitive is needed to sample the curvature of the rod axis at the interior nodes of the primal complex, since it is not well-defined at the boundary nodes. Curvature is a fundamental ingredient of the elastic energy of a rod [31].

Please wait… references are loading.