Physics-informed Deep Learning for Simultaneous Surrogate Modelling and PDE-constrained Optimization

We model the flow around an airfoil with a physics-informed neural network (PINN) while simultaneously optimizing the airfoil geometry to maximize its lift-to-drag ratio. The parameters of the airfoil shape are provided as inputs to the PINN and the multidimensional search space of shape parameters is populated with collocation points to ensure that the Navier–Stokes equations are approximately satisfied everywhere in the search space. We use the fact that the PINN is automatically differentiable to calculate gradients of the lift-to-drag ratio with respect to the parameter values. This allows us to use the L-BFGS gradient-based optimization algorithm, which is more efficient than non-gradient-based algorithms. We train the PINN with adaptive sampling of collocation points, such that the accuracy of the solution is enhanced along the optimization trajectory. We demonstrate this method on two examples: one that optimizes a single parameter, and another that optimizes eleven parameters. The method is successful and, by comparison with conventional CFD, we find that the velocity and pressure fields have small pointwise errors and that the method converges to optimal parameters. We find that different PINNs converge to slightly different parameters, reflecting the fact that there are many closely-spaced local optima during optimization. The PINN can also rapidly and accurately predict flow fields for any parameter values within our design space and offers a simple powerful alternative to surrogate models trained on data. This method can be applied relatively easily to other optimization problems and avoids the difficult process of writing adjoint codes. As knowledge about how to train PINNs improves and hardware dedicated to neural networks becomes faster, this method of simultaneous training and optimization with PINNs could become easier and faster than using adjoint codes.


Introduction
A partial differential equation (PDE) solver maps from a set of parameters (e.g.airfoil shape, Reynolds number, Mach number) to a set of field variables (e.g.velocity, pressure, temperature), and optionally to a performance measure (e.g.lift-to-drag ratio).For some applications, knowledge of the field variables is sufficient.Often, however, the user would like to optimize the performance measure by changing the parameters, for example when optimizing an airfoil shape for drag minimization [1], acoustic metamaterial geometry for sound wave control [2], acoustic horn shape [3], or current-carrying multi-cables [4].Optimization could be achieved with a non-gradient-based optimization method, but only after a large number of direct evaluations of the performance measure.If there are many parameters, the number of direct evaluations becomes prohibitively large.Gradient-based optimization is more efficient.For this, the gradient of the performance measure is calculated with respect to the field variables, whose gradients are in turn calculated with respect to the parameters.In a single calculation, the chain rule of differentiation then gives the gradient of the performance measure with respect to all parameters.This gradient information is then used to converge efficiently to a local optimum.
The calculation of the gradient either requires a new code to solve for adjoint variables or requires that the original PDE solver be differentiable.Most, if not all, PDE solvers need significant adaptation to be differentiable.In many cases, particularly for commonly used PDE solvers, this adaptation is impractical.Neural networks (NN), on the other hand, are differentiable by design.Usually their performance measure is the discrepancy between a measurement and the NN's prediction, given its current parameters.The gradient of this discrepancy with respect to the parameters is automatically calculated and is then combined with a gradient descent algorithm in order to reduce this discrepancy.Being universal function approximators, NNs can be trained to be surrogate models for a PDE solver [5].Further, they can be constructed to calculate the performance measure, together with all its gradients with respect to the parameters.In this paper we investigate whether we can train a neural network to be a surrogate model for a PDE solver, while simultaneously using the fact that the NN is automatically differentiable to solve an optimization problem for a performance measure that is constrained by that PDE.We choose shape optimization of an airfoil because this is particularly challenging to achieve with standard mesh-based PDE solvers.The method we investigate is general and could be applied widely in PDE constrained optimization problems.

PDE-constrained Shape optimization
Many engineering problems require solutions of PDEs on complex geometries, for which there is no analytical solution.Numerical solutions are therefore sought, often using finite difference, finite element, or finite volume methods, which require the computational domain to be meshed.Some meshless methods exist but they have limitations.For example, spectral methods require simple geometries, smooth particle hydrodynamics (SPH) suffers from low convergence rates and a lack of turbulence modeling [6], Element Free Galerkin (EFG) methods need a background mesh to evaluate the stiffness matrix and field variables at nodal locations [7], and the Reproducing Kernel Particle Method (RKPM) is not applicable to time-dependent functions because it require symmetric kernels and is therefore limited to spatial functions and their derivatives [8].Most PDE solvers therefore use meshes, with the field variables evaluated at points or within regions on the mesh.
Gradient-based optimization methods require the gradients of the field variables and/or performance measure to be calculated with respect to the PDE's parameters.In the continuous adjoint approach (also known as "optimize then discretize"), the gradient of the performance measure with respect to the parameters is calculated by constructing a constrained optimization problem with a Lagrangian functional, and re-arranging it to form the continuous adjoint PDEs [9,10,11].These are usually solved on the same mesh as the direct PDEs.The continuous adjoint approach works, but requires a new code with new boundary conditions, which tends to have different stability properties to the direct code.In the discrete adjoint approach (also known as "discretize then optimize"), the gradient of the performance measure with respect to the parameters is calculated through successive applications of the chain rule of differentiation, through the intermediate steps in the solution of the direct PDE.One way to calculate the derivatives at the intermediate steps is with symbolic differentiation using software such as Maple, Maxima, Mathematica, and Theano.This creates analytical derivatives which give valuable insight but are inefficient at runtime [12], particularly because the derivative expressions are usually much larger than the original expressions.Alternatively, these gradients can be calculated with automatic differentiation using software such as Tapenade and JaX.This computes numerical values of derivatives using symbolic rules of differentiation, as before.However, it calculates and stores these derivative values only around the solution to the PDE, rather than as general symbolic representations of the derivatives around any solution to the PDE.Automatic differentiation is therefore quicker than symbolic differentiation and ideally suited to gradient-based optimization, for which the only gradient required is that around the solution to the PDE.The discrete adjoint approach works well, but requires a direct code that has been designed to be differentiable.
The challenges of writing differentiable codes on mesh-based PDE solvers are exacerbated when the boundaries change shape because this causes the mesh nodes to move.In mesh-based PDE solvers, the field variables are held at points on the mesh.For accurate derivatives in the discrete adjoint approach, one therefore needs to calculate the gradients of the field variables with respect to the mesh positions, as well as with respect to the parameters.With either approach, one also has to introduce regularization (smoothing) of the boundary in order to avoid oscillations.There are two popular methods to model boundary deformation: free form deformation [13,14], and the level set method [15,16], both of which require regularization.Further, if the boundary deforms significantly, the mesh quality degrades and frequent re-meshing is required [17].These challenges can be overcome but, again, significant code development is required.The challenges described above can be avoided by using a mesh-free surrogate model to solve the PDE approximately.One option is to train a surrogate model on the field variables calculated with solutions from a PDE solver at several different parameter values [18].This surrogate model is used for optimization because it is cheaper than the PDE solver.This, however, requires a large amount of training data.For instance, Chen et al. trained a deep neural network with high-fidelity datasets to build a surrogate model for 2D laminar flow around airfoils [19].Then they optimize the airfoil geometry to minimize drag.A different option, explored in this paper, is to construct a surrogate model that satisfies the PDE at a cloud of points in the domain, and to simultaneously optimize as the model is trained.

Physics-informed Neural Networks
Deep neural networks are general models that can be trained to solve scientific computing problems [20,21].A typical deep neural network consists of a multi-layer stack of simple modules.Each module computes nonlinear input-output mappings and this transformation increases both the selectivity and the invariance of the representation [22].With multiple nonlinear layers, a deep learning model can approximate complicated functions [23] and, with proper training, any universal function [24].The optimal parameters of a deep neural network are identified by minimizing a loss function, which quantifies the difference between the model and the expected outputs.Gradientbased optimization is used, which requires the gradients of the loss function with respect to the neural network's parameters.These are computed automatically with the backpropagation algorithm [25], which is more efficient than alternatives such as the finite-difference method [26,12].Note that backpropagation is a subset of reverse-mode automatic differentiation.
Neural Networks are usually trained on labelled data (i.e., inputs and their expected outputs).Labelled data, whether acquired through experimental measurements or numerical simulations, is costly.On the other hand, a Physics-informed Neural Network (PINN) is trained by minimizing the deviations from the governing equations, initial conditions and boundary conditions at given points, known as collocation or sampling points [27].This avoids the need for labelled data.A standard PINN has two components: a surrogate/approximator network and a residual block [28].Typically, the surrogate network is a fully-connected feed-forward neural network.The residual block contains an embedded governing physics equation, whose residues are calculated by taking the output of the surrogate network.The weights and biases of the surrogate network are identified by minimizing the values of the residual block.Figure 1 shows how the steady-state Navier-Stokes equations are encoded in the residual network.
A PINN performs well at inference or prediction [29] because it can successfully approximate complex, nonlinear, and high-dimensional functions.This provides a gridless alternative to traditional mesh-based numerical solvers and avoids the need for mesh generation, which is usually a significant burden in mesh-based CFD calculations.A PINN can be trained to approximately satisfy strong form PDEs at the collocation points.This PINN can then automatically compute derivatives of the model outputs with respect to the model inputs, which can be used for optimization.
PINNs were introduction by Raissi et al. [30] and have since been extended to solve forward and inverse problems with several different PDEs.As well as solving for the velocity and pressure fields alone [31], PINNs have been used to simulate vortex-induced vibrations [32] and to tackle ill-posed inverse fluid mechanics problems [20].Furthermore, PINNs have be used to de-noise and obtain super-resolution of 4D flow magnetic resonance imaging [33], as well as to predict near-wall blood flow and quantify wall shear stress using sparse measurement data [34].More recently, PINNs have been used to simulate turbulence directly with incomplete or noisy boundary conditions [35].

PINNs for PDE-constrained shape optimization
In this paper we propose a physics-informed deep learning framework that models steady incompressible flow around multiple sets of airfoils geometries.This framework is combined with quasi-Newton optimization to identify the optimal airfoil parameters for a given objective function.For flow around an airfoil, an objective function typically depends on the aerodynamic properties and the physical location of the boundary.Instead of a mesh, sample points are generated on the airfoil surface and within the computational domain.The loss function is the sum of the mean squared residuals of the governing partial differential equations calculated at these sample points, combined with boundary conditions.During the optimization process, the computational domain changes as the shape of the airfoil changes.Usually, physics-informed neural networks obtain solutions only for a single computational domain with a fixed geometry.This makes them unsuitable for rapid explorations of variable airfoil geometries.To overcome this challenge, we propose a new PINN architecture by constructing a heterogeneous input layer, which takes several design parameters along with collocation points as inputs.This extended input layer equips the PINN with a mechanism to extract geometric features of input computational domains and consequently can output flow fields for various computational domains.After being trained over the set of geometries only once, this PINN can reliably predict the flow fields associated with varying airfoil shapes.We use the trained PINN as a surrogate model in the optimization process.
Although many prior studies have obtained approximate solutions of PDEs using PINNs, they have not used physics-informed deep learning to perform optimization.First, we evaluate the potential of PINNs to approximate the solutions of Navier-Stokes equations (NSEs).Our goal is to develop a new class of physics-based optimization approach, combining deep learning models with standard optimization algorithms such as the limited-memory Broyden-Fletcher-Goldfarb-Shanno optimization (L-BFGS) [36].Second, we propose a physics-informed training method.To the best of our knowledge, existing PINNs have collocation points as the only inputs.Here we combine collocation points with a set of design parameters and input this composite data into a PINN.The trained PINN is able to model flows around a geometry defined by the design parameters within or close to the training range.The main strength of our PINN is that it represents the flow field for general geometries continuously.This makes the PINN particularly attractive for shape optimization because it avoids the need for separate training for each geometry.In this study, we apply this method to optimization of airfoils at low Reynolds number.The method is general, however, and could be used for other optimization problems such as vehicle shape optimization or process optimization for chemical productions.
In section 2, we introduce the governing equations and the PINN architecture.In section 3 we use the PINN to solve the Navier-Stokes equations and to optimize a single-parameter problem: the angle of attack of an airfoil.In section 4, we apply the proposed approach to solve a multi-parameter shape optimization problem.In Section 5 we conclude and outline future challenges.

Methodology
We solve a minimization/maximization problem of an objective function constrained by a set of partial differential equations: max (min) where  (, ) ∈ R is an objective function, which will be defined in the next section.The PDE constraint   (, ) = 0 enforces the state variable  over -dimensional space  ∈ R  , where  represents the number of constraints.The state variable  is a function of a design variable Γ and spatial variables .Design variables Γ ∈  live in the underlying high-dimensional parameter space  ⊂ R  .The spatial domain is labelled The governing PDEs encoded in the PINN are the steady, incompressible Navier-Stokes equations for a Newtonian fluid: where  is the density,  the dynamic viscosity, (, ) is the velocity vector, (, ) the pressure field, and ,  ∈ Ω the spatial coordinate.Boundary conditions (BCs) can be expressed as where B is a precisely known operator determining boundary conditions on the boundary Ω.
In this study, the PINN is employed as a surrogate model to approximate the velocity and pressure fields as a function of space and of the parameters.We construct fully-connected neural networks by stacking layers of artificial neurons in which each layer   performs an affine transformation on the previous layer with nonlinearity imposed via an element-wise activation function , typically a nonlinear function such as sigmoid, tanh and ReLu [23].
where   ,   are the weight matrices and bias vectors for layer .Our neural network is trained to obtain optimal weights and biases with the Adam algorithm, a variant of stochastic gradient descent.We use a loss function that includes the residuals from both the momentum and continuity equations: where  weights the residue arising from the continuity equation with that from the momentum equation.Boundary conditions are satisfied by minimizing The total loss is the sum of the residue of the governing equations and the residue due to deviations from given boundary conditions, weighted by a hyperparameter : The optimal weights and biases for each layer,  *  ,  *  , are found by minimizing the total loss L. We quantify losses with the  2 norm.
The quality, and thus the predictive power, of a PINN depends on its network structure.Figure 1 shows a diagram of this PINN.A point cloud is established in the domain and on the boundaries.The spatial coordinates of this point cloud are the inputs of the fully-connected neural network, and the outputs are the velocity, (, ), and pressure, , at those points.The main characteristics of the PINN are: (i) the input layer has thirteen neurons, two of which represent the  and  coordinates of a collocation point, and the rest of which account for the eleven PARSEC design parameters; (ii) the network is fully connected with 20 hidden layers and 50 neurons per layer; each neuron is connected by affine linear maps between neurons in neighbouring layers and with nonlinear activation functions within neurons; (iii) the output layer contains three neurons; (iv) the sigmoid activation function (() = 1/1 +  − ) is used; (v) the Adam optimizer is used; (vi) the loss function consists of the mean squared error of the PDE residual and boundary conditions, regularized with the squared residual evaluated at interior collocation points.The PINN is constructed with the open-source software TensorFlow [37].

Physics-informed Training
Figure 2 shows the key components to construct a surrogate model for adaptive optimization.First, a set of design parameters is chosen from the feasible space.No prior knowledge is assumed, apart from the parameters' lower and upper bounds.Sample points are randomly generated, for which the first two dimensions are for the spatial coordinates First, an initial guess for the parameters, together with the spatial data points, are used as inputs to the neural network.Second, the flow field is approximated using the forward pass of the PINN to construct a map between the parameter inputs and the flow outputs.Third, the objective function is computed and its gradient with respect to the design parameters is obtained.Fourth, the parameters are updated and the process repeated in order to maximize the objective function.
of the collocation points and the remaining dimensions are for the design parameters.The rationale behind these high-dimensional points is to save computational cost.Because neural networks interpolate well, we can use PINNs to approximate flow fields over a continuous range of designs.Instead of treating each design separately and sampling spatial points (2D points with ,  coordinates in this study) around each of them, we consider all designs together and train the PINN using data from the entire design space.Thus training data is generated by associating each spatial point with expanded dimensions accounting for design dimensions.Theoretically, each of these sample points represents a possible design.Figure 12 shows the high-dimensional training data composed of spatial coordinates and design parameters for the multi-parameter optimization example in Section 4. For a given airfoil, we therefore construct a point cloud between its boundary and the boundary of the computational domain.Points on the airfoil surfaces represent the geometry of the no-slip boundaries.Representing multiple airfoil shapes as point clouds instead of meshes can dramatically lower the memory requirement for calculations.
Second, these sample points are used as inputs of the PINN.A surrogate model is constructed based on the forward pass of the PINN by minimizing the PDE residuals measured at the sample points.Third, the flow variables (, , ) outputted from the surrogate model are used to compute the objective function values.Back-propagation through the PINN is used to calculate the gradient of the objective function (e.g.lift/drag ratio) with respect to the design parameters.In this algorithm, the gradient calculation proceeds backwards through the deep neural network and this backward flow of information facilitates efficient computation of the gradient at each layer [23].Furthermore, reverse-mode automatic differentiation enables us to compute the gradient of a loss function with respect to the design parameters.This method is -times more efficient than a finite-difference method, where  represents the number of parameters.Fourth, the design parameters are updated with a quasi-Newton algorithm.The process is repeated.Training is performed on a NVIDIA GeForce RTX 3080 graphics card with 16 Gigabytes of RAM.The Adam optimization algorithm is used to find the optimal neural network parameters [38].The training process uses minibatch for loss estimates and gradient computation.The training process starts with a high learning rate of 5.0 × 10 −4 , followed by a reduced value of 10 −6 at a later stage.

Quasi-Newton optimization
The limited-memory BFGS (L-BFGS) gradient-based optimization algorithm is employed to optimize the design parameters.The L-BFGS uses the first derivative of the objective function  with respect to the parameters in order to estimate the second derivatives (Hessian) (refer to Equation 15 in Table 2, and thereby speed up optimization.The limited memory form can be implemented particularly efficiently [39].The steps in this quasi-Newton algorithm are shown in Table 2.A maximum line search step length of 0.02 is employed to control the magnitude of the design variables and to avoid non-physical deformation of airfoil shapes.The optimization is terminated when the stopping criteria (the step or function change is less than the tolerance of 10 −6 ) are satisfied, or the maximum iteration limit of 50 is reached.

Single Parameter Optimization
The method is demonstrated first on laminar ( = 20) flow over a Joukowski airfoil at different angles of attack.A Joukowski airfoil is generated in the complex plane (-plane) by applying the Joukowski transform to a circle in the -plane, i.e.,  =  + 1  .The horizontal spatial coordinates of the generated airfoil are normalized to [0, 1], running from the leading to the trailing edge.It is placed in a rectangular computational domain Taking a fluid density of  = 1.0 kg/m 3 and the dynamic viscosity of  = 0.01 kg/(m s), the fluid is characterised by the steady Navier-Stokes equations (2).The computational domain and boundary conditions are shown in Figure 3.A parabolic inlet velocity condition is used: where  is the height of the rectangular domain, (, ) are the horizontal and vertical velocity components,  ∞ is the maximum velocity of the inlet flow.No-slip conditions are imposed on the airfoil surface.A free boundary condition is imposed on the outflow, which takes the form where I is the 2 × 2 identity matrix,  is the pressure and n denotes the outward unit normal to the rightmost boundary.For a maximum velocity of  ∞ = 0.3, the parabolic profile results in a mean velocity Ū = 0.2. Figure 1 shows the architecture of the fully connected network, which contains 10 hidden layers and 50 neurons per layer.The input layer consists of three neurons, taking realizations from the physical domain (i.e., , ) and a single optimization variable, the angle of attack (AOA).There are three neurons in the output layer, representing flow velocity (, ) and pressure ().Many conventional approaches require, for every angle of attack, training with new collocation points in order to find the optimum angle of attack.To explore the design space completely, training would be required at many angles of attack, which would be computationally inefficient.
Instead, here we combine the collocations points and the angles of attack together into a single point cloud to create a heterogeneous exploration space in which the first two dimensions are the collocation points and the third dimension is the angle of attack, shown in Figure 4.When training the PINN, each batch contains 20,400 randomly-sampled points from this three dimensional space, of which 18000 points are sampled inside the computational domain (including the airfoil surface) and 2400 are boundary points, generated using Latin hypercube sampling at the other boundaries (wall, inlet and outlet).5000 collocation points are sampled near the airfoil surface in order to constrain the flow there more.Building up a surrogate model successively by considering all angles of attack simultaneously greatly reduces the computation time compared with considering each angle of attack separately.The loss history during training is shown in Figure 5, showing residual data every 100 iterations.This contains the total residual, the PDE residual (Equation 5) in the domain, and the PDE residual at the boundaries.Figure 5 shows that the predicted results obey all boundary conditions well.The PINN is trained on collocation points over a wide range of angles of attack (AOA= 0 − 50 • ).After training, the PINN can accurately predict the flow fields in this training range, as illustrated in Figure 6 for cases of AOA= 10 • , 20 • , 30 • , 40 • .The velocity (, ) and pressure fields () obtained from PINN predictions agree well with the numerical results from FEniCS, an open-source computing platform for solving partial differential equations [40].In FEniCS, the computational domain is discretized with a mesh of triangular elements; Taylor-Hood elements are employed to build the test and trial function spaces.This trained PINN can predict flows successfully over airfoils at any angle of attack within the training range, not just the four illustrative cases shown in Figure 6.Further, this figure also shows the discrepancy between the PINN predictions and CFD simulations.These error contours show the pointwise error distribution within the computational domain.Here we calculate the relative error using where  is the quantity of interest and  = 10 −4 is a small number to avoid a zero denominator.For the velocity fields, the maximum error is observed at the leading edge of the airfoil for all angles of attack because the velocity field has highest gradients there and the point cloud is not sufficiently fine to resolve it well.This has also been found in the neural network-predicted flows around a NACA 0028 airfoil [41].The PINN used here predicts the velocity fields (maximum error 10 −2 ) more accurately than the pressure field (maximum error 10 −1 ).The discrepancy in the pressure field is located in a small region at the trailing edge but, as will be shown next, this has little influence on the lift-to-drag ratio, which is the objective functional for optimization.
For an airfoil with fixed shape at a given Reynolds number, the angle of attack determines the lift-to-drag ratio.In this example, the objective is to maximize the lift-to-drag ratio by changing the angle of attack.The drag and lift forces are calculated from the pressure and velocity fields taken from the PINN predictions.The lift force   and drag force   are computed as where Here  and  are the axial and normal forces, parallel and perpendicular to the airfoil chord line,  represents the angle of attack and τ is the magnitude of the wall shear stress,  is the angle between the airfoil surface and the chord line, taking a positive value in the clockwise direction.The differential section  of the perimeter runs from the trailing edge to the leading edge over the top surface ( ) and  runs in the opposite direction over the bottom surface ().Using these equations, we can compare the lift force   and drag force   between PINN predictions and FEniCS simulations, shown in Figure 7.As required, there is good agreement between the two around the area of interest, which is the angle of attack at which the lift to drag ratio is maximum.
Having confirmed that the PINN can compute the lift to drag ratio sufficiently accurately for our purposes, we restart the training and optimization process, using a gradient descent algorithm to find the AOA that maximizes the lift to drag ratio.The gradient of the lift-to-drag ratio with respect to the design parameter is obtained with reverse-mode automatic differentiation, which is a natural feature of a PINN and requires no further coding.Figure 9 shows the lift-to-drag ratio and its gradient during this optimization process, which starts from AOA=5 • .The optimization process is successful.The largest lift-to-drag ratio of 0.92 is found at AOA=30.7 • where the gradient and reaches its stop criteria, i.e., the gradient is less than 10 −4 .
To ensure that the optimization results are reliable, the discrepancy between the PINN results and the CFD results is calculated again for the optimal AOA. Figure 10 shows that the flow fields predicted with the PINN agree well with those from the CFD results, as evidenced by the small discrepancy.The PINN outputs a flow field immediately for a given AOA and, although approximate, is sufficiently accurate to be useful for this optimization process.

Multi-parameter Optimization
In this section, the PINN is applied to a multi-parameter optimization problem: finding the set of 11 PARSEC parameters that maximize the lift-to-drag ratio of an airfoil.The PARSEC parameterization translates Cartesian coordinates into 11 parameters (Γ 1 , Γ 2 , . . ., Γ 11 ) to define an airfoil shape [42,43].These parameters represent the physical characteristics of a given airfoil, such as the leading edge radius, curvature, thickness, and maximum thickness.PARSEC parameterization is a physically intuitive method because it enables a designer to link typical airfoil parameters to the real characteristics of a given airfoil, such as leading edge radius, airfoil thickness or trailing edge angle.Figure 11 shows the PARSEC parameterization definition.The physical meanings of the symbols are summarized in Table 1.The airfoil geometry characterized by PARSEC parameterization can be defined using the following polynomial functions where:  is the horizontal coordinate normalized by chord length to become [0, 1];    and   are the vertical coordinates of the upper and lower surface respectively;    and   are the coefficients computed from the eleven PARSEC parameters.Details of the determination of these parameters can be found in [44].As shown in Figure 12, each training point has 13 dimensions: two for the ,  coordinates of the collocation points and 11 for the PARSEC parameters.In the neural network design, we therefore need to use 13 neurons in the input layer to represent the thirteen-dimensional sample points.The upper and lower limits of the 11 PARSEC parameters used for training are shown in Table 5.A wide range of airfoil shapes can be created using this method.The batch size is set to 30,000, including 8000 points on the boundaries.Instead of simply relying on 12,000 training points in the computational domain, we augment the training batch by adding 10,000 refinement points to achieve better accuracy in regions with large errors.This adaptive sampling strategy is used in order to improve the training efficiency.We add additional sample points for every 100 iterations according to the distribution of PDE residuals.The number of added points is proportional to the residual level, which measures the errors in the satisfaction of the Navier-Stokes equations.Consequently, a higher density of training points is used in regions with poor predictions.The regions with large number density carry more weight in the overall loss function.This therefore improves predictions where the residuals were high.This strategy shares some similarities with importance sampling, which has been proved to facilitate training [38].During network training, points generated from a Gaussian distribution are added to the training dataset.To avoid a GPU memory leak, the size of adaptively added points is capped at 10,000.
This adaptive sampling strategy facilitates training and improves accuracy of the PINN.The progressive improvement of the flow field predictions can be seen in Figure 13.The first four columns show the flow fields obtained from PINN predictions as the iterations progress, and the last plots are the numerical results from CFD.In the early stage of training, the PINN gives poor predictions of the flow fields.Figure 14 shows how, as the training continues, new sample points (pink and blue dots) are created around the existing training points that have high residuals (red dots).Since additional training points are concentrated in areas with large residuals, the error can be reduced efficiently during the optimization process.Further training leads to better satisfaction of the Navier-Stokes equations, and thus a reduced number of added points (Figure 14 b)).Figure 15 shows the pointwise residuals in the continuity and momentum equations for two different airfoils with Γ 1 = 0.014 and Γ 1 = 0.017, respectively.These residuals are ∼ 10 −4 to 10 −3 , showing that the pressure and velocity fields predicted by the PINN satisfy these governing equations well.These small residuals arising from the flow fields for two different airfoil shapes demonstrate that the incorporation of the Navier-Stokes equations into the surrogate model enables the PINN to make physical predictions.
To illustrate the PINN's suitablility for optimization of lift to drag ratio, flows around the airfoils with the range of shapes defined in Table 4 are calculated.Figure 16 compares the flow fields from the PINN with those from the CFD modelling of four different airfoils.The relatively good predictions assure us that the PINN can output reliable flow fields for arbitrary airfoil shapes, if close to those examined.This is essential for calculation of the objective function during the optimization process.Here the objective function is the lift-to-drag ratio, and the PINN calculates the gradient of this with respect to the PARSEC parameters.The predicted lift and drag forces, compared against the FEniCS calculations, are shown in Figure 17 for the test airfoils in Table 4.These are predicted to within 1% by the PINN.
Figure 18 shows the lift to drag ratio during the last 11 iterations of the optimization process, during which the lift to drag ratio reaches its maximum at 0.761.The flow around the optimized airfoil is compared with the CFD results using FEniCS in Figure 19.The optimal airfoil has higher thickness and a longer suction surface than the initial guess.The optimization tends to increase the airfoil thickness for a higher lift.This trend is also observed in the optimization of the NREL S809 airfoil at  = 2 × 10 6 using a genetic algorithm [44].
The optimization process starts from an initial guess, updates the PARSEC parameters using computed gradient information, and gradually moves towards a local optimum.Our surrogate model is advantageous and computationally inexpensive because it does not explore solutions in the entire high-dimensional design space.This strategy is especially suitable for high-dimensional problems, or when the intermediate simulations are computationally expensive.For low-dimensional problems, this method can be further improved by adopting subset selection methods, which show promising performance because they can select the most important features and identify the corresponding model parameters [18].Another point to emphasize is that, while we are interested in generating a good surrogate model capable of accurately predicting the optimum, the constructed model is still able to make reasonably good, if not     perfect, predictions of flow fields for intermediate cases, as illustrated in Figure 16.
The results shown above are obtained with gradient-based optimization from a single starting point.We now perform the same analysis from ten different starting points.These starting points are randomly generated within the design space, shown in Table A.8 in the appendix.The achieved optimization results, i.e., optimal PARSEC parameters and achieved lift-to-drag ratios are shown in Table 6.These ratios are close to the desirable value of 0.761 but are not quite the same, showing that the process has found ten local optima.Nevertheless, most cases yield optimal lift-to-drag ratios between 0.75 and 0.761.This consistency, despite starting from randomly-selected initial PARSEC parameters, indicates that the process has converged to the region of a global optimum and that the small differences in / arise from closely-spaced local optima in this region.
For comparison, we also perform CFD simulations for 12,000 airfoils with different PARSEC parameters, some of which are sampled around the optimal parameter set obtained from the PINN, the rest of which are randomly sampled within the design space.Figure 20 shows the lift-to-drag ratio for these airfoils, as a function of two of the PARSEC parameters (Γ 2 , Γ 3 ).Most of the candidate airfoils have lift-to-drag ratios larger than 0.6 because the sample points were taking near the optimal PINN design, which gives 0.761.Some airfoils have very poor aerodynamic performance, however, as indicated by lift-to-drag ratios less than 0.4.The maximum value is 0.757, which is close to the PINN optimization result of 0.761.Table 7 compares optimal PARSEC parameters using PINNs with the best found from the 12, 000 CFD simulations.The parameters are close to each other, showing that the PINN optimization has performed well.Slight differences are to be expected because the solution method is different.

Conclusion
In this paper, we develop and apply a physics informed neural network (PINN) that simultaneously calculates the flow around an airfoil and optimizes the airfoil's parameters to maximize the lift to drag ratio.The framework represents the geometries using mesh-free point clouds in a high dimensional space.This approach can handle complex geometries.We accelerate the training with adaptive sampling, in which additional sample points are added to the regions with large errors in order to improve the accuracy of predictions.We use the L-BFGS algorithm, and   provide shape-sensitivities obtained by automatic differentiation of the PINN.We thereby simultaneously converge to a local optimum while improving the accuracy of the solution.Although the neural network is trained on a limited number of collocation points, most of which are concentrated around the optimization trajectory, the PINN can reliably predict the flow over a wide range of parameters.We demonstrate this method on two examples: one that optimizes a single parameter, and another that optimizes eleven parameters.The method is successful and, by comparison with conventional CFD, we find that the velocity and pressure fields have small pointwise errors and that the method successfully finds the optimal parameters.Nevertheless, we find that different PINNs converge to slightly different parameters, reflecting the fact that there are many closely-spaced local minima during optimization.This approach combines two strengths of deep neural networks: the ability to approximate a high dimensional function well, and the automatic differentiability of that function with respect to its inputs.The first strength means that the surrogate model can approximate the flow at any point in parameter space.In principle, it could be trained at all points but, because we are interested in simultaneous training and optimization, we generate most of the training points close to the optimal trajectory.If instead we were to generate collocation points uniformly, we could train a surrogate model that predicts the flow field at any parameter values within our search space.With our approach, the PINN attempts to solve the Navier-Stokes equations simultaneously for all possible design parameters.This technique offers a powerful alternative to traditional surrogate models, which require expensive simulation data for training and may generate unphysical solutions.The second strength avoids the need to write an adjoint code.This saves development effort because adjoint codes, whether continuous or discrete, are challenging to construct.The concept in this paper can therefore be applied relatively easily to other optimization problems and is particularly appealing in applications where labelled data is scarce or expensive.Future work is needed to explore how this method would cope with several dozen parameters and to investigate other ways to optimize the loss function of the PINN.As our knowledge of how to efficiently train PINNs increases, and the hardware to train them becomes faster, this method of simultaneous training and optimization could become easier and faster than using adjoint codes.

Figure 1 :
Figure 1: A schematic of the network architecture used for constructing a PINN-based surrogate model with a set of extra input design parameters (Γ 1 , Γ 2 , . . .Γ  ).This network approximates residuals of the steady-state Navier-Stokes equations.The input collocation points (, ) represent a flexible and mesh free representation of the domain.After training, the model approximates flow variables (, , ) at any new input points in the computational domain, as well as their derivatives with respect to the design parameters.

Figure 2 :
Figure 2: The procedures of physics-informed training and parameter optimization based on backpropagation.First, an initial guess for the parameters, together with the spatial data points, are used as inputs to the neural network.Second, the flow field is approximated using the forward pass of the PINN to construct a map between the parameter inputs and the flow outputs.Third, the objective function is computed and its gradient with respect to the design parameters is obtained.Fourth, the parameters are updated and the process repeated in order to maximize the objective function.

Figure 3 :
Figure 3: Boundary conditions for the flow past an airfoil.

Figure 4 :
Figure 4: Sampled collocation points along airfoil surfaces at angle of attack  = 0 − 60 • .Different colors represent different angles of attack.

Figure 5 :
Figure 5: Different categories of residual as a function of iteration during training: (a) total residuals, (b) PDE, (c) inlet boundary, (d) outlet boundary, (e) top and bottom wall , and (f) airfoil surface.Boundary conditions can be found in Figure 3.

Figure 7 :
Figure 7: (a) Lift-to-drag ratio, (b) lift and (c) drag calculated with the PINN (blue square) and CFD (red dot).(d) The discrepancy between lift-to-drag ratio between the PINN and that computed with CFD.

Figure 8 :
Figure 8: A schematic diagram of the process used to find the optimal angle of attack.

Figure 9 :
Figure 9: Lift-to-drag ratio (blue) and its gradient (green) during the simultaneous PINN training and optimization process.

Figure 10 :
Figure 10: Flowfields calculated from the PINN and from CFD simulations at the optimal AOA of 30.7 • .The relative error is small.

Figure 12 :
Figure 12: High-dimensional inputs for the PINN.(a) The upper table shows input parameters for each training point 1 to  , and an illustration of the point cloud in ( , )-space for two different training points, which correspond to two different airfoils.(b) The lower figure illustrates airfoil shapes at the different training points.

Figure 13 :
Figure 13: The flow fields predicted by the PINN during the optimization process.

Figure 14 :
Figure 14: Illustration of the adaptive sampling strategy during PINN training at (a) 500 iterations and (b) 2000 iterations.Red points denote existing sample points with total residuals above the preset threshold.Blue points denote points added due to residuals in the momentum equation.Pink points denote points added due to residuals in the continuity equation.

Figure 17 :
Figure 17: (a) Lift-to-drag ratio, (b) lift and (c) drag calculated with the PINN (blue square) and CFD (red dot).(d) The discrepancy between lift-to-drag ratio calculated from the PINN and that from CFD.

Figure 18 :
Figure 18: Lift-to-drag ratios during the final eleven iterations of the optimization process.

Figure 19 :
Figure 19: Flow fields calculated with the PINN and with CFD at (a) the starting point of the optimization process and (b) the optimal point

Table 1 :
Labels and definitions of the PARSEC parameters Figure 11: Diagram of the PARSEC airfoil parameters in Table 1

Table 4 :
PARSEC parameters for the test airfoils with Γ 1 as index.These airfoils are shown in Figure16and Figure17.

Table 5 :
Upper and lower limits of PARSEC parameters.

Table 6 :
Optimized PARSEC parameters and lift-to-drag ratios for ten optimization processes, which started from the ten random initial guesses in TableA.8.
Figure 20: CFD computations of Lift-to-drag ratios for airfoils characterized by various PARSEC parameters.

Table 7 :
Comparison of the optimal PARSEC parameters obtained neural network and FEniCS.