Introduction

Powertrains of the future must meet increasingly stringent requirements for emissions, performance, reliability, onboard monitoring, and serviceability. Capable system models for estimating states and adapting to an individual system’s behaviour are critical elements to meet control and health monitoring needs. Leveraging purely data-driven models to meet these requirements provides simplicity in modelling and captures dynamics difficult to formulate analytically. However, large data needs, poor physical interpretability, challenges with systems with long memory effects and sparse sensing, as well as inability to extrapolate beyond the training datasets present onerous burdens to practical implementation. Relying on purely theory-based models allows for directly interpretable results with higher confidence and fewer data for calibration but often causes a tradeoff of modelling relevant dynamics versus model complexity, challenges in systems with high uncertainties, poor modelling where dynamics are not well understood, and slow solution of higher-order models. Modelling solutions that leverage the strengths of theory-guided as well as data-driven models have the potential to reduce data needs, increase robustness, and effectively use theoretical and practical knowledge of the system.

To investigate model architectures, balancing the strengths of both theory-based models and data-driven models, this work explores the application of physics-informed neural networks (PINNs) to a diesel internal combustion engine model for the purposes of simultaneous parameter and state estimation. The physical portion is based on the mean value model of a diesel engine with a variable geometry turbocharger (VGT), and exhaust gas recirculation (EGR) proposed by Wahlström and Eriksson1.

Physics-informed neural networks (PINNs)2 is a new method of training neural networks, which takes into account the physics of a problem while evaluating the parameters of the neural network. The method is suitable for both evaluation of the solution of PDF (forward problem) and the data-driven identification of parameters of PDF (inverse problem). It takes advantage of automatic differentiation3 in formulating a physical loss in the loss function along with data loss. Jagtap et al.4 proposed conservative PINNs (cPINNs) for conservation laws, which employs domain decomposition with a PINN formulation in each domain. Further, Jagtap and Karniadakis5 introduced domain decomposition for general PDEs using the so-called extended PINN (XPINN). hp-VPINNs is a variational formulation of PINN with domain decomposition proposed by Kharazmi et al.6. Meng et al.7 proposed the Parareal PINN (PPINN) approach for long-time integration of time-dependent partial differential equations. The authors of8 proposed “separable” PINN, which can reduce the computational time and increase accuracy for high dimensional PDEs. In PINN, there are multiple loss functions, and the total loss function is given by the weighted sum of individual losses. McClenny and Braga-Neto9 proposed a self-adaptive weight technique, which is capable of tuning the weights automatically. PINN and its variants were also considered in various inverse problems like supersonic flows10, nano-optics and metamaterials11, unsaturated groundwater flow12. Detailed reviews of PINN can be found in13,14,15.

Modelling of diesel engines using neural networks has been considered in the past. Biao et al.16 considered Nonlinear Auto-Regressive Moving Average with eXogenous inputs (NARMAX) method for system identification of locomotive diesel engines. The model has three inputs to the network, i.e. the fuel injected, the load of the main generator, and the feedback rotation speed (from the output); the outputs are rotation speed and diesel power. The authors considered Levenberg–Marquardt (LM) algorithm to train the network. Finesso and Spessa17 developed a three-zone thermodynamic model to predict nitrogen oxide and in-cylinder temperature heat release rate for direct injection diesel engines under steady state and transient conditions. The model is zero-dimensional, and the equations can be solved analytically. Thus, it required a very short computational time. Tosun et al.18 predicted torque, carbon monoxide, and oxides of nitrogen using neural networks (3 independent networks) for diesel engines fueled with biodiesel-alcohol mixtures. The authors considered three fuel properties (density, cetane number, lower heating value) and engine speed as input parameters and the networks are optimized using the Levenberg-Marquardt method. The authors observed that neural network results are better than the least square method. González et al.19 integrated a data-driven model with a physics-based (equation-based) model for the gas exchange process of a diesel engine. The authors modelled the steady-state turbocharger using a neural network. Further, the authors integrated the data-driven model with an equation-based model. Recently, Kumar et al.20 considered DeepONet21 to predict the state variables of the same mean value engine model1 we considered in this study. The authors consider dynamic data to train the model. However, the model predicts the state variables only at the particular (trained) ambient temperature and pressure, as variations of ambient temperature and pressure are not considered in the training of DeepONet. The model also does not predict the parameters of the engine model. While the model was trained using dynamic data, the physics of the problem was not considered while training the network. The model (DeepONet) is capable of predicting dynamic responses.

In the present study, we formulate a PINN model for the data-driven identification of parameters and prediction of dynamics of system variables of a diesel engine. In PINN, the physics of the system is directly included in the form of physics loss along with data loss. While data-driven models require large amount of data over the entire operational range in training, PINN can be trained with a smaller amount of data as it is trained online. The dynamics characteristic of the state variables is automatically incorporated. PINN may be used for the solution of differential equations or for the identification of parameters and prediction of state variables. In the present study, we are specifically interested in estimating unknown parameters and states when we know a few state variables from field data. The dynamics of the state variables of the mean value engine1 are described by first-order differential equations. We will utilize these equations in the formulation of the physics-informed loss function. The unknown parameters are considered trainable and updated in the training process along with the neural network parameters.

The engine model also considers a few empirical formulae in its formulation. These equations are engine-specific, and the coefficients of these equations need to be evaluated from experimental data. These equations are static in nature, and thus may be trained with smaller data compared to dynamic equations. We know that deep neural networks (DNNs) are universal approximators of any continuous function, thus, DNNs may be considered more general approximators of these empirical formulae. One of the advantages of considering DNNs over empirical formulae is that we do not need to assume the type of non-linearity between the input out variables. The neural network learns the non-linearity if trained with sufficient data. We approximate the empirical formulae using DNNs and train them using laboratory test data. Once these networks are trained using laboratory test data, these are considered in the PINNs model in places of the empirical formulae. During the training of the PINNs model, the parameters of these networks are remain constant.

The training data for the inverse problem and laboratory data are generated using the Simulink file22 accompanied in1. The input to Simulink is taken from actual field data. By doing this, we are trying to generate data as realistic as field data. Furthermore, we also consider noise to the field data generated. We observed that the proposed PINNs model can predict the dynamics of the states and unknown parameters. We summarize below a few of the salient features of the present study:

1.:

We formulated PINNs-based parameter identification for real-world dynamical systems, in the present case, a diesel engine. This is significant as it started a new paradigm for future research for onboard systems for the health monitoring of engines.

2.:

We showed how PINNs could be implemented in predicting important unknown parameters of diesel engines from field data. From these predicted parameters, one can infer the health and serviceability requirements of the engine.

3.:

We showed the importance of self-adaptive weights (given the fast transient dynamics) in the accuracy and faster convergence of results for PINNs for the present study.

4.:

The engine model generally considers empirical formulae to evaluate a few of its quantities. These empirical formulae are engine-specific and require lab test data for the evaluation of the coefficients. We have shown how neural networks can be considered to model the empirical formulae. We have shown how we can train these networks from lab-test data. This is important as it may provide a better relationship for the empirical formulae.

5.:

The field data for the inverse problem are generated considering input recorded from actual engine running conditions. Further, we consider appropriate noise in the simulated data, mimicking near real-world field data.

We organize the rest of the article as follows: in “Problem setup”, we discuss the detailed problem statement and different cases considered for simulation studies. In “Methodology”, first, we discuss PINNs for the inverse problems for the diesel engine and the surrogates for the empirical formula. We discuss a detailed flow chart for the inverse problem for the PINN engine model in “Flowchart for PINN model for diesel engine”. In “Data generation”, we discuss the laboratory data required and their generation for the training of surrogates for the empirical formulae. We also discuss the field data generation for the inverse problem. We present the results and discussion in “Results and discussions”. The conclusions of the present study are discussed in “Summary”.

Problem setup

In this section, we first introduce the mean value model for the gas flow dynamics1 in the diesel engine, and then we will formulate the inverse problems that we are interested in.

As shown in Fig. 1, the engine model considered in the present study mainly comprises six parts: the intake and exhaust manifold, the cylinder, the exhaust gas recirculation (EGR) valve system, the compressor and the turbine. More details on each engine part can be seen in “Appendix 2”. We note that the engine considered here is the same as in1.

Figure 1
figure 1

Schematic diagram of the diesel engine: a schematic diagram of the mean value diesel engine with a variable-geometry turbocharger (VGT) and exhaust gas recirculation (EGR)1. The main components of the engine are the intake manifold, the exhaust manifold, the cylinder, the EGR valve system, the compressor, and the turbine. The control input vector is \(\varvec{u} = \{u_\delta , u_{egr}, u_{vgt}\}\), and engine speed is \(n_e\). (Source: Figure is adopted from1).

To describe the gas flow dynamics in the engine illustrated in Fig. 1, e.g., the dynamics in the manifold pressures, turbocharger, EGR and VGT actuators, a mean value model of the diesel engine with variable geometric turbocharger and exhaust gas recirculation was proposed in1. We will also utilize the same model as the governing equations to describe the gas flow dynamics considered in the current study. Specifically, the model proposed in1 has eight states expressed as follows:

$$\begin{aligned} \varvec{x} = \{p_{im},\;\; p_{em},\;\; X_{Oim},\;\; X_{Oem},\;\; \omega _t,\;\; \tilde{u}_{egr1},\;\; \tilde{u}_{egr2},\;\; \tilde{u}_{vgt}\}, \end{aligned}$$
(1)

where \(p_{im}\) and \(p_{em}\) are the intake and exhaust manifold pressure, respectively, \(X_{Oim}\) and \(X_{Oem}\) are the oxygen mass fractions in the intake and exhaust manifold, respectively, \(\omega _t\) is the turbo speed; \(\tilde{u}_{vgt}\) represents the VGT actuator dynamics. A second-order system with an overshoot and a time delay is used to represent the dynamics of the EGR-valve actuator. The model is represented by subtraction of two first-order models, \(\tilde{u}_{egr1}\) and \(\tilde{u}_{egr2}\), with different gains and time constants. Further, the control inputs for the engine are \(\varvec{u} = \{u_\delta ,\;\; u_{egr}, \;\; u_{vgt}\}\) and the engine speed is \(n_e\), in which \(u_\delta \) is the mass of injected fuel, \(u_{egr}\) and \(u_{vgt}\) are the EGR valve position and VGT actuator positions, respectively. Furthermore, the position of the valves, i.e., \(u_{egr}\) and \(u_{vgt}\), may vary from 0 to 100%, which indicates the complete close and opening of the valves, respectively. The mean value engine model is then expressed as

$$\begin{aligned} \dot{\varvec{x}} = f(\varvec{x}, \varvec{u}, n_e). \end{aligned}$$
(2)

In addition, the states describing the oxygen mass fraction of the intake and exhaust manifold, i.e., \(X_{Oim}\) and \(X_{Oem}\), are not considered in the present study as the rest of the states do not depend on these two states. Also, the parameters of the oxygen mass fractions are assumed to be constant and known. The governing equations for the remaining six states are as follows:

$$\begin{aligned}{} & {} \dfrac{d}{dt} p_{im} = \dfrac{R_a T_{im}}{V_{im}}(W_c+W_{egr} - W_{ei}), \end{aligned}$$
(3)
$$\begin{aligned}{} & {} \dfrac{d}{dt} p_{em} = \dfrac{R_e T_{em}}{V_{em}}(W_{eo} - W_t - W_{egr}), \end{aligned}$$
(4)
$$\begin{aligned}{} & {} \dfrac{d}{d t}\omega _t = \dfrac{P_t\eta _m - P_c}{J_t\omega _t}, \end{aligned}$$
(5)
$$\begin{aligned}{} & {} \dfrac{d\tilde{u}_{egr1}}{dt} = \dfrac{1}{\tau _{egr1}}\left[ u_{egr}(t-\tau _{degr}) - \tilde{u}_{egr1}\right] , \end{aligned}$$
(6)
$$\begin{aligned}{} & {} \dfrac{d\tilde{u}_{egr2}}{dt} = \dfrac{1}{\tau _{egr2}}\left[ u_{egr}(t-\tau _{degr}) - \tilde{u}_{egr2}\right] , \end{aligned}$$
(7)
$$\begin{aligned}{} & {} \dfrac{d\tilde{u}_{vgt}}{dt} = \dfrac{1}{\tau _{vgt}}\left[ u_{vgt}(t-\tau _{dvgt}) - \tilde{u}_{vgt}\right] . \end{aligned}$$
(8)

Two additional equations used for the computation of \(T_{em}\) in Eq. (4) read as:

$$\begin{aligned}{} & {} T_1 = x_rT_e + (1-x_r)T_{im}, \end{aligned}$$
(9)
$$\begin{aligned}{} & {} x_r = \dfrac{\Pi _e^{1/\gamma _a}x_p^{-1/\gamma _a}}{r_c x_v}, \end{aligned}$$
(10)

where \(T_1\) is the temperature when the inlet valve closes after the intake stroke and mixing, and \(x_r\) is the residual gas fraction. A brief discussion on the governing equations of the engine model is presented in “Appendix 2”. Interested readers can also refer to1 for more details.

In the present study, we have field measurements on a certain number of variables, i.e., \(p_{im}\), \(p_{em}\), \(\omega _t\), and \(W_{egr}\) as well as the inputs, i.e., \(\varvec{u}\) and \(n_e\), at discrete times. Further, some of the parameters in the system, e.g., \(A_{egrmax}\), \(\eta _{sc}\), \(h_{tot}\) and \(A_{vgtmax}\), which are difficult to measure directly, are unknown. \(A_{egrmax}\) is the maximum effective area of the EGR valve, \(\eta _{sc}\) is the compensation factor for non-ideal cycles, \(h_{tot}\) is the total heat transfer coefficient of the exhaust pipes and \(A_{vgtmax}\) is the maximum area in the turbine that the gas flows through. From the field prediction of these parameters, we can infer the health of the engine; a higher deviation from their design value may indicate a fault in the system. We are interested in (1) predicting the dynamics of all the variables in Eqs. (3)–(10), and (2) identifying the unknown parameters in the system, given field measurements on \(p_{im}\), \(p_{em}\), \(\omega _t\), and \(W_{egr}\) as well as Eqs. (3)–(10). We refer to the above problem as the inverse problem in this study. Specifically, the following cases are considered for a detailed study:

Case 1:

Prediction of dynamics of the system and identification of 3 unknown parameters \(A_{egrmax}\), \(\eta _{sc}\) and \(h_{tot}\) with clean data of \(p_{im}\), \(p_{em}\), \(\omega _t\), and \(W_{egr}\).

Case 2:

Prediction of dynamics of the system and identification of 3 unknown parameters \(A_{egrmax}\), \(\eta _{sc}\) and \(h_{tot}\) with noisy data of \(p_{im}\), \(p_{em}\), \(\omega _t\), and \(W_{egr}\).

Case 3:

Prediction of dynamics of the system and identification of 4 unknown parameters \(A_{egrmax}\), \(\eta _{sc}\), \(h_{tot}\) and \(A_{vgtmax}\) with clean data of \(p_{im}\), \(p_{em}\), \(\omega _t\), and \(W_{egr}\).

Case 4:

Prediction of dynamics of the system and identification of 4 unknown parameters \(A_{egrmax}\), \(\eta _{sc}, h_{tot}\) and \(A_{vgtmax}\) with noisy data of \(p_{im}\), \(p_{em}\), \(\omega _t\), and \(W_{egr}\).

In the present study, we consider self-adaptive weights9 (discussed in “Methodology” and “Appendix 1”) in our loss function. We study the above four cases using self-adaptive weight. In order to understand the effect and importance of self-adaptive weights in the convergence and accuracy of results, we consider one more case without self-adaptive weight,

Case 5:

Prediction of dynamics of the system and identification of 4 unknown parameters \(A_{egrmax}\), \(\eta _{sc}\), \(h_{tot}\) and \(A_{vgtmax}\) with clean data of \(p_{im}\), \(p_{em}\), \(\omega _t\), and \(W_{egr}\) without self-adaptive weights.

The results of Case 1 and Case 2 are presented in “Appendix 7”. First, we study the results of Case 3 and Case 5 to understand the accuracy and convergence of PINN method and the importance of self-adaptive weights. Then, we study the results of Case 4. The results for Case 3, Case 4 and Case 5 are discussed in “Results and discussions”.

Methodology

We consider to employ the deep learning algorithm, particularly, the physics-informed neural networks (PINNs), to solve the inverse problem discussed in “Problem setup”. To begin with, we first briefly review the basic principle of PINNs, and then we discuss how to employ PINNs for the present inverse problem.

PINNs for inverse problems in the diesel engine

We first briefly review the PINNs2,4,5,6,7 for solving inverse problems, and then we introduce how to employ the PINNs to solve the specific problem that we are of interest for the diesel engine.

Figure 2
figure 2

Schematic of physics-informed neural networks (PINNs) for inverse problems: the left part of the figure, enclosed in the red dashed line, shows a DNN whose input is time. The DNN is to approximate the solution (y) to a differential equation. The top left part of the figure enclosed in the black dashed line shows an DNN whose input is the output y (maybe with other input, e.g. ambient condition). The output of this network is a function g(y). This network is pre-trained with laboratory data of y and g(y). The right part of the figure, enclosed in the blue dashed line, denotes the physics loss/residue. The DNN (enclosed in a red dashed line) approximates the solution to any differential equation, and the equation is encoded using automatic differentiation. The total loss \({\mathcal {L}}(\varvec{\theta })\) includes the loss of equation as well as the data. The \(\lambda _1\) and \(\lambda _2\) are two weights to the data loss and physics loss, which may be fixed or adaptive depending upon the problem and solution method. \(\varvec{\theta } = \{\varvec{W}, \varvec{b}, \varvec{\varLambda }\}\) represents the parameters in DNN, \(\varvec{W}\) and \(\varvec{b}\) are the weights and biases of DNN, respectively and \(\varvec{\varLambda }\) are the unknown parameters of the ODE; \(\sigma \) is the activation function, q(t) is the right-hand side (RHS) of the differential equation (source term), h is the function of the predicted variable, and r is the residual for the equation. \(\varvec{\theta }^P = \{\varvec{W}^P, \varvec{b}^P\}\) represents the parameters in pre-trained neural network, \(\varvec{W}^P\) and \(\varvec{b}^P\) are the weights and biases of the pre-trained neural network.

As illustrated in Fig. 2, the PINN is composed of two parts, i.e., a fully-connected neural network which is to approximate the solution to a particular differential equation and the physics-informed part in which the automatic differentiation3 is employed to encode the corresponding differential equation. Further, \(\varvec{\mathcal {\varLambda }}\) represents the unknowns in the equation, which can be either a constant or a field. In particular, \(\varvec{\mathcal {\varLambda }}\) are trainable variables as the unknowns are constant, but they could also be approximated by a DNN if the unknown is a field. The loss function for solving the inverse problems consists of two parts, i.e., the data loss and the equation loss, which reads as:

$$\begin{aligned} {\mathcal {L}}(\varvec{\theta }; \varvec{\mathcal {\varLambda }}) = \underbrace{\frac{1}{M}\sum ^M_{i=1} |\hat{y}(t_i; \varvec{\theta }) - y(t_i)|^2}_{\text{ data } \text{ loss }} + \underbrace{\frac{1}{N}\sum ^N_{i=1} |r(t_i; \varvec{\theta }; \varvec{\mathcal {\varLambda }})|^2}_{\text{ equation } \text{ loss }} \end{aligned}$$
(11)

where \(\varvec{\theta }\) denotes the parameters in the DNN; M and N represent the number of measurements and the residual points, respectively; \(\hat{y}(t_i; \varvec{\theta })\) denotes the prediction of DNN at the time \(t_i\); \(y(t_i)\) is the measurement at \(t_i\), and \(r(t_i; \varvec{\theta }; \varvec{\mathcal {\varLambda }})\) represents the residual of the corresponding differential equation, which should be zero in the entire domain. By minimizing the loss in Eq. (11), we can obtain the optimal parameters, i.e., \(\varvec{\theta }\), of the DNN as well as the unknowns, i.e., \(\varvec{\mathcal {\varLambda }}\), in the system. In the present study, we have a few empirical equations that we approximate using DNNs. These DNNs are trained first using data and considered in place of these empirical formulae. We fixed the parameters of these networks when we minimized the loss function for the PINN model. Furthermore, note that here we employ the system described by one equation as the example to demonstrate how to use PINNs for solving inverse problems. For the system with more than one equation, we can either utilize an DNN with multiple outputs or multiple DNNs as the surrogates for the solutions to differential equations. In addition, a similar idea can also be employed for systems with multiple unknown fields. The loss function can then be rewritten as

$$\begin{aligned} {\mathcal {L}}(\varvec{\theta }; \varvec{\mathcal {\varLambda }}) = \underbrace{ \sum ^K_{k=1}\left[ \frac{1}{M_k}\sum ^{M_k}_{i=1} |\hat{y}_k(t_i; \varvec{\theta }) - y_k(t_i)|^2\right] }_{\text{ data } \text{ loss }} + \underbrace{\sum ^L_{l=1} \left[ \frac{1}{N_l}\sum ^{N_l}_{i=1} |r_l(t_i; \varvec{\theta }; \varvec{\mathcal {\varLambda }})|^2\right] }_{\text{ equation } \text{ loss }} \end{aligned}$$
(12)

where K and L denote the number of variables that can be measured as well as the equations, respectively; \(M_k\) and \(N_l\) are the number of measurements for the \(k{\text {th}}\) variable and the number of residual points for the \(l{\text {th}}\) equation, respectively; and \(\varvec{\mathcal {\varLambda }}\) collects all the unknowns in the system.

Table 1 Neural network surrogates employed PINNs for solving the inverse problems.

For the inverse problem presented in “Problem setup”, we are interested in (1) learning the dynamics of the six states (2) inferring the unknown parameters in the system, given measurements on \(\{p_{im}, p_{em}, \omega _t, W_{egr}\}\) as well as Eqs. (3)–(10), using PINNs. Specifically, we utilize six DNNs as the surrogates for the solutions to different equations, and the corresponding equations are encoded using the automatic differentiation, as illustrated in Table 1. In addition, the loss for training the PINNs for Case 1 to Case 4 is expressed as follows:

$$\begin{aligned} {\mathcal {L}}(\varvec{\theta }, \varvec{\mathcal {\varLambda }}, \varvec{\lambda }_{p_{im}}, \varvec{\lambda }_{p_{em}}, \varvec{\lambda }_{\omega _t}, \varvec{\lambda }_{W_{egr}}, \lambda _{T_1}) =&{\mathcal {L}}_{p_{im}} + {\mathcal {L}}_{p_{em}} + {\mathcal {L}}_{\omega _{t}} + {\mathcal {L}}_{u_{egr1}} \nonumber \\ {}{} & {} + {\mathcal {L}}_{u_{egr2}} + {\mathcal {L}}_{u_{vgt}} + 10\times {\mathcal {L}}_{x_{r}} + \lambda _{T_1}\times {\mathcal {L}}_{T_{1}} \nonumber \\{} & {} + {\mathcal {L}}^{ini}_{p_{im}} + {\mathcal {L}}^{ini}_{p_{em}} + {\mathcal {L}}^{ini}_{\omega _{t}} + {\mathcal {L}}^{ini}_{\tilde{u}_{egr1}} \nonumber \\{} & {} + {\mathcal {L}}^{ini}_{\tilde{u}_{egr2}} + {\mathcal {L}}^{ini}_{\tilde{u}_{vgt}} + {\mathcal {L}}^{ini}_{x_{r}} + 100\times {\mathcal {L}}^{ini}_{T_{1}} \nonumber \\{} & {} + {\mathcal {L}}^{data}_{p_{im}}(\varvec{\lambda }_{p_{im}}) + {\mathcal {L}}^{data}_{p_{em}}(\varvec{\lambda }_{p_{em}}) \nonumber \\{} & {} + {\mathcal {L}}^{data}_{\omega _t}(\varvec{\lambda }_{\omega _t}) + {\mathcal {L}}^{data}_{W_{egr}}(\varvec{\lambda }_{W_{egr}}), \end{aligned}$$
(13)

where \(\varvec{\theta } = (\varvec{\theta }_1,\ldots , \varvec{\theta }_6)\) are the parameters of all NNs in PINNs, \(\varvec{\mathcal {\varLambda }}\) are the unknown parameters, which will be inferred from the given measurements, \({\mathcal {L}}_{\phi }, ~ \phi = (p_{im}, p_{em}, \omega _t, \tilde{u}_{egr1}, \tilde{u}_{egr2}, \tilde{u}_{vgt}, x_r, T_1)\) are the losses for the corresponding equations, and \({\mathcal {L}}^{data}_{\psi }, ~ \psi = (p_{im}, p_{em}, \omega _t, W_{egr})\) are the losses for the corresponding measurements, and \({\mathcal {L}}^{ini}_{\phi }, ~ \phi = (p_{im}, p_{em}, \omega _t, \tilde{u}_{egr1}, \tilde{u}_{egr2}, \tilde{u}_{vgt}, x_r, T_1)\) are the losses for the initial conditions, \({\lambda }_{T_1}\), \(\varvec{\lambda }_{p_{im}}\), \(\varvec{\lambda }_{p_{em}}\), \(\varvec{\lambda }_{\omega _t}\), and \(\varvec{\lambda }_{W_{egr}}\) are the weights for different loss terms which are used to balance each term in the loss function. In particular, the self-adaptive weight technique proposed in9, which is capable of tuning the weights automatically, is utilized here to obtain the optimal, \({\lambda }_{T_1}\) \(\varvec{\lambda }_{p_{im}}\), \(\varvec{\lambda }_{p_{em}}\), \(\varvec{\lambda }_{\omega _t}\), and \(\varvec{\lambda }_{W_{egr}}\). More details for self-adaptive weights in PINN can be found in “Appendix 1”.

In the Case 5, where we have not considered self-adaptive weights, so the loss function is given as

$$\begin{aligned} \begin{aligned} {\mathcal {L}}(\varvec{\theta }, \varvec{\mathcal {\varLambda }}) =&{\mathcal {L}}_{p_{im}} + {\mathcal {L}}_{p_{em}} + {\mathcal {L}}_{\omega _{t}} + {\mathcal {L}}_{u_{egr1}} \\ {}&+ {\mathcal {L}}_{u_{egr2}} + {\mathcal {L}}_{u_{vgt}} + 10\times {\mathcal {L}}_{x_{r}} + 10^3\times {\mathcal {L}}_{T_{1}} \\&+{\mathcal {L}}^{ini}_{p_{im}} + {\mathcal {L}}^{ini}_{p_{em}} + {\mathcal {L}}^{ini}_{\omega _{t}} + {\mathcal {L}}^{ini}_{\tilde{u}_{egr1}} \\&+ {\mathcal {L}}^{ini}_{\tilde{u}_{egr2}} + {\mathcal {L}}^{ini}_{\tilde{u}_{vgt}} + {\mathcal {L}}^{ini}_{x_{r}} + 100\times {\mathcal {L}}^{ini}_{T_{1}} \\&+ 10^3\times {\mathcal {L}}^{data}_{p_{im}} + 10^3\times {\mathcal {L}}^{data}_{p_{em}} \\&+ 10^3\times {\mathcal {L}}^{data}_{\omega _t} + 10^3\times {\mathcal {L}}^{data}_{W_{egr}}, \end{aligned} \end{aligned}$$
(14)

As for training the PINNs in the present study, we first employ the first-order optimizer, i.e., Adam23, to train the parameters in the NNs, unknowns in the systems as well as the self-adaptive weights for a certain number of steps. We then fix the self-adaptive weight and employed Adam to train the parameters in the NNs, and unknowns in the systems for another certain number of steps. We then switch to the second-order accuracy optimizer, i.e., LBFGS-B, to further optimize the parameters in the NNs and the unknowns in the systems. Note that the self-adaptive weights are optimized at the first training stage of Adam only, and they are fixed during the second training stage of Adam and LBFGS-B training with the values at the end of the first stage of Adam optimization.

Neural network surrogates for empirical formulae

In the mean value engine model proposed in1, empirical formulae, e.g., polynomial functions, are employed for the volumetric efficiency (\(\eta _{vol}\)), effective area ratio function for EGR valve (\(f_{egr}\)), turbine mechanical efficiency (\(\eta _{tm}\)), effective area ratio function for VGT (\(f_{vgt}\)), choking function (for VGT) (\(f_{\Pi _t}\)), compressor efficiency (\(\eta _c\)), and volumetric flow coefficient (for the compressor) (\(\Phi _c\)). Note that these empirical formulae are engine-specific and may not be appropriate for the diesel engines considered in the present study. Deep neural networks (DNNs), which are known to be universal approximators of any continuous function, are thus utilized as more general surrogates for the empirical formulae here. Particularly, we employ six DNNs for the aforementioned variables, and the inputs for each DNN are presented in Table 2.

Table 2 Neural network surrogates for empirical formulae \({\varvec{\mathcal {N}}}_i^{(P)}(\varvec{x}; \varvec{\theta }_i^P)\), \(i= 1,\dots , 6\) denotes the surrogate for the ith DNN parameterized by \(\theta _i^P\) with the input \(\varvec{x}\).

We now discuss the training of the DNNs illustrated in Table 2. In laboratory experiments, measurements on all variables are available. We can then train the neural network surrogates in Table 2 using the data collected in the laboratory. The loss function considered for the training of these networks is

$$\begin{aligned} {\mathcal {L}}_i(\varvec{\theta }_i^P) = \dfrac{1}{n_i}\sum _{j=1}^{n_i} \left[ y_i^{(j)} - \hat{y}_i^{(j)}\right] ^2 = \dfrac{1}{n_i}\sum _{j=1}^{n_i}\left[ y_i^{(j)} - {\varvec{\mathcal {N}}}_i^{(P)}(\varvec{x}_i;\varvec{\theta }_i^P)^{(j)}\right] ^2,\;\;\;\;\; i=1,2,\dots ,6 \end{aligned}$$
(15)

where \(i=1,2,\dots ,6\) are the different neural networks for the approximation of the empirical formulae, \(\varvec{x}_i\) are the input corresponds to the ith network, \(\hat{y}_i\) and \(y_i\) are the output of the ith network and the corresponding labelled values respectively, \(n_i\) is the number of labelled dataset corresponds to the ith neural network. The laboratory data required for calculating labelled data for training each of these networks are shown in Table 3 (in “Data generation”). We discuss the calculation of labelled data from the laboratory data in “Appendix 4”. We train these networks using the Adam optimizer. Upon the training of these DNNs, we will plug them in the PINNs to replace the empirical models, which are represented by the pretrained neural network with the output g(y) in Fig. 2.

Flowchart for PINN model for diesel engine

In “Problem setup”, we discussed the problem setup, and in subsequent sections, we discussed the approximation of different variables using neural networks as well as the basics of the PINN method and the implementation of PINN in the present problem. In Fig. 2, we have shown a schematic diagram along with a pre-trained network for a general ordinary differential equation. In this section, we show a complete flowchart for the calculation of physics loss functions for the engine problem. In Fig. 3, we show the flow chart for calculating the physics-informed loss for the present problem. Note that we have not shown the data loss and the self-adaptive weights in the flow chart.

Figure 3
figure 3

Flow chart for the proposed PINN model for the inverse problem for the engine for prediction of dynamics of the system variables and estimation of unknown parameters. The inputs are input control vector \(\{u_\delta , u_{egr}, u_{vgt}\}\) and engine speed \(n_e\). Six neural network \(\varvec{\mathcal {N}}_i(t;\varvec{\theta })\), \(i=1,2,\dots ,6\) indicated in dashed rectangular oval takes time t as input and predict \(p_{im}\), \(p_{em}\), \(x_r\) and \(T_1\) \(\tilde{u}_{egr1}\), \(\tilde{u}_{egr2}\), \(\omega _{t}\) and \(\tilde{u}_{vgt}\) as shown in Table 1. Four unknown parameters indicated in hexagon are \(\eta _{sc}\), \(h_{tot}\), \(A_{egrmax}\) and \(A_{vgtmax}\). Six pre-trained neural networks \(\varvec{\mathcal {N}}_i^{(P)}(.;\varvec{\theta })\), \(i=1,2,\dots ,6\) indicated in dashed-dotted rectangular oval takes appropriate input and predict the empirical formulae as shown in Table 2. The parameters (weights and biases) of these pre-trained DNNs are kept fixed to predict the empirical formulae. There are eight main blocks calculating different variables. The equations for the calculation of each of the quantities are shown in Appendix 6. Cylinder flow: calculates \(W_{ei}\), \(W_f\) and \(W_{eo}\) using Eqs. (24), (26) and (25), respectively. Cylinder temperature: calculates \(x_v\), \(x_p\), \(T_e\) and \(T_{em}\) using Eqs. (33), (32), (28) and (35), respectively. \(h_tot\) and \(\eta _{sc}\) are considered as learnable parameters in the calculation of \(T_{em}\) and \(T_e\) respectively. EGR dynamics: calculated \(\tilde{u}_{egr}\) using Eq. (38). EGR flow: calculates EGR mass flow \(W_{egr}\) using Eq. (39). \(A_{egrmax}\) is considered as learnable parameter. Compressor flow: calculates compressor mass flow \(W_c\) using Eq. (68). Compressor power: calculates compressor power \(P_c\) using Eq. (58). Turbine flow: calculates turbine mass flow \(W_t\) using Eq. (46). \(A_{vgtmax}\) is considered as trainable parameter. Turbine power: calculates effective turbine power \(P_t\eta _m\) using Eq. (51). There are five blocks, which calculate the residual of the equation. The first block calculates the residual for state equations for \(p_{im}\) and \(p_{em}\); the second one calculates the residual for the equations of \(x_r\) and \(T_1\); the third block calculates the residuals for state equations for \(\tilde{u}_{egr1}\) and \(\tilde{u}_{egr2}\); the fourth block calculates the residual for the state equation for \(\tilde{u}_{vgt}\), and the fifth block calculates the residual for the state equation for \(\omega _t\). There are another two blocks, which calculate the physics loss. The first one calculates the physics loss corresponding to state variable \(p_{im}\), \(p_{em}\) and \(\omega _t\). The second block calculates state physics loss corresponding to state variables \(\tilde{u}_{egr1}\), \(\tilde{u}_{egr2}\), \(\tilde{u}_{vgt}\) and physics loss corresponding to \(x_r\) and \(T_1\). The data losses can be calculated from the variables calculated from the appropriate blocks.

Data generation

We now discuss the generation of data for training the NNs utilized in this study. Specifically, we have mainly two different types of data here: (1) the data collected from the laboratory that are used to train the DNN surrogates to replace the empirical formulae used in1; and (2) field data \(p_{im}\), \(p_{em}\), \(\omega _t\) and \(W_{egr}\).

In laboratory experiments, we have measurements on state variables, some of which can be employed for training the neural network surrogates for the empirical formulae. The laboratory data required to calculate the labelled data for each of the surrogates are shown in Table 3. The calculation of labelled data from laboratory data is discussed in “Appendix 4”. After training, we consider these pre-trained surrogates in field experiments in place of the empirical formulae. The parameters of these networks are kept constant in the PINNs model for the field experiment.

In field experiments, we have records for the four inputs, i.e., \(\varvec{u} = \{u_\delta ,\;\; u_{egr}, \;\; u_{vgt}\}\) and \(n_e\). In addition, we only have measurements on four variables in field experiments, i.e., the intake manifold pressure (\(p_{im}\)), exhaust manifold pressure (\(p_{em}\)), turbine speed (\(\omega _t\)) and EGR mass flow (\(W_{egr}\)).

In both the laboratory and field experiments, we have the records for the inputs (i.e., \(\varvec{u} = \{u_\delta ,\;\; u_{egr}, \;\; u_{vgt}\}\) and \(n_e\)) from an actual engine running conditions. Considering that we only have a certain number of records for the variables in the running engine, which cannot be used to verify our PINN model since our objective is to use it to predict the whole gas flow dynamics in the engine. We, therefore, take the records for the real inputs (i.e., \(\varvec{u} = \{u_\delta ,\;\; u_{egr}, \;\; u_{vgt}\}\) and \(n_e\)) and employ them as the inputs for the governing equations Eqs. (3)–(8). We then solve these equations using Simulink22 to obtain the dynamics for all variables. We use the data from Simulink to mimic the real-world measurements, which are employed as the training data for PINNs and the DNN for the pre-trained networks. The remaining data are used as the validation data to test the accuracy of PINN for reconstructing the gas dynamics in a running engine, given partial observations. Given that the real measurements are generally noisy, we add \(3\%\), \(3\%\), \(1\%\) and \(10\%\) Gaussian noise in \(p_{im}\), \(p_{em}\), \(\omega _t\) and \(W_{egr}\), respectively. These different signals have different noise values because they are different measurements with different noise characteristics.

Table 3 List of empirical formulae represented using a pre-trained neural network and lab test data required for their training \(^{\dagger }\).

In the present study, we consider two sets of input data in the training and testing of the surrogate neural networks for the empirical formulae. The first set of data (Set-I) is two (2) h of data collected at a sampling rate of 1 s. This control input vector \(\{u_\delta , u_{egr}, u_{vgt}\}\) and \(n_e\) are considered to generate simulated data with different ambient conditions, which are shown in Table 4. The second set of data (Set-II) is twenty-minute (20 min) data collected at a sampling rate of 0.2 s. This control input vector \(\{u_\delta , u_{egr}, u_{vgt}\}\) and \(n_e\) are considered to generate simulated data with Case-V ambient conditions.

The labelled data for the training of surrogate neural network for \(\eta _{vol}\), \(F_{vgt,\Pi _t}\), \(\eta _c\) and \(\Phi _c\) are generated for Case-I to Case-IV with a \(dt = 0.2\) s. The testing data are generated for Case-V with the same dt. We observed from the engine model that the EGR valve actuator is independent of the other system of the engine and depends only on the EGR control signal (\(u_{egr}\)). Thus, for the training of surrogate neural network for \(f_{egr}\) (\({\varvec{\mathcal {N}}}_2^{(P)}(:,\varvec{\theta })\)), we consider the training data set corresponding to Case-I only and the testing data set corresponding to Case-V. The labelled data for \(\eta _{tm}\) are calculated from Eq. (44) (“Turbocharger”), which is a differential equation, thus requires a finer dt. The simulated data for the calculation of labelled \(\eta _{tm}\) are generated with \(dt = 0.025\) s in all the Cases. We assume that the Set-I data for the input control vector includes a good operating range for training surrogate neural networks for the empirical formulae. The field data (\(p_{im}\), \(p_{em}\), \(\omega _t\) and \(W_{egr}\)) for the inverse problem are considered from Case-V.

Table 4 Ambient conditions for training and testing of neural networks: the different ambient conditions are considered for generating training and testing data.

Results and discussions

In this section, we demonstrate the applicability of proposed PINNs for solving the inverse problems discussed in “Problem setup”. Case 1 and Case 2 have three unknown parameters, while Case 3 to Case 5 have four unknown parameters. The predicted values of the unknowns for all five cases are shown in Table 5. In this section, we will discuss the results of Case 3 to Case 5. The results of Case 1 and Case 2 are presented in “Appendix 7”.

Table 5 Predicted unknowns: predicted unknown parameters for different cases considered.

First, we study the results of Case 3 and Case 5 to understand the applicability of PINN and the importance of self-adaptive weight in accuracy and convergence. Then, we study the results of Case 4, which is similar to Case 3; however, with added noise in the field data considered. We also discuss the results for the surrogate for the empirical formulae in Appendix 12. Note that the results for all variables are presented in a normalized scale from zero to one using the following equation,

$$\begin{aligned} x_{scale} = \dfrac{x - x_{min}}{x_{max} - x_{min}}, \end{aligned}$$
(16)

where x and \(x_{scale}\) are the data before and after scaling, respectively, \(x_min\) is the minimum value of true data of x within the time span considered, \(x_{max}\) is the maximum value of true data of x within the time span considered.

We are considering the input control vector \(\{u_\delta , u_{egr}, u_{vgt}\}\) and engine speed \(n_e\) from an actual field record. It is assumed that these data have inherent noise in their records. Detailed studies are carried out considering a 1-min duration. The number of residual points considered in the physics-informed loss and data loss is 301 at equal \(dt=0.2\) s. The initial conditions considered for \(\{p_{im},\;\; p_{em},\;\; x_r,\;\; T_1,\;\; \tilde{u}_{egr1},\;\; \tilde{u}_{egr2},\;\; \omega _t, \;\;\tilde{u}_{vgt}\}\) are \(\{8.0239\times 10^4,\;\; 8.1220\times 10^4,\;\; 0.0505,\;\; 305.3786,\;\; 18.2518,\;\; 18.1813, \;\; 1.5827\times 10^3,\;\; 90.0317\}\) respectively. The measured field data are also considered for 1 min with equal \(dt = 0.2\) s. Thus, each of the measured field quantities has 301 records.

The details of the neural networks considered for the PINN problem are shown in Table 6. We consider \(\sigma (\cdot )={tanh}(\cdot )\) activation function for hidden layers for all the neural networks. We would also like to emphasize that the scaling of output is one of the important considerations for faster and an accurate convergence of the neural network. Furthermore, output transformation is another important consideration. The outputs are physical quantity and always positive. The governing equations are valid only for positive quantities (e.g. in Eq. 39, a negative \(p_{im}\) will result in negative \(W_{egr}\)). The output transformation will ensure that the predicted quantities are always positive in each epoch. Similarly, as shown in Table 5, the mask for the unknown parameters will ensure a positive value. We also observed that the unknown parameters are of different scales. The scale considered for the unknown parameters will ensure the optimization of these parameters is on the same scale. The parameters of the neural network are optimized first using Adam optimized in Tensorflow-1 with \(200\times 10^3\) epoch and further with LBFGS-B optimized. It is also important to note that we have considered self-adaptive weights in the proposed method; thus, we considered different optimizers for each set of self-adaptive weights. Further, self-adaptive weights are optimized only during the process of Adam optimization up to \(100\times 10^3\) epoch. After \(100\times 10^3\) epoch and during the process of optimization using LBFGS-B, the self-adaptive weights are considered constants with the values at \(100\times 10^3\) epoch of Adam optimization. The sizes of self-adaptive weight are \(301\times 1\) for \(\varvec{\lambda }_{p_{im}}\), \(\varvec{\lambda }_{p_{em}}\), \(\varvec{\lambda }_{\omega _{t}}\) and \(\varvec{\lambda }_{W_{egr}}\). The size of self adaptive weight of \(\varvec{\lambda }_{T_{1}}\) is \(1\times 1\). Softplus masks are considered for all the self-adaptive weights.

Table 6 Details of neural network for PINNs: details of neural networks considered to approximate the state variables and \(T_1\) and \(x_r\).

PINN for the inverse problem with four unknown parameters

Results for Case 3 and Case 5

We first consider Case 3 and Case 5 in which we have four unknown parameters \(A_{egrmax}\), \(\eta _{sc}\), \(h_{tot}\) and \(A_{vgtmax}\). The dynamics of \(p_{im}\), \(p_{em}\), \(\omega _t\) and \(W_{egr}\) can be obtained from the corresponding sensor measurements. We then employ the PINN to predict the dynamics for the variables and infer the four unknowns in the system. The difference between the two cases is that in Case 3, we have considered self-adaptive weights, while in Case 5, we have not considered self-adaptive weights. We consider these two cases to study the applicability of PINN and the importance of self-adaptive weights in the present problem.

The predicted output from the neural networks, i.e., the states and \(T_1\) and \(x_r\) are shown in Fig. 4. The predicted values of the unknown parameters are shown in Table 5. We observe that the predicted states are in good approximation with the true value in both cases. However, the predicted \(T_1\) and \(x_r\) are not in good agreement with the true value. We study the effect of \(T_1\) and \(x_r\) on the other variables by comparing the predicted dynamics of \(T_{e}\) and \(T_{em}\) (ref Eqs. (28) and (35)). We also note that \(T_e\) depends on the unknown \(\eta _{sc}\) and \(T_{em}\) depends on unknowns \(T_e\) and \(h_{tot}\). The predicted dynamics of \(T_e\) and \(T_{em}\) are shown in Fig. 5b,c, respectively. We observe that both \(T_e\) and \(T_{em}\) show somewhat good agreement even \(T_1\) and \(x_r\) do not match with the true value. The accuracy is more in Case 3 compared to Case 5. We believe that the difference in the true value and the predicted value is due to the error in the predicted value of unknown parameters. We also study the dependent variables \(A_{egr}\) and \(W_t\) of unknown \(A_{egrmax}\) and \(A_{vgtmax}\), and are shown in Fig. 5a,d, respectively. We observe that in Case 3, the predicted dynamics for both variables show good agreement with true value. However, in Case 5, the \(A_{egr}\) does not show good agreement with true value. This is because the predicted value of \(A_{egrmax}\) has more error than Case 3.

Figure 4
figure 4

Predicted states and \(T_1\) and \(x_r\) for Case 3 and Case 5: predicted dynamics of the state variables of the engine and \(T_r\) and \(x_1\) for Case 3 (PINN with self-adaptive weights) and Case 5 (standard PINN without self-adaptive weights). The variables are scaled using Eq. (16). It can be observed that the predicted dynamics of the states are in good agreement with the true values. However, \(T_1\) and \(x_r\) do not match with the true value. We study the dependent variables of these two variables, and are shown in Fig. 5.

Figure 5
figure 5

Predicted dynamics of dependent variables for Case 3 and Case 5: predicted dynamics of \(A_{egr}\), \(T_e\), \(T_{em}\) and \(W_t\) for Case 3 and Case 5. These variables depend on the unknown parameters \(A_{egrmax}\), \(\eta _{sc}\), \(h_{tot}\) and \(A_{vgtmax}\) respectively. We also note that \(T_e\) depends on \(T_1\) and \(x_r\).

In order to study the importance of self-adaptive weights, we study the convergence of the unknown parameters for both cases with self-adaptive weight (Case 3) and without self-adaptive weight (Case 5). The convergences of the unknown parameters with epoch for both cases are shown in Fig. 6. In Case 3 (with self-adaptive weights), we can observe that the unknown parameters converge faster and are more accurate. Furthermore, we also study the effect of different initialization of network parameters for PINN and self-adaptive weights. We run the PINN model for Case 3 and Case 5 with different initialization of parameters of PINN (DNN and unknown parameters) and self-adaptive weight keeping other hyperparameters (number of epoch considered, learning rate scheduler etc.) the same. The results for both cases are shown in Fig. 7. It is observed that for unknowns, \(\eta _{sc}\) and \(A_{vgtmax}\) for both cases show similar accuracy. However, for unknowns, \(A_{egrmax}\) and \(h_{tot}\), Case 3, which is with self-adaptive weights, shows better accuracy than Case 5 (without self-adaptive weights) for all the runs. In Fig. 8, we show the self-adaptive weights for \(p_{im}\), \(p_{em}\), \(\omega _{t}\) and \(W_{egr}\) after \(100\times 10^3\) epoch (constant value after \(100\times 10^3\) epoch). Thus, we conclude that self-adaptive weights are important for better accuracy and convergence for the present problem.

Figure 6
figure 6

Convergence of the unknown parameters for Case 3 and Case 5: Convergence of the unknown parameters with epoch for Case 3 (PINN with self-adaptive weights) and Case 5 (standard PINN without self-adaptive weights). It is observed that Case 3 converges faster and also shows better accuracy.

Figure 7
figure 7

Predicted unknown parameters for Case 3 and Case 5: predicted unknown parameters for Case 3 (PINN with self-adaptive weights) and Case 5 (standard PINN) when prediction is made multiple times with different initialisation of the parameters of PINN, self-adaptive weights, and the unknown parameters Black dashed line → true value, Blue dots → Case 3, Red dots → Case 5.

Figure 8
figure 8

Self-adaptive weights for Case 3: self-adaptive wights for \(p_{im}\), \(p_{em}\), \(\omega _{t}\) and \(W_{egr}\) after \(100\times 10^3\) epoch of Adam optimization. The values of the self-adaptive weights after \(100\times 10^3\) Adam optimization and LBFGS-B optimization are constant with the values of self-adaptive weight at \(100\times 10^3\) epoch.

Results for Case 4: four unknowns with noisy measurement data

In the previous section, we have shown the effectiveness of PINN and the importance of self-adaptive weights. In this section, we test the robustness of the proposed PINN formulation for predicting the gas flow dynamics of the diesel engine given noisy data. In particular, we are considering Case 4 (the same Case 3 but noisy measure data), in which we have four unknown parameters \(A_{egrmax}\), \(\eta _{sc}\), \(h_{tot}\) and \(A_{vgtmax}\), with noise measurement of \(p_{im}\), \(p_{em}\), \(\omega _{t}\), \(W_{egr}\).

We contaminate the training data \(p_{im}\), \(p_{em}\), \(\omega _{t}\), \(W_{egr}\) considered in Case 3 with Gaussian noise and consider these as synthetic field measurements. We present the predicted dynamics of the known data in Fig. 9a–d and unknown parameters in Table 5. We observe that the dynamics of the predicted \(p_{im}\), \(p_{em}\), \(\omega _{t}\), \(W_{egr}\) matches with the reference solution. However, in the case of \(W_{egr}\), there is a small discrepancy in the predicted values near 20–25 s, which we can attribute to over-fitting caused by the noisy training data. We study the dynamics of \(T_{em}\), \(W_{ei}\) \(A_{egr}\) and \(W_t\) on which we do not have any measured data, and these are shown in Fig. 9e–h. We observe that \(W_{ei}\) matches with the reference results. Most of the dynamics of \(A_{egr}\) and \(W_t\) match with the reference solution. The mismatch in these two variables may also be attributed to over-fitting caused by noisy data. The profile of \(T_{em}\) matches with the reference solution, however, it is not an exact match with the reference solution. This is because of the error in the predicted value of unknown parameter \(\eta _{sc}\) and \(h_{tot}\). We also note that in the present study, we do not have any temperature measurements of field data. Thus, we expect errors in the predicted temperature measurements and unknown parameters. We also study the convergence of the unknown parameters with epoch and shown in Fig. 10. We note that, in this case, we consider the same hyperparameters in the optimization process. In some cases, we see over-feeting due to noisy data. This may be controlled by changing the hyperparameters, specially the learning rate for the self-adaptive weights.

Figure 9
figure 9

Predicted dynamics for variables for Case 4: predicted dynamics of (ad) variables whose noisy field measurements are known. (eh) dynamics of other important variables, which are also dependent on the unknown parameters. These results are for Case 4 with 4 unknown parameters and noisy field measurements.

Figure 10
figure 10

Convergence of the unknown parameters for Case 4: convergence of the unknown parameters with epoch for Case 4 (PINN with self-adaptive weights and noisy field data).

We also study the prediction of empirical formulae in this case and shown in Fig. 11. We observe that \(\eta _{vol}\) and \(\eta _c\) match with the reference solution. These two quantity gives the volumetric efficiency of the cylinder and the efficiency of the compressor. The other four quantities (\(f_{egr}\), \(f_{vgt}\times f_{\Pi _t}\), \(\eta _{tm}\), \(\Phi _c\)), also match most of its points. The discrepancy can be attributed to the noisy measurement of field data.

Figure 11
figure 11

Empirical formulae for Case 4: the prediction of empirical formulae for Case 4 (PINN with self-adaptive weights and noisy field data).

Summary

In this study, we proposed a PINNs-based method for estimating unknown parameters and predicting the dynamics of variables of a mean value diesel engine with VGT and EGR, given the measurement of a few of its variables. Specifically, we know field data of intake manifold pressure (\(p_{im}\)), exhaust manifold pressure (\(p_{em}\)), turbine speed (\(\omega _{t}\)) and EGR flow (\(W_{egr}\)). We predicted the dynamics of the system variables and unknown parameters (\(A_{egrmax}\), \(\eta _{sc}\), \(h_{tot}\) and \(A_{vgtmax}\)). The input data for the study are considered from actual engine running conditions and show good accuracy in predicted results. We also studied the importance of self-adaptive weight in the accuracy and convergence of results. Furthermore, we also showed how we could approximate empirical formulas for different quantities using neural networks and train them. We believe the proposed method could be considered for an online monitoring system of diesel engines. The field-measured data are collected using individual sensors. Thus, in the event of sensor failure or erroneous data, the method may give erroneous results. The method also does not consider a failure of engine components, e.g. leakage in the EGR valve. We considered the engine model proposed in1. In the present study, we do not have any temperature measurement of field data, and thus we expect errors in predicted temperature measurements and unknown parameters as observed. Future research may include modelling of failure of engine components. Since the proposed PINN consider online training, with change in input data, field measured data or ambient condition, the PINN networks are required to train again. The accuracy of the results also depends on the size of neural networks and the optimization strategy (e.g. optimizer, learning rate scheduler) considered. For example, a large neural network or higher value in learning rate may result in overfitting of predicted results. The activation function also plays an important role in the accuracy and computational cost24. Further study may include a neural architecture search for optimal network sizes considering different operational ranges. Future studies may include a more robust and efficient PINN method for the problem that can be used with edge systems, including proper transfer learning strategies to reduce the computation cost. As there is noise in the measured data, the future study in this regard may also be towards uncertainty quantification of the predicted dynamics and unknown parameters.