Machine learning-based rapid response tools for regional air pollution modelling

A parameterised non-intrusive reduced order model (P-NIROM) based on proper orthogonal decomposition (POD) and machine learning methods has been firstly developed for model reduction of pollutant transport equations. Our motivation is to provide rapid response urban air pollution predictions and controls. The varying parameters in the P-NIROM are pollutant sources. The training data sets are obtained from the high fidelity modelling solutions (called snapshots) for selected parameters (pollutant sources, here) over the parameter space P R . From these training data sets, the machine learning method is used to generate the relationship between the reduced solutions and inputs (pollutant sources) over P R . Furthermore a set of hyper-surface functions associated with each POD basis function is constructed for representing the fluid dynamics over the reduced space. The accuracy of the P-NIROM is highly dependent on the quality of the trainingset,hereobtainedfrom thehighfidelitymodel.Overexistingmachinelearningmethods,theP-NIROMalgorithm proposed here has the advantages that (1) it is combined with NIROM, thus providing rapid and reasonably accurate solutions; and(2) it is a robust andefficient approach for representation of any parametrised partial differential equations as the model parameters/inputs vary. In this study, we demonstrate the way how to implement the P-NIROM for the pollutant transport equation (but not limited to due to its robustness). Its predictive capability is illustrated in a three-dimensional (3-D) simulation of power plant plumes over a large region in China, where the varying parameters are the emission intensity at three locations. Results indicate that in comparison to the high fidelity model, the CPU cost is reduced by factor up to five orders of magnitude while reasonable accuracy remains.


Introduction
Pollution in cities has a strong impact on the health of communities and affects global warming with dire consequences to humanity.The dynamic and pollutant transport processes involve a wide range of scales.The highly disparate scale poses a formidable challenge for atmospheric and air pollution modelling.In recent years, the spatial resolution in operation air pollution models has been increased significantly, thus improving predictive capability.However, this unavoidably leads to an increase in computational cost (Foley et al., 2014).Our motivation is to develop numerical tools for rapid responses/predictions of pollutants without sacrificing solution accuracy, especially in emergency situations.
Reduced-order models (ROMs) have become important to many fields as they offer the potential to simulate dynamical systems with considerably reduced computational cost in comparison to high fidelity models (Cordier et al., 2013;Haasdonk, 2017;Benner et al., 2015;Hinze and Volkwein, 2005).Recently, reduced order methods have been applied to studies of air pollution (Djouad and Sportisse, 2003;Hammond et al., 2018;Alkuwari et al., 2013;Fang et al., 2014).Existing ROMs can be classified into two categories: intrusive and non-intrusive approaches in the sense that whether the implementation of ROMs requires knowledge of the details of original numerical source codes (Chen, 2012).The intrusive reduced order methods have been widely used in many fields (Schlegel and. Noack, 2015;Osth et al., 2014;Amsallem and Farhat, 2012;Franca and Frey, 1992;Chaturantabut and Sorensen, 2010;Feriedoun and Alireza, 2012;Xiao et al., 2014;Xiao et al., 2013).More recently, the non-intrusive methods have became popular since they are less dependent on complex dynamic systems and are therefore easy to implement even when the numerical source code is not available.
More recently, Wang et al. introduced a deep learning technique to NIROMs and applied it to fluid problems (Wang et al., 2018).Deep learning technologies represent the most recent progress in artificial neural networks (LeCun et al., 2015), and have been applied to a number of areas such as speech recognition (Hinton et al., 2012), image recognition (Tompson et al., 2014), medical science (Leung et al., 2014), self-driving cars (Hadsell et al., 2009), language understanding (Collobert et al., 2011) etc.In this work, we have developed a Parameterised NIROM (P-NIROM) based on machine learning techniques for parameterised pollutant transport problems.The input parameters are the emission intensity of pollutants released at different source locations.The P-NIROM enables rapid simulations and controls of the impact of pollutant sources without excessive computational costs.Given a set of selected pollutant sources Q tr over the para- meterised space R P , the training data sets (also called solution snapshots) can be obtained by running the high fidelity model.From the snapshot solutions, the corresponding reduced basis functions are calculated using singular value decomposition (SVD)/POD.The reduced basis functions are used for constructing the reduced space.The original high fidelity model can be projected onto the reduced space, which is several orders of magnitude smaller than the dimensional size of the high fidelity full model, thus significantly reducing the computational cost.For any unseen emission intensity of pollutant sources Q P R , the P-NIROM is constructed using the machine learning methods.From the training solution snapshots, a Gaussian process is used for generating the snapshots and POD basis functions for the unseen pollutant sources Q.Furthermore, the relationship (P-NIROM) between the reduced solutions and the inputs (the pollutant emission intensities) can be obtained using the machine learning techniques.Finally, the solutions from the P-NIROM are projected back the full space.
The P-NIROM is a robust and efficient numerical tool for rapid prediction of pollutants released from different sources and assessment of their impact on specified cities/locations.In this work, we have been successfully applied the P-NIROM to air pollution simulations over a large region in China which covers 55 cities including Beijing.The efficiency and accuracy of the P-NIROM have been evaluated by comparing the results with those from the high fidelity full model.
The remainder of this article is arranged as follows.The pollutant transport equation and its discretisation are described in section 2. In section 3, the details of forming the P-NIROM using POD and machine learning methods are provided.Section 4 presents a numerical experiment of simulating the spatial and temporary distribution of pollutants released from 100 power plants in China.Conclusions are drawn in section 5.

Pollutant transport equation and its discretisation
The dispersion of the tracer concentration (c) is modelled by: where u is the velocity vector, Q is a source term and κ the diffusivity.
In general, the discretised form of (1) at each time level n (where a time interval of t is set during the simulation period) can be written: ), where M is the full numerical operator with varying input parameter μ, , , ) N , N is the number of nodes in the computational domain), s s includes the source term, boundary condi- tions and the variable solutions from the previous time level.In this study, the varying input parameters in air pollutant problems are set to be the pollutant sources, (here, S is the number of pollutant sources).

Parameterised reduced order transport equation
In this work, the POD approach in combination with machine learning techniques is used for model reduction.POD has proven to be a powerful tool for circumventing the intensive computational burden in large complex numerical simulations.POD is capable of representing large complex dynamical systems using a few number of optimal basis functions.In POD reduced order modelling, the tracer concentration in (2) can be expressed as an expansion of the POD basis functions where M R is the reduced state variable vector (the superscript r indicates the variable associated with the reduced order model) to be determined over the reduced space.The POD basis functions are constructed from a collection of snapshots that are taken from the high fidelity model solution (2) for the selected training pollutant sources.Using SVD, a set of orthogonal basis functions { } m can be obtained in an optimal way.The POD basis functions can represent the dynamics of snapshot solutions.The loss of information due to the truncation of the POD expansion set to M vectors can be quantified by the following ratio, where λ denotes eigenvalues, and I is the total number of eigenvectors (here equivalent to the number of solution snapshots used for generating the POD basis functions).The value of E will tend to 1 as M is increased to the value I, this would imply no loss of information.A few number of leading eigenvectors can represent most of dynamical energy within the solution snapshots.Projecting (2) from the N dimensional space onto the M dimen- sional reduced space (M N ), yields: ).
The parameterised reduced order model can thus be written as: 3) and ( 6) can be used for efficient air pollution operational prediction where the CPU time can be reduced by several orders of magnitude.In this work, the parameter set μ in (6) consists of the pollutant source inputs.A recently developed NIROM (Wang et al., 2018) is extended to construct the parameterised reduced order model in ( 6).The P-NIROM based on the machine learning techniques described below is capable of predicting problems with unseen or different parameters (for example, unseen pollutant sources).It is also non-intrusive and independent of the original source code.

Construction of P-NIROM based on POD and machine learning methods
The parameterised reduced order model ( 6) is re-written for the variable c m r associated with each POD basis function m over the reduced space in a general form: In non-intrusive reduced order modelling, one searches a set of functions m to represent the dynamics in (7).In this work, we introduce the Gaussian process regression (GPR) (Rasmussen, 2004) and deep learning learning methods (Wang et al., 2018) to construct the relationship functions m to represent the fluid dynamics of system (6) for any unseen input parameter = µ Q P R .

Gaussian process regression for calculation of POD coefficient and snapshot solutions for any input over the parameter space
In GPR, the relationship between the input = µ Q (here, the pollu- tant source) and the corresponding output c n at each time level tr can be expressed as follows (Rasmussen, 2004): where, = (0, ) n G is the Gaussian distribution with zero mean and variance n .In GPR, it is assumed that the function g Q ( ) n has a Gaussian distribution (with zero mean, here): where the covariance function n represents the dependency between the function values at two different input points Q and Q , that is, where, l is the length scale and w n is the variance.The correlation between the functions g Q ( ) n is dependent on the distance between the two input points.Given a set of training input-output pairs (where, N tr is the number of training points), one aims to predict the pollutant concentration c n in (8) for any new input Q.The joint Gaussian distribution of the training and predicted outputs (c tr n and c n ) for the training and new inputs (Q tr and Q) re- spectively can be written: where, n tr tr is the covariance matrix between all training points and is written below: , and the matrices , .
For any given input parameter (pollution source Q), the probability of the prediction of the reduced variable c r is: where, , is the mean of the Gaussian distribution:

Deep learning method for construction of P-NIROM and calculation of reduced solutions for any input over the parameter space
In this section, an alternative method for calculation of reduced solutions for any given input is introduced.A Recurrent Neural Network (RNN) using the Long Short Term Memory (LSTM) architecture is used to construct the P-NIROM (7).Compared to traditional RNNs, the LSTM has a special memory block in the hidden layer of the recurrent neural network, allowing information to persist.This type of network has cyclic connections, which makes the network a powerful method to model temporal data since it has an internal memory system to deal with temporal sequence inputs.A memory cell is composed of four main elements: an input gate, a neuron with a self-recurrent connection (a connection to itself), a forget gate and an output gate.
The input gate of each memory block controls the information transmitting from the input activations into the memory cell and the output gate controls the information transmitting from the memory cell activations into other nodes.The forget gate decides what information is to be deleted from the memory cell state (Wang et al., 2018).
The LSTM technique is utilised to construct the set of functions (hyper-surfaces) F m in (7).In the LSTM network, the input is the re- duced solution , can be obtained using the following equations: ), where I, f and o denote the input, forget and output gate vectors respectively, Ce is the cell activation vector, b is the bias vector, ϱ is the activation function, W denotes the weight matrices (e.g.W ic is the weight matrix from the input gate to the input), is the element wise product of the vectors, Ce o and Ce i are the cell output and cell input activation functions respectively and ζ is the network output activation function.
After obtaining the function F m , it can then be used to predict the POD coefficients at current time level n.The offline calculation of snapshots at the training stage and the online procedure for constructing and resolving the P-NIROM can be algorithmically summarised in Fig. 1.The details of the offline and on-line calculations are further given in Algorithm 1 and 2 respectively.

Table 1
The emission intensity (g s 1 ) of SO 2 at locations 1 (x = 540,y = 752) km, 2 (x = 603, y = 670) km and 3 (x = 753, y = 679) km.A A 1 28 are the training cases while T T 1 2 are the unseen cases used for evaluating the predictive capability of the new P-NIROM.Fig. 2. The singular values and logarithmic scale of singular values.

Regional pollutant dispersion in China
To demonstrate the capability of the new P-NIROM based on machine learning techniques, it has been applied to a realistic case in China where the SO 2 emissions from power plants disperse through the atmosphere in time.The SO 2 emission intensity at the power plant locations was obtained from the Regional Emission inventory in ASia (REAS 2.1) data developed by National Institute of Environmental Sciences of Japan.The simulated domain covers the whole Shanxi-Hebei-Shandong-Henan region of China with an area encompassing × km km 1090 1060 , and there are about 100 power plants in this area (Zheng et al., 2015).
Using adaptive mesh techniques, the 2D top adaptive mesh ( km 20 above the sea level) is first constructed to ensure a high resolution of km 2.5 around the power plan points within a radius of 6 km.The 3D unstructured mesh with 61479 nodes is then obtained by extending the 2D top mesh onto the terrain surface, with 11 terrain-following layers, where 7 vertical layers are within km 1 above the terrain.The pollutant SO 2 sources around the power plants are released into the atmosphere at the hight of m 200 above the terrain.
In the study, the simulation started at 00:00 UTC on the 10 January 2013 and ran through to the 15 January 2013.A time interval of = t hr 0.5 was used.Assuming that the mixing layer height is m 600 and the turbulent horizontal diffusivity is m s 100 / 2 while the vertical eddy diffusivity is parameterised based on a scheme by Byun and Dennis (1995).The meteorological fields are provided by the mesoscale meteorological model WRF (v3.5) (Skamarock et al., 2008).
In this case, the varying input parameter, = µ Q, is the emission intensity of pollutant sources at locations 1 , and 3 (see Table 1).The emission intensity of pollutant sources is ranged from 0 to 5000 g s 1 .A set of training pollutant sources = µ Q tr tr at three locations is listed in Table 1.The solution snapshots c tr with the training parameters were obtained by running the high fidelity model (Fluidity (AMCG and Imperial, 2015)) and stored at equally spaced time intervals ( hrs 3 ) during the simulation period (5 days).
To illustrate the capability of the P-NIROM based on machine learning techniques, an unseen test case, the emission intensity of pollutant sources = = µ g s Q (2400,2400,5000) 1 , was given at locations 1 , 2 and 3 respectively (T 1 in Table 1).Following the online procedure shown in Fig. 1, using the GPR, the solution snapshots (the distribution of pollutants at every hrs 3 ) for the given unseen pollutant sources were calculated from the training solutions for the selected training parameters (28 training parameter sets in Table 1).
Fig. 2 shows the singular values and a logarithmic scale of singular values.From the calculation in (4), the sharp decrease of singular values suggests that the first 36 leading POD basis functions can capture 99% of dynamical energy within the solution snapshots.In this study, two cases of 6 and 36 POD basis functions were chosen to construct the Fig. 3.The first second POD coefficients obtained from the standard ROM and machine learning ROM (the black solid line: standard ROM, the red dash line: LSTM-ROM, and the blue dot line: GPR-ROM.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)P-NIROM.The larger the number of POD basis functions chosen, the higher the accuracy of P-NIROM.Fig. 4 provides some of the first 36 leading basis functions.It can be seen that the first leading basis function captures a large part of the spatial distribution of pollutant concentration solutions, while the remaining basis functions represent the details of pollutant distributions of different regions.
A comparison of coefficients for the POD basis functions between using the standard ROM and machine learning ROM (based on LSTM and GPR) is provided in Fig. 3.It is clearly seen that the POD coefficients are in very close agreement with each other.Compared to the standard ROM, the machine learning ROM has a wider range of application areas, especially where observational data is concerned, for With an increased number of 36 POD basis functions, the P-NIROM has performed well at resolving the flow dynamics and evolution of power plant plumes (see Fig. 5(e) and (d)).This is further highlighted in Fig. 6 which shows the solutions from different angles.Further comparison is provided in Fig. 7 which illustrates the evolution of pollutant concentrations predicted by the fidelity model and P-NIROM at the location 786) .We can see that the P-NIROM with 6 and 36 POD An error analysis of P-NIROM has been carried out.Visual inspection of Fig. 8 shows the spatial distribution of absolution errors of pollutant solutions between the high fidelity model and P-NIROM.It is visually evident that the accuracy of P-NIROM solutions is improved by increasing the number of retained POD basis functions from 6 to 36.Fig. 9 illustrates the RMSE and correlation coefficients of pollutant solutions between the high fidelity model and P-NIROM with 36 POD basis functions.The correlation coefficients achieve results above between the high fidelity full model and P-NIROM with 36 basis functions from different angles.80 % 90%.This again demonstrates that the P-NIROM is in good agreement with the high fidelity full model.
To further investigate the predictive ability of the P-NIROM, another unseen cases (T2) was set up, where the emission intensities of pollutants at three source locations ( = = µ g s Q (5500,6000,6000) 1 , see Table 1) were given beyond the range of the training data g s (0,5000) 1 .The pollutant solutions (at time level = t hrs 24 ) from the high fidelity full model and P-NIROM are shown in Fig. 10  is provided in Fig. 10 (c).As shown in the figures, the predictive ability of the P-NIROM in cases T2 is acceptable although the given test data goes beyond the range of the training data.

Computational efficiency
This section provides a comparison of the online computational CPU cost required by the high fidelity full model and P-NIROM.The specifications of the machine for simulations were: 12 cores with a frequency of 3.33 GHz ( ® Intel Xeon(R) CPU X5680 @3.33 GHz × 12) and a 62.9 GB memory.One core was used for the simulations since the cases were simulated in serial.Table 2 lists the online CPU cost required for running the high fidelity model and P-NIROM.The offline cost (see Fig. 1) at the training stage is not listed in this table.It can be seen that using the P-NIROM, the CPU time is reduced by five order of magnitude in comparison to the high fidelity model.represent the dynamics of pollutant transport over the reduced space.The P-NIROM is then used for calculating the reduced solutions (POD coefficients) for the given μ (the emission intensity).The unique combination of the P-NIROM and machine learning techniques enables rapid and reasonably accurate simulations.The P-NIROM techniques developed here are robust and can be used for a large number of disciplines not least of pollutant flow based disciplines.

Conclusions
The P-NIROM has been applied to a realistic case in China involving plumes released from over 100 power plants.The varying input parameter is the emission intensity of pollutant sources.A comparison of pollutant solutions between the high fidelity model and P-NIROM has been undertaken.The P-NIROM with 36 POD basis functions exhibits an overall good agreement with the high fidelity model.The online computation cost required by the P-NIROM is reduced by several orders of magnitude in comparison to the high fidelity model.
Compared to existing P-NIROM techniques (for example, based on radial basis functions), the P-NIROM based on machine learning methods provides a wider range of application areas, for example, uncertainty analysis in both data and modelling results, real-time interactive use, data management (real-time data monitoring/analysis), data assimilation and better-informed decision making.In particular, the machine learning techniques with ROM can be used for data selection and data reduction by condensing the information into the desired   Atmospheric Environment 199 (2019) 463-473 number of features and recovering the original data from the reduced feature set.

Fig. 1 .t
Fig. 1.The figure displayed above shows the online and offline procedures of constructing and resolving the P-NIROM for any given parameter µ R p .
1 at the previous time level n 1 while the output is the reduced solution c m r n , associated with the m th POD basis function m … m M ( (1, , )).The relationship function (hyper-surface F m ) between the input c r n , 1 and output c m r n

Fig. 4 .
Fig. 4. Some of the first 36 leading POD basis functions.
(a) and (b) respectively while the corresponding absolute error is illustrated in Fig. 10 (d).A comparison of results between the high fidelity full model and P-NIROMs at a particular location =

Table 2
Online CPU cost required for running the high fidelity model and P-NIROM during one time step.