Physics-informed neural networks for high-resolution weather reconstruction from sparse weather stations

Background The accurate provision of weather information holds immense significance to many disciplines. One example corresponds to the field of air traffic management, in which one basis for weather detection is set upon recordings from sparse weather stations on ground. The scarcity of data and their lack of precision poses significant challenges to achieve a detailed description of the atmosphere state at a certain moment in time. Methods In this article, we foster the use of physics-informed neural networks (PINNs), a type of machine learning (ML) architecture which embeds mathematically accurate physics models, to generate high-quality weather information subject to the regularization provided by the Navier-Stokes equations. Results The application of PINNs is oriented to the reconstruction of dense and precise wind and pressure fields in areas where only a few local measurements provided by weather stations are available. Our model does not only disclose and regularize such data, which are potentially corrupted by noise, but is also able to precisely compute wind and pressure in target areas. Conclusions The effect of time and spatial resolution over the capability of the PINN to accurately reconstruct fluid phenomena is thoroughly discussed through a parametric study, concluding that a proper tuning of the neural network’s loss function during training is of utmost importance.


Introduction
Extreme weather conditions have a major impact, not only on nature, but also on human-based operations.Weather-related disasters have been reported to incur in substantial costs, exceeding USD 92.9 billion in the United States of America in 2023 1 and EUR 52.3 billion in the European Union in 2022 2 , adding up to USD 2.655 trillion and EUR 650 billion since 1980, correspondingly.Here, not only the effect of storms is accounted for, but also extreme temperatures, floods and other natural catastrophes 3 .The ability and capacity to properly reconstruct, predict and prevent weather disasters are therefore of crucial importance, since not only human lives can be saved, but the associated costs may be remarkably decreased.The art of weather forecasting, although ancient, spans throughout history (the interested reader is referred to a documentation on Babylonians using haloes and appearances of clouds for storm prediction 4 ; Aristotle's treatise Meteorologica -see e.g. its English translation 5 -on the formation of rain, clouds, hail, wind, thunder, lightning, and hurricanes; and more recently the theory of chaos proposed by Lorenz and its application to 'Deterministic nonperiodic flow' 6 , which was to revolutionize the statistical treatment of weather data).However, since the beginning of the computer age and with an increasing number of weather stations (WS) around the world, dedicated satellites orbiting the Earth and high precision weather gauges, weather reconstruction and now-/fore-casting has become a more reliable science thanks to the boost of numerical methods 7 .
Nowadays, access to massive amounts of data has leveraged the use of neural networks (NNs), one of the many applications of artificial intelligence, in many fields in which data assimilation becomes a cornerstone.The capacity of NNs to find intrinsic correlations with complex data has widened the possibilities of application in fields where time sensitivity and computational effort may not work in favor.Novel studies have promoted the analysis of weather information using convolutional neural networks (CNNs) and the associated recommendations to undertake when storms are approaching critical areas, such as the vicinity of airports 8,9 .Recently, these architectures have gained substantial precision and adaptability due to the development of autoencoders and Fourier neural operators, which have demonstrated excellent performance when working with scarce noisy information 10,11 .Other architectures, such as recurrent neural networks (RNNs), in particular long short-term memory (LSTM), have been shown to provide precise forecasting of weather phenomena given sequences of measurements at a set of fixed locations 12 .However, even though these NNs have shown excellent performance when working with extensive sets of structured data, they lack a regulatory basis upon which to extract conclusions based on a physically viable behavior.
Since the development of physics-informed neural networks (PINNs), a type of NN which incorporates physics constraints during its training stage 13,14 , their versatility has promoted their application in the vast field of fluid mechanics, as the embedding of the well-known Navier-Stokes (NS) equations can be achieved and becomes a source of regularization when data-assimilating [15][16][17][18] .The compliance with this system of equations becomes a relevant asset for the reconstruction of weather events 10 , more particularly the wind direction and the pressure of the atmosphere at a certain moment in time.Even though PINNs are not fully capable of solving a complete formulation of the NS equations given their extreme complexity, they can outperform direct numerical simulations (DNSs) when solving Reynolds-averaged NS equations as far as their precision and, most importantly, their computational cost are concerned 19 .PINNs are able to go beyond the limitations of experimental procedures, since they can account for behavior which cannot be measured directly by current technology 14,20 or due to scarcity of resources, e.g.limited amount of probes in a water / wind tunnel experimental setup 21,22 .In addition, reference information needs not be structured on a regular grid, which is a major advantage as compared to other architectures.Current trends in the improvement of PINN architectures are heading towards the development of physics-informed neural operators which intrinsically embed the physics constraints without the need of investing in automatic differentiation for the computation of derivatives 23 .Still, PINNs have shown to be extremely dependent on the time and spatial resolution of the available data.Recent research lines showcase alternative methods to enhance time resolution provided that spatial resolution and a set of time-resolved pointwise measurements are at hand 22 .Nonetheless, PINNs have been demonstrated to perform in an excellent manner with datasets originated in controlled environments, such as numerical simulations or lab experiments.However, there is still room for improvement and the use of PINNs on real data lacking both time and spatial resolution has not been yet tested to the best of the authors' knowledge.This scenario is very representative of weather reconstruction, in which from a limited WS set, the full fluid field needs to be inferred.In this case, information from sparse networks of sensors can be leveraged to provide fast and accurate predictions.This is essential for many fields, among them the organization of air traffic, both in the air and ground stage, especially in areas surrounding or approaching airports' vicinities [24][25][26] .
This article deals for the first time with the reconstruction capacity of a standard PINN given the information provided by a sparse WS set, whose recordings may not be exactly accurate and incorporate experimental noise.The tuning of the PINN to best adapt to the reference information while in parallel imposing a regularization based on a physics constraint is exposed here, showing that the precision of the final reconstruction is highly dependent on the quality of the reference information and the way in which the different contributions to the loss function are weighted.The article is organized in the following manner: Section 2 introduces the PINN architecture and the necessary pretreatment of the available data to guarantee a proper performance of the NN; Section 3 discusses the accuracy of the weather reconstruction and the different effects that time and spatial resolution have on the final output of the PINN.Additional counteractive strategies are proposed to compensate for the lack of both resolutions in the original dataset; finally, Section 4 discloses the conclusions of the article.

Methods
The proposed architecture is based on a standard PINN as described in the original article by 13: a fully connected NN in which each neuron of each hidden layer is connected to all the neurons of the subsequent one.PINNs are capable of embedding constraints based on physics laws to regularize and supervise the training of the NN in such a way that this process is considered semi-supervised: the NN learns both from labeled and unlabeled data, being the labeled data the information provided by WS and the unlabeled data the physics constraints.The ability to adapt physics laws is achieved via automatic differentiation during backward propagation 27 .Here, the derivatives of the outputs versus the inputs can be calculated with precision and allow for the computation of residuals from ordinary or partial differential equations 28 .For the case of fluid flow characteristics, such as wind and pressure profiles, the Navier-Stokes (NS) equations can be assumed to drive the evolution of the weather phenomena.In a simplified manner, NS equations may be written as follows: Here, t, x, y, z are dimensionless time and spatial coordinates in a Cartesian system of reference, and u, v, w are dimensionless velocity components in their corresponding spatial direction, whereas p refers to the dimensionless pressure.Re defines the dimensionless Reynolds number.The terms e i (i = 1,…,4) within parenthesis indicate the residuals of the equations, which will be explained later in the section.For applications to weather reconstruction at ground level, winds in the vertical direction, though essential for vertical wind shear, typically represent one or two orders of magnitude smaller than horizontal wind speeds [29][30][31] .Therefore, the term w may be safely neglected.In addition, WS seldom provide information in the vertical direction and generically measure in a horizontal plane.Therefore, in case vertical fluctuations are not available, all terms with respect to the z coordinate need be eliminated.This introduces a loss of accuracy and limits the implementation of the methodology to cases that are 2D on average.
Consequently, the PINN receives as input the two spatial coordinates x, y and time t, and extracts the in-plane velocity components u, v and pressure p.Our PINN is based on a fully connected NN composed of N l = 12 hidden layers (among which the first 8 are activated by a hyperbolic tangent function, leaving the rest with a linear activation) with number of neurons N n = 600.The loss function to be minimized during training combines two main contributions: • the error with respect to the reported data from WS, ℒ WS , • and the residuals obtained from the NS equations, ℒ NS .
In detail, , where the subindex represented by the bullet • refers to the corresponding variable measured by WS.The associated loss contribution of each component is calculated based on the mean squared error (MSE) with respect to the reference data, computed in the following form: where  i and  j indicate discrete time and position instances over N t number of times for N WS number of weather stations, and � Z and  correspond to the predicted and real values of a particular variable, respectively.For dimensionality purposes, the MSE will be referred to the squared standard deviation value of each variable, so as to account for the overall corresponding fluctuations within the full time and spatial frame during the learning process.Consequently, where σ indicates the standard deviation.It is important to note that, given the uncertainty of experimental measurements, real values may not be by default accurate, which exposes a limitation of the PINN when working with noisy information.
On the other hand, for the case of a 2D fluid flow, the physics-constraint based on Equation 1 reduces to three equations, i.e. continuity and two momentum equations, resulting in ℒ WS = MSE(e 1 ,0) + MSE(e 2 ,0) + MSE(e 3 ,0), where 0 would be the reference value for a perfectly compliant fluid domain and e i (i = 1,2,3) correspond to the residuals already introduced in Equation 1.Such residuals are calculated by substituting the predicted values from the PINN and their corresponding derivatives via automatic differentiation into the defined equations.It must be remarked that, whereas the MSE definition used in ℒ WS only covers the points in space with a WS location, the corresponding MSE used for ℒ NS spans to the entire field of flow reconstruction, i.e. the term N WS in Equation 2 needs to be substituted by N S ≫ N WS , where N S refers to the number of spatial points in the field of reconstruction.
The final loss may be then expressed as a sum of the two independent losses.However, we opt for the adaptive weighted alternative introduced in 22, namely: where ω NS is computed analogously to the previous weights.Therefore, the total loss may be rewritten as 2 NS ( ) . This weighted definition of the loss function works as an adaptive compensation between two counteractive events 22 : the homogenizing effect caused by the imposition of the NS equations which tends to smooth derivatives to achieve a null residual (indeed, an homogeneous velocity and pressure fields perfectly comply with NS equations), and the observance of reported data at the locations where WS are positioned.
Finally, by inspecting Equation 1, the only field strictly necessary for regularization would be the wind velocity, since the reference pressure values are used solely for the purpose of recovering the pressure field.Indeed, the PINN supplies the pressure gradient, thus any pressure measurement at any arbitrary point in the domain would suffice for a comprehensive reconstruction.Whereas the continuity equation inherently regularizes the velocity field, the inclusion of the momentum equations further enhances regularization by revealing the pressure gradient.It is important to note that setting a suitable pressure boundary condition in at least one point is crucial.As discussed in 22, the pressure gradient in the loss function acts as a 'residual drain' and incorporates errors from inaccuracies in other terms of the equations or missing terms (e.g. the out-of-plane motion expanding to the vertical direction).The inclusion of numerous pressure reference values improves convergence and enhances flow reconstruction, since they impede the accumulation of residuals in the pressure gradient, thereby augmenting the regularization of the velocity fields.

Data pretreatment prior to NN training
Generically, information provided by WS originates from ASOS (automated surface observing system) or METAR (meteorological aerodrome reports).Typically, each WS is specifically targeted to register a set of variables, and their quality and recording frequency may fluctuate.Particularly for this article, the dataset contains the following information: date, longitude, latitude, altitude, temperature, wind speed, wind direction and pressure.However, data preprocessing is of utmost importance since the training information needs to be clean of corrupted or missing information.First of all, information from the date field needs to be transformed into continuous time measurements in seconds.For that purpose, the use of Python datetime library is recommended.As a result, time information T may be obtained.Secondly, the location of each weather station needs be transformed into Cartesian coordinates.Spherical projections are applied to longitude and latitude so that information can be retrieved on a 2D plane.Consequently, original values given in degrees for latitude and longitude may be translated to meters in Cartesian horizontal and vertical coordinates X and Y, respectively.For the altitude component Z, no transformation is needed as such information is already expressed in meters over sea level (SL).A similar situation occurs with temperature data C, given by default in Celsius.Regarding the velocity field, wind direction is normally given in degrees with respect to the North, in which the positive angle moves clockwise.Therefore, given the information of wind speed and wind direction, the two velocity components U and V may be easily recovered on the Cartesian coordinates X and Y. Pressure P is generically given in mbar, so an appropriate multiplication by 100 transforms those values into Pa.In addition, P may be transformed to its equivalent condition at a specific altitude, e.g.sea level (SL).This conversion may reduce the error committed by the assumption of 2D flow from a 3D domain.Pressure has a strong dependence on altitude and temperature.Following the international standard atmosphere (ISA), the equivalent pressure value at SL at standard temperature conditions (15°C) can be calculated according to the following equation: where P SL is the equivalent pressure at SL for the associated measured P at any other altitude Z and temperature C. Once all relevant variables have been translated to their corresponding 2D simplification in SI units, it is extremely convenient to transform them to dimensionless variables, both to facilitate the interpretation of the order of magnitude of the available information and simplify the training process of the PINN, since Equation 1 can be thus directly computed.Reference distance L, velocity W and pressure P 0 are to be defined, namely: where subindexes max and min refer to the maximum and minimum values of the corresponding variable, absolute values are indicated between bars and the overhead bar represents the mean value.Consequently, location, time, velocity and pressure may be non-dimensionalized as follows: These dimensionless variables can be directly substituted in Equation 1, in which Re is defined as Re = ρWL/μ, being ρ the air density and μ its dynamic viscosity (for simplification purposes, both properties are referred to the standard temperature).
Finally, it is typical that, within a dataset involving WS measurements, empty records may occur.As PINNs should be trained on complete records of data, either incomplete information (usually indicated as Not a Number -NaN) need be removed or data attribution techniques must be used to estimate the missing values.In this article, we opt for the first option and NaN values are removed from the training dataset.

Experiments and results
This section discusses the application of the methodology explained before to a dataset consisting of measurements from 21 WS in the region of Brussels-Zaventem airport for a period of 14 days in the year 2018.In general, data sources indicate wind velocities and pressure values on a scattered 3D distribution, since each WS has a determined location on the horizontal plane given by its latitude and longitude and a different altitude over sea level (SL).However, the standard deviation of the latter for this particular dataset has a value of 180 m, which according to 32 falls within the assumption of low wind profiles, which states that wind velocities could be considered quasi-constant over a distance below 500 m over ground level, ignoring the contribution of the first few meters (the so-called wind boundary layer).On this basis, a 3D domain may be simplified to a 2D domain, exposing a limitation of the PINN to accurately reconstruct 3D information given a very scarce set of reference information.Our weather reconstruction is therefore presented in a 2D format, for which every parameter depending on altitude has been transformed to their sea-level equivalent according to the international standard atmosphere (ISA), as already detailed in the previous Subsection 2.1.A description of the methodology is outlined in Figure 1.
Three experimental settings using the same data are proposed, each of them with a different objective.First, the standard The PINN considers two main contributions during its learning process: the residuals from the NS equations and the error with respect to experimentally accessible data.The tuning of the strength of the physics constraints, thoroughly explained in Subsection 3.2, is of crucial importance when reference information is scarce and potentially subject to noise.The final output of the process is a detailed fluid field disclosing both wind components in horizontal and vertical directions and pressure.
architecture is designed to reconstruct the true field using an output grid of varying spatial resolution.This experiment assesses the impact of the resolution parameter R on the final reconstruction capacity.Second, the effect of the hyperparameter tuning the strength of the physics constraint is measured.Third, a series of validation experiments are performed to determine the generalization capabilities of the PINN on unseen scenarios.

Assessing the effect of temporal and spatial resolution
The reconstruction ability of the PINN highly depends on the spatial and time resolution of the original dataset used during training.For this particular case, the minimum time resolution corresponds to 10 min, which is the time interval between each intake of the WS.Considering that our field of view covers a reference length L ≈ 410000 m with a reference velocity W ≈ 17 m/s, the time interval to cross the full domain is τ = L/W ≈ 82000 s ≈ 1367 min.Therefore, the available time resolution should be enough to capture the majority of events occurring in the domain of observation.However, fasterdeveloping phenomena under time resolution may not be recovered.In addition, the errors committed by the temporal derivatives in Equation 1 tend to homogenize the final reconstruction and again pose a limitation of the methodology 22 .A potential technique to further improve the given resolution may involve the parallel recording with high-frequency probes, as explained in 22.In addition, viscosity effects in Equation 1 may be neglected, since for a typical kinematic viscosity at standard conditions (temperature of 15°C, pressure of 1 atm) v = 1.46 × 10 -5 m 2 /s, Re = LW/v = 4.774 × 10 11 .This adds an additional layer of simplification, which accelerates the convergence of the PINN to the detriment of recovering potential viscosity-induced events.
Concerning spatial resolution, PINNs perform significantly better when the refinement of the output grid is high.This originates from the capacity of the PINN to impose a hard constraint based on the governing laws if a finer grid is available while maintaining consistency with the experimentally accessible data.Nonetheless, there are certain limitations.First, WS are fixed points in the domain of resolution, i.e. those points do not change in time and are typically sparsely distributed.
The minimal theoretical spatial resolution can be estimated by the minimum distance between WS.Consequently, if there are events developing and disappearing in smaller distances, WS will not be able to track them.For the particular case analyzed here, the minimum spatial resolution corresponds to approximately 11.5 km.PINNs however have been proven to recover partially-lost information due to lack of spatial resolution when temporal resolution is at hand 17,22 .Hence, the readiness of the PINN to adapt to a rougher or finer spatial resolution than the original one can be analyzed.We set different spatial resolutions R = (0.05°,0.1°,0.2°)referred to latitude and longitude in the output grid, with approximate corresponding values of 5.5, 11 and 22 km, i.e. half of the given spatial resolution, the same order and its double.recovering the slight fluctuations in the pressure field, which remains quasi-constant through a discrete time intake.
To check the accuracy of the reconstruction, Figure 3 represents the PINN assimilation error (how accurately it learns the accessible reference data) per snapshot, i.e. per time intake, along the full timespan considered in the dataset (14 days).Here, the error is determined via the relative root mean squared error (rRMSE) of each variable, calculated as the square root of MSE in Equation 2 divided by the corresponding standard deviation, i.e. rRMSE MSE( ) ( ) , /σ =  Z Z Z .Two different concepts of standard deviation have been imposed for the calculation of rRMSE: the total standard deviation of the complete dataset, i.e. for a period of 14 days, σ(); and the standard deviation per intake, i.e. every 10 min, σ(()).It is expected that the PINN does a great job when assimilating information to a level of precision significantly below the overall standard deviation, whereas its accuracy should deteriorate when compared to the standard deviation for each particular time intake.This is the case of the two velocity components, which always maintain a good level of precision both below the discrete and overall standard deviation (represented by the horizontal dashed line at value rRMSE = 1).However, a major limitation occurs for the pressure assimilation.Indeed, as wind fluctuates more frequently, the standard deviation of the full dataset is on average very similar to that of one specific time snapshot.For pressure, on the contrary, its variation over a single discrete snapshot is minimal when compared to the overall fluctuation over 14 days.In fact, pressure changes in the timespan of 10 min are barely noticeable.Thus, the PINN is very accurate when measuring fluctuations over the full range of variation, but very limited when assimilating slight changes over a single discrete time.
The overall results for the full dataset may be checked in Table 1.Here, the rRMSE considers a period of 14 days and the 21 available WS.The error commited by the NS equations, calculated as e e e + + , is also indicated.As it can be noted from the aforementioned table, increasing the resolution of the output grid is beneficial since the PINN is able to properly embed the WS measurements in an easier manner.This is reflected by a decreasing rRMSE in every variable for a finer spatial resolution.However, a finer grid entails a higher uncertainty when calculating spatial derivatives, resulting in higher residual of the NS equations, which are subject to the inaccuracy of the reference measurements and the precision at which the PINN is able to assimilate them.However, we can conclude that, overall, increasing the spatial resolution of the output grid works in favor of a more accurate reconstruction.Indeed, when the selected resolution is of the same order

Table 1. Errors with respect to WS measured data and NS residuals over the reconstructed domain.
The average value of all the errors for each spatial resolution is included for reference purposes.

R(°)
0 of magnitude as the one of the reference data (or rougher), the PINN starts experiencing difficulties when assimilating WS reference information while enforcing in parallel the compliance with NS equations.

Tuning the strength of the physics constraint
The intensity at which the physics constraint is applied during the training of the PINN is a hyperparameter that has been demonstrated to have an insightful effect on the precision of the final reconstruction.In fact, recent studies show that this adaptability improves the performance of different NNs when reference data are very noisy or inaccurate 15,18,33 .This hyperparameter, typically named λ, is incorporated into the loss function and indicates the rate of regularization that is desired on the final reconstruction.The corrected contribution to the total loss function (Equation 3) may be written as * NS L = λℒ NS , resulting in an updated definition, which now reads It is important to note that, in this definition, the contribution ℒ NS is calculated using the same methodology as explained in Section 2 and that the weights ω * need be updated to incorporate the effect of λ, i.e. ω * = ℒ * /(λℒ NS + ℒ u + ℒ v + ℒ p ). Consequently, λ = 0 represents a pure data assimilation methodology, i.e. the PINN does not enforce any physics regularization during its training and the final outcome will only be accurate on the WS locations.Therefore, information at any other point in the field of reconstruction would not be necessarily reliable.Increasing values of λ indicate a stronger enforcement of the physics constraints, with λ = 1 corresponding to the standard definition of the loss function expressed in Equation 3. Two main tendencies may be approached with this hyperparameter: λ < 1 indicates that more importance is given to the assimilation of WS measurements and less to the regularization via physics constraints, whereas λ > 1 performs inversely, i.e. the PINN is directed to reconstruct a field that is more physically viable than a better match to the reference information.Note that λ shall not be set to negative values by definition.Nonetheless, any configuration of λ except its null value will incur in a certain regularization via physics constraints, which influences the way the PINN embeds the reference data since those values are prone to be affected by noise.These inaccuracies in the original dataset may reach up to levels of 40% in some instances 34,35 .The imposition of NS may contribute to the partial correction of those errors, as the physics constraints allow for a regularization of the original data 22,33,36 .This regularization comes at the expense of never achieving a null loss value in Equation 3: a perfect assimilation of the reference data, i.e.ℒ WS = 0 would never be compatible with ℒ NS = 0, and viceversa.From laminar to slightly turbulent regimes, PINNs have been shown to adequately correct experimental errors 22,33 , achieving in all cases an excellent final reconstruction.Table 2, Table 3  e e e + + .
As a common trend for the different spatial resolutions, increasing values of λ reduce the error of the NS equations to the detriment of the rRMSE of the reference variables.However, when looking at the average error output it is worth noticing that the PINN significantly improves its performance when λ ≈ 2 for all levels of spatial resolution.This is an indicator that when reference data are noisy and scarce, more importance must be paid to the regularizing constraint instead of the assimilation of reference information.This is reflected in Figure 4, in which the loss contributions during the training of the PINN are disclosed for different levels of λ.At some point during training, the PINN reaches a plateau in which an equilibrium has been found between the two counteracting effects discussed before: the imposition of the physics constraints and the compliance with 'noisy' accessible data.This plateau shows a higher error with respect to the reference information for increasing values of λ.This originates from the fact that more importance is given to the regularization, and therefore, the PINN is more inclined to reduce the error with respect to the NS equations rather than assimilate the given data.In conclusion, if reference information is suspected of being inaccurate, more effort needs to be oriented to the embedding of the physics regularization, which may partially compensate for the original data errors.Nevertheless, it is important to note that this regularization factor needs not be too high, since for those cases, an excessive deviation with respect to the reference information may occur, impeding a final reconstruction in agreement with experimentally accessible data.

Validation of the weather model via PINN
As discussed in previous subsections, weather information is not readily available in locations where there are no recording stations.As a result, it is extremely difficult to reconstruct the weather flow and find useful information for comparison and validation purposes, since available data would be based on simulations, which already incorporate some hypothesis of the reconstructed behavior.Given such a complex scenario, the only plausible method for a strict validation of our methodology would be based on the elimination of weather stations from the training dataset and refer to them as test cases, i.e. estimate wind velocity and pressure at the location of those test stations, where actual measurements are readily available.
The proposed methodology will be tested on different cases to check its integrity when operating on weather data that is completely in or out of the field of measurement.For that purpose, three test scenarios are established based on the removal of WS very close to one another, very far from one another and the ones contained within an envelope.A representation of each case scenario is shown in Figure 5 for clarity purposes, with blue dots indicating the retained WS for training and red dots the removed ones (and kept for testing).
The 'close' scenario is particularly interesting because it proves that the regularization is done correctly.Any physics constraint should be able to project a precise behavior in points close to the ones used as reference.This is the first benchmark that should be addressed to validate the reconstruction capacity of the PINN.In this case, one wants to demonstrate that the PINN is able to detect changes in the near field.For that purpose, the rRMSE corresponding to the test WS is calculated for each variable.Results are summarized in Table 5 and compared to other approximation methods, such as linear interpolation, nearest and natural neighbor, and cubic and spline interpolation.
Here, PINNs are able to perform similarly to other interpolation methodologies.This is expected since the stations that have been removed are all in the vicinity of the ones used during training, so big changes in the velocity and pressure values are not anticipated.It is very important to notice that rRSMEs are calculated based on the original values measured by the WS, which may be inaccurate.This adds another layer of uncertainty to perform a proper comparison, since the PINN has been shown to partially compensate for   Table 5. Errors on test WS for the 'close' scenario calculated as rRMSE for each flow variable.The NS error is represented by the square root of the summation of squared residuals from the NS equations, i.e. e e e + + .The comparison to other methodologies is also indicated.experimental / measuring errors.As a result, the PINN is presumed to perform better than reflected here.A direct interpolation using the proposed alternative methodologies does not compensate for errors, and therefore their predictions are assumed to be as noisy as the reference data.This is one of the main advantages of using PINNs: noise and errors in the original database are partially taken care of 22 , improving the overall performance.In addition, even though all the interpolators used here for comparison show a high predictability capacity, they are not able to incorporate any physics regularization, i.e. the error with respect to NS equations is orders of magnitude higher than that of the PINN.This is an added value when used in other scenarios in which the points of interest are not in the vicinity of the reference ones.This is particularly the case of the 'far' scenario, in which the performance of the PINN for reconstructing values far away from the set of training WS is exposed.The results are summarized in Table 6.In this scenario, the PINN proves to be remarkably more accurate than any other interpolation method (note that spline and cubic interpolators are not designed to be used as extrapolators and, therefore, no error is calculated).
Both the errors with respect to the test WS and the NS residuals are significantly reduced as compared to those from the alternative interpolators.This validation is of extreme importance because it demonstrates the viability of using PINNs as regularizers even when there is lack of spatial and time resolution: whereas standard computational fluid dynamics (CFD) may incur in a significant amount of computational cost and time, PINNs show a high precision with a remarkable reduction in the computational effort, being significantly better than any interpolation methodology that does not account for physics constraints.
Finally, the ability to infer information from a certain domain if a WS envelope is selected is assessed.In this last test, only WS located at the rim of the set of available WS are allowed during training, i.e. the selected WS form an envelope over the region of interest.A comparison between the use of PINN with different grid resolution and other methodologies is indicated in Table 7. PINNs show again an excellent resolution performance, indicating that the regularization via NS equations is properly embedded into the system.A high accuracy when predicting flow variables at the test WS is always achieved.Nevertheless, there are other simpler methodologies which provide similar forecasting capabilities.Again, we must note that these alternative interpolators do not disclose any physics behavior, and therefore, the error with respect to the NS is at least one order of magnitude higher than that provided by the PINN.Therefore, we demonstrate the higher efficiency of the PINN when compared to other standard interpolation techniques.One may discuss that the major improvement from the use of the PINN originates from the noise compensation / cancellation, as already discussed in 22. Figure 6 shows the rRMSE over the reconstructed timespan for the test WS in each validation scenario.A reference horizontal line at value rRSME = 1 serves as a threshold to determine whether the outcome of the PINN may be considered accurate (below threshold) or not (over threshold).
Note that wind velocity is more challenging to correctly predict as compared to pressure.As already discussed, this is due to the fact that pressure is usually a very stable variable which barely fluctuates within a short timespan.This allows the PINN to accommodate the reference data in a more effective manner and therefore show a great ability to estimate pressure values at other stations, which would not significantly differ.That stability does not occur with wind, which oscillates more frequently.Once more, note that errors are measured with respect to the real values given by WS, which are intrinsically affected by noise.If perfectly accurate values with no noise were readily at hand, the error curves would be presumably improved.Nonetheless, the capacity of PINNs to embed physics constraints still exhibits its main advantage when compared to other methodologies that, even precise enough for a fast estimation of low-fluctuating phenomena, do not pose any regularization on the original data.PINNs show an exceptional reconstruction accuracy both when interpolating and extrapolating information in the field of measurement thanks to their ability to calculate precise derivatives and consequently impose a physics background.

Conclusions
In this article, we have discussed the potentiality of using PINNs for the accurate reconstruction of wind and pressure fields from WS recordings.Different parameters have been incorporated so as to gain a deep insight on the way a PINN learns relevant data and applies a regularization based on several physics constraints.
First of all, the scarcity of accurate information poses a limitation on the capacity of the PINN to both assimilate the experimentally accessible data while at the same time enforcing the compliance with physics laws.This impediment originates from the data sources, based on WS that are fixed in space and provide recordings every several minutes.Therefore, the first drawback that is encountered is the limited time and spatial resolutions.Assumptions can be made to estimate if the available time resolution is fine enough to capture relevant phenomena, which is the case for the majority of events that are disclosed in this article.Concerning spatial resolution, the ratio of reference data to the final output grid is an important parameter.For 14 days of recording with a time interval of 10 min and 21 available WS, this ratio corresponds to 1.20, 4.25 and 13.13% for an output resolution of R = 0.05°, 0.1° and 0.2°, respectively.This small percentage is extremely relevant and plays a significant role in the ability of the PINN to properly assimilate the reference values and incorporate them into the learning process when constraints are imposed in the rest of the domain.The strength at which the physics constraint is imposed plays also a remarkably relevant role.This is achieved by the hypertuning of the adjustable parameter λ.
Values smaller than the unity indicate stronger efforts to reconstruct a field which more accurately resembles the reference information, whereas higher values expose a stronger regularization via physics.For cases in which there is scarcity of data, which may not necessarily be accurate, λ is recommended to be always slightly higher than the unity, though not that high that the output field may totally differ from the accessible information.The higher the scarcity or the inaccuracy of the original reference values, the higher λ should be, allowing the PINN to partially compensate for given errors.
Finally, PINNs are able to adequately perform when reconstructing the velocity and pressure domain as compared to other interpolators.Two effects play in our advantage: the reduction of noise and inaccuracies from the original dataset, and the capacity of the PINN to extrapolate information with precision due to the regularization by enforcement of physics constraints.Whereas the PINN achieves relatively similar results than other interpolators when estimating values very close to the location of the WS or within an envelope, they outperform other methodologies when predicting the behavior of the atmosphere far away from the cluster formed by the source WS.The enhancement of data predictability is therefore justified and the regularization provided by the NS equations suffices for a correct weather estimation in other regions not covered by the WS.Other interpolators fail at this task, since the capability of extracting precise information at other locations is highly dependent on the precision of the reference points, failing drastically at extrapolating outside the region of training.
PINNs have also been shown to be a very powerful tool when estimating information at other locations and smoothing fluctuations that may occur in the fluid field.This works in our advantage when calculating temporal and spatial derivatives, essential to comply with the physics constraints imposed by the NS equations.However, a major limitation originates from this smoothing effect: rapid events developing and disappearing below the time resolution of the original dataset and local events occurring in a region below the spatial resolution (and not captured by a WS) would not be reconstructed nor recovered.PINNs are thus useful neural networks to partially compensate for source errors, to accurately estimate flow properties in other areas in and outside the field of view, and to recover average field behavior.Consequently, PINNs have been proven to be a useful asset when used for reconstructing a complete physically -based wind and pressure field from scarce WS data with precision.Nevertheless, they are still very dependent on the quality of the reference information.Sparsity and inaccuracy of data is certainly the case in most real situations.This first attempt to fully reconstruct a weather field with the use of a PINN shows promising results to improve meteorological predictions in a rapid manner.Therefore, it opens many doors for future research regarding the application of physics constraints to embed and regularize weather fields.Although some basic interpolation methods are shown, there is no benchmark of how the PINN method compares against a standard data assimilation method.I think this is necessary in order to evaluate whether the PINN is a realistic alternative for other methods. 2.
I think it would be good to see more spatial fields (like Figure 2) so that the reader can more easily compare the differences between the PINN and the basic interpolation methods which the authors claim are unphysical.Is this unphysicalness visible in the spatial field? 3.
It would be good to test this method in a different location to see if it generalises, although I accept this may be difficult due to access to data.

4.
Minor I am a little bit confused by your emphasis on the accurate provision of weather information being useful for air traffic management, in the abstract.As you rightly point out in the introduction there are far more important reasons for accurate weather prediction such as mitigating impacts of extreme events. 1.
The introduction is missing all the developments since 2022 of large scale data-driven weather forecasting models such as Pangu, GraphCast, FourCastNet and AIFS.Furthermore there have been studies on the physicalness of these models for example Hakim et al.
(2024) and Bonavita (2024), which I think need to be mentioned.There has also been work using neural networks with data assimilation (see for example Melinc and Zaplotnik (2024)).

2.
I think you could directly write the simplified version of the navier stokes equations, eliminating the vertical components (like you have in Figure 1).For this paper what is relevant is what the PINN is using and you can just add a reference for the Navier Stokes equations.

3.
In Figure 3, what resolution are you using for the results?4.
What are the increased costs associated with increasing resolution?Does it scale linearly? 5.
Figure 6: How have you determined that rRMSE=1 is a good accuracy threshold?6.

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility?Yes

Hao Chen
Beihang University, Beijing, Beijing, China This manuscript proposed a PINN-based method for weather reconstruction from sparse weather station observations.The reviewer believes that the task studied in the manuscript is very interesting and has strong practical significance.The author has conducted a comprehensive review of the work related to PINN, and the logical flow in the introduction section is relatively clear.However, there are still some issues that need to be addressed.The specific suggestions are as follows: -As far as the reviewer knows, the basic motion of the atmosphere is described by the set of atmospheric motion equations.However, the differential equation constraints used in the manuscript only include the idealized NS (Navier-Stokes) equations.How can the effectiveness of these differential equation constraints be ensured in real-world scenarios under such conditions?-Directly obtaining meteorological field data from weather station observations is an interesting task.However, for general analysis fields or reanalysis fields, various observations, including satellite remote sensing and model outputs, are usually assimilated.How can these data be integrated within the current framework?-In the experimental section, the author compared the model performance under different hyperparameter settings but lacked a comparison with some publicly available meteorological analysis fields, such as ERA5.Please explain the reason for this.
-At such a high resolution on the kilometer scale, terrain has a significant impact on meteorological variables.Have terrain-related factors been considered?-In addition to work related to PINNs, recent studies that combine meteorology with PINNs and focus on site observations should be mentioned and reviewed in the manuscript.Reviewer Expertise: weather forecasting, data fusion, PINNs, remote sensing I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.
validating the model on unseen data.Although minor I feel the paper will be benefited by addressing the following concerns-1) It is not explicitly mentioned in Section 3.3 what value of lambda was used for training the PINN in the three scenarios.
2) It would be nice if the authors could add another test case where only a certain duration (e.g.8-10 days) out of 14 days is included in the training set and the rest in the test set.
3) The authors highlight that PINN plays a critical role in "the reduction of noise and inaccuracies from the original dataset" by regularization through physics constraints.To make this more explicit, it would be nice if the authors could create synthetic data following Navier-Stokes equations, add noise, and then train PINN on the noisy data to demonstrate the retrieval of the original information (not affected by noise) in comparison to the baseline methods.
The paper is already in great shape and should be ready to be accepted for indexing after the above concerns are addressed.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility?Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: I specialize in novel applications of Machine Learning techniques on heliophysics problems.My current research problems range from inter-instrument data calibration, space weather forecasting, detection of solar transients, and interpretable generation of solar data.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 1 .
Figure1.Sketch of the proposed methodology discussed in Section 3. First, a real dataset containing weather information provided by a fix number of WS is used to train a PINN after a process of data non-dimensionalization (refer to Subsection 2.1 for a detailed description).The PINN considers two main contributions during its learning process: the residuals from the NS equations and the error with respect to experimentally accessible data.The tuning of the strength of the physics constraints, thoroughly explained in Subsection 3.2, is of crucial importance when reference information is scarce and potentially subject to noise.The final output of the process is a detailed fluid field disclosing both wind components in horizontal and vertical directions and pressure.
Figure 2 represents the different reconstructions for a typical instant given the different grid resolutions, showing great similarities among them while at the same time exposing the difficulties in

Figure 2 .
Figure 2. Reconstruction of wind velocity and pressure fields from measurements given by a WS set via PINN.The outcomes in each row correspond to a different spatial resolution R. The reference data for this specific time intake is shown in the first row as guideline.A video entitled PINN_resolution.mp4disclosing the weather evolution for 14 days of recording is submitted along this article.

Figure 3 .
Figure 3. rRMSE with respect to WS reference data normalized by (blue) the standard deviation of the corresponding variable over the full timespan considered, σ() , and (red) the standard deviation over each specific time intake, σ(()).

Figure 4 .
Figure 4. Error evolution regarding experimentally accessible data during PINN training for different spatial resolutions, computed as the average of each loss contribution with respect to the three reference variables, i.e. ( u +  v +  p )/3. Independently of the value of λ, all curves reach a plateau when an equilibrium has been achieved between the efforts put by the PINN to reduce the error with respect to the NS while assimilating scarce noisy data.

Figure 5 .
Figure 5. Description of selected WS for validation purposes for each test case.In blue, the retained WS for training are highlighted, whereas those in red are removed and only used for testing (and therefore are not included in the training dataset).

Figure 6 .
Figure 6.Evolution over the reconstructed time of the rRMSE of the predicted variables at the test WS for each validation case scenario, namely 'close', 'far' and 'envelope'.Here, the standard deviation over the complete timespan, i.e. σ(), has been used for dimensionality purposes when calculating rRMSE.The horizontal dotted black line indicates the threshold upon which estimated values are considered inaccurate.
Are the conclusions drawn adequately supported by the results?PartlyCompeting Interests: No competing interests were disclosed.Reviewer Expertise: Data-driven weather forecasting I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.Reviewer Report 23 July 2024 https://doi.org/10.21956/openreseurope.18793.r41858© 2024 Chen H.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

[ 1 ]
https://arxiv.org/abs/2401.04125[2]https://www.semanticscholar.org/paper/Deep-Learning-for-Day-Forecasts-from-Sparse-Andrychowicz-Espeholt/207c4b5a58d2b47a1e1b5f8a6877018a99f1bf68Is the work clearly and accurately presented and does it cite the current literature?PartlyIs the study design appropriate and does the work have academic merit?PartlyAre sufficient details of methods and analysis provided to allow replication by others?PartlyIf applicable, is the statistical analysis and its interpretation appropriate?Not applicableAre all the source data underlying the results available to ensure full reproducibility?NoAre the conclusions drawn adequately supported by the results?PartlyCompeting Interests: No competing interests were disclosed.

Table 2 . Errors with respect to WS data and NS over the full domain with spatial resolution R = 0.2° for different levels of λ regularization.
and Table4collect the errors obtained during the training of several PINNs varying the different degrees of regularization via λ for different spatial resolution In blue, the optimal configuration is highlighted.

Table 4 . Errors with respect to WS data and NS over the full domain with spatial resolution R = 0.05° for different levels of λ regularization.
In blue, the optimal configuration is highlighted.

Table 7 . Errors on test WS for the 'envelope' scenario.
Labels follow the same description as in Table5.

Table 6 . Errors on test WS for the 'far' scenario.
Labels follow the same description as in Table5.

the work clearly and accurately presented and does it cite the current literature? Partly Is the study design appropriate and does the work have academic merit? Partly Are sufficient details of methods and analysis provided to allow replication by others? Partly If applicable, is the statistical analysis and its interpretation appropriate? Partly Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Partly
simulated result as well in the figures as well as the PDE residual map at the end of training to see where the residual is high.7. Very minor: the \lambda tables might be easier to read if they were a figure to see the tradeoffs.
Competing Interests: No competing interests were disclosed.Reviewer Expertise: Deep learning for science, HPCI confirm that I

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. 14
days is not enough to evaluate a weather forecasting model.Generally models are evaluated against a whole season.Moreover it is not clear to me how you have split the training and test datasets in section 3.1 and 3.2 1.