Advanced regression models for ionospheric delay prediction using GNSS measurements

,


Introduction
The Precise Point Positioning (PPP) (Bakker et al., 2017) technique has attracted great interest among the geoscience community, mainly due to its ability to provide low cost and high accuracy position estimates using Global Navigation Satellite System (GNSS) data from stand alone receivers.One of its main drawbacks is the long convergence time to achieve positioning accuracy at a level of a few centimeters.Recently, various approaches aiming at accelerating the single-frequency PPP convergence time have been proposed, based on combining multi-constellation GNSS measurements with global ionospheric map (GIM) data.GIM (Ou et al., 2012) data provide the necessary external total electron content (TEC) information required to handle the error imposed by the ionosphere.As the GNSS signal is travelling through the upper part of the atmosphere, the ionosphere, in the region about 60 km to 1000 km altitude, it can encounter severe temporal and spatial changes of Kaselimi M., Doulamis N., Doulamis A., Delikaraoglou D., National Technical University of Athens Advanced Regression Models for Ionospheric Delay Prediction using GNSS Systems Τιμητικός -συλλεκτικός Τόμος στη μνήμη Ευαγγελίας Λάμπρου Πολυτεχνειούπολη Ζωγράφου, Αθήνα, 2020 the electron density which can cause significant disruptions on the traveling GNSS radio wave.TEC is a suitable parameter that reflects ionosphere changes caused by solar extreme ultra-violet radiation, geomagnetic storms, and the atmospheric waves that propagate up from the lower atmosphere.TEC estimates with high accuracy and precision are therefore important in determining ionospheric delays of the GNSS signals.
The magnitude of TEC represents the integral of the location-dependent electron density along the signal path and is highly dependent on local time, latitude, longitude, season, solar cycle and activity, and geomagnetic conditions.In practice, as the radio wave signal passes through the electrons in the ionosphere, the velocity of signal changes and consequently, since the propagation characteristics of the GNSS signal depend on frequency, the change in the path and velocity of radio waves in the ionosphere has a big impact on the accuracy of GNSS satellite navigation systems (Alizadeh et al., 2013).Due to these factors, the ionosphere's high variability, is difficult to model, and this is a reason that traditional ionosphere models (e.g., Klobuchar (Klobuchar et al., 1987), NeQuick (Nava et al., 2008)) frequently used in single-frequency GNSS data processing, fail to catch traditional ionosphere conditions or to provide TEC values with high accuracy and precision.For instance, the ionospheric parameters of the Klobuchar model are broadcasted via the GPS satellite's navigation message to the user.The underlying model can only correct approximately 50-60% of the actual ionospheric delays at mid-latitude locations on the average during quiet ionospheric conditions, which can be useful to single-frequency users, in applications that do not demand high accuracy (Klobuchar et al., 1987).However, recently, there has been a greater need to utilize accurate, real-time, single-frequency positioning measurements and, in turn, making necessary to consider using external ionosphere products of high accuracy.In this paper, we propose an advanced regressionbased machine learning model to efficiently predict TEC values per satellite in an attempt to replace the GIM derived data and consequently achieve high accuracy in PPP and/or in PPP-RTK positioning performance.

Related work
Based on empirical approaches, the Klobuchar and NeQuick broadcast models are used to estimate the ionospheric parameters for single frequency users.In a way similar to the Klobuchar's ionospheric model (albeit with its aforementioned drawbacks), NeQuick is the ionospheric model adopted by the European global navigation satellite system Galileo, in order to compute the ionospheric delay corrections for its single-frequency users.Additionally, the International Reference Ionosphere (IRI) model is a global climatological model for the terrestrial ionosphere which is recommended for international use by the Committee On Space Research (COSPAR) and the International Union of Radio Science (URSI).Except for the above physical and empirical models to obtain TEC measurements, information about the ionosphere can also be extracted from GNSS observations, using linear combinations between multi-frequency observables.Based on such computational procedures, the International GNSS Service (IGS) provides Global Ionosperic Maps (GIM) in the IONospheric EXchange (IONEX) format (Montenbruck et al., 2017).The GIM models Kaselimi M., Doulamis N., Doulamis A., Delikaraoglou D., National Technical University of Athens Advanced Regression Models for Ionospheric Delay Prediction using GNSS Systems Τιμητικός -συλλεκτικός Τόμος στη μνήμη Ευαγγελίας Λάμπρου Πολυτεχνειούπολη Ζωγράφου, Αθήνα, 2020 are constructed by using TEC data derived from hundreds of GNSS stations worldwide.The overall effort is coordinated by the IGS Working Group on Ionosphere which was established in 1998 and includes four Ionosphere Associate Analysis Centers (IAACs): the Centre for Orbit Determination in Europe (CODE), Jet Propulsion Laboratory (JPL), European Space Agency (ESA), and Polytechnic University of Catalonia (UPC).These centers use different approaches and techniques to compute global ionospheric models.It is worth to mention that Satellite Altimetry and Very Long Baseline Interferometry (VBLI) measurements are also suitable to obtain ionosphere parameters.
The above mentioned models are global representations of TEC values, with their accuracy being limited by various factors such as, for instance the adoption of a simplified model in the case of the Klobuchar model, the fact that the NeQuick model provides typical median condition of the ionosphere (and hence, it cannot handle successfully extreme ionospheric conditions), or the non-uniform accuracy of IGS TEC data in the global scope for the GIM maps.Most of these models are stationary, difficult to be adapted to the irregular and complex distribution of TEC in the ionosphere, and additionally, they can not learn sufficiently enough the complex abstractions involved in the modeling of TEC variations.
Many significant research efforts are utilized to develop the TEC forecasting method including Support Vector Machine (SVM) (Zhang et al., 2019), the nonlinear radial basic function (RBF) neural network (Huang et al., 2014), artificial neural network (Bosco et al., 2009) and LSTM (Kaselimi et al., 2019, Sun et al., 2017).In particular, being capable of learning noisy and nonlinear relationships, Neural Networks are considered an interesting approach in the domain of ionosphere TEC prediction.However, the above-mentioned studies focus on TEC prediction based on estimates derived from TEC models, such as GIM maps, and they don't try to predict ionosphere delays separately from every different visible GNSS satellite, as in our approach.

Ionosphere variability
The ionosphere is typically defined as that part of the earth's upper atmosphere with sufficient concentration of free electrons affecting the propagation of electromagnetic waves.Its existence is primarily the result of the absorption of solar ultraviolet radiation in that part of the atmosphere which, in turn, reacts to produce free electrons and ions.TEC is often used to describe ionospheric variability and is space and time varying.
It is widely known that ionosphere exhibits significant variations with: • latitude and longitude: the most disturbed region is the aurora zone (between 60 o to 70 o N geomagnetic latitude) followed by the polar zone (> 70 o N), while significant ionospheric irregularities appear over the equatorial region.
• local time: during the sunny hours of the day, the ionospheric condition variations are higher than those in night-time period.
• solar cycle and geomagnetic activity: linked to the 11-year cycle of the sporadic magnetic storms that coincide with the peak of solar sun spot activity, and the recurrent magnetic storms that exhibit their maximum approximately two years following the maximum solar activity.
As the GNSS signal is propagated through the ionosphere, it can encounter severe temporal and spatial changes of the electron density which can cause significant disruptions on the traveling GNSS radio wave.As a consequence of the ionosphere's dispersive nature, different carrier wave frequencies, will be affected by different delays.This fact provides one of the greatest advantages of multi-frequency receivers over the single-frequency ones: appropriate mathematical combinations among the observable in different frequencies can eliminate the first order ionospheric delay error.Hence, the use of multiple navigation signals of distinct center frequency transmitted from the same GNSS satellite allows direct observation and removal of the most part of the ionospheric delay.
The amount of time that a GNSS signal spends travelling through the ionosphere sets the severity of the ionosphere's effect on that signal.A signal originating from a satellite near the observer's horizon (low satellite elevation) passes through a larger portion of the ionosphere until it reaches the receiver, in comparison for a signal transmitted from a satellite near the observer's zenith (high satellite elevation).Therefore, the longer the signal's path through the ionosphere, the greater will be the ionosphere's effect on it.

Precise Point Positioning method
Our goal is to construct robust regional and adaptable models of the ionosphere variability that consequently could be applied in single-frequency PPP processing as external ionosphere correction information.Towards this end, we will create deep learning-based models that could be applied for prediction estimates of the ionosphere parameters.These models first needed to be trained, thus ground truth TEC values should be taken into consideration.The workflow is as follows: are the sum of measurement noise and multi-path error for pseudorange and carrier phase observations.
In the case of dual-frequency GPS observations, the equation ( 1) is: since the code biases are commonly generated and distributed as differential code biases (DCBs): Given γ 2 =  1 2 / 2 2 , we have: then, Based on Equation (3), the term I is grouped with DCBs, thus: After having separated STEC values from DCBs, then, based on the single layer model assumption, the STEC can be converted into the vertical total electron content VTEC as follows (Xiang et al., 2019): where   is the mean radius of the Earth; θ is the elevation angle of the satellite; ℎ  is the height of the ionospheric layer (typically taken at 350 km).
Kaselimi M., Doulamis N., Doulamis A., Delikaraoglou D., National Technical University of Athens Advanced Regression Models for Ionospheric Delay Prediction using GNSS Systems Τιμητικός -συλλεκτικός Τόμος στη μνήμη Ευαγγελίας Λάμπρου Πολυτεχνειούπολη Ζωγράφου, Αθήνα, 2020 of independent, identically distributed (i.i.d.) examples from some unknown distribution, where   ∈ ℝ  and   ∈ ℝ.A GPR model addresses the question of predicting the value of a response variable  * , given the new input vector  * and the training set.A GPR model assumes that a response   satisfies the following equation:   = (  ) +   (9) where   are i.i.d.noise variables, so that   ~(0,  2 ).

Gaussian Process Regression
The GPR method calculates the posterior predictive distributions for the new test inputs.Gaussian processes can be considered as the extension of multivariate Gaussians to infinitesized collections of variables of real value.The marginal distribution of the training dataset responses  follows: ~(0,   ), where   is the kernel matrix: Thus, the predictive distribution ( * |) is: where

Support Vector Regression
Support vector machine (SVM) analysis is a popular machine learning tool for classification problems, first identified by (Kopsiaftis et al., 2019).Support Vector Regression (SVR) is the generalization of the SVM method for regression problems.The regression problem is a generalization of the classification problem, in which the model returns a continuous-valued output, as opposed to an output from a finite set.In other words, a regression model estimates a continuous-valued multivariate function.Some regression problems cannot adequately be described using a linear model.In such a case, the Lagrange dual formulation allows the SVM technique to be extended to nonlinear functions.The next step is to obtain a nonlinear SVM regression model by replacing the dot product  • ′ with a nonlinear kernel function (, ′) = ⟨(), (′)⟩, where φ(x) is a transformation that maps x to a high-dimensional space.

Experimental Results
During the data preparation process, we used the GAMP software, an open-source GNSS Analysis software for Multi-constellation and multi-frequency Precise positioning (Zhou et al., 2018).GAMP allows the use of undifferenced and uncombined observations in dual-  amplifies the measurement noise when dual frequency observations are combined in order to remove the ionospheric effects.The necessary RINEX observation and navigation files, along with precise orbit and clock information, IGS ANTEX (igs14.atx)and SINEX files, as well as ocean tide loading coefficients and Differential code biases (DCBs) from available IGS supplied DCB products are entered into the GAMP software for static PPP processing.Wuhan University's satellite orbits and clock offsets were also provided as input to GAMP software.DCBs (Wang et al., 2020) are essential in many navigation and nonnavigation applications (such as ionospheric analysis).As new signals are currently provided by the modernized GNSS systems, the need for a comprehensive multi-GNSS DCB product arises.DCB products, as part of the IGS Multi-GNSS Experiment (MGEX), are provided by the Chinese Academy of Sciences (CAS) in Wuhan.As it is observed, non-linearity helps GPR and SVR models to improve their performance and to provide better results than the traditional AR and ARMA models in VTEC estimation.
In particular, Figure 1 shows the predicted GPR-aided vertical TEC values for each one of the GPS satellites (when they are visible from 'bor1' station), along with the vertical TEC values being derived from GAMP software (and taken as the ground truth data).As noted, the GPR model can adequately predict the VTEC's variation for every satellite separately.

Figure 1 :
Figure 1: VTEC ground truth values and the respective prediction with the GPR method.
We apply dual-frequency undifferenced and unconstrained PPP model to estimate STEC values.These values are then separated from satellite and receiver DCBs (Differential Code Biases).Then, having pure STEC values we convert them to corresponding VTEC values through a mapping function dependent on the satellite's elevation.VTEC values are then used as supervised model's ground truth for training.between the receiver to the satellite; c is the speed of light in vacuum;   and   are the receiver and satellite clock offsets, respectively;  (  )  is the frequency- Kaselimi M., Doulamis N., Doulamis A., Delikaraoglou D., National Technical University of Athens Advanced Regression Models for Ionospheric Delay Prediction using GNSS Systems Τιμητικός -συλλεκτικός Τόμος στη μνήμη Ευαγγελίας Λάμπρου Πολυτεχνειούπολη Ζωγράφου, Αθήνα, 2020

Table 1
Performance metrics for the selected stations.
(Kopsiaftis et al., 2019)on (GPR) is a nonparametric kernel-based probabilistic model(Kopsiaftis et al., 2019).Considering a training set {(  ,   )} =1 PPP processing to extract STEC values.Using the uncombined PPP (UPPP) model allows estimating the ionospheric effects as unknown parameters, without the need to impose ionospheric-free constraints as it is done in the traditional PPP model which Kaselimi M., Doulamis N., Doulamis A., Delikaraoglou D., National Technical University of Athens Advanced Regression Models for Ionospheric Delay Prediction using GNSS Systems Τιμητικός -συλλεκτικός Τόμος στη μνήμη Ευαγγελίας Λάμπρου Πολυτεχνειούπολη Ζωγράφου, Αθήνα, 2020 frequency

Table 1
summarizes the average Mean Absolute Error (MAE) and the average Root Mean Squared Error (RMSE) for TEC predictions for all visible satellites.As noted, GPR and SVR methods have similar performance, ranging between 0.7 and 1.0 for the MAE and 0.9-1.3 for the RMSE metric.The GPR and SVR approaches have be implemented in conjunction to linear-in-the-parameters time-series model, such as the simple Autoregressive (AR) and Autoregressive Moving Average (ARMA) models(Wang etal., 2018, Mandrikova et al., 2014, Cheng et al., 2018).