1. Introduction
The Chinese BeiDou navigation satellite system, abbreviated as BDS [
1], has finished its regional system BDS-2 consisting of 14 satellites in 2012, which provides positioning, navigation, timing, and short message communication services for Asia-Pacific users. During the past several years, it has developed dramatically for the purpose of constructing a global navigation system as BDS-3. By the end of 2018, a total of 19 BDS-3 satellites had been launched to achieve a primary system for global services. With the help of altogether 33 operational satellites, i.e., five geostationary earth orbits (GEOs), seven inclined geosynchronous orbits (IGSOs), and 21 medium Earth orbits (MEOs) in orbit, the standard positioning accuracy achieves five meters in both horizontal and vertical directions for Asia-Pacific users and 10 meters for global users [
2].
With the development of business requirements and the processes of technological innovation, higher and higher accuracies of positioning and navigation are needed. Generally, the accuracies of satellite orbit and clock products directly decide BDS positioning performance [
3,
4]. With the great efforts of international GNSS service (IGS) and international GNSS monitoring and assessment service (iGMAS), several types of precise orbit and clock products for BDS satellites are provided to satisfy various positioning demands with different accuracies [
5,
6,
7,
8]. Among these products, the ultra-rapid product is generated for possible real-time applications by calculating orbits and clocks over the next 24 h. Although the orbit product with an about five centimeter uncertainty can be neglected, the precision of predicted clocks cannot satisfy the requirement of real-time precise point positioning (RT-PPP) [
6,
9]. Therefore, improving the accuracies of predicted clocks is of considerable interest in order to make RT-PPP workable. In this paper, we will focus on an improved clock prediction method for all BDS-2 satellites and most BDS-3 MEOs, since the MEO satellite is the backbone type of BDS-3, which has better orbit and clock products than GEOs and IGSOs, due to its favorable observation geometry. Another reason is that the qualities of the BDS-3 GEOs and IGSOs products are very poor.
Due to the solution and broadcast delay, clock prediction is indispensable for positioning services with the prediction time varying from several seconds to a few days based on different application scenarios. For the RT-PPP calculation, critical real-time and accuracy of the predicted clock must be met at the same time. Since the intrinsic physical characteristics of satellite clocks are very complex and instable, as well as varying space environment [
10,
11], it is more challenging to achieve the actual prediction process, and a simple linear function describing the clock offset series cannot meet the abovementioned demands. Despite such difficulties, numerous researches are carried out and many models are given for clock prediction, e.g., quadratic polynomial model (QPM), grey model (GM), spectrum analysis model (SAM), Kalman filter (KF), and neural network (NN) [
12,
13,
14].
Usually, the first- and second-order polynomial models, as well as the GM, are widely used for the global positioning system(GPS) satellite clocks prediction. However, the limitations are obvious. It is certificated that the clock errors of QPM are accumulated rapidly with the prediction time increasing [
11]. As for the traditional GM, its exponential coefficients are closely related to the number of elements, and this model requires plenty of sampling data. Although many efforts have been made, the accuracy of clock prediction is still not sufficient for real-time and high-precision applications [
15]. Since the unmodeled dynamics in the residuals will be absorbed by clock parameters, including clock bias, drift, and drift rates, which leads to misestimation, based on the fast Fourier transform (FFT) method, Senior (2008) achieved four main periodic items for the GPS satellite clocks and analyzed the relationship between its amplitude and orbit period [
16]. They show the incorporation of fixed-period harmonics, especially for the GPS satellite clocks, which provides a very accurate predicting model. Even though the SAM is an enhancement of the QPM with periodic terms included, it is pointed out that the periodic functions have to be determined reasonably based on a very long clock time series [
17,
18]. Meanwhile, researches indicate that clock parameters can be estimated by employing the KF method with a small amount of recent data. Epstein (2003) achieves an accuracy of 8–9 ns by predicting GPS clocks over six hours with the KF approach [
19]. Davis (2012) developed an improved model based on the KF method along with a stochastic model, and a sub-nanosecond clock is obtained with a centimeter-level positioning accuracy [
13]. However, the estimation is not reliable if a frequency jump happens. Moreover, artificial NNs are employed by scientists to predict clock offsets [
11,
20,
21,
22], e.g., back-propagation neural network (BPNN) and wavelet neural network (WNN), which have the ability to estimate non-linear time series by sample training. However, the topology structure of the WNN is hard to set up, while the BPNN involves an iterative procedure to determine appropriate weights, which requires considerable time consumption.
While most of the abovementioned achievements focus on GPS clocks, BDS satellites attract more and more attention since their dramatically development. It is reported that ultra-rapid products with an accuracy of about 2–6 ns are one type of the most important products in providing decimeter-level real-time positioning services for BDS. However, evident nonlinear system errors, for example periodic signals, are detected in ultra-rapid clock products [
11,
14]. To satisfy the requirements of high-precision application scenarios, such as landslide early warning and hydrographic surveying, an improved model is required to further enhance its prediction accuracy. Thanks to the dramatic development of artificial intelligence, some new ideas for satellite clock prediction were proposed [
14]. Among these algorithms, least-squares support-vector machines (LS-SVM) is a machine learning method based on strict mathematical principles, which was developed by Suykens et al. [
23]. Since it has high generalization ability, which is also benefit for low- and high-dimensional datasets, as well as fast learning speeds and low computational cost, LS-SVM is widely applied in the areas of pattern recognition, classification, and nonlinear function fitting. In this paper, an improved model based on the SAM and LS-SVM is developed to deal with BDS satellite clock prediction.
In the following sections, the related typical models for satellite clock prediction are summarized in
Section 2, including QPM, SAM, and LS-SVM. Then, an improved model for BDS satellite clock prediction is proposed, and satellite-specific optimal parameters including the sample input length, regularization parameter, and kernel parameter are determined by experiments in
Section 3. After preprocessing of clock offsets, clock accuracy, and static and kinematic precise point positioning (PPP) are validated and analyzed in
Section 4. Finally,
Section 5 concludes with a discussion.
2. The Improved Model
The current BDS constellation consists of GEO, IGSO, and MEO satellites, which are equipped with a rubidium atomic frequency standard (RAFS) for BDS-2, and a new-generation high-quality rubidium clock and passive hydrogen masers (PHMs) for BDS-3 (
Table 1). Since the GEOs and IGSOs of BDS-3 are the test satellites without public ultra-rapid clock products, our discussion in this paper is limited to MEOs of BDS-2 (C01 to C16) and MEOs of BDS-3 (C19 to C34). Additionally, satellites C35–C37 are quite new with few observations, which are also excluded because of the poor data quality.
For a better understanding of the improved model for clock prediction, traditional methods including the QPM and SAM approaches are introduced firstly. Then, the improved model is presented as a combination of the SAM and LS-SVM methods in detail.
2.1. QPM
According to the performance of satellite clocks, the systematic parts of clock offsets can be described by polynomial models with different orders. For BDS satellite clock offsets, normally a second-order polynomial model is adopted as a basic model, which is proved effectively in many contributions [
11,
14]. The basic QPM method is presented as:
where
a0 is the clock bias correction coefficient,
a1 is the clock drift correction coefficient, and
a2 is clock drift rate correction coefficient;
t0 is the reference epoch of clock offset series;
ti and
clk(
i) denote the
i-th epoch and its corresponding clock offset; and
εi is the residuals of the prediction model.
2.2. SAM
For GPS clock offsets, the QPM method can precisely fit most features of the satellite clock in most cases [
11]. However, since the stability of BDS satellite clock is not as high as that of GPS, periodic terms have to be considered during clock offsets fitting and predicting. Taking the QPM method as a basic function, two significant periodic terms are employed to describe BDS clock offsets, called the SAM method. Hence, the function with periodic terms is expressed as [
11]:
where
a0 is the clock bias correction coefficient,
a1 is the clock drift correction coefficient, and
a2 is clock drift rate correction coefficient;
t0 is the reference epoch of clock offset series;
ti and
clk(
ti) denote the
i-th epoch and its corresponding clock offset;
l is the number of preiodic terms;
and
fr are the amplitude and frequency of periodic terms, respectively;
is its corresponding initial phase; and
εi is the residuals of the prediction model.
2.3. LS-SVM Model
Generally, during the LS-SVM processing, nonlinear mapping is applied to map datasets from the original space into a high-dimensional feature space, so that the nonlinear fitting problem in the original space becomes a linear fitting problem in the high-dimensional feature space. Thereby, the basic theory of LS-SVM can be described as follows.
Suppose the sample datasets are
,
. Then, the linear regression function in a high-dimensional eigenspace is shown as follow:
where
is the nonlinear function mapping datasets from the original space into a high-dimensional feature space;
w and
b are the regression parameters to be estimated from the sample datasets.
According to the minimal structural risk principle, the LS-SVM estimation can be transformed into an optimization problem, expressed as follows:
where
, and
is the regularization parameter. Then, the solution of the LS-SVM regressor can be obtained with the aid of constructing a Lagrangian function:
where
are the Lagrange multipliers. The conditions for optimality are shown as follows:
Elimination of
w and
e will yield a linear system instead of a quadratic programming problem [
24], which is described as:
with
,
, and
;
is an
identity matrix, and
is the kernel matrix defined by
, which satisfies the Mercer’s conditions [
25].
In order to realize the LS-SVM method, the kernel function has to be determined firstly. The commonly used kernel functions include linear functions, polynomial functions, and radial basis functions (RBFs). Generally, the kernel function is selected by experience and the RBF is reported as the first choice in kinds of issues. The reason is that the RBF is suitable for both small and large quantities of datasets, especially for the data with no prior information; meanwhile, it has wider applications and possesses higher precision. Comparatively, the linear function has a linearly separable condition, and the polynomial function may lead to more parameters to be estimated when the order of polynomial is high.
The RBF kernel function can be described as follows:
where σ is the kernel parameter, which determines the generalization behavior of the RBF kernel approach.
Once the RBF kernel function is fixed, the potential factors that influence prediction accuracy include input sample length m, regularization parameter γ, and kernel parameter σ. Firstly, data preprocessing is employed, and a training set and a testing set are constructed based on the input sample length m. Secondly, regularization parameter γ and kernel parameter σ are determined based on the grid search algorithm [
26], that is, after setting the search range by experience, an iterative computation in which k-fold cross verification is used as an evaluation criterion is done to find the tuple <γ,σ>, corresponding to the minimum squared error in the regression sample dataset. Usually, more than one tuple of <γ,σ> are obtained with a similar high accuracy level. Since a large γ value may result in overfitting and poor predictions, an optimal parameter pattern with a small γ value is the best choice [
27].
2.4. The Improved Model
Since the stochastic characteristic of BDS satellite clock is sophisticated and non-stationary, which cannot be described well using the abovementioned existing models, an improved model combing the SAM and LS-SVM methods is proposed for clock prediction. It is known that LS-SVM has high non-linear mapping ability and strong robust property, which is better for approaching the remaining clock residuals after the removal of trend and periodic terms. Because the LS-SVM expression is very complex, the part of LS-SVM is depicted as a function of f
LS-SVM. Hence, the improved model is proposed as:
where
clk(ti),
a0,
a1,
a2,
ti,
t0,
l,
,
fr,
and
εi parameters are the same as those in Equation (2).
Further, the process diagram of the improved model is given in
Figure 1. Firstly, preprocessing for raw clock offset series is executed to get rid of outliers, and a preliminary processed clock offset series is obtained. Then, the QPM method is employed to systematically eliminate trends of clock offsets. After that, the SAM method is used to weaken periodic characteristics, and the residuals are considered as an input dataset for LS-SVM. After choosing the RBF kernel function by experience, an input sample length is determined by experiments, and the parameter pattern <γ,σ> are optimized satellite by satellite to consider satellite-specific characteristics for BDS satellite clocks. Finally, the well-trained improved model is used for clock prediction.
Additionally, in an actual system, the machine learning algorithm uses historical data to learn the complex, nonlinear relationships between the past and the future. In the training process, a subset of the historical data (70%) are used for training and the rest of the historical data are used for testing. Then, the well-trained model is running online, and provides real-time clock offset predictions with the help of sliding window strategy. In the online operation process, the filter does not need to train itself with every dataset. When the filter fails in some cases, the system is degraded (for example, a polynomial model would succeed), but still works. Meanwhile, a new model is trained offline.
3. Realization of the Improved Model
3.1. Preprocessing for Satellite Clock Offsets
It was reported that the ultra-rapid clock products of BDS-2/3 are made available by several GNSS analysis centers, for example the German Research Centre for Geosciences (GFZ) and Wuhan University(WHU), and can be used as an input in data processing. However, the raw clock time series may conceal abnormal data, such as gross errors and blunders, which should be detected before executing clock prediction. The epoch-differenced clocks are calculated in which outliers can be shown more apparently and then the outliers are detected through the median absolute deviation (MAD), which is a robust but simple tool [
11,
14]. Here, the epoch-differenced clock series are obtained as follows:
where
fi is the epoch-differenced clock series with index
i,
ci is the raw clock series with index
i, and
τ is the sample interval of the raw clock series. Meanwhile, the MAD can be expressed as follows:
where
m is the median of the epoch-differenced clock series, namely
. Then, the abnormal clock offset will be deleted when
is valid, where
n is an integer empirically set as 3 in this processing. It follows the important principle not to delete outliers more than 5% of total data, which avoids losing a great deal of useful information [
22]. Currently, this preprocessing approach is used for each BDS satellite employing the finally products of GFZ with a 30 s sample interval. The statistic results show that more than 95% of outliers (approximately 0.27%–0.63% of total data) are detected and eliminated for each satellite.
3.2. Determination of Periodic Terms
Significant periodic characteristics are found in BDS satellite clock residuals after quadratic polynomial fitting and periodic terms changes from satellite to satellite due to different system configurations [
11]. In order to identify the potential periods for each BDS satellite clock, the FFT approach is employed to analyze the clock residuals after quadratic polynomial fitting using 60-day data from day of year (DOY) 020 to 079 in 2019. The relationships between amplitude and frequency are identified for each type of satellites. Considering satellite clock stability is inversely proportional to its operation time, the newest launched satellites C02, C13, C14, C33, and C34, belonging to GEOs, IGSOs, and MEOs of BDS-2, as well as BDS-3 MEOs equipped with Rubidium (Rb) and PHM, respectively, are chosen as examples, and results are given in
Figure 2. Additionally, a special periodic term of about 1.3 h has been identified for satellite C11, and the result is also shown in
Figure 2. Further, the two outstanding periodic terms of each BDS satellite clock are given in
Table 2.
For most of GEOs and IGSOs, the two main periodic terms, 24 h and 12 h, are identified, and the periodic terms are caused by the orbital operation and general relativistic effect. Results also show that 12 h is the most significant periodic term for most of MEOs equipped with Rb with different amplitudes. As for BDS-2, a 12 h harmonic with a smaller amplitude than for BDS-3 is noticeable. Comparatively, the clock periods of BDS-3 MEOs equipped with PHM are not apparent. The thermal insulation materials for satellites are not good enough at an early time, which leads to a changing environment for satellite clocks. As the angle of the sun above the orbital plane changes over time depending on the orbital position, the temperature of satellite clock is also varying, related to the orbital position. This is the main exterior reason why the clock offsets exhibit periodic oscillation features. Hence, the most pronounced periods of BDS satellite clocks are related to their corresponding orbit periods. Interestingly, a special periodic term of about 1.3 h has been identified with a noticeable amplitude for satellite C11, and it is different from the normal period of the MEO orbital geometric configuration. This obvious period may be caused by the hardware of the atomic clock itself, which is the main interior reason for this special periodic term and has a large impact on the prediction accuracy. Therefore, the periodic terms of the clock offsets must be considered carefully.
3.3. Input Length Determination
During the clock prediction processing, the sample dataset is usually divided into two parts, with one being a training sample dataset and the other being a testing sample dataset. Meanwhile, the performance of LS-SVM can be measured in terms of its plasticity and generalization ability, that is, the plasticity is a good fitting of the trained model with the training sample dataset, and the generalization ability is checked by the testing sample dataset, which means a good prediction. An LS-SVM with better performance should have better plasticity and generalization ability at the same time.
In order to build a suitable LS-SVM model, a sample dataset, the remaining clock residual after the removal of trend and periodic terms, is divided into a training sample dataset and a testing sample dataset in an appropriate way. Normally, 70% of sample data are employed for training and the rest are used for testing [
28]. In the training processing, the training sample dataset is further divided into a number of successive multiple inputs, single output (MISO) groups, as shown in
Table 3. Herein, the input data length affects LS-SVM performance significantly and should be determined by experiments firstly.
Then, we can judge which input length is the best for each BDS satellite by checking the agreement of true (GFZ precise clock products) and prediction with different lengths of input data. In the LS-SVM construction phase, the length of input data successively increases from 5 to 200 with 5 as an interval to investigate how the length of input data affects LS-SVM performance.
Figure 3 gives predicted root mean square (RMS) errors of each satellite type, and they are on the sub-nanosecond level when the prediction time is less than 3 h, regardless of the length of input data rising from 5 to 200. Regarding different satellite types, the RMS error results of GEOs and IGSOs are larger than those of MEOs. The prediction RMS values for all MEOs are within 0.4 ns for most cases, with the best one at a length of 120 as about 0.1 ns. Although MEOs of both BDS-2 and BDS-3 have a similar performance, the MEOs belonging to BDS-3 achieve a slightly better performance than those of BDS-2, especially for BDS-3 MEOs equipped with PHMs. Considering the prediction time of this experiment is three hours, the following inference is usually reasonable and the prediction accuracy will be more precise if the prediction time is not less than 3 h. Based on many experiments, it can be identified that, when the prediction time is less than 3 h, the recommended length of input data is 120 (1 h) for most satellites as a best compromise between prediction accuracy and computational complexity.
3.4. Optimization of LS-SVM Parameters
The regularization parameter and kernel parameter are the two key parameters in the LS-SVM method. The kernel function used for BDS clock offsets is the RBF function with σ as the kernel parameter and is crucial for a better prediction. When the value of σ is small enough, i.e., σ→0, each sample data is classified into a single group, leading to over-fitting. When the value of σ is large enough, i.e., σ→∞, the distance between any points is zero, which means that all the points are divided into one group, resulting in under-fitting. Therefore, it is worth investigating the kernel parameter selection for a better balance. Meanwhile, when the regularization parameter γ is large, the training sample dataset will be fully fitted, introducing more complexity and inefficiency. When the value of γ is small, the fitting function is as simple as possible, which may not fully describe characteristic properties of the training sample dataset.
It is important to note that the regularization parameter γ and the kernel parameter σ are coupled. Therefore, the grid search algorithm with a cross-validation method is applied to find an optimal tuple <γ,σ> [
29]. Firstly, the coupled simulated annealing (CSA) algorithm determines suitable starting points within the search range [e
–10,e
10], based on experience. Secondly, these starting points are then given to the LS-SVM model to get the highest accuracy of prediction without over-fitting or under-fitting. The search range is gradually narrowed from a coarse mesh to a fine mesh and the threshold value is gradually decreased. Finally, the optimal tuple <γ,σ> are chosen under the principle that the regularization parameter γ is the smallest in order to decrease the computing complexity.
With the aid of optimal input length as 120 (1 h) determined in
Section 3.3, the optimal tuple <γ,σ> and its corresponding prediction RMS errors for each BDS satellite are presented in
Table 4.
5. Conclusions
The predicted satellite clock offsets are crucial to support real-time global satellite precise positioning applications. Although there are already several real-time precise positioning service providers, unfortunately, not all users can use the correction information due to either cost of the service and limitation of their equipment or out of the service coverage. An alternative way is to enhance the accuracy of the predicted satellite clocks for real-time precise positioning. Based on the study of existing prediction models, an improved model combing the SAM method and the LS-SVM model is proposed; by further considering satellite-specific characteristics, the parameters of the LS-SVM method are optimized satellite by satellite, including input length, regularization, and kernel parameters.
The results indicate that GEOs and IGSOs have typical periodic terms corresponding to the orbital periods of 12 h and 24 h. It also show that there exist some special periods, e.g., 1.3 h for C11, beyond our understanding that periodic terms are related to the orbit period, which may be caused by the hardware of the atomic clock itself. Experiments show that 120 (1 h data) is the best input length achieving the highest prediction accuracy for most satellites. Additionally, the optimal tuple <γ,σ> is searched by the cross-validation method based on a prediction precision comparison. Further, the improved model with satellite-specific parameters is validated, and results show that the clock accuracy of the improved model is generally better than those of the QPM and SAM methods. The overall accuracy of predicted clocks is better than ±1.0 ns for most MEOs within 3 h, which is significantly enhanced with respect to the ultra-rapid products, and the accuracy is good enough for several real-time precise applications.
The predicted clock offsets are further evaluated by applying them to PPP in both static and kinematic solutions for 10 IGS MGEX stations, including five stations in the Asia-Pacific region. The positioning accuracies obtained with the QPM and SAM methods, as well as the improved model, are compared and their convergence times are also analyzed. In the static PPP, the improved model is demonstrated to be effective, and helps to achieve more than 15.0% improvements on average for each direction, which enables sub-decimeter positioning, especially in the Asia-Pacific region. In the kinematic PPP, the improved model performs much better than the others in terms of both convergence time and positioning accuracy. The convergence time can be shortened from 1.0 h to below 0.5 h, while the positioning accuracy is enhanced by 16.3%, 10.8%, and 18.9% on average in east, north, and up directions, respectively.