Combining evolutionary computation with machine learning technique for improved short-term prediction of UT1-UTC and length-of-day

Over the years, prediction techniques for the highly variable angular velocity of the Earth represented by Earth’s rotation (UT1-UTC) and length-of-day (LOD) have been continuously improved. This is because many applications like navigation, astronomy, space exploration, climate studies, timekeeping, disaster monitoring, and geodynamic studies, all rely on predictions of these Earth rotation parameters. They provide early warning of changes in the Earth’s rotation, allowing various industries and scientific fields to operate more precisely and efficiently. Thus, in our study, we focused on short-term prediction for UT1-UTC (dUT1) and LOD. Our prediction approach is to combine machine learning (ML) technique with efficient evolutionary computation (EC) algorithms to achieve reliable and improved predictions. Gaussian process regression (GPR) is used as the ML technique with genetic algorithm (GA) as the EC algo-rithm. GA is used for hyperparameter optimization of GPR model as selecting appropriate values for hyperparameter are essential to ensure that the prediction model can accurately capture the underlying patterns in the data. We conducted some experiments with our prediction approach to thoroughly test its capabilities. Moreover, two forecasting strategies were used to assess the performance in both hindcast and operational settings. In most of the experiments, the data used are the multi-technique combinations (C04) generated by International Earth Rotation and Reference Systems Service (IERS). In one of the experiments, we also investigated the performance of our prediction model on dUT1 and LOD from four different products obtained from IERS EOP 20 C04, DTRF20, JTRF20 and USNO. The prediction products are evaluated with real estimates of the EOP product with which the model is trained. The combined excitations of the atmosphere, oceans, hydrology, and sea level (AAM + OAM + HAM + SLAM) are used as predictors because they are highly correlated to the input data. The results depict the highest performance of 0.412 ms in dUT1 and 0.092 ms/day in LOD, on day 10 of predictions. It is worth noting that the later predictions were obtained by incorporating the uncertainty of the input data as weights in the prediction model, which was a novel approach tested in this study.


Graphical Abstract 1 Introduction
The Earth's rotation is extremely complex as it incorporates the Earth's response to a wide range of internal and external perturbations (Dickey 2016).The perturbing sources include torques caused by celestial bodies such as the Moon, Sun and planets; angular momentum transfer between the solid and fluid envelopes of the Earth; and mass redistributions not only in the fluid regions of the Earth but also, in the solid Earth due to earthquakes, post-glacial rebound, mantle convection, and motions of tectonic plates.The Earth's rotation can be estimated from the observations of space geodetic techniques, Very Long Baseline Interferometry (VLBI), Satellite/Lunar Laser Ranging (SLR/LLR), Doppler Orbitography and Radio positioning Integrated by Satellite (DORIS), and Global Navigation Satellite Systems (GNSS).The rotation of the solid Earth is not constant, as measured by groundbased observatories of the space geodetic techniques described in the latter.These variations are geophysically interesting, and their analysis and understanding contributes to meteorology, oceanography, astronomy, seismology, tectonics, and geodynamics.
Mean Solar Time, UT1, is based on the observation of Sun from the rotating Earth around its axis and along its orbit.By definition, it is a linear function of the Earth rotation angle (ERA) (Capitaine et al. 2000).UT1-UTC or dUT1 is an Earth Orientation Parameter (EOP) which the difference between UT1 and Coordinated Universal Time (UTC).dUT1 is the most variable EOP with significant unpredictability (Nothnagel and Schnell 2008).The length-of-day (LOD) denotes the excess day length from 86,400 s (Moritz and Mueller 1987).In other words, the variation of the Earth's angular velocity due to geophysical and gravitational influences is expressed as a variation of the effective time for one full rotation by LOD (Seitz and Schuh 2010).The relationship between LOD and dUT1 is represented by the following equation (Landskron and Böhm 2019; Mikschi et al. 2019): Monitoring the Earth's rotation angle is critical in many domains involving reference systems, navigation, satellite orbit determination, positional astronomy, and geophysical studies on time scales ranging from a few hours to decades (Gambis and Luzum 2011).The International Earth Rotation and Reference Systems Service (IERS) is responsible for monitoring EOP and it provides time series products of EOP estimates.To improve the quality of estimates, EOP are estimated using one or more (1) dUT 1(T ) = T t 0 −LOD(t)dt + dUT 1(t 0 ).independent space geodetic techniques, the results of which vary in terms of availability, accuracy, and stability.Thus, due to the computational latency in combination, and data availability from different space geodetic techniques, the IERS cannot deliver EOP data in real time of observations.As a result, predictions are required to compensate for the latency of EOP.Further, many fields like, astronomy, satellite navigation, geodetic measurements, earthquake monitoring, spacecraft operations, climate monitoring, weather forecasting, time services, etc., rely on EOP predictions.It is crucial to account for the variability in Earth's rotation, ensuring that observations, measurements, and calculations in those fields are as accurate as possible.
The irregular variability or the occurrence of extremes in estimates of dUT1 and LOD makes the forecasts a difficult task.In case of dUT1, the current best estimate of accuracy is 5-10 µs and the accuracy of prediction over 10 days is up to 1 ms, which corresponds to ~ 50 cm at equator on Earth's surface.Further, the accuracy of prediction decreases with the increasing prediction length.As a result, prediction algorithms should be constantly improved because the accuracy of predictions, even for a few days in the future, is much lower than the accuracy of observations.
In the present study, we used the pattern recognition capability of machine learning (ML) to predict dUT1 and LOD.The time series of these quantities are representation of a physics-based parameter that are designed to study the Earth processes.In general, conservation laws are used to construct these physics-based models.In contrast to that, the empirical expressions used in ML model construction are data-driven and based on phenomenon observation (Goldstein and Coco 2015).They are used to optimize the fit of empirical predictors in non-linear multivariate datasets.Evolutionary computation (EC) approaches (Tang and Wu 2011), on the other hand, are inspired by nature and can provide a reliable and effective approach to addressing complex problems in realworld applications.Thus, in our approach, we combine the strengths of inductive and deductive methods used by ML and EC, respectively, in the development of prediction models to improve their accuracy.This combined approach is known as Evolutionary Machine Learning (EML) based predictions.
Several studies have been conducted to improve the predictions of LOD and dUT1.The stochastic prediction techniques are portrayed by auto-covariance (Kosek et al. 1998), autoregressive (Kosek 2012), neural network (Schuh et al. 2002;Kalarus and Kosek 2004;Zhang et al. 2011), kriging predictions (Michalczak and Ligas 2022), and extreme learning machine (Lei et al. 2015) techniques.The common techniques for predicting the deterministic and stochastic parts of time series are combination of least squares (LS) extrapolation with stochastic prediction (Kosek et al. 2005;Lei et al. 2023;Niedzielski and Kosek 2008), singular spectrum analysis (SSA) with copula analysis (Modiri et al. 2020), grey model with neural networks (Lei et al. 2017b), and a hybrid model combining SSA, LS, and support vector machine (SVM) (Yang et al. 2022).There are some methodologies in which predictions are made from effective angular excitation functions (Zheng 1993;Dill and Dobslaw 2019;Dobslaw and Dill 2018).There have been some recent studies that used deep learning techniques for LOD prediction (Gou et al. 2023;Zheng et al. 2017;Guessoum et al. 2022).Finally, the Earth orientation parameters prediction comparison campaigns (EOP PCC) were organized by IERS for the purpose of assessing the accuracy of EOP predictions by different prediction techniques.There have been two such campaigns organized till date, 2005-2008(Kalarus et al. 2010) and 2021-2022(Śliwińska et al. 2023)).
The outcome from the above studies depend on the used prediction algorithm, on the observations used as input for prediction and the time period of predictions.The majority of published studies are for predicting LOD in comparison to dUT1.In this study, the prediction model for dUT1 and LOD was developed using the Gaussian process regression (GPR) technique, with kernels and hyperparameters optimized using the EC algorithm known as the genetic algorithm (GA).This EML-based prediction model is referred to as GPR GA model in this paper.The choice of the kernel and its hyperparameters is called model selection.In the following study, we will use GA for model selection, where the optimal hyperparameters were determined by minimizing the cross-validation error of the model on training-validation data set.Then, experiments were performed to study the effectiveness of our developed prediction approach.
Based on their operational needs and objectives, different applications have different tolerances for predicational accuracy of dUT1 and LOD.The most stringent accuracy requirement is for the satellite navigation and deep space missions, as they require extremely accurate timekeeping.Any discrepancy in timekeeping can lead to significant navigation errors.Based on the document published by the European Space Agency (ESA) in 2016, the accuracy requirement w.r.t the IERS C04 solution for the dUT1 prediction products are 0.06 ms on day 1, 0.215 ms on day 5, and 0.53 ms on day 10.The need for improved predictions and improving the prediction algorithm, the latency in the availability of the final products, as well the increasing the prediction products to prevent the reliability on just one product, all these are the reasons for diving into prediction algorithms.
In the following section, we will shortly describe the algorithms that were used in this study.

Preliminaries
In this section, we will present the description of the building blocks leading to the prediction algorithm that is applied in this study.We use Gaussian process regression (GPR) (Rasmussen and Williams 2006) as the ML method for our prediction.Based on the preliminary prediction results with other algorithms tested with our data, GPR was selected for further development.Furthermore, one of the most important reasons for choosing GPR is that it provides uncertainty estimates for predictions and model parameters.For our application we consider it as very important to get not only predictions, but also a measure of confidence for the predictions.Also, this ML technique works well on relatively small datasets as compared to what is required for others.
The performance and accuracy of GPR modeling strongly depends on the hyperparameters of the covariance function determining the proximity between the training data and the prediction data (Schulz et al. 2018).As a result, in order to obtain the best performing GPR model, the hyperparameter search space is explored using a genetic algorithm (GA) to find appropriate parameter settings.GA solves optimization problems based on a natural selection process that mimics biological evolution.GA is used to improve the performance of GPR prediction model and the quality of their results.

Gaussian process regression (GPR)
GPR is a non-parametric Bayesian approach towards regression problem (Gershman and Blei 2012; Schulz et al. 2018).Let us assume the output y of a function f, at input x, can be written as: with ε : N(0, σ 2 ε ) .The error variance, σ 2 ε , is estimated from the data.We will assume that f (x) is a random vari- able with a specific distribution.The noise term ε reflects the errors of the observations, which is always present no matter how many observations we make.In GPR, we assume the function f (x) behaves according to a Gauss- ian process as follows: A Gaussian process (GP) is defined by a mean, m(x) , and a covariance function, k(x, x ′ ) .x and x ′ are input vari- ables with the latter depicting the input point on prediction epoch.In order to avoid intensive posterior computations and only infer via the covariance function, the a priori (2) mean is often set to zero.This is achieved by subtracting the a priori mean from all observations.The covariance function k(x, x ′ ) is represented by: E reflects the expected function value.The function k is commonly known as the kernel of Gaussian process, which is defined solely by the input variables.The appropriate kernel is selected based on the assumption such as smoothness and patterns of data.
In regression, we would like to make predictions of the function value based on Eq. 2. If we assume a linear regression model, y = θ 1 + θ 2 x , we need to find parameters θ 1 and θ 2 to define the function.Thus, we use the training dataset, D = [(x i , y i )|i = 1, .., n] , with n observed points for mapping x to y through basis function f (x) .After the training process, we assume all information of the data are captured by the parameters θ , thus predictions are independent of the training dataset D. The parameters,θ , discussed in the latter are known as hyperparameters.For most of the cases like ours, the assumption of linear regression model will not be true, and also there will be a set of infinite hyperparameters to choose from, to precisely define the relation between x and y.
We are trying to prepare a prediction model for the training data D from the unknown function f .There may be more than one function that fit the observed data points equally well.A Gaussian processes model describes a probability distribution over functions that could fit a set of points (Wang 2020).We can make predictions using the regression function modeled by a multivariate Gaussian as: where , and K i,j = k(x i , x j ) .X are the observed data points, m represents the mean function and k represents the kernel function.As mentioned ear- lier, the default is to set m(X) = 0 .Now, we make pre- dictions at new inputs X * by drawing f * (X * ) from the posteriori distribution P(f |X) .By definition, previous observations y and function values f * follow a joint mul- tivariate normal distribution.The detailed derivation can be obtained from any of the literature cited in this section for GPR.The joint distribution of the observed values and the function values at new testing points can be written as: (5) where K (X, X) is the covariance matrix between all observed points so far, K (X * , X * ) is the covariance matrix between the newly introduced points, K (X * , X) is the covariance matrix between the new input points and the already observed points, and K (X, X * ) is the covariance matrix between the observed points and the new input points.I is an identity matrix.The conditional distribution P(f * |X, y, X * ) is then a multivariate normal distribu- tion with mean as: And covariance matrix as: To predict f * , we can simply use the mean function in (7) and the covariance kernel in (8).
Along with the choice of kernel function, the process of model selection estimates the hyperparameters, θ , of the kernel function from the training data (Rasmussen and Nickisch 2010;Blum and Riedmiller 2013).The hyperparameters include length-scale, ℓ , noise variance, σ 2 ε , signal variance, σ 2 f , and basis function coefficients.ℓ determines the distance over which the input points are expected to change significantly and σ 2 f captures the variability of the signal predicted by Gaussian process.The model selection allows for flexible customization of the GPR model to the problem at hand.In this study, we introduced EC approach through GA to deal with the optimization of hyperparameters and kernel functions.

Genetic algorithm (GA)
From the last sub-Sect.2.1, it is clear that the reliability of the prediction is dependent on a judicious choice of kernel function and its hyperparameters.Thus, it is significant to incorporate a robust optimization technique.The usual choice to optimize the hyperparameters are by grid-search, random-search, and Bayesian optimization techniques.Generally, GA (Goldberg 1989) is used to solve problems that are not well suited for standard optimization algorithms (Gen and Cheng 2000;Krok 2015).These include discontinuous, non-differential, stochastic or highly non-linear problems, most of which is true for our case presented in this study.As we already know that the search space of GPR consists of kernel function and its hyperparameters.Kernel functions have discrete values, whereas hyperparameter values vary on a continuous scale.Section 3.2 discusses the kernel functions and hyperparameter values used in this study.GA solves continuous or non-continuous optimization problems with any types of constraints.Thus, GA-optimized GPR model are bound to perform better.
GA is a stochastic, population-based algorithm that is based on natural selection process that mimics biological evolution (Whitley 1994;Vose 1999;Mathew 2012).The steps involved in GA are depicted graphically in the "Model Selection" section of Fig. 1.GA encodes a problem in terms of individual(s) to be evolved with the aim of improving the quality of solution.The operation of GA begins with the generation of a population of random strings or chromosomes.This first step is known as initialization.The population is then operated by three main operators; reproduction, crossover and mutation to create a new population of points, which constitute the second step.Reproduction (or selection) is an operator that makes more copies of better strings in a new population.Each string or chromosome is a set of kernels and its hyperparameters that are tested against the fitness function.Fitness function is an equation that is determined from the set of parameters in the string, which we want to maximize or minimize with the optimization process.It is constructed as penalty function with respect to feasibility of solution.Crossover operator recombines two strings to get a better string.Mutation is, in some ways, the process of randomly disrupting genetic information.They operate at the parameter level; when the parameter value is being copied from the current string to the new string, there is a small probability that the parameter may become mutated and added with a slightly different value.This would bring either good solution or bad solution.As a result of this step, the total population will become more than double the size of the initial population.Then, comes the step 3 known as ranking, which ranks the strings based on their value of the fitness function.This helps to select the best strings for further process.In step 4, few random changes are made on parameters of highest-ranking strings to generate even better solution.At the end step of "selection", highest ranking strings are retained for repeating the process.Over successive generations, the population evolves towards an optimal solution.Strings which present a better solution to the target problem are given more chances to reproduce than those with poorer solution.The optimization is robust and the solution does not tend towards local minima as in other similar algorithms.Thus, GA can be viewed as trying to minimize the fitness function, by evaluating several solution vectors.

Methodology
In this section, we will discuss the pipeline and implementation of the prediction algorithm.A schematic representation of the prediction algorithm followed in our study is shown in Fig. 1.

Target data and predictors
This sub-section will outline the data used for the prediction.The target data are the parameter that we are trying to predict, and the predictors are the features or independent variables that are used to predict or explain the variability in target variable.The LOD and dUT1 time series are the target data and are unquestionably the most important aspect of the study.Four different time series products are used: IERS EOP 20 C04 (https:// www.iers.org/ IERS/ EN/ DataP roduc ts/ Earth Orien tatio nData/ eop.html), DTRF2020 (https:// dtrf.dgfi.tum.de/ en/ dtrf2 020/), JTRF2020 (data obtained from R. Gross by personal communication, 2023), and USNO Finals (https:// maia.usno.navy.mil/ produ cts/ daily).The prediction errors resulting from any prediction method, represent the accuracy of that prediction method and the effects of the observational error (Kosek et al. 2005).Different products of LOD and dUT1 time series capture the effects of observational errors differently.Thus, we investigate the prediction obtained for different products of LOD and dUT1.Furthermore, cubic spline interpolation was used to solve the inconsistency of time series data in some of the data products.In this study, we did not consider the data prior to 1990 as they are highly inconsistent with large formal errors relatively.Though GPR can handle a variety of data patterns, the highly discontinuous dataset as latter can pose a challenge to optimal learning by yielding suboptimal estimates of hyperparameters, affecting the overall performance of the prediction model.
The time series of angular momentum excitation functions (http:// rz-vm115.gfz-potsd am.de: 8080/ repos itory) are provided by the Earth System Modelling group of GFZ Potsdam (ESMGFZ).The time series of Atmospheric Angular Momentum (AAM), Oceanic Angular Momentum (OAM), Hydrological Angular Momentum (HAM), and Sea-level Angular Momentum (SLAM) functions are combined to form Effective Angular Momentum (EAM) functions.EAM functions ( χ mass This domain knowledge helps us to choose the axial component of EAM as predictors for our prediction model. To compute the prediction of dUT1, the leap seconds were first removed from the observed values of UT1-UTC, to provide UT1-TAI.The leap seconds were obtained from the IERS Bulletin D. Additionally, known effects of solid Earth zonal tides with periods from 5 days up to 18.6 years were removed from dUT1 and LOD (Sect.8.1 of (Petit and Luzum 2010)).The reduced dUT1 and LOD will be mentioned as dUT1 R and LOD R , respectively, in our study.The dUT1 R and LOD R will be predicted, and subsequently the leap seconds and tidal effects will be added back to provide the predictions of dUT1 and LOD.
However, removing long-period tides from the target data did not present considerable improvement in the results of our EML prediction approach.The reason for this is the chosen predictors in our GPR-based prediction model.As previously stated, the covariance matrix and mean are crucial for computing predictions in GPR.The covariance matrix influences how predictions are made by accounting for the similarities between predictors and target data.The predictors are chosen based on the feature selection which is based on correlation analysis with the target variable.Thus, our selection of EAM ( χ mass 3 , χ motion 3 ) as predictors, based on Eq. 9, will account for most of the geophysical variations associated with the target variables and effectively minimize their prediction errors in short-term.Please note that the leap seconds add significant jumps in dUT1 and hence, they are crucial to remove for achieving better prediction model.
In this study, we do not propose time series decomposition of the target variables to compute their predictions.We have checked our predictions with and without decomposition of target signal by empirical mode decomposition (EMD), and we did not see any significant changes.Moreover, the process of decomposition, prediction, and reconstruction of time series is more expensive in terms of computing time and more prone to errors.Unlike auto-regression techniques, the ML technique presents an advanced methodology that can capture the non-linear relationships, such as trends and seasonality patterns in the data effectively.Also, according to Luzum et al. (2001) and Johnson et al. (2005), short-term predictions would not benefit from the smoothing of past observations as observed by empirical testing.The data period of the target data and predictors used in this study are explicitly stated in the Sect.3.5.

Model selection using GA
The kernel,k , and hyperparameters, θ , are unknown and must be deduced from the training data.The search space for GA for this study is shown in Table 1 (Appendix In GA, the kernels and their hyperparameters form chromosome vectors, and the fitness function F (k, θ) is defined such that the optimization of the kernel is reduced to finding a minimum of F. In our case, we used the loss (mean squared error) obtained by the fivefold cross-validated regression model.The dataset is used as input into fivefold cross validation division.This technique splits the dataset into 5 equally sized folds, where each combination of 4 folds is used as training set (without repetition) and the remaining set is used as validation set.The model is trained on the training set observations to predict the observations within the validation set and then, the loss is computed for the validation set.The sets of hyperparameters and kernels, or chromosomes are then ranked based on the corresponding value of fivefold cross-validation loss or fitness function.In agreement with the process described in Sect.2.2, the cycle is repeated and the process stops if the minimum value of the fitness function for the population does not change in consecutive generations (convergence).Then, the components of the corresponding chromosome will be considered as optimal selection of kernel and its hyperparameters.In this way, the GPR kernel covariance function and its optimal hyperparameters were selected for the given dataset.This is computationally demanding part of the workflow, but this robust procedure helps to select the best performing GPR prediction model for the dataset.For fivefold cross validation in the study, the data set was divided into 80% training set with 9058 data points, and 20% validation set with 2264 points.For all experiments in our study, except one, the training set started in 01- 01-1990 (47,892) and ended in 20-10-2014 (56,950).The validation set started in 21-10-2014 (56,951) and ended in 31-12-2020 (59,214).Modified Julian Date (MJD) corresponding to their respective dates are shown inside parenthesis in the latter statements.

Prediction strategy for optimized GPR model
The prediction model developed based on the training data set, i.e., the ideal covariance function of the GPR Fig. 2 The impact of genetic algorithm optimization of GPR on training data for prediction of dUT1 R kernel and its optimized set of hyperparameters, is evaluated by comparing the test dataset to the computed predictions.The test dataset described here is different from that used for training and validation in the fivefold cross validation (Sect.3.2).It is reserved for evaluating the final performance of the optimized GPR model.After training and optimization of prediction model using the training and validation sets, we assess its generalization ability and predictive accuracy on the test set.This provides an unbiased estimate of how well the model is likely to perform on new and unseen data.The exact time period of training, validation and test data set are provided in Sect.3.5.
Based on an evaluation of various input data lengths for the prediction of ten days into future, we conclude that using 30 previous days as input yielded the best results.To test the optimized prediction model, we made 1000 samples of test data.We arranged the test dataset into input matrix of dimensions n × h × d to study the per- formance of prediction model.The n is the number of iterations or samples of test data used in assessment, i.e., 1000, while h is the input data length, i.e., 30 and d is the number of features.In our case, the features are the χ mass 3 and χ motion 3 functions of EAM, MJD, and LOD R or dUT1 R , depending on the target parameter of prediction.In contrast to the input matrix, there is also a forecasting matrix that is provided to the prediction model.This matrix contains the values of predictors in the forecasting epoch that are used to predict the target variables.The dimensions of the forecasting matrix are g × l , where g is the number of predictors, i.e., 3 ( χ mass 3 and χ motion 3 functions of EAM and MJD) and l is the prediction length, i.e., 10, for our study.
We use multiple output and recursive multi-step forecasting strategy to compute the 10-day predictions.The multiple output strategy entails creating a single model with one input dataset that predicts the entire forecast sequence.The following representation will help to understand this strategy: where t represents the starting day of predictions.The p shows the predictions and obs shows the observations.The input matrix with input length of 30 is used by our optimized GPR model, represented by mdl , is shown on the right side.
The recursive multi-step forecasting strategy is based on individual models for each prediction day, where later steps involve predictions of earlier steps: p(t) = mdl(obs(t − 1), obs(t − 2), ..., obs(t − 30)) p(t + 1) = mdl(p(t), obs(t − 1), ..., obs(t − 29)) Since the predictions are used in place of observations in this strategy, the prediction error accumulates and the p(t),...,p(t+9) = mdl(obs(t − 1), ....., obs(t − 30)), model will degrade for predictions on longer time horizons.Here we are focusing on short-term predictions, and hence this method is suitable for our prediction length.Also, a major advantage in this strategy is the use of the last time step before the predicted day as an input for the prediction model.When the input data are closer to the forecasted day, the performance of prediction might be better.

Performance metrics
The performance of the model in terms of predictions is evaluated by mean absolute error (MAE).The following equation is used to determine MAE of the prediction day, d: The AE denote the absolute error and n denotes the number of epochs predicted, i.e., 1000.d varies from 1 to 10 as we are performing 10-day prediction in our study.
For evaluating the performance, the absolute error is calculated as the difference of predicted and observed values on the same day.This observed value is obtained from one of the EOP products: IERS EOP 20 C04, DTRF2020, JTRF2020, and USNO Finals, based on the specific product being predicted.

Experiment setup
In this study, we conducted five experiments to comprehensively assess the performance of our proposed method.In this section, we will discuss the setup of each experiment one-by-one.The first experiment was designed to evaluate the performance of prediction of different EOP products.These different EOP products are IERS EOP 20 C04, DTRF2020, JTRF2020, and USNO Finals.Experiment 1 is a hindcast experiment, as this is primarily for the comparison of our predictions of different data products.Multiple output forecasting strategy was used in this experiment for the same reason.In Experiment 1, the training and validation data set starts on 01-01-1990 and ends 31-12-2017, and the test data set is from 01-01-2018 to 31-12-2020.Unlike different EOP products used in Experiment 1, the upcoming experiments use only EOP 20 C04 products from IERS for training and testing of predicted models.
Experiment 2 is also a hindcast experiment.Two different forecasting strategies, multiple output and recursive multi-step, were followed to predict LOD R and dUT1 R .
The training and validation data sets are longer than the time series of previous experiment.The reason for this is that the EOP products of DTRF2020 and JTRF2020 used (10) in Experiment 1, end on 02-01-2021.The GPR model was trained and validated using data from 01-01-1990 to 31-12-2020, while testing was performed on data from 01-01-2021 to 31-12-2022.Experiment 3 is used to evaluate the performance of our proposed model with operational settings.The IERS C04 products have a one-month latency and thus cannot be used as an input for prediction operationally in the current setting.Hence, we use daily rapid data products from IERS as input to overcome the latency period in IERS C04.In the rapid products, there is a latency of two days for LOD and a latency of one day for dUT1.In our experiment, we compensate for this latency by cubic spline interpolation, so that the prediction of the data set starts at day one in future.
The dataset that we used for the prediction has associated uncertainty and hence, we incorporated that into our GPR prediction models in the fourth experiment.
The IERS provides formal errors in terms of 1−σ standard deviation of its products and these were considered with our prediction model.This is achieved by modifying the kernel function to incorporate the uncertainty of the input values (Girard 2004).The following representation on ARD exponential kernel function of GPR describes our approach in this experiment.ARD stands for Automatic Relevance Determination: d depicts the dimension of the input matrix with length D. Modified kernel with uncertainty of input values ( σ d ) are represented by: (11 In this modification, w d shows the relevance weight of each input value.In this step, the σ d for each input value was expressed as variance.Then, the weights ( w d = 1/(σ 2 d ) ) were assigned based on their uncer- tainty values.Inputs with higher uncertainty were assigned lower weights, and vice versa.Further, the kernel matrix shown in Eqs.7 and 8 are calculated based on the modified kernel function.Then, we proceed with GPR prediction using the modified mean and covariance kernel matrix.For Experiment 4, the similar setup of training and test dataset as in Experiment 3 is followed.Further, the prediction model is tested for both forecasting strategies with operational settings in Experiment 4.
The axial component of the EAM function has a direct relationship with LOD (Eq.9), and we are using these as predictors for both LOD and dUT1, in our study.Thus, we were motivated to use LOD predictions to derive corresponding dUT1 predictions for the same epoch, using Eq. 1.This was performed in Experiment 5.The same setup for predicting LOD in Experiments 2 and 4 was used in this experiment regarding training, validation and testing datasets, for evaluating hindcast and operational settings, respectively.The predictive performance of the model was checked against the predicted dUT1 from Experiments 2 and 4 with the respective cases obtained in Experiment 5. (12)

Results and discussion
In this section, we discuss the results of the experiments outlined in Sect.3.5.Table 1 provides a summarized description of all the experiments.As iterated before in Sect.3.1, the removed leap seconds and tidal deformations were added to our predictions of dUT1 R and LOD R for obtaining the corresponding dUT1 and LOD predictions.Thus, the outcome in this section discusses the performance of dUT1 and LOD predictions.

Experiment 1
Figure 3 shows the MAE of the predicted dUT1 and LOD in the testing period, spanning from 2018 to 2020, of Experiment 1.The results of the four different series are portrayed by different colors in the plot.The MAE is calculated as the difference of the predicted parameter and the corresponding known parameter from that particular series.In Fig. 3(a), we can see that the MAE of dUT1 from IERS EOP 20 C04, DTRF2020, JTRF2020, and USNO Finals behave similarly.The MAE of dUT1 prediction on day 1 were 0.242 ms, 0.242 ms, 0.243 ms, and 0.267 ms for IERS EOP 20 C04, USNO Finals, JTRF2020 and DTRF2020 dUT1 products, respectively.The 'day' depicts the epoch from which the prediction is initiated.On the day 10, the MAE of predictions were 1.455 ms, 1.456 ms, 1.453 ms, and 1.491 ms for dUT1 obtained from IERS EOP 20 C04, USNO Finals, JTRF2020 and DTRF2020, respectively.The MAE of dUT1 exhibits a linear increase with the predicted days in future.It is evident from Fig. 3 that the MAE of all the products are close to each other, varying with a maximum factor of 1.1.The reason for the difference in MAE of predictions among different dUT1 products might be due to the different strategies involved for providing dUT1 by each group.Thus, the data learned by the prediction model were different for each of the four products and hence, difference in performance can be viewed.Further, as informed earlier, the dUT1 dataset obtained from JTRF2020 and DTRF2020 had gaps and were filled with interpolation.This might also affect the performance of their respective model as added uncertainty will be attached to the training data due to interpolation.It will not be a fair comparison to compare the performance of different products as the training and testing data are different and, also different prediction model is generated for each product.Thus, the results of this experiment display the performance of our prediction approach on four distinct EOP products.The outcome does not represent the quality of these EOP products.
Figure 3(b) shows the MAE on day 1 are 0.103 ms, 0.095 ms, 0.101 ms, and 0.101 ms for LOD obtained from IERS EOP 20 C04, USNO Finals, JTRF2020 and DTRF2020, respectively.While the MAE of day 10 LOD predictions from IERS EOP 20 C04, USNO Finals, JTRF2020 and DTRF2020 were 0.111 ms, 0.112 ms, 0.112 ms, and 0.112 ms, respectively.Unlike MAE of predicted dUT1, the MAE of predicted LOD does not increase linearly from day 1 to day 10 of predictions.This implies that the quality of LOD predictions for four EOP products obtained from our prediction algorithm remain consistent from day 1 to day 10 of predictions.This behavior is exhibited by our LOD predictions in all experiments of this study.Thus, it can be attributed as a peculiar behavior of our GPR GA predictions model with LOD.However, more has to be investigated in this direction.The LOD prediction of USNO performs better than the other series in the starting days of prediction, i.e., day 1 to day 5, by a maximum factor of 1.06.The MAE of JTRF20 LOD were relatively higher for most of the prediction days.This slightly higher MAE can be attributed to the use of interpolation for addressing the gaps in LOD provided by JTRF20 series for both training and testing phase.

Experiment 2
The MAE results of Experiment 2 are displayed by Fig. 4. The EOP parameters were predicted using our prediction model with two different forecasting strategiesmultiple output and recursive multi-step strategy.Thus, two lines each for the predicted parameter are shown in Fig. 4. Figure 4(a) shows the MAE of dUT1 predictions forecasted by multiple output strategy were 0.217 ms on day 1, 0.612 ms on day 5 and 1.008 ms on day 10.While those predicted by recursive multi-output strategy were 0.217 ms on day 1, 0.628 ms on day 5, and 1.02 ms on day 10.The MAE increases gradually from day 1 to day 10 of predictions in both cases.As the prediction epochs are same in the testing dataset, both the forecasting strategies have same MAE on day 1.This confirms high repeatability of the predictions obtained from our model.In Experiment 2, the dUT1 predictions by multiple output strategy show smaller MAE than the recursive multi-step with a maximum of 2.6%.This is evident as the recursive multi-step strategy makes predictions one step at a time and it feeds the previously predicted values back into the model for subsequent predictions.Therefore, in recursive multi-step, the input matrix for the prediction model includes predictions with greater uncertainty along with the test data from IERS C04 with much smaller uncertainty.In contrast to that, no predictions are included in the input matrix of multiple output strategy at any predicted day.This is the reason for relatively high MAE for recursive multi-step.Despite hindcast settings in both Experiments, 1 and 2, the MAE of dUT1 for IERS C04 is higher in the previous experiment, by a maximum of 30%.In order to maintain identical conditions to Experiment 1, this comparison is made with the outcomes of the multiple output strategy in Experiment 2. From Table 1, it is evident that the training data for the prediction model in Experiment 2 is three years more than that of Experiment 1.This portrays that more data for training is beneficial for developing the prediction model.Further, the testing period of Experiment 1 is for 3 years (2018-2020), in comparison to 2 years (2021-2022) testing in Experiment 2. This might also contribute in attaining higher prediction error in Experiment 1, revealing the need for re-training the prediction model.Further elaboration on this subject will be provided in Sect.4.4.
Figure 4(b) portrays the MAE of LOD, which is 0.098 ms and 0.098 ms on day 1, 0.116 ms and 0.127 ms on day 5, and 0.105 ms and 0.117 ms on day 10 for multiple output and recursive multi-step strategy, respectively.Similar to previous experiment, the MAE of LOD exhibit an inconsistent increase from day 1 to day 10 for both tested strategies.The predictions by multiple output show smaller MAE than by recursive multi-step with a maximum of 12%.The MAE of IERS C04 LOD obtained in Experiment 1 is higher than that of Experiment 2 by a maximum of 5%.The reason contributing to this difference is explained in the previous paragraph.

Experiment 3
Figure 5 shows the results of Experiments 3 and 4. Firstly, we will discuss the results of Experiment 3 from the plot.Figure 5(a) shows the MAE of predicted dUT1 with 0.241 ms and 0.241 ms on day 1, 0.676 ms and 0.692 ms on day 5, and 1.113 ms and 1.125 ms on day 10 for multiple output and recursive multi-step strategy, respectively.Despite the same MAE on day 1 for both cases, our prediction model displays slightly better performance with multiple output strategy, by a maximum of 2%.This experiment is set up with operational settings, which means that the input data consist of IERS Rapid EOP.The formal associated with IERS Rapid EOP data is more than that of the IERS C04 EOP.This explains the higher MAE observed for dUT1 in the operational case than that obtained for Experiment 2 with hindcast settings.The dUT1 MAE in Experiment 3 is more than Experiment 2, by a maximum of 9%.
The MAE of predicted LOD in Experiment 3 is displayed in Fig. 5(b).It is 0.106 ms and 0.106 ms on day 1, 0.124 ms and 0.137 ms on day 5, and 0.115 ms and 0.128 ms on day 10 for multiple output and recursive multi-step strategy, respectively.The MAE of multiple output strategy shows maximal improvement of 13%, compared to the outcome of recursive strategy.For similar reasons as iterated before, the LOD MAE in this experiment with operational settings are 6-9% higher than that observed for hindcast settings in Experiment 2.

Experiment 4
Now, the results of Experiment 4 are discussed from Fig. 5.The performance of the predicted results of dUT1 is portrayed in Fig. 5(a).The MAE of dUT1 predicted by multiple output strategy were 0.174 ms on day 1, 0.382 ms on day 5 and 0.481 ms on day 10, while those from the recursive multi-step strategy were 0.174 ms on day 1, 0.358 ms on day 5, and 0.413 ms on day 10.In contrast to the previous experiments (2-3), the recursive multi-step strategy performs better than the multiple output with a maximal improvement of 13%, in Experiment 4. When the latest data from the day before the predicted day were used for the input in the recursive multi-step approach with its uncertainty, the performance of this particular strategy improved in Experiment 4. The inclusion of the latest predictions, together with their associated uncertainty, provided an additional benefit to this prediction methodology.It is important to remember that both Experiments 3 and 4 are configured with operational settings.The only difference in the prediction strategy of this experiment is the use of uncertainty associated with the input EOP data.It is evident from Fig. 5(a) that there is only a very slight difference in the MAE of day 1 among the outcome from these two experiments.This difference increases gradually to reach the maximum at day 10.This indicates that this particular prediction methodology improves the prediction error with the increase of predicted days with respect to Experiment 3. Further, the best performance of Experiment 3 is improved by a maximum of 63% in Experiment 4.
Figure 5(b) depicts the MAE of predicted LOD in Experiment 4. The associated values are 0.086 ms and 0.086 ms on day 1, 0.111 ms and 0.105 ms on day 5, and 0.099 ms and 0.092 ms on day 10 for multiple output and recursive multi-step strategy, respectively.The recursive multi-step improves the performance of our prediction model by a maximum of 9% than that portrayed by multiple output strategy.Further, the outcome of Experiment 4 is an improvement over that of Experiment 3, with a maximum of 20%.Please note from Fig. 5 that there is a difference between the results of Experiments 3 and 4, albeit the input data being the same.This reveals that uncertainty of EOP input data has a substantial impact on enhancing the prediction error of our prediction model.
Figure 6 displays the absolute error of 10-day predictions obtained for 1000 prediction epochs or iterations in testing period 2021-2022.This is shown for recursive multi-step of dUT1 and LOD predictions in Experiment-4.This is solely to examine the performance of our prediction model in each of the 1000 prediction epochs.In Fig. 6(a), for the iterations or prediction epochs in year 2022, the higher AE values above 1 ms for dUT1 predictions are observed when predictions start in between MJD 59741 (11-06-2022) andMJD 59789 (29-07-2022).Similarly, the dUT1 prediction epochs in the year 2021 with higher AE values than others in the same year, start in between MJD 59393 (28-06-2021) and MJD 59402 (07-07-2021).Please remember that one or more prediction epochs might starts at the same day since we perform 1000 iterations in the test period 2021-2022.Thus, consecutive prediction epochs with similar AE are possible in Fig. 6.Unlike AE of dUT1 predictions, the LOD AE does not increase in ascending order from day 1 to day 10 of predictions.This is already clear from the MAE results discussed in the Sect.4. Note that the start MJD of prediction epochs for LOD predictions are same as that of dUT1 predictions.The prediction epochs in which dUT1 predictions experience higher AE match with those of LOD predictions exhibiting relatively higher AE compared to their counterparts.

Experiment 5
Figure 7 shows the MAE for dUT1 predictions only.The dashed lines in the plot represent dUT1 predictions as computed in the previous experiments, while the solid line represent dUT1 computed from the corresponding LOD predictions.The latter is labeled as dUT1 (LOD) to differentiate with those obtained from previous experiments.Figure 7(a) displays the results from hindcast experiment.The MAE of the regular dUT1 predictions in Fig. 7(a) is already discussed in the Sect.4.2.Now, we will only focus on the results of dUT1 (LOD) and its comparison with the directly predicted dUT1 as exhibited from Fig. 7(a).The dUT1 (LOD) is derived from the predicted LOD of Experiment 2 using Eq. 1.The MAE of the predicted dUT1 (LOD) for the multiple output strategy is 0.095 ms on day 1, 0.525 ms on day 5, and 0.92 ms on day 10.The MAE with recursive multi-step is 0.095 ms on day 1, 0.562 ms on day 5, and 1.02 ms on day 10 for dUT1 (LOD) predictions.Similar to Experiment 2, the multiple output performs better than recursive multi-step, by a maximum of 9% in dUT1 (LOD).Figure 7(a) clearly demonstrates that dUT1 (LOD) outperforms dUT1 for hindcast settings.The improvement in initial days of predictions is higher than that achieved for later days in the short-term prediction length.In multiple output strategy, the dUT1 (LOD) improves by a maximum of 56% on day 1 which reduces to 8% on day 10, with respect to dUT1 obtained in Experiment 2. Similarly, the dUT1 (LOD) improves by a maximum of 56% on day 1 to 1% on day 10 for recursive multi-step strategy.This indicates that dUT1 (LOD) prediction methodology brings out substantial improvement in the initial days of predictions, in comparison to the directly predicted dUT1.The MAE of dUT1 (LOD) are 0.093 ms and 0.093 ms on day 1, 0.551 ms and 0.515 ms on day 5, and 0.998 ms and 0.902 ms on day 10 for multiple output and recursive multi-step predictions, respectively.The predicted LOD from recursive multi-step performs better than multiple output in Experiment 4, and hence similar outcome is expected for dUT1 (LOD) in this experiment.The prediction of dUT1 (LOD) from recursive multi-step strategy outperforms its counterpart, by a maximum of 10%.Further, from Fig. 7(b), it is evident that the dUT1 (LOD) performs better for the initial days of prediction, i.e., day 1 to 2. Whereas later, the performance of dUT1 (LOD) predictions scores worse than that of predicted dUT1 in Experiment 4. The performance of dUT1 (LOD) on day 1 is highest with 46% improvement than dUT1 from Experiment 4. This performance of dUT1 (LOD) decreases by a maximum of 47% and 58% on day 10 than that obtained in Experiment 4. This reveals that dUT1 (LOD) prediction methodology will provide smaller prediction errors than dUT1 on prediction days 1-2, for the operational settings in our study.

Comparison with other techniques
In this section, we will compare the best obtained outcome from this study with other prediction models.For comparison, we will solely concentrate on the results obtained with the operational settings, as this is crucial for real-time EOP applications, which is the primary objective of generating short-term EOP predictions.Based on previous sub-sections displaying results, the best outcome is achieved by our prediction methodology followed in Experiment 4. Both our investigated EOP parameters, dUT1 and LOD, agree with the latter assertion.However, for predicted days 1-2, the dUT1 prediction approach of Experiment 5 achieves better performance than Experiment 4, by 46%.Hence, we recommend to use the prediction approach of Experiment 5 for predicting the initial days (day 1 and 2) of dUT1 prediction, and rest of the prediction days to be predicted by the approach followed in Experiment 4. Thus, we used the prediction errors for dUT1 based on the latter declaration for comparison with other prediction techniques, in this section.The prediction errors of LOD in Experiment 4 were considered for further comparison in this section.Therefore, from the testing of 1000 10-day prediction epochs for 24 months (2021-2022) with our prediction model, the best achieved MAE were 0.093 ms on day 1, 0.358 ms on day 5, and 0.412 ms on day 10 for dUT1 predictions.Similarly, for LOD predictions from our prediction model, we attained the best MAE of 0.086 ms on day 1, 0.105 ms on day 5, and 0.092 ms on day 10.The best obtained MAE reported here are from operational settings examined in the above-mentioned experiments.
The first IERS EOP prediction comparison campaign (EOP PCC-I) showed MAE on the 10th day of prediction between 0.60 ms to 1.40 ms for dUT1 and 0.13 to 0.40 ms for LOD (Kalarus et al. 2010).This was after 29 months of operational campaign, started in 1 October 2005.In the second IERS EOP prediction comparison campaign (EOP PCC-II), the MAE on the prediction day 10 was between 0.27 ms to 3.15 ms for dUT1 and 0.072 ms to 0.292 ms for LOD (Table 7 and 9 in Śliwińska-Bronowicz et al. (2024)).This was after 16 months (2021-2022) of operational campaign, started in 1 September 2021.Our dUT1 MAE for day 10 is smaller than the best results obtained in the EOP PCC-I, by a factor of 1.5.While our dUT1 MAE is higher than the best result of EOP PCC-II, by a factor of 1.5.It is important to remember that the latter campaign ran for a duration of 16 months, whereas we conducted testing for 24 months period on our prediction model to determine the MAE.However, in EOP PC-II, the dUT1 MAE obtained by various prediction methodologies using ML on day 10 was 0.43 ms-1.61ms (Table 7 in Śliwińska-Bronowicz et al. (2024)).In comparison to the dUT1 MAE from other techniques listed here, we infer that our attained dUT1 MAE results are better for most cases.The smallest LOD MAE obtained for day 10 in EOP PCC -I, is higher than that achieved by our prediction model, by a factor of 1.4.The LOD MAE of best performing prediction methodology of EOP PC-II is smaller than our MAE by a factor of 1.3.However, the day-10 MAE obtained by various MLbased prediction methodologies were 0.098 to 0.107 ms in the latter campaign (Table 9 in (Śliwińska-Bronowicz et al. 2024)), which is slightly higher than 0.092 ms achieved by our prediction model on the same predicted day.Albeit, the LOD predictions obtained from our prediction model are not as good in the initial days of prediction, this can be improved further by investigating the cause of prediction errors.
Most of the current research primarily focus on shortterm prediction of LOD, with just a limited number available for dUT1.The short-term prediction of dUT1 by combination of the grey model and neural networks by Lei et al. (2017a) generated MAE of 0.039 ms on day 1 to 0.966 ms on day 10.Ye and Yuan (2024) employed improved least-squares and multivariate autoregressive (LS + MAR) hybrid method to achieve dUT1 predictions with MAE of 0.021 ms on day 1 to 0.934 ms on day 10.These predictions were conducted with hindcast settings and hence, their prediction accuracy will be lower in operational settings.Compared to these recent dUT1 prediction methodologies, it is evident that the dUT1 prediction errors obtained from our prediction model are higher in the initial days of short-term prediction, but gets smaller after day 5 of prediction till day 10.The denoised EAM implemented with least-squares and autoregressive (LS + AR) method by Li et al. (2023) presented more accurate predictions than the traditional LS + AR generating MAE of 0.055 ms-0.36 ms for day 1 to day 10 operational dUT1 predictions.The day 10 MAE of the latter is smaller than our corresponding MAE by a factor of 1.1.We can also consider denoising EAM for utilizing with our GPR GA prediction model for better prediction accuracy.In this section, the MAE presented for other techniques were determined using IERS EOP C04 as a reference.This is crucial to ensure a fair comparison with our results in this section.
Different authors introduced a variety of approaches for the prediction of LOD.Some recent LOD prediction methodologies include the best approach by Modiri et al. (2020) based on Copula and singular spectrum analysis (SSA) method which provided MAE in the range of 0.047-0.083ms for the days 1-10, respectively.The extreme machine learning approach used by Lei et al. (2015) generated MAE of 0.025 ms on day 1 up to 0.21 ms on day 10 for LOD predictions.Another ML prediction of LOD using 1D Convolutional Neural Network produced results between 0.027 ms and 0.120 ms for day 1 and day 10, respectively (Guessoum et al. 2022).All these state-of-the-art approaches described till now involved hindcast experiments to evaluate their LOD predictions between different time spans.The recent high quality work published by Gou et al. (2023) using LSTM (Long Short-Term Memory) networks shows MAE of LOD prediction under operational conditions of 0.025 ms on day 1 to 0.08 ms on day 10.It is important to note that the time span for training and testing of data are all different for the approaches mentioned above.Our LOD predictions of both hindcast and operational experiments are comparable to them on day 10.
Further, we have directly compared our prediction outcome with the commonly used LS + AR (Xu and Zhou 2015;Xu et al. 2012), and highly recommended Kalman filter (Freedman et al. 1994;Gross et al. 1998;Gross 2011) prediction method for dUT1 and LOD.In order to compare our predictions with the predictions supplied by IERS in Bulletin-A, we have also demonstrated the MAE calculated from their predictions.The MAE is determined with respect to the IERS C04 estimates.The MAE of these three prediction techniques along with our prediction outcome is displayed in Tables 2, 3.The prediction period, spanning from 1 September 2021 to 31 December 2022, was utilized to ensure consistency in the prediction epoch across all the methodologies compared in this table.We determined dUT1 and LOD predictions for the latter prediction epochs with our recommended methodology for each parameter, as mentioned in this Sect.4.6.For the direct comparison, the predictions are obtained for 70 prediction epochs, once every week, for each prediction methodology portrayed in Tables 2, 3.The predictions of LS + AR, Kalman filter, and Bulletin-A are available in the link provided under ' Availability of data and materials' in this paper.The IERS Bulletin A does not provide LOD predictions, hence their corresponding prediction errors are not shown in Table 3. From Table 2, it is evident that the dUT1 MAE of LS + AR is higher than that achieved by our prediction model, by a maximum factor of 2.8 on day 10.The MAE for dUT1 portrays that the prediction error of our method is higher than that of Kalman filter and Bulletin-A for most of the predicted days.The difference between the dUT1 MAE of our method and that of the Kalman filter and Bulletin-A is highest on day 1 and thereafter reduces with the increase of prediction days.On day 10, our prediction model performs better than Kalman filter by a factor of 1.2.However, the Bulletin-A predictions performs better than our prediction model by a minute factor of 1.1 on day 10 of dUT1 predictions.In LOD predictions, as shown in Table 3, our prediction method performs better than LS + AR by a maximum factor of 3.However, the MAE of LOD predictions from Kalman filter are smaller than that attained by our model.The difference between the MAE of Kalman filter and our prediction model decreases significantly along the prediction length, after day 1 of predictions.Our LOD MAE at day 10 is smaller than that of Kalman filter by a factor of 1.05.

Summary and conclusion
We investigated the use of combining GA with GPR (GPR GA ) for short-term dUT1 and LOD predictions.
In other words, we evaluated the use of evolutionary machine learning algorithm for EOP prediction.Thus, we developed robust GPR-based prediction model with optimal selection of kernel function and its hyperparameters from the complex search space using GA optimization.The search space, including continuous and discontinuous values, was searched for an optimal combination for better prediction accuracy.This provides better GPR prediction model than any other commonly used optimization techniques as shown in Sect.3.2.In addition to that, by learning the underlying relationships with the predicted EOP data, the appropriate features or predictors influencing the EOP were chosen for our model.The axial components of EAM were selected as predictors for our prediction model.Further, the exact input data length with minimum error were used for attaining the desired prediction outcome for short-term prediction length.
Our prediction algorithm was tested on four different EOP products from IERS EOP 20 C04, USNO Finals, JTRF2020 and DTRF2020 during 2018-2020.Then to bring the testing of the algorithm to more recent data period, i.e., 2021-2022, we performed experiments on IERS EOP 20 C04 series in both hindcast and operational settings.In operational settings, due to the one-month latency of IERS C04 EOP, rapid EOP products with relatively high uncertainty were used as input.Further, the predictions of EAM are used in the forecasting matrix, under operational settings.The accuracy of the EAM predictions available are not as good as their estimates.In contrast to that, the hindcast settings predict on the past data with actual estimates in the input and forecasted matrix, with no shortcomings as discussed for operational settings.For each of the experiments, we investigated two different forecasting approaches, multiple output and recursive multi-step, to assess better approach for shortterm prediction.Unlike any other related study, we also incorporated the uncertainty of the input data for getting better prediction model.This was a novel approach used in our study which improved the results significantly.We also determined dUT1 predictions from predicted LOD of the corresponding epoch, as axial components of EAM used as predictors in our prediction approach are directly correlated to LOD.The performance of the dUT1 predictions obtained from the latter approach were compared with those obtained in the previous experiments.Please refer to Table 1 for a summarized version of experimental setup used for testing different prediction approaches with our prediction model.It is also important to note that the performance of our prediction model was evaluated based on the mean prediction error obtained from 1000 predictions for each day of the short-term prediction length.Experiment 1 depicted the performance of our developed prediction model on predicting EOP from four different EOP products.The MAE of dUT1 and LOD predictions for all the four products exhibited minimal variations among each other, with a maximum factor of 1.1.This showed that our developed GPR GA prediction model is robust.It was evident from the results that the performance of dUT1 is more uniform than LOD over the short-term prediction length.This could be due to the fact that the LOD is simply the gradient of UT1 which is more consistent and stable.The hindcast settings of Experiment 2 portrayed that the performance of multiple output forecasting strategy is better than the recursive multi-step by a maximum factor of 1.02 for dUT1 and 1.2 for LOD.The best performing prediction approach of hindcast settings generated MAE of 1 ms for dUT1 and 0.1 ms for LOD on day 10.The testing with hindcast settings indicates the capability of our prediction model when the shortcomings in the input and forecasted data in operational settings will be resolved.As expected, the performance of our prediction model worsens in Experiment 3 with operational settings, by a maximum of 9% for both dUT1 and LOD.In Experiment 4, the inclusion of uncertainty in the operational settings improved the performance of recursive multi-step forecasting approach with respect to multiple output.In addition to that, the best MAE for both dUT1 and LOD is generated with the prediction approach followed in Experiment-4.We achieve MAE of 0.412 ms for dUT1 and 0.092 ms for LOD on day 10.In Experiment 5, the dUT1 predictions determined from LOD predictions of Experiment 4 further improved the performance of our prediction model on the initial prediction days (day 1-2) with respect to dUT1 MAE obtained with operational settings, by a maximum factor of 1.9.
The recommended prediction methodology for shortterm predictions of dUT1 and LOD with our GPR GA prediction model will be recursive multi-step strategy by incorporating uncertainty values of input EOP data as shown in the methodology of Experiment 4.However, for further improving the initial two days of dUT1 prediction, the prediction approach in Experiment 5 is recommended.A significant finding from the outcome of this study is that the uncertainty of input values helps to reduce the shortterm prediction error.Also, our GPR GA prediction methodology performs relatively better for predictions of day 5 day 10 of the short-term prediction length.
The comparison of our prediction methodology with other prediction techniques portrayed that our results are comparable with some of the best available approaches.
Our outcome mostly at day 10 are better than most of the best prediction methodology as investigated in Sect.4.6.Unlike many of the related research, we performed our predictions in operational settings for generating the prediction errors in real-time.In this study, the best dUT1 outcome with MAE of 0.09 ms on day 1, 0.358 ms on day 5, and 0.412 ms on day 10 are close to the ESA requirements as described in the Sect. 1.The MAE on day 1 and day 5 are more by a factor of 1.5 and 1.6, respectively, than the ESA requirements.While our MAE on day 10 are better by a factor of 1.3 than the requirements set by ESA.Nonetheless, our predictions obtained for many prediction epochs in the testing period are well below the ESA required criteria.But for other epochs, we need to further investigate our prediction approach to reduce the prediction errors, especially for the initial days of short-term prediction.
Since, the prediction algorithm using Kalman filtering performed better than our prediction approach, we should also investigate by utilizing Kalman filter along with our GPR GA prediction model for improving the prediction accuracy.During the testing period in 2021 and 2022, the investigation of AE revealed elevated prediction errors in June and July, specifically.The reason for this might be some extreme meteorological event, like ENSO, and has to be investigated further.Also, the benefits of incorporating global climatic factors for short-term EOP predictions should be examined.The relatively higher AE during the prediction epochs of 2022 can also indicate the need for re-training our model.Moreover, it will be beneficial for the prediction model to be re-trained regularly for including the recent geophysical variations.This will further help to minimize the prediction errors.
In this study, we extensively exploited the use of GPR GA prediction approach for short-term dUT1 and LOD predictions.Although, the results obtained were satisfactory, there is still scope for more investigation in this technique for development of improved GPR GA prediction methodology.As mentioned before, the input data and predictors are crucial for our prediction technique.Therefore, it is necessary to evaluate the performance of other geophysical parameters that influence EOP in order to determine their suitability as predictors.Another study can be conducted to compare various inputs, such as VLBI intensives or a combination of VLBI intensives with GNSS, in order to enhance predictions in operational settings using our prediction model.We can utilize the EOP estimated from our own analysis with our proposed prediction process to independently generate EOP predictions.

Appendix
See Table 4 This search space shows that the kernel function, and its hyperparameters (basis functions, kernel scale, sigma, and standardize).As mentioned in the paper, the search space contains continuous and discontinuous vales from which an optimal combination has to be selected with minimal prediction error.For instance, the Kernel Function, Basis function, and standardization parameters contains discreet choices.While the kernel scale and sigma offer continuous choices in between stated minimum to maximum values.(1993) Prediction of the length of day from atmospheric angular momentum with lstar model.In: Mueller II, Kołaczek B (eds) Developments in astrometry and their impact on astrophysics and geodynamics international astronomical union/union astronomique internationale.Springer, Dordrecht Zheng H, Yuan J, Chen L (2017) Short-term load forecasting using EMD-LSTM neural networks with a Xgboost algorithm for feature importance evaluation.Energies.https:// doi.org/ 10. 3390/ en100 81168

Fig. 1
Fig. 1 Schematic flowchart of the prediction process followed in this study.As an example, showing prediction process for 30-07-2021 to 08-08-2021 Earth orientation due to mass redistribution and motion by means of the Liouville equation(Munk and MacDonald 1975;Barnes et al. 1983).The superscript 'mass' signifies the terms induced by mass redistributions, and 'motion' terms denote the motion or transport of masses due to atmospheric winds, ocean currents, etc.The observed LOD in s/day can be quantified by the axial component ( ).The impact of GA optimization on GPR model can be clearly assessed from Fig. 2. The plot shows the values of dUT1 R for specified dates.These dates were chosen at random.The black and blue lines depict the predicted values of the GPR prediction model before and after GA optimization.The red shows the actual values of dUT1 R during the specified period in the x-axis.The figure clearly shows that the data from the predicted model after GA optimization fits well to the actual data.This shows the performance of the model on re-substitution (training data), which provides a preliminary measure of how well the model fits on the training data.But the actual performance of our prediction model (Sect.3.4) is assessed by predictions (Sect.3.3) made on a new dataset that the model has not seen during training.

Fig. 3
Fig. 3 The MAE values showing performance of our prediction model in Experiment 1 for predictions of: (a) dUT1 and (b) LOD

Fig. 4
Fig. 4 The MAE values showing performance of our prediction model in Experiment 2 for predictions of: (a) dUT1 and (b) LOD

Fig. 5
Fig. 5 The MAE values showing performance of our prediction model in Experiment 3 and Experiment 4 for predictions of: (a) dUT1 and (b) LOD

Fig. 6
Fig. 6 The AE values of, (a) dUT1 and (b) LOD, predictions from Experiment 4 with recursive multi-step forecasting strategy

Figure 7
Figure7(b) exhibits the MAE of predicted dUT1 and dUT1 (LOD) with operational settings of Experiment 4. For discussion on the MAE of predicted dUT1, we refer to the results of Experiment 4 in Sect.4.4.The MAE of dUT1 (LOD) are 0.093 ms and 0.093 ms on day 1, 0.551 ms and 0.515 ms on day 5, and 0.998 ms and 0.902 ms on day 10 for multiple output and recursive multi-step predictions, respectively.The predicted LOD from recursive multi-step performs better than multiple output in Experiment 4, and hence similar outcome is expected for dUT1 (LOD) in this experiment.The prediction of dUT1 (LOD) from recursive multi-step strategy outperforms its counterpart, by a maximum of 10%.Further, from Fig.7(b), it is evident that the dUT1 (LOD) performs better for the initial days of prediction, i.e., day 1 to 2. Whereas later, the performance of dUT1 (LOD) predictions scores worse than that of predicted dUT1 in Experiment 4. The performance of dUT1 (LOD) on day 1 is highest with 46% improvement than dUT1 from Experiment 4. This performance of dUT1 (LOD) decreases by a maximum of 47% and 58% on day 10 than that obtained in Experiment 4. This reveals that dUT1 (LOD) prediction methodology will provide smaller prediction errors than dUT1 on prediction days 1-2, for the operational settings in our study.

Fig. 7
Fig. 7 The MAE values showing performance of our prediction model in Experiment 5 for predictions of: (a) dUT1 from Hindcast Experiments (b) dUT1 from Operational Experiments

Table 1
Description of the experiment setup of the study a IERS Rapid products are used as input for operational settings

Table 2
The MAE (ms) achieved for operational dUT1 predictions with different prediction techniques

Table 3
The MAE (ms) achieved for operational LOD predictions with different prediction techniques

Table 4
(Rasmussen and Williams 2006)algorithm consists of following parameters from the Gaussian process regression model(Rasmussen and Williams 2006), max(1e −3 , 10 • std(r))] where std is the standard deviation of target parameters (r)5Standardize True want to standardize the predictors and target variables False do not want to standardize the predictors and target variables Zheng D