Constraints on Cubic and $f(P)$ Gravity from the Cosmic Chronometers, BAO&CMB datasets : Use of Machine Learning Algorithms

In this work, we perform observational data analysis on Einsteinian cubic gravity and $f(P)$ gravity to constrain the parameter space of the theories. We use the 30-point $z-H(z)$ cosmic chronometer data as the observational tool for our analysis along with the BAO and the CMB peak parameters. The $\chi^2$ statistic is used for the fitting analysis and it is minimized to obtain the best fit values for the free model parameters. We have used the Markov chain Monte Carlo algorithm to obtain bounds for the free parameters. To achieve this we used the publicly available CosmoMC code to put parameter bounds and subsequently generate contour plots for them with different confidence intervals. Besides finding the Hubble parameter $H$ in terms of the redshift $z$ theoretically from our gravity models, we have exercised correlation coefficients and two machine learning models, namely the linear regression (LR) and artificial neural network (ANN), for the estimation of $H(z)$. For this purpose, we have developed a Python package for finding the parameter space and performing the subsequent statistical analysis and prediction analysis using machine learning. We compared both our theoretical and estimated values of $H(z)$ with the observations. It is seen that our theoretical and estimated models from machine learning performed significantly well when compared with the observations.


I. INTRODUCTION
Since the discovery of the late cosmic acceleration (Riess et al. 1998;Perlmutter et al. 1999;Spergel et al. 2003), General Relativity (GR) suffered a big set-back due to its incompatibility and inability to explain the event. Since then modified gravity theories (Nojiri & Odintsov 2007;Nojiri, Odintsov & Oikonomou 2017) have become a useful tool in describing the acceleration of the universe. The basic aim of a modified gravity theory is to comprehensively describe the history of the universe staring from the early inflationary phase to the late time acceleration, always complying with the observations. Modification of Einstein gravity basically involves the modification of the Einstein-Hilbert (EH) action, by generalizing the gravity Lagrangian R (where R is the Ricci scalar invariant given by R = g µν R µν ). Corrections to EH action may be inflicted from higher order curvature terms, invariants coming from the matter sector, inclusion of torsion in the theory, etc. This can motivate one to search for higher-order corrections to the EH term, thus resulting in higher-order gravity theories. It should be stated here, that the higher-order corrections are motivated from the fact that such terms naturally arise in the effective action of a complete string theory (Gross & Sloan 1987) resulting in a renormalizable and hence a quantizable theory of gravitation (Stelle 1977). Some examples include topologically massive gravity (Deser, Jackiw & Templeton 1982, new massive gravity in three dimensions (Bergshoeff, Hohm & Townsend 2009) and critical gravity (Lu & Pope 2011). The whole concept of holography has in fact motivated the construction of higher-order theories like quasitopological gravity (Oliva & Ray 2010, Myers, Paulos & Sinha 2010. One important aspect of higher order theories is that some of them are equivalent to Einstein gravity at the linearized level in vacuum. In these cases the only physical mode transmitted by the metric perturbation is a transverse and massless graviton. Certain examples include the quasitopological gravity (Oliva & Ray 2010, Myers, Paulos & Sinha 2010) and certain f (Lovelock) theories (Lovelock 1971, Bueno et al. 2016, Karasu, Kenar & Tekin 2016. It is known that the most fundamental generalization of the EH action is inflicted by replacing R by an arbitrary function f (R) resulting in f (R) theories (De Felice & Tsujikawa 2010;Sotiriou & Faraoni 2010). Inclusion of an arbitrary function of R in the field equations help us to probe the non-linear effects of the curvature invariant.
In spite of being highly motivated, modified gravity theories have several disadvantages. Since the coupling of the curvature invariants depend of the dimensions of the spacetime, these eventually boil down to different theories in different dimensions. The higher order curvature terms in the action may give rise to field equations whose order is greater than the second order. This may not always indicate instabilities or pathologies, but yet it is always a matter of concern.
A very interesting higher order theory known as the Einsteinian cubic gravity (ECG) (Bueno & Cano 2016a) was developed recently where the authors used cubic contraction of the Riemann tensor R µναβ to generate the cubic invariant. In the formulation the authors used the linearization technique of general higher order gravities. Although being highly non-linear the theory contains non-topological terms responsible for basic health conditions. It was shown that the theory admits spherically symmetric black hole solutions with a second order differential equation for the metric function (Bueno & Cano 2016b;Hennigar & Mann 2017). The cosmology of ECG was studied and it was found that it has a mechanism that triggers an early inflation and a late time acceleration close to the ΛCDM behaviour (Arciniega, Edelstein & Jaime 2020). Observing these developments, the theory was further extended to f (P ) gravity (P being the invariant from ECG) where the gravity Lagrangian was given as the Ricci scalar coupled with an arbitrary function of the cubic invariant P (Erices, Papantonopoulos & Saridakis 2019). The theory was formulated in four dimensions and the cosmology of the model was studied. It was seen that f (P ) gravity admitted early inflation and late acceleration even with a vanishing cosmological constant. A complete dynamical system analysis on f (P ) gravity was performed by Marciu (2020). Recently the f (P ) gravity is further extended to f (R, P ) gravity in Marciu (2021), where the author studied the dark effects of the theory.
Inspired by the success of ECG and f (P ) theories in the cosmological context, we want to conduct an observational data analysis on both the theories to constrain their parameter space. The most important aspect of a theoretical model is its degree of compliance with the observations. Fitting the theory with the observational data one can obtain bounds for the free parameters of the model using various statistical tools. This is a vital exercise for any theory and with the bounds on the model parameters the theory gains physical viability. In this work we will use the cosmic chronometer data from the slowly evolving distant galaxies as our observational tool. We will use various statistical procedures like the χ 2 minimization technique, Markov chain Monte Carlo method and machine learning algorithms for our analysis. The paper is organized as follows: In section II we discuss the basic equations of ECG and f (P ) gravity. In section III we talk about observational data analysis with the cosmic chronometer data coupled to Baryon acoustic oscillation and Cosmic microwave background peak parameters using the χ 2 minimization technique and the Markov chain Monte Carlo algorithm. In section IV we discuss the statistical analysis and machine learning methodology. In section V we present all the results obtained from section III & IV. Finally the paper ends with a conclusion in section VI.

II. BASIC EQUATIONS OF CUBIC AND f (P ) GRAVITY
The basic way of imposing modifications to Einstein's theory of General Relativity (GR) is by introducing new scalar invariants in the gravitational Lagrangian of the Einstein-Hilbert action that will replace the orthodox Ricci scalar R. Now there can be many such possibilities from the mathematical point of view. We can always play around with the Riemann tensor R µνρσ , Ricci tensor R µν , energy momentum tensor T µν , etc. and explore their possible contractions and produce scalars from such exercises. Although their physical significance and importance to cosmology is a totally different question and needs thorough study, their mathematical significance is unquestionable. Nevertheless while undertaking such an exercise one would like to be in compliance with three rules: (i) The theory possess an identical spectrum as GR. (ii) It should be non-topological in nature and should possess non-trivial terms in four dimensions. (iii) The field equations retrieved from such a theory must be of second order.
In 4 dimensional spacetime a general non-topological cubic term would be given by (Erices, Papantonopoulos & Saridakis 2019), where β i are parameters. If the condition (i) stated above is to be satisfied then we should have, The action for the cubic gravity is given by, where P is the invariant defined in eqn.
(1) and α is the coupling parameter. κ = 8πG is the Newton's constant and we have considered a vanishing cosmological constant. L m is the matter Lagrangian and L rad is the Lagrangian for the pressure-less radiation. Varying the above action (4) with respect to the metric we arrive at the following field equations, where the energy-momentum tensors T µν and T rad µν are given by, and where ∇ µ is the covariant derivative with respect to the metric g µν . The tensor K αβµν is defined by, Here the tensor H µν arises from the contribution of the invariant P . In order to explore the cosmological implications of the cubic gravity we consider a homogeneous and isotropic spacetime given by the following Friedmann-Lemaitre-Robertson-Walker (FLRW) metric, where a(t) is the scale factor and δ ij is the Kronecker delta. The matter sector will be represented by perfect a fluid with the energy momentum tensor where ρ m and p m are respectively the energy density and pressure of matter and u µ is the four-velocity of the fluid. Equipped with all these considerations we can now write the FLRW equations from the field equations (5) as, where H =ȧ a is the Hubble parameter and dots (.) represent derivatives with respect to time. Here we have defined the parameterβ as,β Moreover the condition for the second order field equations is satisfied if we have, It should be noted that under the FLRW geometry considering the equations (2) and (14), the cubic invariant can be put in the form, Here we see that the above invariant P only consists of first order derivatives and hence the FLRW equations will contain derivatives upto the second order only. The Friedmann equations (11) and (12) can be rewritten in the standard form as, where ρ cubic = 6αβH 6 (18) Above we see that these ρ cubic and p cubic are the contribution of the cubic gravity on energy density and pressure respectively, and these modifications are introduced in the action by the cubic invariant P . The conservation equations of the respective components will be,ρ cubic + 3H (ρ cubic + p cubic ) = 0 (20) In the above, we have considered matter to be pressure-less dust along with radiation. Solving eqns. (21) and (22) we get respectively, ρ m = ρ m0 (1 + z) 3 and ρ rad = ρ rad0 (1 + z) 4 , where ρ m0 and ρ rad0 represents the present energy densities of matter and radiation respectively. The cosmological redshift will be given by z = 1 a(t) − 1. The effective equation of state will be given by, The effective dark energy equation of state will be given by, Now we define the density parameters as, Ω DE = Ω cubic = κρ cubic 3H 2 , Ω m = κρm 3H 2 , Ω rad = κρ rad 3H 2 . The deceleration parameter may be defined as, Now we can generalize the action for the cubic gravity (4) by including an arbitrary function of the scalar invariant P as given below, where f (P ) is the arbitrary function of P . Varying the action with respect to the metric we get, where T µν and T rad µν are given by eqn. (6) and In the above expression the tensorK αβµν is given in terms of the tensor K αβµν from eqn.(8) as, where the primes denote derivative with respect to the argument. Considering FLRW geometry, we get the following two FLRW equations, where Here ∂ t = ∂ ∂t and ∂ 2 t = ∂ 2 ∂t 2 . The conservation equation for the modified gravity sector is given by, The conservation equations for the matter and the radiation sectors remain same as the ones given in eqns. (21) and (22) respectively. In the above equations for f (P ) gravity, if we put f (P ) = αP we recover the corresponding equations for the cubic gravity.

III. OBSERVATIONAL DATA ANALYSIS
Here we would like to perform observational data analysis on our theoretical models using various observational data. We would also adopt different statistical tools and techniques for our data analysis methodology. We would concentrate on methods like χ 2 minimization technique, Markov chain Monte Carlo random sampling methods, etc. We would also like to verify the validity of our theoretical model using machine learning techniques. From here on we will use κ = 1 everywhere.
Firstly, we build up the theoretical model for the cubic gravity. The first FLRW equation of cubic gravity (11) can be put in the form, The present time dimensionless density parameters can be defined as, Using these definitions in the expression for Hubble parameter (35) we get, (37) The free parameters appearing in the above model are H 0 , Ω m0 , Ω rad0 , α andβ. From the recent astronomical data we will fix the parameters H 0 = 67.66 km sec −1 M pc −1 (Aghanim et al. 2020), Ω m0 = 0.31 (Ade et. al. 2016) and Ω rad0 = 8.48 × 10 −5 (Chavanis, 2015). So we are left with two free parameters and the corresponding parameter space to be constrained is (α,β). Now we consider the theoretical framework for f (P ) gravity. To further proceed with the FLRW equations (30) and (31) we need to consider some specific models. Looking at the complexity of the theory, we consider the simplest power law model, where f 0 and γ are constants. Using this in the eqn.(30) we get, Now we consider the following relations from cosmography, where q is the deceleration parameter defined in eqn.(25) and j is the jerk parameter. Using the above parameters in eqn.(39) we get, We see that it is quite difficult to get analytical expression for H(z) from the above equation because of its highly non-linear nature. Fortunately for γ = −1/3 we get an analytic expression for H(z) as given below, This will serve as our theoretical tool in this analysis. The free parameters appearing in the above model are H 0 , Ω m0 , Ω rad0 , f 0 ,β, q and j. Just like cubic gravity, we will fix the parameters H 0 = 67.66 km sec −1 M pc −1 , Ω m0 = 0.31 and Ω rad0 = 8.48 × 10 −5 . In addition these we will also fix the deceleration q = −0.503 (Capozziello, D'Agostino & Luongo 2019;Aghanim et al. 2020). From the combination of three kinematical datasets: the gold sample of type Ia supernovae (Riess et al. 2004), the SNIa data from the SNLS project (Astier et al. 2006) and the X-ray galaxy cluster distance measurements (Rapetti et al. 2007) the value of the jerk parameter is estimated as j = 2.16 +0.81 −0.75 . So for the jerk parameter we will use the value j = 2.16 for this study. So in this model, we are left with two free parameters and the corresponding parameter space to be constrained is (f 0 ,β).

A. Analysis with Cosmic Chronometer (CC) Data
In this work we will use the z − H(z) cosmic chronometer data sets (Jimenez & Loeb 2002;Simon, Verde & Jimenez 2005;Stern et al. 2010;Zhang et al. 2014;Moresco 2015). The complete data table for CC data can be found in the refs. Rudra & Giri (2021) and Ranjit, Rudra & Kundu (2021). We observe that the resdshift range of the data is (0, 2). Considering the current redshift to be z ≈ 0 and the universe is evolving from higher redhift regime to a lower redshift regime, the span of the data seems to be quite relevant cosmologically. The cosmic choronometers are a very useful set of tools in understanding the gradual evolution of the universe. This data is extracted from the observation of passively evolving primordial galaxies, using the technique of differential age evolution. We know that for a universe modelled by the FLRW equations the relation between the Hubble parameter H and the redshift z can be given by So the time gradient of the redshift z is measured and used in the above relation to find the Hubble parameter values. The complete data set of the cosmic chronometers spans around 10 Gyr of cosmic time.
We would like to perform a data analysis with the 30 point CC data-set and constrain the free parameters of the models. To achieve this, we will first establish the χ 2 statistic as a sum of standard normal variate as follows: where H(z) and H obs (z) are the theoretical and observational values of Hubble parameter at different red-shifts respectively and σ(z) is the standard deviation representing the corresponding error in measurement of the data point. Here we consider the present value of Hubble parameter as H 0 = 69 ± 8 Km s −1 Mpc −1 and we also consider a fixed prior distribution for it. The reduced chi square value can be given as where P (H 0 ) is the prior distribution function for H 0 .

B. Joint analysis with CC+BAO
In the work of Eisenstein et al. (2005), we found an elegant method of clubbing the cosmological data with the Baryon acoustic oscillation (BAO) peak parameter to refine the constraining methods of model parameters. The BAO signal was detected for the first time while conducting the Sloan digital sky survey (SDSS) at around a scale of 100 Mpc. Another success of the SDSS survey was that, it confirmed the observations and results obtained from the Wilkinson microwave anisotropy probe (WMAP). During the SDSS survey spectroscopic samples were retrieved from thousands of red galaxies that spanned across the sky, covering a diameter of around five billion light yeras. The BAO peak parameter is defined as (Thakur, Ghose & Paul 2009;Paul, Thakur & Ghose 2010;Paul, Ghose & Thakur 2011;Ghose, Thakur & Paul 2012): In the above expression E(z) = H(z)/H 0 is called the normalized Hubble parameter. From the SDSS survey it is well known that z 1 = 0.35 is the prototypical value of red-shift which we will use in our analysis. For a flat model of universe SDSS data estimates the value of the peak parameter to around A = 0.469 ± 0.017 (Eisenstein et al. (2005)). The χ 2 function for the BAO measurement can be given as The cumulative analysis with cosmic chronometers and the BAO peak parameter (CC+BAO) for the χ 2 function can be defined as (Wu & Yu 2007;Thakur, Ghose & Paul 2009;Paul, Thakur & Ghose 2010;Paul, Ghose & Thakur 2011;Ghose, Thakur & Paul 2012) The above given cumulative χ 2 will be used in our analysis to refine the results obtained for CC data by the additional restrictions of the BAO peak parameter.

C. Joint analysis with CC+BAO+CMB
With the discovery of the late time cosmic acceleration general relativity became inconsistent at the cosmological scales. To explain such a phenomenon the scientific community has resorted to the concept of an exotic fluid with negative pressure known as dark energy. Cosmic microwave background (CMB) observations is the most interesting probe that provides some observational evidence to the theoretical framework of dark energy via its power spectrum. One disadvantage of the CMB parameter is that it is not sensitive to the background perturbations. However the parameter is perfectly suitable to constrain cosmological models using its peak. The first peak of the CMB power spectrum is primarily a shift parameter given by where z 2 is the value of redshift consistent with the last scattering surface. From the Wilkinson microwave anisotropy probe (WMAP) 7-year data, which is available in the work of Komatsu et al. (2011) the value of the above shift parameter has been obtained as R = 1.726 ± 0.018 at the redshift z = 1091.3. Now the χ 2 function for the CMB measurement can be given in terms of the CMB shift parameter by the following relation, Now considering the CC data constrained with both the BAO and CMB peaks together, we can perform the joint data analysis for CC+BAO+CMB. The total χ 2 function for this case can be given as, Using this we can refine the results obtained for the CC+BAO case, and put better constraints on the free parameters of the models. Moreover since CMB observations are motivated from the dark energy, they serve as the best tools to constrain modified gravity theories when compiled with other data sets.

IV. STATISTICAL ANALYSIS AND MACHINE LEARNING
In this section, we will briefly describe the statistical methods and the supervised machine learning techniques which will be used in view of the statistical and machine learning perspective. Three different methods will be studied namely, analysis with correlation coefficients, linear regression analysis and artificial neural network. We discuss the methods below:

A. Correlation Coefficients
Correlation coefficients (here after R) (Heumann, Schomaker & Shalabh 2016) are used to measure the strength of the linear relationship between two variables. R is a statistical technique used to determine the degree to which two variables are related (i.e. for a bi-variate sample). In this work, there are two data sets z and H of length N , where N = 30. Hence, z = (z 1 , z 2 , z 3 , . . . , z N ) and H = (H 1 , H 2 , H 3 , . . . , H N ). Then, R is given by The possible range of values for the correlation coefficient is −1.0 to 1.0. If the correlation coefficient is greater than zero, it is a positive relationship. Conversely, if the value is less than zero, it is a negative relationship. A value of zero indicates that there is no relationship between the two variables.If R is close to 1 or −1 there is a strong positive or negative correlation respectively.

B. Linear Regression (LR)
The regression analysis is a common but powerful supervised machine learning algorithm, which helps to predict the trends and future values from the existing data. In the context of regression models, the simple linear regression (LR) (Sen & Srivastava 1990) is one of the most basic and common predictive analysis model. Basically, LR is used to predict the relationship between independent (known as input/s) and dependent variables (known as output) assuming a linear relationship between those input/s and output. If there is a single input variable, then the model is referred to as LR, while if there are multiple input variables, the same is termed as multiple linear regression model (MLR). In both LR and MLR, the output will be a single variable. The distribution of regression residuals are normal distribution. In this work we have only one input parameter,viz., z and output parameter H LR . So, we used LR here.Ifz then, the estimated value of H LR is obtained from the equation of straight line for LR which is given by where, the slope of the straight line is given as, and the intercept of LR line is,

C. Artificial Neural Network (ANN)
An artificial neuron network is a computational model that mimics the way nerve cells work in the human brain (MacGregor & Lewis 1977). Human brain consists of many neurons connected to each of their neighbours. In human brain, each neurons pass the input signal from one to another as well as pass the information that is to be computed for output. Similarly, the ANN also pass the input signal from one neuron to another and create a network of artificial neurons for computation. A simple ANN structure consists one or more number of inputs and a single output. Let, the net input to a process is H in . Then the net output H out is a function of H in . For a simple ANN the net output is considered as a binary step function as below.
All the neurons in ANN build the layers or network by interconnecting themselves. These interconnection may or may not be fully connected. According to this layered architecture the ANN can be classified into various divisions,viz, single layer feed forward ANN, multi-layer feed forward ANN, competitive network and recurrent network. All of these networks may have one or more hidden layer with the input and output layers. The output only generates from an output processing unit when a special function satisfies required criteria for given input variables. This special function that maps the net input value to the output values is known as activation function of that output unit of the ANN. It uses learning algorithms that can independently make adjustments -or learn, in a sense -as they receive new input. This makes them a very effective tool for non-linear statistical data modeling.

V. RESULTS
In connection with this work, we have developed a complete PYTHON package, where, using any theoretical model for H(z), we can find the best fitted auxiliary parameters, bounds of free parameters, plotting of parameter space, statistical analysis and estimation analysis for H using machine learning. Although we have developed our own code almost, however, in order to find bounds of free parameters at different statistical confidence intervals, we took the help of the publicly available CosmoMC code (Lewis, Challinor & Lasenby 2000;Lewis & Bridle 2002). We found various interesting results which will be presented in the following subsections.

A. Auxiliary Parameter Analysis
It is relevant to mention here that these model have some background theoretical motivations and hence the results for this model may be interesting for referencing other results. After fixing the known parameters, we are left with only two free parameters,viz., (α,β) for Cubic gravity and (f 0,β) for f (P ) gravity. So computationally this seems to be a relatively convenient scenario. Using our code we have constrained the theoretical models with the data and put bounds on the free parameters. The results are given in the tables I and II.   Fig. 1 and Fig. 2 shows the 2D confidence contours and the distributions followed by the free parameters of cubic and f (P ) gravities respectively. From distribution curves we see that for almost all parameters, we get a Gaussian like distribution (not perfectly Gaussian) for any data-set. For cubic gravity in Fig. 1 we see that the distribution curves of α for CC and CC+BAO datasets have a flat top with centre at around −4, and almost coincide with each other. But for CC+BAO+CMB the distribution is considerably skewed towards left in comparison to the others. For the distributions ofβ the centre shifts to 6 and the curve for CC+BAO+CMB is skewed towards the right in comparison to the other curves. In Fig.2 we have the results for f (P ) gravity where it is observed that the distribution curves for the parameters f 0 andβ are almost coincident for all the three datasets. The impact of the BAO and CMB peak   1: 1D distributions and 2D joint likelihood contours of the free parameters (α,β) of cubic gravity model. The deeper shades show the 68% confidence intervals and the lighter shades represent the 95% confidence intervals for the parameters.

B. Correlation Analysis
We will now investigate the corresponding correlation between observed values of H,viz., H obs and the theoretical values of the same H theo for CC data with the help of correlation coefficient R (48). From, Table I, it is evident that the best fitting values of the auxiliary parameters; α andβ are −1.3 × 10 −6 and 2.9 × 10 −7 respectively for cubic gravity model, while, f 0 andβ are −3.769 × 10 −4 and 0.2385 respectively for f (P ) gravity. Next, for both of these models, with these specific α, f 0 andβ, we calculated the respective values of H theo (z) using 37 and 42. In table. III, the calculated values of H theo for cubic gravity model are given in the third column, while, in the column third column of table IV similar values obtained from f (P ) model are provided. In both the tables, the respective observed values are kept in the second column for meaningful comparison. However, using (48), we calculate that R = 0.9507 and 0.9498 for cubic and f (P ) model respectively. Both the values of R clearly indicate that, there is a strong positive co-relationship between H obs and our deduced H theo for both cubic and f (P ) theories. In Fig.3 using H obs and H theo values, we have compared our theoretical results for both cubic and f (P ) theories with observations in context of R. The top panel of Fig.3 shows the H − z variations for cubic theory while, the bottom panel represents the same for f (P ) theory. In both the cases, best fitted parameters obtained from table I are used to show the z variation of both theoretical and observational H. It is interesting to mention here that for cubic gravity model, we got three important observations as follows: (a) α andβ must be of opposite signs for best fitting values (b) Both of these values of best fitting parameters tends to zero in the context of providing minimum χ 2 . (c) H obs and H theo data sets are in good agreement with each other. In contrast to this, for the f (P ) theory, we found that (a) f 0 andβ must be of opposite signs for best fitting values and (b) H obs and H theo data sets are in good agreement with each other. From this it is evident that our fitting analysis has good degree of precision and the constraints on the parameters are excellent.

C. Machine Learning Analysis
There are several supervised machine learning techniques. Few of these are Linear Regression (LR), Artificial Neural Network (ANN), Supporting Vector Machine (SVM), Random Forest etc. As mentioned earlier, here, we have used two such algorithms, viz., LR and ANN for the same purpose. This is relevant to mention here that standard observational likelihood analysis (SOLA) is used get more robust parameter estimates but it can't forecast or predict the H(z) with it's known values. Hence the purposes of SOLA and ML are totally different although both are statistical methods. By introducing machine learning techniques we complement our standard observational likelihood analysis, with a more refined form of study. It is known that ML methods are designed to produce far better and accurate predictions than standard statistical procedures. So the motivation behind introducing ML in our study is simply to get better predictions from our theoretical model and the observational data. Moreover we can also compare our results with those obtained from the standard likelihood analysis and get an idea about the range of deviation in the two procedures. In this part of our work, we have focused on the theoretical values of H (third column of both the Tables III and IV) to investigate the role of machine learning in order to let the computer learn how the function H(z) depends on the red shift parameter z. It is well known that for applying any supervised learning techniques, the input data set is segregated in to two subsets. In general, among these two sub sets,the larger set contains more than 50 % of the data which is termed as the training data, while the rest of the set contains less than 50% of data which is known as the testing data. Here we have used the input data as z and target output as H(z) (obtained from our model). We used 67 % of this data (randomly taken) for the training purpose for both of the cases of LR and ANN, while the remaining 33 % data were used for the testing purpose. For the validation purpose, we compared these machine learning predicted H(z) with both the original H(z) (from 33% test data) and also with observational H(z). Now, if we include a new set of z values, we don't need to run our theoretical model once again, we can easily use these machine learning techniques to predict the new H(z) corresponding to these new z. In our present context, we let the machine select randomly the data from 20 rows given in the second and third column of Table III and  Table IV and estimate the data in the remaining 10 rows after learning the evolution of H(z). Here, we discuss the results obtained from LR and ANN methods and also compare them with the observational data sets and theoretical models.

LR Analysis
The LR estimated values of H (denoted as H LR ) are shown in column 4 of Table III (for cubic gravity model) and Table IV (for f (P ) gravity model). Upper panel of Fig.4 and Fig.6 show the correlation coefficient R of the predicted H LR and our theoretical values H theo for cubic model is 0.9858, while, same for the f (P ) model is almost equal to 0.9838. As both the values of R for our models are very much close to 1 and also from the plots (upper panels of Fig. 4 and Fig. 6), we can easily conclude that the prediction using LR is significantly good for the theoretical values obtained from our derived expressions in eqns.37 and 42 .
In order to compare H obs with H theo and H LR , we provide the ratios H theo /H obs and H LR /H obs in column 6 and column 7 of Table III (for cubic gravity theory) and in Table IV (for f (P ) theory). We have also calculated the deviation parameters (Kangal, Salti & Aydogdu 2019) and We presented these two deviations in the columns 9 and 10 of while, for f (P ) gravity model we have In the upper panels of Fig 5 and 7 , we plot the deviations δ theo and δ LR along z for respective models. One can see from these above mentioned figures that the LR values of the deviations for both the models roughly merged with the deviations of the theoretical values. Hence, we can say that the theoretical framework of both the gravity models are well-built and suitable to make interesting cosmological predictions.

ANN Analysis
We have used multi-layer perceptron ANN to predict the values of H which is denoted here as H AN N . The estimated values H AN N are given in the column 5 of table III and IV for the respective theories of gravity. The lower panels of Fig. 4 and 6 show that the values of the correlation coefficient R between the predicted H AN N and our theoretical models H theo are 0.9936 (for cubic model) and 0.9926 (for f (P ) model) respectively. So, it seems that ANN gives even better estimation of H compare to the LR in both of the cases. We also calculate the ratios H AN N /H obs and the subsequent deviation parameter δ AN N given by For, both of our models, in table III and IV, H AN N /H obs and δ AN N are provided in columns 8 and 11 respectively. The mean deviations for ANN and our theoretical cubic gravity model are while, for f (P ) model, the same is given as In the lower panels of Fig 5 and 7, we plot the deviations δ theo and δ AN N along z for the respective models. It is interesting to see that for both the cubic and f (P ) model, the mean deviations of ANN with theory are even lesser compared to the deviations of LR with theory. Thus, it can be seen that the ANN and the LR models perform significantly well and the ANN model is the better among the two.

VI. CONCLUSION
In this work we have performed an observational data analysis on the Einsteinian cubic gravity and f (P ) gravity. Both cubic and f (P ) gravity theories are higher order curvature theories that involve cubic corrections of the Riemann tensor to the Einstein Hilbert action. Motivated from their success as consistent theories of cosmology, we undertook the project of constraining their parameter space in this work. From the cosmological equations of the theories the theoretical framework was built by representing the Hubble parameter H as a function of the redshift parameter z. The 30 point z − H(z) cosmic chronometer data was used for the analysis. Clubbing the data with the additional constraints of BAO and CMB peaks further refinement was achieved in the probe. The Markov chain Monte Carlo method was used in the fitting analysis. Using our own PYTHON code and the publicly available CosmoMC code we found the best fit values of the free parameters of the model and also put bounds on them. 1D and 2D likelihood contours were generated for the free parameters representing their 68% and 95% confidence intervals. It was seen that the posterior distributions followed by the parameters are Gaussian-like, but quite skewed from a perfect normal distribution.
We went on to complement the fitting analysis with further statistical analysis and machine learning methods. We used correlation coefficients to compare the theoretical framework obtained from the fitting analysis with the observational data. For both cubic and f (P ) gravity it was found that the correlation coefficient was quite close to +1, showing strong positive correlation between theory and data. Two machine learning models, namely the linear regression and the artificial neural network was employed for the estimation of H(z). We compared these estimates with the theory and the observations and it was found that these estimated models performed significantly well. Both of our models and the respective machine learning modelling predicted the behavior of H(z) closely related to the observations. Further, the comparison of two machine learning models indicates that the ANN performs slightly better than LR. LR is method dealing with linear dependencies, ANN can deal with non-linearities. So, the data will have some nonlinear dependencies, ANN should perform better than regression. Interestingly, ANN is giving better forecasts compared to LR here. This clearly proves that the relation of z and H(z) is highly non-linear which is quite expected. Therefore, the studied models can be used to fill the data gaps of observational data sets that exist due to technological challenges and instrumentation constraints. This proves the accuracy of our fitting analysis and the consistency of the theory with the observations. This work is a significant theoretical development of the Einsteinian cubic and f (P ) gravity theories as far as their cosmological implication is concerned. It will be really interesting to check our results in the light of the results coming from larger data sets like Planck, Pantheon, etc., where the primary data sets are not z − H(z) points, but involves measurements of other cosmological parameters. This may be an interesting future project for cubic and f (P ) gravity as far as observational constraints on the theoretical parameters are concerned.