Distinguishing ΛCDM from Evolving Dark Energy with Om Two-point Statistics: Implications from the Space-borne Gravitational-wave Detector

The Omh 2(z i , z j ) two-point diagnostics was proposed as a litmus test of the ΛCDM model, and measurements of the cosmic expansion rate H(z) have been extensively used to perform this test. The results obtained so far suggested a tension between observations and predictions of the ΛCDM model. However, the data set of H(z) direct measurements from cosmic chronometers and baryon acoustic oscillations was quite limited. This motivated us to study the performance of this test on a larger sample obtained in an alternative way. In this paper, we propose that gravitational-wave (GW) standard sirens could provide large samples of H(z) measurements in the redshift range of 0 < z < 5, based on the measurements of the dipole anisotropy of luminosity distance arising from the matter inhomogeneities of the large-scale structure and the local motion of the observer. We discuss the effectiveness of our method in the context of the space-borne DECi-herz Interferometer Gravitational-wave Observatory, based on a comprehensive H(z) simulated data set from binary neutron star merger systems. Our results indicate that in the GW domain, the Omh 2(z i , z j ) two-point diagnostics could effectively distinguish whether ΛCDM is the best description of our Universe. We also discuss the potential of our methodology in determining possible evidence for dark energy evolution, focusing on its performance on the constant and redshift-dependent dark energy equation of state.


Introduction
One of the most important challenges in modern cosmology is to understand the fundamental nature of the concordance cosmological model, which includes three pillars: inflation (flat universe), dark matter (neutral, collisionless particles), and dark energy (cosmological constant) (Guth 1981;Turner et al. 1984;Lyth & Riotto 1999;Peebles & Ratra 2003;Bertone et al. 2005;Turner 2018).Since the 1980s, the concordance model has withstood many rigorous tests based on different astronomical observations, such as Type Ia supernovae (SNe Ia; Riess et al. 1998;Perlmutter et al. 1999), cosmic microwave background (CMB), baryon acoustic oscillations (BAO; Spergel et al. 2003;Eisenstein et al. 2005), and strong gravitational lensing (Cao et al. 2012a(Cao et al. , 2012b(Cao et al. , 2015)).However, this harmony has recently been disrupted by the increasing precision of available measurements.Currently, the most hotly debated issue is the "H 0 tension," where the value of Hubble constant measured with CMB (H 0 = 67 ± 0.5 km s −1 Mpc −1 ) based on the ΛCDM assumption is 4σ-6σ different from the supernovae direct local measurement (H 0 = 73 ± 1.7 km s −1 Mpc −1 ) (Riess et al. 2019;Verde et al. 2019).At the same time, another tension emerges regarding cosmic curvature Ω k .The CMB anisotropies analyzed alone tend to support a closed universe, which is 3σ different from the results obtained jointly with BAO data (Di Valentino et al. 2020;Handley 2021).In addition, there is a discrepancy regarding the matter density parameter Ω m , which is noticeably smaller in recent cosmic shear surveys than the value obtained from Planck data in the framework of ΛCDM (Asgari et al. 2020).It is also worth noting that the 4σ difference exists between the ΛCDM and the high-redshift Hubble diagram of the SNe Ia, quasars, and gamma-ray bursts (Lusso et al. 2019;Risaliti & Lusso 2019).Although systematic uncertainties might play an important role in these tensions, they may also indicate that the ΛCDM model is not valid when confronted with more accurate data.As a result, it is natural and necessary to reexamine the performance of the ΛCDM, because its failure could lead to new discoveries that may give us even greater surprises.This has prompted and motivated many researchers to test and discuss the foundations of the standard cosmological model (Yang et al. 2020;Di Valentino et al. 2021;Koksbang 2021;Vagnozzi 2021).The "litmus test" for assessing the validity of the ΛCDM model holds significant prominence (Zunckel & Clarkson 2008).
The two-point diagnostics Omh 2 (z 1 , z 2 ) developed by Sahni et al. (2008) and extensively employed by Shafieloo et al. (2012) and Sahni et al. (2014) is an attractive technique used to test the ΛCDM model, using the measurements of the Hubble parameter H(z) derived from observations.It is more convenient and alleviates some issues with the original onepoint diagnostic Om(z), which has also been widely implemented with the H(z) measurements obtained from cosmic chronometers and baryon acoustic oscillations (Planck Collaboration 2014; Ding et al. 2015;Zheng et al. 2016;Cao et al. 2018;Qi et al. 2018).However, the number of measured H(z) is still too low to guarantee a sufficient statistical power.Moreover, the redshift coverage is limited to z ∼ 2. Robust reconstruction techniques, such as the Gaussian process, are a common approach to alleviate these drawbacks.It should be noted that the reconstruction may produce different systematic errors, which will naturally affect the reliability of the results.Fortunately, the detection of gravitational-wave (GW) signals has opened a new window on the Universe and provides more possibilities (Schutz 1986;LIGO Scientific Collaboration & Virgo Collaboration 2017).Methods of directly measuring the expansion rate H(z) using GWs have also been proposed and discussed (Sasaki 1987;Seto et al. 2001;Bonvin et al. 2006aBonvin et al. , 2006b;;Nishizawa et al. 2011).This is exciting because it means that we have more chances to get a larger sample of H(z) across a higher redshift range to test the consistency of ΛCDM with the two-point diagnostics Omh 2 (z 1 , z 2 ) technique.
In this paper, we will focus on this method of testing the consistency of ΛCDM (Sahni et al. 2014), using the Hubble parameter H(z) from the simulated sample of data attainable to the GW detector DECi-hertz Interferometer Gravitational-wave Observatory (DECIGO) up to z ∼ 5. Aiming at the verification of ΛCDM in both the electromagnetic (EM) and GW domains, we also discuss the currently largest 32 model-independent measurements of the Hubble parameter H(z).Moreover, we investigate the performance of Omh 2 (z 1 , z 2 ) two-point diagnostics under ωCDM and Chevallier-Polarski-Linder (CPL) models.In Section 2, we briefly describe the Omh 2 (z 1 , z 2 ) two-point diagnostics method and H(z) samples in the GW and EM domains.In Section 3, the results and discussions are presented.The conclusions are summarized in Section 4.

Methodology and Data
Since its inception, the Omh 2 (z 1 , z 2 ) diagnostics has developed into an interesting and popular method extending the repository of tools and techniques of model testing and comparison.By definition, the Omh 2 (z i , z j ) two-point diagnostics can be written as where h(z) ≡ H(z)/100 km s −1 Mpc −1 (Sahni et al. 2014).It means that once we get the measurements of the Hubble parameter H(z) at two or more redshifts directly from the observations, we can obtain the Omh 2 (z i , z j ) values and further compare them with theoretical values derived from concrete cosmological models (Liu et al. 2023).
In the simplest case of the ΛCDM model, the theoretical expression of Equation (1) can be written as follows: where h ≡ H 0 /100 km s −1 Mpc −1 .A significant advantage of the Omh 2 (z i , z j ) two-point diagnostics for ΛCDM is that once some other probe tells us the value of Ω m,0 h 2 , we are able to know the value of Omh 2 (z i , z j ).As a result, any departure of Omh 2 (z i , z j ) from the above value may indicate that dark energy is not Λ.In other words, we can use this deviation to test the ΛCDM model and distinguish it from other dark energy models.In the case of other cosmological models, the twopoint diagnostics is no longer a single number, but its theoretical counterpart can be formulated.The simplest type of a model different from ΛCDM is the dynamical dark energy ωCDM model with a constant equation of state p = ωρ.The two-point diagnostics in this model can be expressed as In addition, we also consider the CPL evolving dark energy model (Chevallier & Polarski 2001;Linder 2003)  .In this framework, the theoretical expression of Equation (1) can be expressed as Combined with Equation (1), it becomes obvious and natural to test the ΛCDM model and differentiate the cosmological model, based on the measurements of the Hubble parameter H(z).

Hubble Parameter H(z) from GW Standard Sirens Detectable by DECIGO
DECIGO as a future space-borne GW antenna will be sensitive to a lower frequency range f = 0.1-10 Hz and will have higher detection sensitivity than ground-based interferometric detectors (Seto et al. 2001;Kawamura et al. 2011Kawamura et al. , 2019)).It would be a new opportunity to detect signals of GW sources (neutron star binaries and black hole binaries) in the inspiral phase much earlier.These binary systems would enter the sensitivity band of ground-based detectors up to a few years later.In particular, DECIGO will be able to detect the GWs from neutron star binaries even at a redshift of z ∼ 5 for 5 yr of its mission.It is well established that coalescing binaries (in the inspiral phase) are the standard sirens and can provide direct measurements of the luminosity distance.This opportunity has been widely discussed (Schutz 1986;Qi et al. 2019;Geng et al. 2020).As a matter of fact, it also opens up more possibilities, such as probing the spatial geometry of the Universe (Zheng et al. 2021;Cao et al. 2022a;Zhang et al. 2022), dark photon bursts from compact binary systems (Hou et al. 2022), direct tests of the FLRW metric (Cao et al. 2019), the distribution of dark matter in the Universe (Cao et al. 2021(Cao et al. , 2022b)), and measuring Hubble parameters based on the dipole anisotropy of luminosity distance from GW observations (Qi et al. 2021).
Considering that matter in the large-scale structure of the Universe is not strictly uniformly distributed and the coordinate system we use is not at absolute rest, the real luminosity distance (neglecting multipoles higher than the dipole) needs to be redefined as (Bonvin et al. 2006a(Bonvin et al. , 2006b) where z is the redshift and n is the direction to the source.The ( ) means the luminosity distance to a source in an unperturbed Friedman Universe; in other words, it is the luminosity distance in the traditional sense.Here e = v 0 /|v 0 | is a unit vector in the direction of the dipole and ( ) as its amplitude can be represented in the following form: In principle, v 0 can be directly measured from the CMB dipole (Nishizawa et al. 2011) and its magnitude is v 0 = 369.1 ± 0.9 km s −1 based on the COBE satellite measurements.Consequently, we can get the Hubble parameters H(z) directly from the dipole of luminosity distance with the uncertainty assessed in the following way (Bonvin et al. 2006a(Bonvin et al. , 2006b;;Nishizawa et al. 2011): includes instrumental uncertainty (σ inst ) caused by the GW detector DECIGO and other systematic uncertainties caused by the gravitational lensing (σ lens ) and the peculiar velocity (σ pv ).Namely, One can estimate instrumental uncertainty (σ inst ) using the Fisher matrix method; for more details, one can refer to Nishizawa et al. (2011).Concerning the σ lens uncertainty induced by the matter inhomogeneities of the large-scale structure along the line of sight, it can be estimated as (Hirata et al. 2010) . 9 lens 0.25 1.8 In addition, the systematic uncertainty caused by the Doppler effect induced by the peculiar velocity of the source along the line of sight is given by Gordon et al. (2007): Usually, the one-dimensional velocity dispersion of the galaxy σ v,gal = 300 km s −1 can be assumed (Silberman et al. 2001).
One thing worth noting is that we focus on the dipole anisotropy of luminosity distance from the GWs produced by the binary neutron stars (BNSs).Therefore, for the BNS merge rate we employ the fitting form of the redshift distribution of GW sources obtained from the cosmic star formation (Schneider et al. 2001;Cutler & Holz 2009;Sathyaprakash et al. 2010).Based on the detection rate of 10 4 neutron star binaries (Kawamura et al. 2021), we simulated the total number of 10 5 BNS as a representative yield of the DECIGO 10 yr mission.Considering all of the total uncertainties in Equation (8), we display the simulated Hubble parameter H(z) obtained from DECIGO in Figure 1 (with the redshift bin of Δz = 0.1).The fiducial cosmology adopted in the simulation is the ΛCDM model with the Planck 2018 results (Asgari et al. 2020).In addition, considering that the systematic uncertainty caused by line-ofsight lensing may be eliminated by some feasible techniques (Hirata et al. 2010;Shapiro et al. 2010), we also consider the cases without the lensing effect for comparison.In Figure 1, we plot the errors of H(z) taking account of the case with and without lensing errors, during 1 and 10 yr observations.Our results show that the uncertainty of H(z) over 10 yr observations is significantly smaller than that for 1 yr observation, proportional to ( ) N z 1 . Here N(z) is the number of independent BNS systems in the vicinity of the redshift z.

Results and Discussion
Having obtained Hubble parameters H(z) from the simulated standard sirens and taking the measurements of cosmic chronometers, we can calculate the Omh 2 (z i , z j ) values of every combination of H(z) pairs.Then, we take a look at the two-point diagnostics with their uncertainties as a function of redshift difference Δz = |z i − z j |.We could see some interesting features regarding the uncertainties.Namely, as the observation time increases, the uncertainty decreases substantially, regardless of whether we consider the lensing effect or not.In other words, accuracy will improve dramatically with 10 yr observations.Compared with the case without the lensing error, the results accuracy of Omh 2 (z i , z j ) are degraded.Twopoint diagnostics as tests of the ΛCDM model provide numerical values for each combination of data pairs, which should in principle reproduce a single number.Hence, we need to analyze these individual values from a statistical point of view.
The inverse variance weighting as the most straightforward and popular way of summarizing multiple measurements is adopted to give the weighted mean of the individual values considered.The weighted mean formula for Omh 2 (z i , z j ) twopoint diagnostic can be written as The fiducial model is ΛCDM and is shown in red.Blue bars with smaller uncertainty represent cases without the lensing effect while the green ones take the lensing effect into account.
and its variance is given by where s Omh ij , 2 2 can be expressed as In Figure 2, one can see the weighted mean of Omh 2 (z i , z j ) calculated from simulated H(z) data.The blue and green dashed line means the weighted mean of Omh 2 (z i , z j ) surrounded by color bands denoting 68% confidence regions.The darker and wider bands from the left to the right mean the simulated H(z) data for 1 and 10 yr observations with lensing effects, respectively.The lighter and narrower bands located at each center display the results without considering the lensing effects.As a comparison, the Planck results of W = ( ) h m Planck ,0 2  0.1428 0.0011 are shown by the cyan line and bands (Planck Collaboration 2020).Our results demonstrate the good consistency between the simulation and our prediction.We summarize the results in Table 1.Although the weighted mean values are similar in all cases (with/without lensing and 1 and 10 yr observations), what we should be more concerned about is their uncertainty, with a magnitude of 10 −3 .Specifically, if the observation time is the same, the uncertainty with regard to lensing is about 3 times greater than that without lensing.Focusing on the cases without considering the lensing effect, Δ W.M.10yr+withoutlens = 0.00114, which means the uncertainty of 10 yr observations decreases to 32% of the 1 yr observation.Of particular concern is that the uncertainty can be compared to the Planck 2018 result (∼0.0011) if observation lasts for 10 yr (ignoring the lensing effect).
Another issue that should be discussed is the comparison between our results based on the GW standard sirens from DECIGO and other models of dark energy.More specifically, we focus on the ωCDM and CPL parameterizations of dark energy.For those parameterizations, the theoretical values of Omh 2 (z i , z j ) should be calculated with corresponding model parameters.The expression can be found in Equations (6) and (7) of Zheng et al. (2016).Here, we use the parameters of Ω m,0 = 0.299 ± 0.007 and ω = −1.047± 0.038 for ωCDM and w 0 = −1.007± 0.089 and w a = −0.222± 0.407 for CPL parameterization from the SNe Ia and CMB constraints combined with BAO and local H 0 measurements (Scolnic et al. 2018) .The residuals, represented as the weighted mean, were depicted in Figure 3, while their corresponding values were summarized in Table 1.In each panel, the left blue one denotes the ΛCDM model, the middle yellow one denotes the ωCDM model and the right violet one denotes the CPL parameterization.The left two panels are related to the cases when the lensing effect has been considered and the right two panels are related to the case without considering lensing.As can be seen from Figure 3, the residuals of the ΛCDM model are closer to zero than that of the ωCDM and CPL models, which means that the ΛCDM model is more supported by H(z) from the GW standard siren DECIGO under the Omh 2 (z i , z j ) two-point diagnostics.In other words, if we do not consider lensing effects (meaning that the lensing effect was corrected), we can differentiate the ΛCDM model from others clearly even for 1 yr observations.In the case when lensing uncertainty is taken into account, these three models can be differentiated only after more than 3 yr of DECIGO observations.Finally, one could wonder about the performance of other models of dark energy in the EM domain with the latest H(z) measurements.Jimenez & Loeb (2002) proposed a cosmological model-independent technique to directly measure H(z), which is known as the differential age or cosmic chronometer (CC) method.In this method, the Hubble parameter H(z) can be obtained as where z is the cosmological redshift and dz/dt can be directly obtained by measuring the age difference Δt between two galaxies, which are massive and evolving passively on a  timescale larger than their age difference and differ in redshifts by Δz.Some features of their spectrum such as the D4000 break enable us to measure the age difference of such galaxies.Another method to obtain the H(z) is the radial BAO method (Blake et al. 2012;Delubac et al. 2015;Blomqvist et al. 2019) based on the value of the sound horizon at the drag epoch r d .It should be noted that this approach assumes a priori the cosmological model, which makes such measurements modeldependent (Li et al. 2016).In this paper, we utilize the newest 33 CC H(z) measurements with redshift up to ∼2, wherein 32 measurements have been extensively employed in previous studies (Borghi et al. 2022;Wu et al. 2023), while 1 measurement has been updated (Tomasetti et al. 2023).
The results of 528 pairs of Omh 2 (z i , z j ) are shown in Figure 4, where red points denote the central values and blue bars are the corresponding 68% uncertainties.It is clearly visible that almost all of the reconstructed Omh 2 (z i , z j ) are consistent with the theoretical values from ΛCDM in the 1σ confidence level.In order to improve the robustness of the results, we also used the weighted mean statistics.Table 2 lists   Figure displayed as the weighted mean for ΛCDM, ωCDM, and CPL models at the observed level of GW standard siren DECIGO with dashed lines and different color bands denoting 1σ confidence level.From upper to lower and left to right means 1 and 10 yr results with and without lensing.the results of this statistical method.For all Omh 2 (z i , z j ) pairs, the result of the weighted mean 0.1438 ± 0.0045 is consistent with the Planck 2018 result 0.1428 ± 0.0011.Compared to our previous results using simulated H(z) data in the GW domain, we find that if we ignore the lensing effect, the accuracy of 1 yr observations is comparable to the current Hubble parameter observations.Even taking into account the lensing effect, the current accuracy can be achieved after more than 3 yr observations.Moreover, the assumption underlying the utilization of the weighted mean approach is that the error distribution must conform to a Gaussian distribution; we adopt the methodology used by Chen et al. (2003) to test the Gaussianity of error distributions.In this regard, it is necessary to compute the number of standard deviations N σ by which a measurement deviates from its central value.The result for the weighted mean is 84.09%, clearly indicating that the error does not follow a Gaussian distribution.Additionally, it should be noted that the percentage of measurements with |N σ | < 1 should ideally be equal to 68.3% for Gaussian distribution.Therefore, we further use the median statistic approach for analysis.When making a total number of N measurements, one might naturally expect that there is a 50% chance that each measurement is higher/lower than the true median.Therefore, the probability that the nth observation is higher than the median follows the binomial distribution: et al. 2001).Similarly, we can define the 68.3% confidence interval with median statistics.The obtained result of the median statistics -+ 0.161 0.0057 0.0051 is incompatible with the Planck 2018 result derived from the ΛCDM model.The results derived from the aforementioned two statistical methodologies are illustrated in Figure 4 and Table 2.We present the weighted mean of residuals in Figure 5 and the corresponding values are shown in Table 2.In principle, the residual should be zero.For comparison between different dark energy parameterizations, the standard ΛCDM model still performed the best, which is consistent with our previous results.As for the ωCDM or CPL models, the residuals summarized in the median statistics scheme present some deviation from the expected value of zero.One can make a conclusion from Figure 5 that the residuals Omh 2 (z i , z j ) for the ΛCDM model are closer to zero than that of the ωCDM or CPL models in such statistical methods we use (Zheng et al. 2016).

Conclusion
The Omh 2 (z i , z j ) two-point diagnostics as an independent and popular screening test of the spatially flat standard ΛCDM model was used in our research.One of its advantages is that it is only related to the Hubble parameter H(z) (Sahni et al. 2014).Therefore, the sample size and the redshift range of Hubble parameters H(z) are very important.Fortunately, we are in a new era of GW multimessenger astronomy, which gives us more opportunities and possibilities to test various physical theories, not just the ΛCDM model.DECIGO as the future space-based GW detector is very promising since it would be able to detect a large number of sources (neutron star binaries) distributed deeply enough at higher redshift.More importantly, the neutron star binaries would be clean GW sources, which can provide the luminosity distance with less systematics (Nishizawa et al. 2011).In consequence, we would have a chance to obtain the Hubble parameters H(z) directly by using the dipole of the luminosity distance from the DECIGO.Our results do not show a significant deviation compared with the Planck results, which means that the ΛCDM model (assumed in the simulations) is supported by the Omh 2 (z i , z j ) two-point diagnostics with H(z) expansion rates from the secondgeneration space-based GW detector, DECIGO.In particular, in the framework of DECIGO, the Omh 2 (z i , z j ) two-point diagnostics are expected to be constrained with a precision of 10 −3 .And the uncertainty (0.00114) can be comparable with  the Planck 2018 result (0.0011) if observed for 10 yr (ignoring the lensing effect).In addition, we have also discussed the performance of the ωCDM and CPL models.Residual results show that there is a certain deviation seen in the data from 10 yr of observation.In other words, we can differentiate these models from the standard ΛCDM cosmological model from the 10 yr observation cycle when the uncertainty from lensing is considered or from 3 yr observation data when the lensing is mitigated.
For comparison, the newest measurements of 33 cosmic chronometers with redshift up to ∼2 were also considered, which raises the possibility of testing cosmological models in the EM domain.We found no significant inconsistencies within the 68% confidence level in the scheme of the Omh 2 (z i , z j ) twopoint diagnosis and weighted mean statistical method, which is consistent with the Planck 2018 result and our previous results in the GW domain.If we ignore the lensing effect, the accuracy of the 1 yr observation is comparable to the current Hubble parameter observations.Even taking into account the lensing effect, the current accuracy can be achieved after more than 3 yr of observations.For different dark energy models, it is important to note that the residuals of the standard cosmological model are closer to zero than its immediate extensions, which turned out to be that the ΛCDM model still performs better under the Omh 2 (z i , z j ) two-point diagnosis with the CC H(z) measurements.
In conclusion, the precision of Hubble parameters derived from GWs possesses significant potential in discriminating between ΛCDM and parametric models.We can differentiate these models with ΛCDM from the 10 yr observation cycle when the uncertainty from lensing is considered or from the 3 yr observation data when the lensing is mitigated.To discuss the accuracy of Hubble parameters for distinguishing cosmological models, the most recent Hubble parameter comes from the CC, where z ∼ 2 is employed to conduct calculations based on the two-point diagnosis of Omh 2 (z i , z j ) and the results demonstrate consistency with the outcomes from Planck's ΛCDM model, aligning with our prior research findings.We would like to stress that this test is independent and does not rely on any cosmological assumptions.Most importantly, it is a new attempt to test the theory using the Hubble parameter measured directly by GWs.

Figure 2 .
Figure 2. The weighted mean of the Omh 2 (z i , z j ) two-point diagnostics displayed by dashed lines and surrounded by color bands denoting 1σ confidence level.The four different color lines and bands show results with and without lensing effect in 1 and 10 yr observations, respectively.The cyan line and band are the Planck 2018 result.

Figure 4 .
Figure4.The Omh 2 (z i , z j ) two-point diagnostic calculated on the observational 33 CC H(z) data.The left panel displays all 528 pairs, where the red dots with blue bars represent Omh 2 (z i , z j ) and their 1σ confidence level.The right panel presents the statistical findings using weighted mean and median statistics.The cyan line and bands indicate the values for which the two-point diagnostic is expected to be equal to within the ΛCDM model: Ω m,0 h 2 = 0.1428 ± 0.0011.

Table 1
The Weighted Mean (w.m.) of Omh 2 (z i , z j ) Two-point Diagnostics and Residuals Calculated for ΛCDM, ωCDM, and CPL Using the Simulated H(z) Data Omh 2 (z i , z j )

Table 2
The Weighted Mean (w.m.) and Median Statistics (m.s.) of Omh 2 (z i , z j ) Two-point Diagnostics Calculated for Observational 33 H(z) CC Data and the corresponding Residuals for comparison with three different cosmological models.
displayed as the weighted mean (left panel) and median statistics (right panel) for ΛCDM, ωCDM, and CPL models at the observed level of cosmic chronometer H(z) with dashed lines and different color bands denoting 1σ confidence level.