Optimal Summary Statistics for X-ray Polarization

We develop two new highly efficient estimators to measure the polarization (Stokes parameters) in experiments that constrain the position angle of individual photons such as scattering and gas-pixel-detector polarimeters, and analyse in detail a previously proposed estimator. All three of these estimators are at least fifty percent more efficient on typical datasets than the standard estimator used in the field. We present analytic estimates of the variance of these estimators and numerical experiments to verify these estimates. Two of the three estimators can be calculated quickly and directly through summations over the measurements of individual photons.


INTRODUCTION
With the advent of X-ray polarimeters such as the Imaging X-Ray Polarimetry Explorer (IXPE Weisskopf et al. 2022) and XL-Calibur (Abarr et al. 2021), techniques to estimate the polarized flux from astrophysical sources using measurements of the scattering or photo-electric emission induced by individual photons are in high demand.Kislat et al. (2015) developed the standard estimator in the field which has several conceptual and technical advantages.The Kislat estimator assigns Stokes parameters to individual photons, so the total Stokes parameters derived from a particular measurement is simply the sum over those of the individual photons.Although this estimator is unbiased, it is far from optimal, especially when considering an instrument whose sensitivity to polarization varies from photon to photon.This is the case for current X-ray polarimeters, which provide a sinusoidal polarization signal whose amplitude is proportional to a modulation factor (µ) that depends strongly on energy (Marshall 2021b).The efficiency of an estimator is inversely proportional to the number of photons required to achieve an expected signal-to-noise ratio; for example, the Kislat estimator would require a fifty percent longer observation (or a fifty percent larger telescope) to obtain the same information as an estimator fifty percent more efficient.
In this paper, we revisit the problem of the optimal summary statistics to compute the Stokes parameters from X-ray polarimetric observations.The paper is organized as follows.In Section 2 we start with the theoretically most efficient estimator, the maximum likelihood estimator (González-Caniulef et al. 2023) or MLE, and derive two linear estimators from it.Although the MLE yields the smallest variance (or highest signal-tonoise ratio), it must be calculated iteratively through an optimization process which involves calculating the like-lihood of each of many hundreds of thousands or millions of photons repeatedly (perhaps hundreds of times), and so can be very time consuming.On the other hand, the two linear estimators can be calculated through sums over the properties of individual photons.While the Kislat estimator requires two summations, one for each of the two Stokes parameters for linear polarization, the efficient estimators that we present here require five or three summations over the photons, so the added computational effort to calculate these linear estimators is negligible compared to a full MLE analysis.In Section 3 we perform numerical experiments to assess the efficiency of each estimator and verify the analytic calculations.Conclusions are presented in Section 4.

DERIVATIONS
We can build the likelihood function focusing only on the polarization signal, which is a function of photon angle, energy and time, or by including the spectrum, which is just a function of energy and time, in the model as well (see also Kislat et al. 2015;Marshall 2021b,a).Let p m (E, t) be polarization degree of the model, µ(E) be the modulation factor of the instrument and ψ m (E, t) be the polarization angle of the model; the first component of the likelihood function consists of the probability density of the photon position angle which can therefore be written as: where ψ is the position angle (or photo-electron scattering angle) of a particular photon and the energy and time variables are suppressed.This expression results from the definition of the modulation factor, the differential scattering cross-section, and the normalization f (ψ)dψ = 1, which is constant with respect to the expected degree of polarization.In principle, the detector could introduce some uncertainty in the measured value of the angle ψ; if the detector is unbiased, the only effect of this is to reduce the value of µ from the theoretical value (Marshall 2021a).We will assume that the value of µ has been measured or calculated including these angular uncertainties.
We can reformulate the probability density for a single photon as as (Marshall 2021b) (2) (Besset et al. 1979) use an analogous probability density for spin-half particles and derive analogous expressions (with 2ψ replaced by ψ).If we define q m = p m cos 2ψ m and u m = p m sin 2ψ m for the model, and Q γ = cos 2ψ and U γ = sin 2ψ for the photon, we have (3) where we have placed the normalized Stokes parameters (q m and u m and similarly for the photon) into a two-component Stokes vector (s m ) for notational convenience.In general we will use lowercase letters to signify normalized Stokes parameters; that is, q = Q/I etc.For IXPE, the total logarithmic likelihood of the data given the model is where N pred is the number of photons predicted by the model (it does not depend on the assumed model polarization).We can maximize the likelihood to determine the most-likely values of q m and u m to account for the data which we will denote as s m,MLE .Although this process is necessarily iterative, it has the advantage of yielding the minimum-variance unbiased estimator for the polarization as the MLE achieves the Cramér-Rao lower bound on the variance (Cramér 1946;Rao 1945).Two useful approximations are the expected values (5) and its variance which is given by (6) to determine how well converged the iterative process is.As the likelihood is calculated as the sum over a large number of terms (one for each photon) and each one is an independent and identically distributed random variable drawn from a distribution with a finite mean and variance, the distribution of the sum approaches the normal distribution by the central limit theorem.Therefore, using the expected value for the likelihood and a odds ratio of 99, we can estimate the minimum detectable polarization at 99% confidence to be where N is the number of events, in agreement with Marshall (2021b).
We would like to find a direct estimator of the polarization that approaches the variance of the MLE and connects with the estimator traditionally used in the field (Kislat et al. 2015).Our first step is to take the gradient of the likelihood with respect to the model polarization S m and linearize it, In linearizing the likelihood we have neglected terms of higher order in µ γ s m , so we have introduced a potential bias for highly polarized sources observed with instruments with high modulation factors (the current generation of instruments typically have µ < 0.5 so this is not an issue at this point).To maximize the likelihood we have and where ⊗ denotes the Kronecker product.This estimator 1 is the same as derived by Marshall (2021b, Eq. 59 and 60) and similar to the approximate estimator (Marshall 2021b, Eq. 61 and 62) and similarly for u m,M .If we replace the term in the brackets with its expectation value, we obtain 2 which is very similar to the Kislat et al. (2015) result, but with a different weighting by modulation factor.The Kislat estimator is implemented in ixpeobssim (Baldini et al. 2022) and xselect (Arnaud 1996).We can understand the relationship between the s m,K and s m,2 estimators by grouping the observations according to the values of µ γ and calculating s m,K for each group.
For each group the variance in the estimator is and similarly for u m,K .If we combine the individual estimators q m,K,µ weighted inversely by the variance, we 1 It requires 5 summations over the events that can number more than a million to compute the Stokes.
2 It requires 3 summations to compute the Stokes. get (16) where we have ignored the second term in the variance.In a similar fashion we can estimate the variance in the two estimators as and The two preceding equations contain the key result of this work.Because the root-mean-square of a group of nonidentical quantities is greater than the harmonic-rootmean-square, the variance of s m,2 is necessarily smaller than that of s m,K if the modulation factors of the observed photons vary.We can also estimate the variance for the MLE technique from the second derivative of the log-likelihood With no loss of generality, let us take µq m = sin α and u m = 0 to calculate the following expectation values If we now take the general situation of q m = 0 and u m = 0 with sin 2 α = µ 2 (q 2 m + u 2 m ) the variances of the two Stokes parameters are The variance of the remaining estimator s m,1 is obtained by linearising the results for the MLE to yield q m,MLE q m,1 q m,2 q m,K 0.50 ± 0.04 0.50 ± 0.04 0.50 ± 0.04 0.50 ± 0.04 0.50 ± 0.12 0.50 ± 0.12 0.50 ± 0.12 0.50 ± 0.14 0.00 ± 0.12 0.00 ± 0.12 0.00 ± 0.12 0.00 ± 0.14 and In general the covariance between Q and U for all of the estimators is given by 3. RESULTS perform a series of numerical experiments with a constant number of photons with either a constant modulation factor or one that varies from photon to photon from 0.2 to 0.5 uniformly to determine the properties of the MLE and the direct estimators: s m,1 , s m,2 and s m,K .For the first experiment we take the values of q = u = 0.5 and µ = 1 and ten thousand random realisations each of one thousand events, yielding error estimates of 0.0382, 0.0387, 0.0418 and 0.0418 for the s m,MLE , s m,1 ,s m,2 and the Kislat estimator s m,K from the formulae above which agree with the results from the numerical simulations shown in Tab. 1.
In general the modulation factors of individual photons differ (as pointed out by Marshall 2021b), so we repeat the experiment with modulation factors varying from 0.2 to 0.5 drawn from a uniform distribution.Although the mean value of µ is 0.35, the mean value of µ 2 is (0.361) 2 , and the mean value of µ −2 is (0.316) −2 , yielding error estimates for the s m,1 , s m,2 and s m,K estimators of 0.122, 0.123 and 0.141 respectively.The minimum variance estimator, s m,MLE , yields 0.122.All are in agreement with the simulated results (Tab.1).In the third experiment we take the model polarization to vanish and the distribution of modulation factors, yielding uncertainties of 0.142 for the Kislat estimator s m,K and 0.124 (see Tab. 1) for the others with the corresponding MDP values of 0.431 and 0.376 (in agreement with the simulation results in Tab. 2).The expected values for the covariance in the first two simulations are −2.5 × 10 −4 and zero in the final simulation.The convergence of the simulations (Tab.3) to the theoretical expectations is poorer than in the case of the standard deviations, but close in magnitude and correct in sign.
Let us now apply these formulae to a real dataset in particular the observations of Centaurus X-3 (Tsygankov et al. 2022) with the Imaging X-Ray Polarimetry Explorer (Weisskopf et al. 2022).The mean value of the modulation factor for the events from the source between two and eight keV is 0.30, the rootmean-square modulation factor is 0.32 and harmonicroot-mean-squared value is 0.26; consequently, to achieve the same constraints on the polarization using the Kislat estimator that is achieved with the estimators presented here requires a fifty-percent longer observation as (0.32/0.26) 2 ≈ 1.5.
4. CONCLUSIONS We have analysed three estimators for measuring the Stokes parameters in photon counting instruments such as X-ray polarimeters.All three are about fifty percent more efficient for typical datasets than the estimators implemented in the standard software packages.One estimator, the s m,MLE , is theoretically the most efficient; however, it must be calculated iteratively.The s m,1 (first derived by Marshall 2021b) is derived by linearizing the MLE and its performance except for very highly polarized sources with highly efficient polarimeters (the product of modulation factor and polarization degree greater than ninety percent) is nearly equivalent to the MLE.In the regime where the MLE is notably more efficient theoretically than the s m,1 estimator, the likelihood function becomes very strongly and sharply peaked making the iterations poorly behaved, so in practice even in this regime the s m,1 estimator is preferable over the MLE.Furthermore, this estimator saturates the Cramér-Rao lower bound on the variance to second order in the polarization degree; therefore, it is the most efficient estimator that can be constructed from the first-and second-order moments of the observed S γ .
We obtained the second direct estimator s m,2 by replacing a portion of the expression of the s m,1 estimator with its expectation value.Again except for very polarized sources, it is nearly as efficient as s m,1 , but it has the advantage of being straightforward to implement in standard pipelines.Although the estimators that we have derived here are significantly more efficient than the Kislat estimator s m,K for calculating summary statistics for binned data, the unbinned estimator derived by González-Caniulef et al. (2023) remains the most efficient method for model testing because no binning is performed.
Although we have not presented calculations of the Marshall (2021b) approximate estimator, its performance with four summations over photon measurement to derive two Stokes parameters, is in general intermediate between s m,1 and s m,2 (indeed almost precisely halfway in between).Additionally an estimator as efficient as s m,2 can be obtained by calculating the mean Stokes parameters over very narrow energy bins (over which the modulation factor is approximately constant) using the standard Kislat estimator, and taking the average over these bins weighted by the inverse variance to obtain estimates of the Stokes parameters.
The key point is that the weighting of the photon measurements with modulation factor matters.The Kislat method yields an uncertainty that is inversely proportional to the harmomic-root-mean-squared modulation factor of the photon measurements, whereas the methods presented here yield uncertainties that are inversely proportional to the root-mean-squared modulation factor.These uncertainties are typically about twenty percent smaller; therefore, these estimators typically yield a fifty percent increase in efficiency, making a given observation as effective as one fifty percent longer or making a given telescope as effective as one with a fifty percent larger collecting area.

Table 1
Results: Mean Values and Standard Deviations from Simulated Distributions for Three Different Experiments (see text for details).All of the estimators are apparently unbiased, and the results for the U Stokes parameter are identical within the statistical uncertainties.

Table 2
Results: MDP 99 .The measured minimum detectable polarization MDP 99 for the estimators discussed in the text as determined through the simulations by determining the 99th percentile over the unpolarized simulations.MDP 99,MLE MDP 99,1 MDP 99,2 MDP 99,K