A Weighted Analysis to Improve the X-Ray Polarization Sensitivity of the Imaging X-ray Polarimetry Explorer

Imaging X-ray Polarimetry Explorer (IXPE) is a Small Explorer mission that was launched at the end of 2021 to measure the polarization of X-ray emission from tens of astronomical sources. Its focal-plane detectors are based on the Gas Pixel Detector, which measures the polarization by imaging photoelectron tracks in a gas mixture and reconstructing their initial directions. The quality of the single track, and then the capability of correctly determining the original direction of the photoelectron, depends on many factors, e.g., whether the photoelectron is emitted at low or high inclination with respect to the collection plane or the occurrence of a large Coulomb scattering close to the generation point. The reconstruction algorithm used by IXPE to obtain the photoelectron emission direction also calculates several properties of the shape of the tracks that characterize the process. In this paper we compare several such properties and identify the best one to weight each track on the basis of the reconstruction accuracy. We demonstrate that significant improvement in sensitivity can be achieved with this approach and for this reason it will be the baseline for IXPE data analysis.


Introduction
X-ray astronomy had a substantial breakthrough with the introduction of X-ray optics mated with imaging detectors in the focal plane. Beside the capability to image extended sources, resolve complex fields, and localize with highprecision point-like sources, the optics had a dramatic impact on the measurement sensitivity. In experiments with collimators, including the improvements of modulation collimators and coded masks, the photons from the source are compared with the fluctuations of the background counts, measured on the same surface. By means of a telescope, the photons collected on the mirrors area, projected on the optical axis, are focused on a focal-plane spot of very small surface where a very few background counts are expected. In terms of astrophysics, the revolution of the optics, achieved for the first time with HEAO-2/Einstein satellite, enabled imaging of galaxy clusters, supernova remnants, black hole jets, and detection of a huge number of extra-galactic sources, introducing X-ray astronomy as a new subject of cosmology.
Polarimetry was not able to follow this evolution and the main reason was the instrument technology. Any polarimeter is based on a large modulation, namely a high dependence of the response to polarization, but an effective exploitation of the optics in terms of both imaging and background needs also a good localization capability. The only viable polarimeters in the early stage of X-ray astronomy were based on Bragg diffraction at 45°or Compton scattering around 90°. Neither technique localizes the interaction point of the impinging photon. The situation evolved with the discovery that, thanks to the new technologies of microelectronics, a finely subdivided gas filled detector could be used as a polarimeter based upon the photoelectric effect.
In photoelectric polarimeters like the GPD, the s atomic orbital electrons are ejected from the atom with an azimuthal distribution of cos 2 f, peaked on the direction of the electric vector of the photon. If the azimuthal angles f of the initial direction of the photoelectrons were measured perfectly, this polarimeter would have a modulation factor of 100%. In reality, there are several effects that limit the ideal response of the instrument. The photoelectron path in the gas is traced by the ionization charges produced along the way, which have to drift to the collection plane; they are multiplied and eventually read out on a pixellated plane with finite pixel size. The quantization of the linear energy loss, the finite pixel dimensions and the transverse diffusion result in an highly blurred image of the track. Moreover, photoelectrons may be scattered close to their generation points, making it difficult to reconstruct their original directions. In the IXPE GPD, photoelectron tracks are analyzed by a custom reconstruction algorithm developed inhouse (Bellazzini et al. 2003a(Bellazzini et al. , 2003bFabiani & Muleri 2014;Baldini et al. 2021), which estimates the initial direction of the photoelectron and the point of the photon absorption. Such an algorithm features several steps: (i) a clustering algorithm to distinguish the contiguous physical track from isolated noisy pixels; (ii) simple calculations on the moments of the collected charge distribution to distinguish the initial and final part of the track; (iii) a refinement of the calculated quantities with a higher weight for the pixels at the beginning of the track. The algorithm also calculates a number of track properties, such as the track size, the total energy, etc. The purpose of this paper is to explore such parameters to determine which correlate better with the response to linear polarization. These results will be used to weight conveniently the tracks on which a more accurate reconstruction is possible and verify if a better sensitivity can be achieved. In the next section the statistical treatment of an event-by-event weighted analysis is presented. In Section 3 a brief explanation of the track reconstruction is given and some different tracks properties are considered to identify the best one to be used as a weight to improve IXPE sensitivity. In Sections 4 and 5 the weighted analysis is applied to existing IXPE calibration data and to estimate the reachable sensitivity on a reference observation.

Weighted Polarization Estimation
A convenient approach to the weighting of data collected by X-ray polarimeters was defined by Kislat et al. (2015). The method was originally developed for weighting the response of instruments with nonuniform acceptance, but it can be extended also to increase the contribution of tracks that are better reconstructed because of higher quality. Here we summarize the statistical method presented in Kislat et al. (2015). We remark that in this Section and in the following, equations for  and Stokes parameters are multiplied by a factor 2 with respect to the usual definition. This choice is due to the fact that, as shown in Kislat et al. (2015), for photoelectric polarimeters, when the usual Stokes parameters definition is applied the polarization degree is Q U , while the adopted definition allows one to obtain the Equation (6) for  as expected.
In photoelectric polarimeters, the (normalized) modulation Stokes parameters of a single event are calculated from the photoelectron emission angle of the kth event f k , and they need to be corrected for the detector response in a later stage analysis following an angular distribution Kislat et al. (2015): where  is the polarization degree, f 0 the polarization angle, and μ is the modulation factor; that is, the amplitude of the instrumental response to 100% polarized radiation. Weights can be introduced, as done in Kislat et al. (2015), applying a multiplication factor of the single-event Stokes parameters: The overall Stokes parameters of a measurement of N events are obtained as where Q and U are the  and  Stokes parameters normalized by  and μ is the modulation factor; that is, the amplitude of the instrumental response to 100% polarized radiation. Following a similar approach, the expected variances for the Stokes parameters can be determined: where we introduced the following quantity: From the previous equations, we can determine the reconstructed Stokes parameters to compare with results from theory or other experimental values as Q r = Q/μ and U r = U/μ. For these latter values we can obtain the following uncertainties in case of a large data set Following the same approach the covariance of  and  can be obtained: From the latter one we obtain From these latter relations it is possible to observe that a correlation between U and Q is present for higher values of polarization degree and μ, in case of low polarization degree this correlation goes to zero.
In Kislat et al. (2015) a joint error posterior distribution for the polarization degree and angle estimators has been derived to include the correlation between the Stokes parameters With andˆ0 f , we obtained values of polarization degree and angle, respectively, and where we define the "effective" number of counts of the measurement N W eff 2 2  = . This distribution, when the correlation term is negligible, can be reduced to the Rice distribution as shown by Vaillancourt (2006). In cases with  and μ not close to 0, the Gaussian approximation for the marginalized errors can be used as stated in Kislat et al. (2015): An older, alternative analysis method is based on a cut analysis, the so-called "standard cuts" (Muleri et al. 2016). This one is carried out applying a two-step selection of the events where 20% of them are removed. The first group of events is removed applying an energy cut: the energy spectrum is fitted with a Gaussian and events outside ±3σ from the peak center are removed. The events surviving the energy selection are ordered as a function of their "eccentricity" and lowereccentricity tracks are removed up to a threshold that removes 20% of the initial events including the ones removed from the energy selection.
This method and every kind of cut/selection analysis is similar to a weighted analysis, by the fact that data selection is a simple kind of weights where good events have w k = 1 and the bad ones w k = 0. In this case, the effective number of events is equal to the number of the good ones, while bad ones are removed by the analysis.
"Standard cuts" were developed for the analysis of monochromatic laboratory sources and they cannot be applied to observations of astrophysical sources with continuum spectra, which will be carried out with IXPE. As an alternative approach is needed to obtain a suitable sensitivity, we propose this weighted analysis. It is worth noting that, when weights are introduced, N eff < N such that a better sensitivity is achieved only if the decrease in the square root of counts' effective number is balanced by a sufficient increase of modulation factor.

Determination of Event-by-event Best-weight Parameter
Photoelectron tracks collected by the GPD on-board IXPE are analyzed by a custom algorithm to extract relevant information. In this algorithm the photoelectron track direction is determined on the basis of a two-step moment analysis that has been refined over the years (Bellazzini et al. 2003a(Bellazzini et al. , 2003bFabiani & Muleri 2014;Baldini et al. 2021). In the following we summarize such an algorithm to provide context for the following analysis. In the first step the barycenter of the charge distribution (as the one of Figure 1) is calculated by where x i and y i are the coordinates of pixels in the image, and q i is the charge collected in it. Centering the track in (x B , y B ), the second moment of the charge distribution is where the angle ψ is between the track direction and the x axis. This is used to determine the axis for the charge distribution of minimal or maximal extension: in fact, the angle which maximizes or minimizes M 2 is obtained imposing dM/dψ = 0. The maximum and minimum angles ( max y and min y respectively) are 90°apart and they allow to determine the longitudinal and transverse second moment (corresponding to track length (TL) and track width (TW), respectively). At this point max y defines the angle between the track direction and the x axis. The third moment estimated for max y is used to determine the X-ray impact point position with respect to the (x B , y B ) coordinates. In fact, this is the less dense part, as the photoelectron loses more and more energy as it slows down, forming the so-called Bragg peak. In Figure 1 it is possible to observe that the Bragg peak region is more dense than the initial part of the track: however the initial photoelectron emission direction in this part of the track is lost because of the photoelectron scattering within the gas cell. In the second step the distance between each pixel and the (x B , y B ) coordinates is estimated in M 2 units. Pixels that have a sign of distance different to one of the ( ) M 3 max y are removed, pixels having a distance value within a horseshoe region centered at (x B , y B ) with a minimum and maximum radius defined by 1.5 and 3.5, respectively, are selected and used to determine the impact point coordinates, as explained in the following. This method allows one to exclude pixels due to the Bragg peak or Auger electrons. To take into account the energy loss in the gas cell, pixel charges are weighted by w e D w 0 = -, with D distance from the impact point and w 0 = 0.05. The weighted charges in each pixel are then used to estimate the M 2 in the horseshoe region and to obtain the final photoelectron track direction.
In Figure 2 we report the horseshoe region, the direction determined in the first and the second step, but also the true one obtained from a Monte Carlo simulation. It is possible to observe that in cases where the initial part can be well identified with respect to the Bragg peak region, the initial direction of the photoelectron can be reconstructed with a better accuracy, as the tracks are of better quality (Figure 2; right). In case, as in Figure 2 (left), where the Bragg peak is not well identified, the track direction is not well determined.
We compared several quantities calculated by the GPD track-analysis algorithm to obtain an estimator of the track quality, to be used as a weight in the subsequent analysis. In the To relate these parameters with the response to polarization, we used the IXPESIM Monte Carlo tool developed by the IXPE team (L. Baldini et al. 2022, in preparation) and based on Geant4 to simulate a large data set. In particular, we generated ;10 7 events produced by 100% polarized source and distributed in energy with a power-law spectrum with photon index −2. To obtain an approximate but quick evaluation, we assumed that the best quantity is the one that provides the smallest uncertainty on the degree of polarization, defined in the previous section, that means to maximize the modulation factor.
In Marshall (2021) it has also been demonstrated that in weighted analysis, the modulation factor itself is the bestweight parameter. This means that we need to find in the following the parameter that allows us to minimize the  s and that is also a good proxy for the modulation factor. We notice that we could have used also the uncertainty on the angle (as in Equation (37) of Kislat et al. 2015) but our choice is consistent with the use of the MDP 99 (and not on the error on the angle) as the primary factor of merit of a polarimeter. Moreover both approaches converge, in practice, to the selection of the modulation factor as the driver.
The distributions of the parameters under study are shown in Figure 3. The data have been divided in 20 bins with the same number of events. For each bin, the modulation factor μ j is calculated, as shown in Figure 3, and we estimated the σ 2 on the whole data set as  Table 1.
From this analysis it is evident that the TE provides the lowest  s value and is also linearly related with the modulation factor.
The results obtained above with Monte Carlo simulations are confirmed with a representative measurement carried out during the calibration of the IXPE focal-plane detector with a 3.69 keV polarized source, described in Muleri et al. (2022), as shown by the results in Table 2. The numerical values of Tables 1 and 2 refer to different data sets with a different energy spectrum and a different number of events, but they show the same best-weight parameter-namely, α.
Both elongation and ellipticity provide good performance, but we choose the latter because its interval is conveniently between 0 and 1 and the modulation factor grows linearly with the ellipticity parameter following the function μ = 0.03 + 0.804α.
It is worth noting that α is not a mere proxy of the energy. In Figure 4 the α distributions at 2, 4, and 6 keV have been estimated by Monte Carlo simulations. One can see that, for a monochromatic source, α is distributed over a wide range of values.
Since ellipticity is the best parameter for the weighted analysis, we tried to further improve the sensitivity using as a weight some function of α-namely, α λ . To choose the best value of λ, we applied it to the simulated data set and calculated for different values of λ the following quantities: 1. Modulation factor as a function of the energy. 2. Reduction of the effective-count fraction, N eff  , as a function of the energy. 3. MDP 99 calculated as the mean value of 10 Monte Carlo simulations. From this latter plot we conclude that MDP 99 has a minimum for λ ; 0.75. Note. The data set is obtained by Monte Carlo simulations for a power-law energy spectrum in the energy range 2-8 keV and 500,000 events.  The results are reported in Figures 5 and 6. As we increase the value of λ, the weighting of data is more and more important with the effect that the number of effective counts decreases but the modulation factor increases. When we combine the two to obtain the sensitivity expressed with the MDP 99 , the best value for sensitivity-that is the minimum in MDP 99 -is obtained for λ = 0.75.

Application of the Weighted Analysis to IXPE Ground Calibration Data
In this section we applied the weighting approach to calibration data obtained with IXPE detector units (DUs) during ground calibrations (Fabiani et al. 2021). As results are consistent for all the IXPE DUs, we present, as a representative case, results for DU-FM2. Data analysis has been performed for both unpolarized and polarized sources to derive the response to completely polarized and unpolarized radiation. In particular, in this section we derive the modulation factor and the spurious modulation amplitude-which is the detector response to unpolarized radiation-by using the following analyses: 1. Unweighted/uncut: analysis without cuts or weights, where all events are included in the analysis. 2. Weighted/uncut: analysis without cuts but with an eventby-event weight equal to α 0.75 . 3. std_cut/unweighted: analysis with "standard cuts" to remove 20% of the data but without weights.
Figure 7(a) shows the spurious modulation as a function of energy as measured (dots). Note that the unweighted and weighted analysis give similar results, smaller than the ones obtained with "standard cuts." This component is removed in the analysis with an algorithm described by Rankin et al. (2022).
The modulation factor as a function of the energy is reported in Figure 7(b). Modulation factor with weights and "standard cuts" is higher than the value obtained with the unweighted analysis.
The guideline for these analysis approaches is always the maximization of the so-called "Factor of Quality" (Pacciani et al. 2003). This is the product of the modulation factor by the square root of the efficiency, which had been already used effectively for the selection of the design parameters of the detectors. Thus, to have an overall idea of the improvement in sensitivity, taking into account the N eff contribution with experimental data, the quality factor is given by f w m , where f w is the fraction of events N eff /N after cuts/weights analysis. In Figure 7(c) the quality factor for each analysis is normalized to the unweighted case, showing that the weighted analysis provides the best improvement in sensitivity over the entire IXPE energy range.

Sensitivity Improvement on a Simulated IXPE Observation
Above we evaluated the improvement achieved by applying the weighted analysis to a monochromatic beam, albeit for different energies. However IXPE (and any other polarimetry mission except the Bragg-diffraction polarimeters) integrates photons within a continuous energy band, in order to achieve adequate sensitivity. Depending upon the celestial source modeling or its intensity, it could be the whole energy range of the instrument. We want to know how the improvements detected at single energy values convert into an improved detection capability for a continuum source.
To estimate the improvement of the weighted analysis with respect to the unweighted one, we simulated an observation with IXPE of a reference point-like source with a power-law spectrum (number index −2), a flux of 10 −11 erg cm −2 s −1 in the energy range 2-8 keV and 10 days of integration time. In the simulation, we considered the GPD quantum efficiency at  launch (following the model described by Baldini et al. (2021) and applying to each IXPE DU the expected DME pressure at launch), the UV filters transparency and the optics effective area as measured at NASA Marshall Space Flight Center during ground calibrations of the optics. We simulated 200 data sets in IXPESIM obtaining MDP 99 values in the range (5.2-5.7)% for the weighted analysis and (5.9-6.6)% for the unweighted one. Figure 8 shows the distribution of these values, which are compatible with a Gaussian distribution with mean values 5.5% and 6.2% for the weighted and the unweighted analysis, respectively.
This kind of analysis cannot be performed with the "standard cuts" approach because the energy spectrum of the source is continuous. This result shows that the weighted analysis by using α 0.75 as an optimal weight can allow a significant improvements in polarization sensitivity. The weighted analysis achieves a sensitivity within the IXPE scientific requirement that MDP 99 not exceed 5.5% for the reference source.

Conclusions
In this paper we applied the weighted analysis described in Kislat et al. (2015) to data collected by the focal-plane X-ray polarimeter on-board IXPE, which are the GPDs. Starting from quantities calculated using the dedicated track reconstruction analysis developed for the GPD, we identified a best-weight parameter for both simulated and real data. The weighted analysis improves the IXPE sensitivity.
In this analysis we showed that the modulation factor is affected significantly by weights and its value depends upon the "strength" of the weighting. However, the gain on sensitivity, expressed as a reduction of MDP 99 , is partially offset by a reduction in the effective number of events (N eff ). Among different choices for the weighting, we used the track ellipticity, which provides the best sensitivity and has the advantage that its range is [0, 1].
In such an assumption, when the analysis is applied to calibration data we conclude the following:  1. Application of weights reduces spurious modulation with respect to values obtained with "standard cuts" used in IXPE calibration analysis. 2. Modulation factor is larger than the value obtained with "standard cuts" or unweighted/uncut analysis.
We evaluated by means of Monte Carlo simulations the improvement in sensitivity considering a Crab-like spectrum source. Our analysis showed that, with respect to an unweighted scheme, the weighted scheme improves the sensitivity by 13%. This translates in 30% reduction in observing time that is like adding a fourth telescope to the three already available. The weighted scheme for track analysis is now considered the baseline for IXPE data analysis.
IXPE data are analyzed on ground, as briefly explained in Soffitta et al. (2021). Binary telemetry files are converted into FITS standard format containing all the information provided by the instrument and other relevant information from the spacecraft. In particular, they contain track images that are analyzed to estimate the Stokes parameters and α 0.75 , that is the best weight available to be used, as explained in this paper. IXPE data will be publically released to the scientific community with a format including for each event a weight to apply, given by α 0.75 .