Testing Shear Recovery with Field Distortion

Jun Zhang; Fuyu Dong; Hekun Li; Xiangchong Li; Yingke Li; Dezi Liu; Wentao Luo; Liping Fu; Guoliang Li; Zuhui Fan

doi:10.3847/1538-4357/ab1080

1. Introduction

Recent weak-lensing measurements have indicated marginal tensions with results from the cosmic microwave background (Hildebrandt et al. 2017; Köhlinger et al. 2017; Abbott et al. 2018; Troxel et al. 2018; Hikage et al. 2019) and galaxy formation models (Leauthaud et al. 2017). It would be extremely exciting if these discrepancies point to new physics beyond the concordance ΛCDM cosmological model, e.g., dynamical dark energy or modified gravity theories (Amendola et al. 2018). There are, however, also concerns that weak-lensing measurements still contain systematic errors that have not been identified and corrected, regarding either shape measurement or photo-z errors (Efstathiou & Lemos 2018).

To establish weak lensing as a robust cosmological probe, ways of testing the accuracy and consistency of the measurements are indispensable. Simulations are typically used to calibrate the shear recovery accuracy (Mandelbaum et al. 2015), but image processing at different stages could all introduce problems that are not easily identified and therefore not included in the simulations.

There are also ways of testing the consistency of shear recovery based on real data. For example, cross-correlation between the point-spread function (PSF) shape and the galaxy shape is used to find out if there are residual anisotropic PSF effects that are not removed (Kaiser et al. 1995; Fischer et al. 2000), and E/B mode decomposition is used to check if the lensing signal has a gravitational origin (Crittenden et al. 2002; Schneider et al. 2002). Nevertheless, we find that these tests are not very sensitive to the multiplicative biases that are typically required to correct for the sensitivity of their shear estimator to the underlying shear signal. Given that multiplicative biases are directly degenerate with the amplitudes of the lensing signals and the shear–shear correlations, and thereby with several key cosmological parameters such as σ₈ and Ω_m, it is desirable to calibrate the multiplicative biases based on real data as an alternative to the existing methods (Vallinotto et al. 2011; Zhang 2015). The purpose of this work is to propose a solution with the help of field distortion, an effect that exists universally in optical systems.

The effect of field distortion modifies the galaxy image in a way similar to that of lensing. The amplitude and direction of this distortion can be calculated using the parameters derived from astrometric calibration on one hand, and recovered from the galaxy images on the other hand. We therefore have a natural way of testing the accuracy of shear recovery directly using real data, without requiring simulations or external data sets.

In Section 2, we introduce the concept of field distortion and the way of deriving it using astrometric parameters. In Section 3, we give a brief introduction of the Fourier_Quad shear measurement method (Zhang et al. 2015). We use CFHTLenS data (Heymans et al. 2012; Erben et al. 2013) to demonstrate our idea. The details of our image processing of the CFHTLenS data are given in Section 4. It highlights a few key steps in our pipeline. Our main results are shown in Section 5. Finally, in Section 6, we give brief conclusions as well as discussions regarding the current status of our pipeline and future prospects.

2. Field Distortion

In geometric optics, field distortion is a type of optical aberration. In general, it is the result of deviation from global rectilinear projection, causing distortions of image shapes. Here in this paper, we are interested in the distortion at any local position of the CCD, i.e., the vicinity of a galaxy. The local distortion can always be treated as a linear mapping between the source plane and the CCD plane. If one thinks of this effect in terms of the displacements of point sources, it is straightforward to draw an analogy between the field distortion and the weak-lensing effect. In other words, field distortion causes a shear signal on top of the astrophysical one. It is our purpose to utilize this signal to test the accuracy of shear measurement.

In this work, we adopt the TPV world coordinate system (WCS; Calabretta & Greisen 2002), which builds on the standard tangent plane projection with a general polynomial distortion correction. This is the convention used in the CFHTLenS data processing. Our notations below can be easily extended to other WCS conventions. Let us denote the standard coordinate of the source on the tangent plane of the celestial sphere by (ξ, η) and its position on the CCD grid by (x, y). They are related through the following formulae in the TPV convention:

$\begin{eqnarray}\begin{array}{rcl}\xi & = & {f}_{\xi }(X,Y)\\ \eta & = & {f}_{\eta }(X,Y)\\ X(x,y) & = & {\mathrm{CD}}_{1}^{1}(x-\mathrm{CRPIX}1)+{\mathrm{CD}}_{2}^{1}(y-\mathrm{CRPIX}2)\\ Y(x,y) & = & {\mathrm{CD}}_{1}^{2}(x-\mathrm{CRPIX}1)+{\mathrm{CD}}_{2}^{2}(y-\mathrm{CRPIX}2),\end{array}\end{eqnarray} \tag{ 1 }$

in which the coefficients ${{\rm{CD}}}_{j}^{i}$ and CRPIX(1, 2) take care of the basic linear transformation, and the distortion functions f_ξ and f_η are defined in polynomial forms as

$\begin{eqnarray}&&\begin{array}{c}\begin{array}{l}{f}_{\xi }(X,Y)\\ \,=\,{{\rm{P}}{\rm{V}}}_{0}^{1}+{{\rm{P}}{\rm{V}}}_{1}^{1}X+{{\rm{P}}{\rm{V}}}_{2}^{1}Y+{{\rm{P}}{\rm{V}}}_{3}^{1}R\\ \,\,\,+\,{{\rm{P}}{\rm{V}}}_{4}^{1}{X}^{2}+{{\rm{P}}{\rm{V}}}_{5}^{1}XY+{{\rm{P}}{\rm{V}}}_{6}^{1}{Y}^{2}+{{\rm{P}}{\rm{V}}}_{7}^{1}{X}^{3}\\ \,\,\,+\,{{\rm{P}}{\rm{V}}}_{8}^{1}{X}^{2}Y+{{\rm{P}}{\rm{V}}}_{9}^{1}{XY}^{2}+{{\rm{P}}{\rm{V}}}_{10}^{1}{Y}^{3}\\ \,\,\,+\,{{\rm{P}}{\rm{V}}}_{11}^{1}{R}^{3}+...\\ \,{f}_{\eta }(X,Y)\\ ={{\rm{P}}{\rm{V}}}_{0}^{2}+{{\rm{P}}{\rm{V}}}_{1}^{2}Y+{{\rm{P}}{\rm{V}}}_{2}^{2}X+{{\rm{P}}{\rm{V}}}_{3}^{2}R\\ \,\,\,+\,{{\rm{P}}{\rm{V}}}_{4}^{2}{Y}^{2}+{{\rm{P}}{\rm{V}}}_{5}^{2}XY+{{\rm{P}}{\rm{V}}}_{6}^{2}{X}^{2}+{{\rm{P}}{\rm{V}}}_{7}^{2}{Y}^{3}\\ \,\,\,+\,{{\rm{P}}{\rm{V}}}_{8}^{2}{XY}^{2}+{{\rm{P}}{\rm{V}}}_{9}^{2}{X}^{2}Y+{{\rm{P}}{\rm{V}}}_{10}^{2}{X}^{3}+{{\rm{P}}{\rm{V}}}_{11}^{2}{R}^{3}+...\end{array}\end{array}\end{eqnarray} \tag{ 2 }$

with $R=\sqrt{{X}^{2}+{Y}^{2}}$ . Equation (2) can in principle include many more higher order polynomial terms. The CFHTLenS team includes only the coefficients of the form ${\mathrm{PV}}_{i}^{1,2}$ , with i = 0, 1, 2, 4, 5, 6, 7, 8, 9, 10.

One can directly derive the shape distortion matrix from the above formulae. As in weak lensing, the matrix can be written as

$\begin{eqnarray}&&\left(\begin{array}{c}d\xi \\ d\eta \end{array}\right)=\left(\begin{array}{cc}{\partial }_{x}\xi & {\partial }_{y}\xi \\ {\partial }_{x}\eta & {\partial }_{y}\eta \end{array}\right)\left(\begin{array}{c}{dx}\\ {dy}\end{array}\right).\end{eqnarray} \tag{ 3 }$

The distortion matrix can usually be decomposed into four parts: an overall scaling factor M (or magnification), two shear components (g₁, g₂), and a rotation angle θ. We can rewrite the distortion matrix in the following form (g₁, g₂, θ ≪ 1):

$\begin{eqnarray}\left(\begin{array}{cc}{\partial }_{x}\xi & {\partial }_{y}\xi \\ {\partial }_{x}\eta & {\partial }_{y}\eta \end{array}\right)=M\left(\begin{array}{cc}1-{g}_{1} & -{g}_{2}+\theta \\ -{g}_{2}-\theta & 1+{g}_{1}\end{array}\right).\end{eqnarray} \tag{ 4 }$

It is then straightforward to derive the field-distortion-induced shear (called FDS hereafter) as

$\begin{eqnarray}\begin{array}{rcl}{g}_{1}(\mathrm{FD}) & = & ({\partial }_{y}\eta -{\partial }_{x}\xi )/({\partial }_{y}\eta +{\partial }_{x}\xi )\\ {g}_{2}(\mathrm{FD}) & = & -({\partial }_{x}\eta +{\partial }_{y}\xi )/({\partial }_{y}\eta +{\partial }_{x}\xi ).\end{array}\end{eqnarray} \tag{ 5 }$

Figure 1 shows a typical distribution of the FDS on the 36 CCDs of MegaCam at CFHT.¹⁰ For the purpose of illustration, the FDS at each location is magnified by a factor of 30 and represented by an ellipse. The parameters of the field distortion are derived from astrometric calibration done with the THELI software (Erben et al. 2005; Schirmer 2013) using the reference catalog from either 2MASS or SDSS. The amplitude of the FDS here is typically a few times 0.001, similar to the weak-lensing signal. The purpose of this work is to find out if one can recover the FDS signals with the galaxy images as a way of checking the accuracy of the shear measurement method. Note that as the FDS only depends on the location of the galaxy on the CCD, the cosmological weak-lensing signal can thus be regarded as a random noise, similar to the galaxy intrinsic shape variation.

It is worth noting that in the Lensfit pipeline of CFHTLenS (Miller et al. 2007, 2013), the field-distortion signal is added to the assumed model of the galaxy on each exposure before performing the joint fitting for the galaxy parameters/ellipticities. This is for the purpose of increasing the significance of the best-fitting model, thereby reducing the errors on the galaxy ellipticities. However, in doing so, the FDS information is lost. In this work, we use the shear measurement method developed in a series of works (Zhang 2008, 2010, 2011; Zhang & Komatsu 2011; Zhang et al. 2015, 2017). It is named "Fourier_Quad" in the GREAT3 project (Mandelbaum et al. 2015). Our shear measurements are performed on individual exposures; therefore, the FDS information can be completely retained. In the rest of the paper, we introduce the Fourier_Quad method and the test it with FDS using CFHTLenS images.

3. The Fourier_Quad Method

In Fourier_Quad, the shear estimators are defined on the 2D power spectrum of the galaxy image in Fourier space:

$\begin{eqnarray}\begin{array}{rcl}{G}_{1} & = & -\displaystyle \frac{1}{2}\displaystyle \int {d}^{2}{\boldsymbol{k}}({k}_{x}^{2}-{k}_{y}^{2})T({\boldsymbol{k}})M({\boldsymbol{k}})\\ {G}_{2} & = & -\displaystyle \int {d}^{2}{\boldsymbol{k}}{k}_{x}{k}_{y}T({\boldsymbol{k}})M({\boldsymbol{k}})\\ N & = & \displaystyle \int {d}^{2}{\boldsymbol{k}}\left[{k}^{2}-\displaystyle \frac{{\beta }^{2}}{2}{k}^{4}\right]T({\boldsymbol{k}})M({\boldsymbol{k}}),\end{array}\end{eqnarray} \tag{ 6 }$

where ${\boldsymbol{k}}$ is the wave vector. $T({\boldsymbol{k}})$ is given by ${\left|{\widetilde{W}}_{\beta }({\boldsymbol{k}})\right|}^{2}/{\left|{\widetilde{W}}_{\mathrm{PSF}}({\boldsymbol{k}})\right|}^{2}$ , i.e., the ratio between the power spectrum of a 2D isotropic Gaussian function W_β and that of the PSF W_PSF. This factor is used to convert the form of the PSF to the isotropic Gaussian function for the purpose of correcting the PSF effect in a model-independent way. The Gaussian function W_β is defined as

$\begin{eqnarray}&&{W}_{\beta }({\boldsymbol{x}})=\displaystyle \frac{1}{2\pi {\beta }^{2}}\exp \left(-\displaystyle \frac{{\left|{\boldsymbol{x}}\right|}^{2}}{2{\beta }^{2}}\right),\end{eqnarray} \tag{ 7 }$

in which β should be somewhat larger than the scale radius of the original PSF to avoid singularities in the conversion. $M({\boldsymbol{k}})$ is the galaxy power spectrum, but modified to take into account the corrections due to the background and the Poisson noise:

$\begin{eqnarray}\begin{array}{rcl}M({\boldsymbol{k}}) & = & {\left|{\widetilde{f}}^{S}({\boldsymbol{k}})\right|}^{2}-{F}^{S}-{\left|{\widetilde{f}}^{B}({\boldsymbol{k}})\right|}^{2}+{F}^{B}\\ {F}^{S} & = & \displaystyle \frac{{\displaystyle \int }_{| {\boldsymbol{k}}| \gt {k}_{c}}{d}^{2}{\boldsymbol{k}}{\left|{\widetilde{f}}^{S}({\boldsymbol{k}})\right|}^{2}}{{\displaystyle \int }_{| {\boldsymbol{k}}| \gt {k}_{c}}{d}^{2}{\boldsymbol{k}}},\\ {F}^{B} & = & \displaystyle \frac{{\displaystyle \int }_{| {\boldsymbol{k}}| \gt {k}_{c}}{d}^{2}{\boldsymbol{k}}{\left|{\widetilde{f}}^{B}({\boldsymbol{k}})\right|}^{2}}{{\displaystyle \int }_{| {\boldsymbol{k}}| \gt {k}_{c}}{d}^{2}{\boldsymbol{k}}}.\end{array}\end{eqnarray} \tag{ 8 }$

${\widetilde{f}}^{S}({\boldsymbol{k}})$ and ${\widetilde{f}}^{B}({\boldsymbol{k}})$ are the Fourier transformations of the galaxy image and a neighboring image of background noise, respectively. The terms F^S and F^B serve as good estimates of the Poisson noise power spectra on the source and background images, respectively, given that the critical wave number k_c is large enough to avoid the regions dominated by the source power. It has been shown in Zhang et al. (2015) that the ensemble averages of the shear estimators defined above recover the shear values to second order in accuracy (assuming that the intrinsic galaxy images are statistically isotropic), i.e.,

$\begin{eqnarray}&&\displaystyle \frac{\left\langle {G}_{1}\right\rangle }{\left\langle N\right\rangle }={g}_{1}+O({g}_{1,2}^{3}),\,\displaystyle \frac{\left\langle {G}_{2}\right\rangle }{\left\langle N\right\rangle }={g}_{2}+O({g}_{1,2}^{3}).\end{eqnarray} \tag{ 9 }$

Note that the ensemble averages are taken for G₁, G₂, and N separately (Zhang & Komatsu 2011).

A more refined way of deriving the shear signal from an ensemble of shear estimators is given in Zhang et al. (2017). The method is called probability distribution function PDF-SYM, in which the best estimate of the shear signal $({\hat{g}}_{1},{\hat{g}}_{2})$ is determined by symmetrizing the probability distribution function (PDF) of ${G}_{1}-{\hat{g}}_{1}(N+U)$ and ${G}_{2}-{\hat{g}}_{2}(N-U)$ on the positive and negative sides of zero. For this purpose, two more terms should be defined to properly take into account the parity properties of the shear estimators:

$\begin{eqnarray}\begin{array}{rcl}U & = & -\displaystyle \frac{{\beta }^{2}}{2}\displaystyle \int {d}^{2}{\boldsymbol{k}}\left({k}_{x}^{4}-6{k}_{x}^{2}{k}_{y}^{2}+{k}_{y}^{4}\right)T({\boldsymbol{k}})M({\boldsymbol{k}})\\ V & = & -2{\beta }^{2}\displaystyle \int {d}^{2}{\boldsymbol{k}}\left({k}_{x}^{3}{k}_{y}-{k}_{x}{k}_{y}^{3}\right)T({\boldsymbol{k}})M({\boldsymbol{k}}).\end{array}\end{eqnarray} \tag{ 10 }$

The term V is kept to transform U in case of coordinate rotation in shear measurement. It turns out that the PDF-SYM method allows the statistical error of the shear measurement to approach the Cramer–Rao bound, i.e., the lower limit in theory. We adopt the PDF-SYM method for presenting the shear measurement results in this paper.

Note that in astrophysical applications, we need to remove the field-distortion effect from the shear estimators. For this purpose, we only need to replace (G₁, G₂) with ( ${G}_{1}^{{\prime} }$ , ${G}_{2}^{{\prime} }$ ), which are defined as

$\begin{eqnarray}\begin{array}{rcl}{G}_{1}^{{\prime} } & = & {G}_{1}-{g}_{1}(\mathrm{FD})(N+U)-{g}_{2}(\mathrm{FD})V\\ {G}_{2}^{{\prime} } & = & {G}_{2}-{g}_{2}(\mathrm{FD})(N-U)-{g}_{1}(\mathrm{FD})V.\end{array}\end{eqnarray} \tag{ 11 }$

4. Image Processing with the CFHTLenS Data

Our imaging data are from the wide part of CFHTLS observed in 2003 and 2008 with MegaPrime/MegaCam (Erben et al. 2013). MegaCam is an optical multichip instrument with a 9 × 4 CCD array, 0 farcs 187 pixel scale, and ∼1° × 1° field of view. There are four fields (W1, 2, 3, and 4) in the survey, containing 171 pointings in total, covering about 154 deg² sky area. There are imaging data for five filters—u*, g', r', i', z'—and we use the i'-band data for shear measurement in this paper.¹¹ For the i'-band, each pointing typically contains seven exposures, each of which lasts for 615 s. The limiting magnitude in the i'-band reaches around 24.5 (AB mag). The images we use are all preprocessed with the Elixir software,¹² which takes care of the removal of instrumental signatures (Magnier & Cuillandre 2004). These images are available at the Canadian Astronomical Data Center (CADC).¹³

Unlike the Lensfit method used by the CFHTLenS team, our Fourier_Quad method is so far carried out on single exposures individually. Initially in this project, we relied on the THELI software to process the CCD images, mainly regarding background removal, defect detection, and astrometric calibration. We skipped the steps of photometric calibration and image co-addition. As our project evolved, we felt increasingly obliged to develop a self-contained image-processing pipeline, including all the necessary steps to ensure shear recovery accuracy, so that sources of shear biases from individual image-processing steps can potentially be targeted and corrected. It is worth noting that several crucial problems are indeed identified with the field-distortion test proposed in this paper. Our results in the rest of this paper are generated using our newly built image-processing pipeline, which is completely independent from the THELI software or any other astronomical software such as, e.g., Sextractor (Bertin & Arnouts 1996). It works directly on the Elixir-processed images and does background removal, defects identification, source identification, deblending of sources, astrometric calibration, PSF reconstruction, and shear measurement. Our code does not yet include photometric calibration; therefore, it does not measure magnitude or photometric redshift. As a remedy for this, our code can run shear measurements for sources whose positions in terms of (R.A., decl.) are from an external source catalog. Some of our results are made with the official CFHTLenS source catalog, as shown later in the paper. A complete description of our pipeline will be given in a separate paper (H. Li et al. 2019, in preparation). In the following sections, we show several key highlights.

4.1. Background Removal

Accurate modeling of the background is important not only for source identification, but also for shear measurement. When the discrete sources are not overcrowded, which is the case for the individual exposures of CFHTLenS, a common way to find the local background level is by sorting a reasonable number of neighboring pixel values, followed by removing the outliers that are either due to bright sources or bad pixels, and finally equating the background as a linear combination of the median and the mean of the remaining pixels (Bertin & Arnouts 1996). The chip-wide background map is constructed with a bicubic-spline interpolation of a number of local background values, each of which is derived from a region of a reasonable mesh size. We adopt a similar approach in our pipeline for making the background map, but with a polynomial fitting for the chip-wide map. In doing so, we remove the outliers of the background values that are typically associated with extended diffraction patterns of bright stars and repeat the fitting until there are no more outliers. The distribution of the standard deviation σ(x, y) of the background is similarly generated.

The background-subtracted CCD images created with the above procedures generally perform well for source identification. Nevertheless, we find that removal of the chip-wide background map inevitably leaves spatially varying residuals, causing excess anisotropies in the source stamps and shear biases. In order to avoid this problem, we choose to cut out the source stamps directly from the original CCD image. For each source stamp (48 × 48), we locally model its background as a linear 2D function, with parameters evaluated using the pixels on the immediate neighborhood of the source stamp. We exclude the bad pixels and the bright pixels from other sources by sorting the pixel values. It turns out that source stamps created in this way perform very well in shear measurement.

It is useful to note that each of the 36 CCDs of Megacam is a combination of two equal-sized chips lining up along the short axis. Their gains sometimes have visible differences due to the imperfect flat-fielding process by Elixir. For avoiding systematic shear errors caused by this feature of the data, we simply remove sources that span both sides of a CCD.

4.2. Source Identification and Deblending

There are two stages in our source identification: (1) locating the source positions in the CCD and (2) determining the boundaries of the sources. If an external source catalog is provided, the first step is skipped. Otherwise, each source is identified as a number of connected pixels that are above nσ using the FOF (friends-of-friends) method. We also require every source FOF region to contain at least one pixel that is above mσ (m > n), and the number of pixels in each group should be larger than s. In the work described in this paper, we choose n = 2, m = 4, and s = 6. Note that unlike in Sextractor (Bertin & Arnouts 1996), we try not to involve a filter of a certain size (e.g., 3 × 3, 5 × 5) and shape in source finding, for the purpose of maximally avoiding potential selection effects. To avoid possible nonlinear-response regions of the CCD, we exclude sources with pixels brighter than 1/2 of the saturation level.

In the second step, each source is required to be well contained within a 48 × 48 stamp, meaning that the source pixels should stay inside a circular region of radius 24 − a (pixel) centered at the stamp center. a is typically less than 5. For other source regions or masked areas inside the stamp, we replace their pixel values with uncorrelated Gaussian random noises of variance σ²(x, y), where (x, y) refers to the central position of the stamp. If the masked area directly touches the region of the target source, we simply consider it to be a bad image and remove the source from the catalog. In Figure 2, we show the repaired source stamps as an example.

**Figure 2.** Examples showing how our pipeline removes contamination from neighboring sources. The upper panel shows the original appearance of stamps cut out from the CCD image, and the lower panel shows the same stamps after the removal of the neighboring contaminants to the central sources.
Download figure:
Standard image High-resolution image

In this work, blending effectively means that the FOF regions of two sources are overlapped. In Fourier_Quad, deblending is not necessary if the two blended sources are physically close, i.e., their redshifts are similar. This is because the accuracy of the Fourier_Quad method does not rely on the regularity of the source shape. If the blended sources have very different redshifts, we need to carefully interpret the shear results based on the frequency of this case, or simply remove them from the source list. However, it is not yet clear how accurately the photometric redshifts for two or more blended sources can be measured individually in practice. This is by itself a difficult problem and beyond the scope of this work. For the purpose of this paper, we simply treat each FOF group as a single source, as it does not affect the recovery of the FDS.

Finally, according to the method of Fourier_Quad, we need to find a nearby stamp of background noise for each source image. Currently, our choice is to locate several candidate stamps in the source neighborhood and pick the one whose maximal pixel value is the lowest. Once the noise stamp is chosen, any source pixels and bad pixels identified within it are replaced with randomly generated noise pixels, similar to what is done for the source stamp. Note that the undetected sources/blends are effectively treated as part of the background. As long as their contributions to the power spectra of the source stamps and the noise stamps are statistically similar, they should not cause shear biases in Fourier_Quad. A more careful study of this problem will be given in a future work.

4.3. Astrometric Calibration

Accurate astrometric calibration is crucial in weak lensing for at least two important reasons: (1) stacking of multi-exposure images and (2) calculation of the FDS, which should be subtracted from the shear estimator. The first point is not relevant in this paper, as our study here does not require stacking. The accuracy of the FDS distribution is our key concern.

Although the formalism of Section 2 has been a routine in deriving the astrometric solution, we find that the fitting procedures sometimes do not converge well, leading to a failed astrometric solution. This happens in the public code SCAMP (Bertin 2006) and THELI. This problem becomes more serious when the maximal order in the polynomial functions defined in Equation (2) is large (>3). This is unfortunate, as we need to increase the order of the polynomial functions to check the convergence/accuracy of the FDS distribution. For our purpose, we propose modifying the fitting procedure slightly: instead of performing the fitting on the projected plane (ξ, η), we define another set of parameters PU to carry out the fitting on the rescaled CCD plane $(\widetilde{X},\widetilde{Y})$ using the following formulae:

$\begin{eqnarray}\begin{array}{rcl}\widetilde{X}(\xi ,\eta ) & = & \xi +{\mathrm{PU}}_{3}^{1}r+{\mathrm{PU}}_{4}^{1}{\xi }^{2}+{\mathrm{PU}}_{5}^{1}\xi \eta +{\mathrm{PU}}_{6}^{1}{\eta }^{2}\\ & & +\ {\mathrm{PU}}_{7}^{1}{\xi }^{3}+{\mathrm{PU}}_{8}^{1}{\xi }^{2}\eta +{\mathrm{PU}}_{9}^{1}\xi {\eta }^{2}+{\mathrm{PU}}_{10}^{1}{\eta }^{3}\\ & & +\ {\mathrm{PU}}_{11}^{1}{r}^{3}+...\\ \widetilde{Y}(\xi ,\eta ) & = & \eta +{\mathrm{PU}}_{3}^{2}r+{\mathrm{PU}}_{4}^{2}{\eta }^{2}+{\mathrm{PU}}_{5}^{2}\xi \eta +{\mathrm{PU}}_{6}^{2}{\xi }^{2}\\ & & +\ {\mathrm{PU}}_{7}^{2}{\eta }^{3}+{\mathrm{PU}}_{8}^{2}\xi {\eta }^{2}+{\mathrm{PU}}_{9}^{2}{\xi }^{2}\eta +{\mathrm{PU}}_{10}^{2}{\xi }^{3}\\ & & +\ {\mathrm{PU}}_{11}^{2}{r}^{3}+...\end{array}\end{eqnarray} \tag{ 12 }$

where $r=\sqrt{{\xi }^{2}+{\eta }^{2}}$ . $(\widetilde{X},\widetilde{Y})$ can then be directly used to match the CCD position (X, Y) defined in Equation (1) through a simple χ² as

$\begin{eqnarray}\begin{array}{rcl}{\chi }^{2} & = & \displaystyle \sum _{i}{{\left[X({x}_{i},{y}_{i})-\widetilde{X}({\xi }_{i},{\eta }_{i})\right]}^{2}/\sigma }_{X}^{2}(i)\\ & & +\ \displaystyle \sum _{i}{{\left[Y({x}_{i},{y}_{i})-\widetilde{Y}({\xi }_{i},{\eta }_{i})\right]}^{2}/\sigma }_{Y}^{2}(i).\end{array}\end{eqnarray} \tag{ 13 }$

By minimizing this form of χ², the values of ${\mathrm{CD}}_{j}^{i}$ , CRPIX(1,2), and ${\mathrm{PU}}_{j}^{i}$ can be straightforwardly derived. Note that we have intentionally omitted the constant shifts and some linear terms in ξ and η in Equation (12) because they are degenerate with ${\mathrm{CD}}_{j}^{i}$ and CRPIX(1, 2). This degeneracy is quite obvious in our definition of χ². The values of (ξ_i, η_i) can be calculated from the R.A. and decl. of a source in the reference catalog, with the CRVAL values given by the header of the FITS file (Calabretta & Greisen 2002). For simplicity in this work, σ_X(i) and σ_Y(i) are chosen to be the same constant values and therefore omitted.

Another key ingredient in astrometric calibration is to match the sources in the reference catalog with those on the CCD. This is done before the fitting is carried out. We use the initial values of ${\mathrm{CD}}_{j}^{i}$ , CRPIX(1, 2) from the header of the fits file to transfer the source coordinates from the pixel space directly to the plane of (ξ, η) (by setting ${\mathrm{PU}}_{j}^{i}=0$ ) and use the given values of CRVAL(1, 2) to turn the source positions of the reference catalog into (ξ, η) as well. By plotting the differences between the position vectors in these two groups, we can identify a concentrated area, indicating that a number of source pairs share similar difference vectors. These sources are used to refine the values of ${\mathrm{CD}}_{j}^{i}$ and CRPIX(1, 2) (still keeping ${\mathrm{PU}}_{j}^{i}=0$ ). The above procedures are then repeated to improve and finalize the identification of source pairs, which are ultimately used in the polynomial fitting for the astrometric solution.

Figure 3 shows the accuracy of the calibrated source positions for one exposure in the field of w4m0m0. The calibrated positions are compared to those listed in the reference catalog, which is chosen to be Gaia DR2 (Gaia Collaboration et al. 2018) in this study. We achieve the results using polynomial fitting functions up to the third order, without involving terms proportional to r or r³. The rms values of the residuals in the four plots are about 0 farcs 05 and appear to be homogeneous over the whole sample. There are no visible systematic biases in the fitting.

In Figure 4, we show the results of the calibration for the same exposure using polynomial fitting functions up to the ninth order, still without the r²ⁿ⁻¹ terms (n = 1, 2, 3, 4, 5). The overall quality of the fitting does not change much, implying that the astrometric calibration converges well at the third-order polynomial fitting in our pipeline. We also find that including the r²ⁿ⁻¹ terms does not make much difference. These facts ensure the accuracy of the FDS measurement for the purpose of this work.

4.4. Star Selection for PSF Reconstruction

As already demonstrated in Lu et al. (2017), for the CFHTLenS data, a chip-wise pixel-by-pixel spatial interpolation of the PSF power spectra with the first- or second-order polynomial functions is the best way of reconstructing the PSF field for Fourier_Quad. Our discussion here therefore only focuses on how to select out bright stars (typically with signal-to-noise ratio S/N ≳ 100) from bright sources.

Among the source stamps with large S/N, we realize that a significant fraction of them are indeed stars. It is also useful to note that the power spectra of stars have very similar profiles within a chip and are more extended than those of galaxies. Using these facts, we adopt the following procedures to identify the stars:

1. For each source stamp with S/N larger than 100, we take its 2D power spectrum and then apply to it the noise reduction, noise correction, and normalization procedures. The details of the noise reduction part are introduced in Section 4.5. Noise correction refers to the removal of the systematic impacts from the background noise and the Poisson noise on the source power spectrum according to the description in Section 3. The result of these procedures is to make the power spectrum image the least affected by noise, and the power at k = 0 is unity.

2. A model PSF is constructed. Each of its pixel value is determined by sorting the corresponding pixel values from the power of all candidate sources and taking the lower bound of the top 25%.

3. The similarity between the images can be quantified by defining a distance D between the normalized power spectra of two source stamps as

$\begin{eqnarray}&&{D}_{{nm}}^{2}=\displaystyle \sum _{i=1}^{N}{\left({I}_{i}^{n}-{I}_{i}^{m}\right)}^{2}{k}_{i}^{\alpha },\end{eqnarray} \tag{ 14 }$

where ${I}_{i}^{(n,m)}$ refers to the value of the ith pixel in the power image of the nth (or mth) source stamp. N is the total number of pixels used to evaluate D. ${k}_{i}^{\alpha }$ is simply a power-law function of the wave number corresponding to the ith pixel, serving here as a weighting function. Because Fourier_Quad uses the quadrupole moments of the power to estimate shear, we choose α = 4 to enhance the importance of pixels at large wave numbers. On the other hand, because the outskirts of the power image are dominated by noise, we limit the summation in Equation (14) to be within the central 25 × 25 region;

4. Calculate the distance D_iM between the model PSF and the ith source power. We then sort the values of D_iM and take the upper bound of the bottom 25% to be the threshold D_c. Sources with D_iM > 3D_c are removed from the candidate list, as they are simply too different from the model PSF.

5. With the remaining candidate sources, we perform the pixel-by-pixel second-order polynomial fitting over the chip scale to construct the model PSF that varies with position. D_iM is then newly defined as the distance between the ith source and the model PSF at its position. We define σ as $\sqrt{{\sum }_{i}{D}_{{iM}}^{2}/{N}_{T}}$ , i.e., the rms of the distances, with N_T being the number of candidates. Sources with D_iM greater than 2σ are removed from the candidate list

6. Step 5 is repeated until the candidate list no longer changes, or the number of sources in the candidate pool is below a certain threshold. The threshold must at least allow us to perform a valid second-order polynomial fitting. Note that if step 5 is stopped due to the second reason, there are simply no PSF or shear measurements further carried out on that chip.

Figure 5 shows examples of source images (lower panel) and their corresponding power spectra (upper panel) that are selected as star candidates on a chip of the w2p2p1 field. Those without crosses on the lower-left corners are identified as stars. The example shows that our algorithm is quite accurate in picking out the right sources (stars) for PSF reconstruction. It is evident from the figure that nonstellar objects are more extended in real space and more compact in Fourier space. We have checked a large number of star images by eye and have not found significant problems.

4.5. Noise Reduction

There are various ways of reducing the image noise, though not all of them are suitable for accurate shear measurement. For the method of Fourier_Quad, we find that simple polynomial fittings in Fourier space perform reasonably well in reducing the noise for the CFHTLenS data. Its accuracy (especially on the faint objects), though, is still subject to more numerical tests in our ongoing work. Our denoising operations are applied on the power spectrum of the image. Each pixel is re-evaluated by fitting its neighboring 5 × 5 region with a second-order polynomial function. We repeat this fitting for every pixel. For pixels close to a boundary, the fitting region also include pixels from the opposite side, due to periodicity. For better results, we adopt the following three tricks: (1) rather than fitting the pixel values, we fit their logarithms, which generally form a much smoother function; (2) we exclude the four vertices of the fitting region to reduce its anisotropy; and (3) the pixel of k_x = k_y = 0 is not included in any fitting, as its value is strongly affected by the residual background level.

Several examples are shown in Figure 6, in which the upper panel shows the original source images, and the middle and lower panels present their corresponding unsmoothed and smoothed power spectra.

5. Results

Our main results are presented in two different source catalogs: one is generated by our own pipeline, named "FQ" hereafter (representing Fourier_Quad); another is downloaded from the website of the CFHTLenS project, named "LF" (standing for LensFit). Source selection in the current version of our pipeline is based on individual exposures with the procedures described in Section 4.2. In the second case, from the LF catalog, we take the source position (R.A., decl.) to find the corresponding CCD position (X, Y) in each exposure using the astrometric parameters. The validities and boundaries of the sources are then determined with the procedures of Section 4.2 as well. Note that we do not use the galaxy ellipticities and weights in the LF catalog.

It is worth noting that in Fourier_Quad, it is not necessary to perform a comprehensive star–galaxy separation, as the inclusion of point sources in the galaxy sample does not affect the results of either shear stacking or shear–shear correlations, as shown in Zhang et al. (2017). This is a unique and useful feature of Fourier_Quad. It is also supported by the results of this work, as demonstrated in this section.

Finally, we are aware that an inappropriate selection of the galaxy sample may lead to shear biases that are purely due to the selection rule itself. This is caused by the correlation between the selection function and the shear signal, and typically called the "selection effect." Previous tests of the Fourier_Quad method are based on simulated images with intrinsically random morphologies, which are by definition free from selection effect. For real data, selections are inevitable at both the bright/large and faint/small ends. The latter is especially important, as the faint end involves a large fraction of the total galaxy population. Our idea is to use the total flux of the source as a selection function, as it is least affected by lensing/shear. The actual quantity we use to represent flux is P₀ (indeed flux²), i.e., the power of the source image at k = 0. Note that the noise in the source power spectrum is reduced by the procedures defined in Section 4.5. To make P₀ dimensionless, we divide it by the average power ${\bar{P}}_{N}$ at large wave numbers (estimated at the boundary of the Fourier domain), i.e., the power of the Poisson noise. Our selection function can be expressed as a Fourier-domain signal-to-noise ratio ${{\rm{S}}/{\rm{N}}}_{{\rm{F}}}={P}_{0}/{\bar{P}}_{N}$ . A detailed discussion of ${{\rm{S}}/{\rm{N}}}_{{\rm{F}}}={P}_{0}/{\bar{P}}_{N}$ as well as other selection functions will be given in another work (H. Li et al. 2019, in preparation). The relation between S/N_F and the i-band magnitude of the sources are given in Figure 7, with data from the field of w2m0m0. The figure shows that S/N_F has a very good monotonic relation with magnitude at the bright end. The scatter of the relation becomes much more significant at the faint end, implying that a cut in S/N_F can be very different from a cut in magnitude for source selection.

**Figure 7.** Relation between S/N_F and the i-band magnitude using the data from the field of w2m0m0.
Download figure:
Standard image High-resolution image

5.1. The FQ Catalog

Figure 8 shows how well the FDS can be recovered with sources from the FQ catalog. The black lines are simply the "y = x" function, and the data points with 1σ error bars are the recovered shear values from source images of CFHTLenS, best fitted by the blue dashed lines. The best-fitting parameters (m, c) are defined as g_1,2(gal) = g_1,2(FD) ∗ (1 + m_1,2) + c_1,2 and are listed on the lower-right corner of each panel. The sources are binned according to the FDS. Note that the ranges of g₁(FD) and g₂(FD) are both very small, i.e., [−0.005, 0.005] and [−0.0075, 0.0075], respectively, quite suitable for testing the accuracy of shear recovery at the level of cosmic shear. We divide these ranges into about 100 bins for each shear component to take advantage of the large source number available from CFHTLenS. In this plot, we use sources with S/N_F ≥ 4, and the number of bright stars for PSF reconstruction on the chip of the source is required to be larger than 20. There are no additional cuts applied to the source sample, and all sources that pass the validity check of Section 4.2 have valid measurements of the G_1,2, N, U, and V shear parameters defined in Equations (6), and (10). There are 42 million sources in total. We use the PDF-SYM method to derive the shear signals in the figure. Note that images of the same galaxy on different exposures are treated as different sources in this work. The results show that the multiplicative biases are at the level of ≲4% (m₁ = (−4.0 ± 3.5) × 10⁻², m₂ = (3.8 ± 2.5) × 10⁻²). We caution that the data points seem to have slight systematic deviations from the "y = x" relation at large values of g₂(FD) (close to the four corners of an exposure), which drive c₂ to be somewhat larger than we expected (c₂ = (3.2 ± 0.8) × 10⁻⁴, 4σ significance). This problem deserves further investigation in a future work.

We choose S/N_F ≥ 4 as our selection criteria because sources with much smaller S/N_F seem to be strongly affected by the source-locating (or pre-selection) procedures defined in Section 4.2. This can be seen, e.g., in Figure 9 from the field of w2m0m0.

**Figure 9.** Distribution of S/N_F in the field of w2m0m0. The sources are from the FQ catalog.
Download figure:
Standard image High-resolution image

5.2. The LF Catalog

Figure 10 is similar to Figure 8, but with sources defined in the LF catalog. We apply the same cut on the S/N_F value, and the same requirement on the star number of the chip for PSF reconstruction. In total, we get 39 million source images. The results of Figure 10 look reasonably good. It shows insignificant multiplicative biases, but slight additive biases (c₁ = (−4.9 ± 0.8) × 10⁻⁴, c₂ = (4.9 ± 0.8) × 10⁻⁴). It is not yet clear what causes the additive biases. It is possibly due to the differences in the source identification parts of the two pipelines: sources in the LF catalog are identified on coadded images, while Fourier_Quad relies on exposures independently. This issue will be studied more carefully in another work. For Fourier_Quad, source identification based on multiple exposures are still under development.

Figure 11 shows the i-band magnitude distribution of sources with different shear measurement statuses. The red ones have valid measurement from both Lensfit and Fourier_Quad, and the yellow and blue ones only have contributions from Lensfit and Fourier_Quad, respectively. The gray population does not have shear measurements at all. The data are from the field of w2m0m0. It is encouraging to note that our Fourier_Quad pipeline can provide valid shear measurements on a much larger galaxy population than Lensfit.

For each source, Fourier_Quad makes an independent shear measurement on every distinct exposure as long as there are no image defects at the source position. Figure 12 shows the distribution of the number of valid exposures for sources of different magnitudes. In the figure, "case n" stands for "n" valid exposures, represented by different colors in the histogram. It is not surprising that at the faint end, a source can be identified only in very few exposures.

**Figure 12.** i-band magnitude distributions of sources with different shear measurement statuses in Fourier_Quad. Case "n" means there are "n" valid shear measurements/exposures associated with a source. The data are from the field of w2m0m0.
Download figure:
Standard image High-resolution image

6. Conclusion and Discussion

We have presented a way of testing shear recovery accuracy on real images. The target "shear signals" are given by the field-distortion effect, which is naturally involved in any optical system. The distribution of the FDS can be accurately mapped out on the CCD plane with the astrometric parameters. These shear signals are recoverable with galaxy images whenever there is a large number of exposures available. The cosmological shear signals in this case are generally not relevant. By comparing the FDS with the stacked galaxy shear estimators, one can directly quantify the multiplicative and additive shear biases on real images.

As an example, we show the performance of the Fourier_Quad method using the CFHTLenS data. The FDS values on MegaCam are around 0.005 or less, very suitable for shear calibration at the level of cosmic shear. We have described in Section 4 a number of important details in the image-processing pipeline of Fourier_Quad, including background removal, source identification and deblending, astrometric calibration, star selection for PSF reconstruction, and noise reduction. Our pipeline can also read in source locations (R.A. and decl.) from an external source catalog.

Our main results are presented in Figures 8 and 10 using two different source catalogs: the FQ one is by our own pipeline, and the LF catalog is from the official data release of CFHTLenS. The shear signals in both cases are measured by Fourier_Quad. Overall, the multiplicative biases are at the level of ≲4%. The results from the sources in the LF catalog reveal some slight additive biases of about 5 × 10⁻⁴ (6σ significance) for both shear components. This problem is minor for the FQ catalog: only g₂ has an additive bias of 3.2 × 10⁻⁴ (4σ). We believe the difference is due to some subtleties in the source locating/identification part of the pipeline of CFHTLenS, which is very different from ours. The most significant difference is that sources in the LF catalog are identified on coadded images, while our pipeline so far processes every exposure individually. A combinatory multi-exposure version of Fourier_Quad is still under development.

Within the LF catalog provided by CFHTLenS, we find that there is a large portion of sources that have valid magnitudes, but not shear measurements. This problem is partially remedied by our Fourier_Quad pipeline, which provides valid shear measurements for most of the sources in the LF catalog, as shown in Figure 11. In principle, for each source, Fourier_Quad can yield one shear measurement from each exposure. The reality is that some sources in the LF catalog cannot be properly identified in single exposures in Fourier_Quad, especially at the faint end as shown in Figure 12. This is due to their weak significance or image defects.

For valid source images, their shear estimators can all be measured by Fourier_Quad. The only exception is when there are not enough bright stars on the chip for PSF reconstruction. The shear recovery part of our pipeline simply does not apply any selection rules on the source morphology (shape, size, etc.), allowing us to avoid complicated decisions regarding selection effects, due to shear measurement itself. Note that it is not even necessary to exclude stellar objects or point sources from the galaxy samples in Fourier_Quad. We consider this a significant advantage of our pipeline.

Our proposal of shear testing with field distortion should be useful for any shear measurement algorithm, including some recently developed ones (Huff & Mandelbaum 2017; Okura & Futamase 2017; Sheldon & Huff 2017; Li et al. 2018; Pujol et al. 2018; Tewes et al. 2019). It is better to work with individual exposures, in which case the field-distortion signals are well defined. For algorithms like Lensfit, which simultaneously fits the source shape on multiple exposures, the field-distortion signals are already included in the forward-modeling of the galaxy image in each exposure. In this case, the recovered ellipticities can still be plotted against the average field-distortion signal to form a null test, assuming the spatial offsets of the relevant exposures on the sky are much smaller than their sizes. Further discussion of these topics is beyond the scope of this work.

It is also important to note that accurate astrometric calibration is indispensable for a successful shear measurement program. We have introduced a modification to the fitting between the projected sky plane and the CCD plane in Section 4.3, enabling us to easily extend the fitting functions to high-order polynomials, and to check the convergence of the astrometric solution. A more comprehensive comparison between the astrometry part of our pipeline and the standard software, such as SCAMP, will be provided in a separate work. We also caution that polynomial fitting is not likely good enough for recovering distortions caused by instrumental effects at special locations (Bernstein et al. 2017) of the CCD. For the CFHTLenS data, we have not studied such a problem carefully. We leave a detailed discussion of this topic to a future work.

Finally, it is encouraging to note that our Fourier_Quad pipeline is now able to carry out shear measurement from almost raw CCD images (after flat-field correction). The overall image-processing speed is very fast. For example, for the CFHTLenS data, overall, it only takes about 0.02 CPU*seconds for each galaxy image on average. We therefore consider the Fourier_Quad pipeline a very promising tool for probing the cosmic structure in ongoing and planned large-scale galaxy surveys. In the future development of our pipeline, we will try to include the following components: (1) source identification using information from multiple exposures, (2) ways of stacking undersampled images for accurate shear recovery (mostly for space-based missions), and (3) treatments of instrumental effects (Rhodes et al. 2010; Antilogus et al. 2014).

This work is based on observations obtained with MegaPrime/MegaCam, a joint project of CFHT and CEA/DAPNIA, at the Canada–France–Hawaii Telescope (CFHT), which is operated by the National Research Council (NRC) of Canada, the Institut National des Sciences de lÚnivers of the Centre National de la Recherche Scientifique (CNRS) of France, and the University of Hawaii. This research used the facilities of the Canadian Astronomy Data Centre operated by the National Research Council of Canada with the support of the Canadian Space Agency. CFHTLenS data processing was made possible thanks to significant computing support from the NSERC Research Tools and Instruments grant program.

In the early stage of this project, the processing of single-exposure images was conducted using the THELI software, a tool for the automated reduction of astronomical images developed by the CFHTLenS team (Erben et al. 2005; Schirmer 2013).

J.Z. is supported by the National Key Basic Research and Development Program of China (No.2018YFA0404504), the NSFC grants (11673016, 11433001, 11621303), and the National Key Basic Research Program of China (2015CB857001). X.L. is supported by the Global Science Graduate Course (GSGC) program of the University of Tokyo. L.P.F. acknowledges the support from NSFC grants 11673018, 11722326, and 11333001; STCSM grants 16ZR1424800 and 188014066; and SHNU grant DYL201603. D.Z.L. and Z.H.F. are supported in part by NSFC of China under grants 11333001 and 11653001. G.L. is supported by NSFC (Nos. 11673065, 11273061, and 1133008) and the 973 program (No. 2015CB857003).

Testing Shear Recovery with Field Distortion

Article metrics

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Field Distortion

3. The Fourier_Quad Method