Noise2Astro: Astronomical Image Denoising With Self-Supervised NeuralNetworks

In observational astronomy, noise obscures signals of interest. Large-scale astronomical surveys are growing in size and complexity, which will produce more data and increase the workload of data processing. Developing automated tools, such as convolutional neural networks (CNN), for denoising has become a promising area of research. We investigate the feasibility of CNN-based self-supervised learning algorithms (e.g., Noise2Noise) for denoising astronomical images. We experimented with Noise2Noise on simulated noisy astronomical data. We evaluate the results based on the accuracy of recovering flux and morphology. This algorithm can well recover the flux for Poisson noise ( $98.13${\raisebox{0.5ex}{\tiny$^{+0.77}_{-0.90} $}$\large\%$}) and for Gaussian noise when image data has a smooth signal profile ($96.45${\raisebox{0.5ex}{\tiny$^{+0.80}_{-0.96} $}$\large\%$}).


INTRODUCTION
Deep learning methods for image denoising are dependent on the ground-truth data provided in the training set, and most efforts have been directed toward refining neural network (NN) architectures (e.g., Vojtekova et al. 2021;Gheller & Vazza 2022). To generate training data, we either need a thorough and robust understanding of our astronomical objects to simulate the ground-truth signal or real data with high signal-to-noise garnered from long exposures in observations. Lehtinen et al. (2018) demonstrate that denoising images with only noisy images -a self-supervised processis possible. Several works (Batson & Royer 2019;Krull et al. 2018) have explored the feasibility of blind denoising using self-supervision. Astronomical imaging often generates multiple exposures of the same objects, which can be used with Noise2Noise. By exploiting the selfsimilarity in natural images, the Noise2Noise algorithm introduces a general method for finding an optimal denoiser for a given dataset. For a given dataset with noisy image pairs, suppose each image pair contains two noisy images x 1 and x 2 of the same dimensions, such that x 1 = y + n 1 and x 2 = y + n 2 , where y is the ground truth, and n 1 and n 2 are two different additive noise components. Assuming each noise component n i is independent from the ground truth, one can find an optimal transformation f from x 1 to x 2 among an entire class of transformations f θ , where the transformation is parameterized by θ, by minimizing the self-supervised loss between the two noisy images: Due to the independence of the noise components, one expects the transformation f to fail in predicting the noisy component n 2 in x 2 when only given x 1 as input. Therefore, the optimal transformation f would only predict the ground truth component y and become an effective denoiser to images within the distribution of the training data.

METHOD
For Experiment 1, we applied the algorithm to two architectures: U-net (Ronneberger et al. 2015) and a denoising convolutional neural network (DnCNN) (Zhang et al. 2017  We simulate training and test data. The signal components of each simulated image pair are generated as Sérsic profiles (Sérsic 1963) using the parametric fitting tool GALFIT (Peng et al. 2010) by inputting the chosen Sérsic index n, half-light radius R e , positional angle P A, semi-major-axis to semi-minor-axis ratio ba, and magnitude M . We randomly sample from a range for each parameter except for the magnitude, which we randomly sample from the DES public catalog (Abbott et al. 2018) processed by the Weak-lensing-Deblending software (Sanchez et al. 2021;Kirkby et al. 2020). We simulate the noise components using GALSIM (Rowe et al. 2015) and combine them with signal components for the training set.
To train networks with self-supervision, we generate a pair of images with the same signal but different noise as one instance. For Experiment 1, we generate 1000 128pix × 128pix instances with Poisson noise, and we choose relatively wide ranges of morphological parameters (n ∈ [0.5, 6]; ba ∈ [0.01, 1]; R e ∈ [4, 20]) for the sig-nal. For Experiment 2, we generate 1000 256pix×256pix image instances with narrower ranges of morphological parameters (n ∈ [1.0, 1.6]; ba ∈ [0.9, 1]; R e ∈ [200, 320]) to simulate low-surface brightness signal components. We generate two train-test sets for Experiment 2: one has entirely Gaussian noise components, and the other has Poisson noise and the same signal components as the former one. The Gaussian noise is simulated with a sigma taken from the value of mean sky-level per pixel from the DES public catalog (Abbott et al. 2018). For all experiments, we use the first 800 instances for training and all 1000 instances for testing.

RESULTS
We evaluate Experiment 1 by measuring the bias image: I bias = Ioutput Itrue . Out of the two network architectures, the U-net produces the least bias. While all instances in Experiment 1 tend to have high biases in the image region where the signal tends to be peaky (relatively high surface brightness), the network output in the image region where the signal is smooth (relatively low surface brightness) converges to the ground truth.
In Experiment 2, we implement the U-net and examine the recovery of flux and signal morphology. To assess the morphological accuracy, we performed 2D Sérsic function fitting on our outputs with SciPy (Virtanen et al. 2020). We mask out pixels with a relatively large bias for fitting by choosing a circular mask with radius of 5pix centered at the peak of the signal. We also applied constraints to the x and y positions of the 2D Sérsic function, while other parameters are allowed to vary. For instances with Poisson (Gaussian) noise, the network recovers 98.13 +0.77 −0.90% (96.45 +0.80 −0.96%) of the flux. We compare outputs from our analysis (network denoising and Sérsic fitting) with the ground truth in Figure 1. In the lower panels, we show the bias (P bias = P arameter f it /P arameter true ) versus ground truth for each parameter. For each parameter, if the biases of experiment instances have a narrower distribution, the fitted values from the overall routine are in better agreement with the ground truths. The fitted ba agrees with ground truths in both Gaussian and Poisson noise experiments (ba bias 0.96 ∼ 1.04). For n, fitted values agree better with ground truths when the noise is Gaussian (n bias 0.86 ∼ 1.1) than when the noise is Poisson (n bias 0.7 ∼ 1.14). For R e , the fitted values poorly agree with ground truths in both Gaussian (R e bias 0.72 ∼ 1.2) and Poisson cases (R e bias 0.6 ∼ 1.16), and few instances are on the line where P bias = 1. In the case of Poisson noise, the R e tends to be significantly underestimated (median(R e bias ) 0.84). Considering that the ground-truth R e is much larger than the size of images in the test set, such a result can be expected.
We distinguish between the distribution of trainingset and test-set instances in Figure 1, where "test-set" refers to the 200 instances of the total 1000 testing instances that are not exposed to the model during training. In the two experiments, we find no obvious deviation of the "test-set" instance distribution from that of the training-set. We conclude that NNs with self-supervision can achieve photometric accuracy within 5% of the ground truth in limited scenarios: the self-training set has to contain images with sufficient resemblance and the objects should have light profiles with low flux gradients. Therefore, this algorithm would potentially be applicable to denoise faint, diffuse objects.