DeepZipper II: Searching for Lensed Supernovae in Dark Energy Survey Data with Deep Learning

Gravitationally lensed supernovae (LSNe) are important probes of cosmic expansion, but they remain rare and difficult to find. Current cosmic surveys likely contain and 5-10 LSNe in total while next-generation experiments are expected to contain several hundreds to a few thousands of these systems. We search for these systems in observed Dark Energy Survey (DES) 5-year SN fields -- 10 3-sq. deg. regions of sky imaged in the $griz$ bands approximately every six nights over five years. To perform the search, we utilize the DeepZipper approach: a multi-branch deep learning architecture trained on image-level simulations of LSNe that simultaneously learns spatial and temporal relationships from time series of images. We find that our method obtains a LSN recall of 61.13% and a false positive rate of 0.02% on the DES SN field data. DeepZipper selected 2,245 candidates from a magnitude-limited ($m_i$ $<$ 22.5) catalog of 3,459,186 systems. We employ human visual inspection to review systems selected by the network and find three candidate LSNe in the DES SN fields.


INTRODUCTION
Galaxy-scale gravitational lensing occurs when the gravitational potential of a foreground galaxy (positioned along an observer's line of sight to a background galaxy) is large enough to deflect the photons of the background galaxy on their journey to an observer.This process produces arcs and/or multiple images of the background galaxy (Treu 2010).For the specific case in which the background galaxy contains a supernova (SN), the photons that contribute to each of the multiple images of the lensed supernova (LSN) travel different paths and distances to the observer and encounter different depths of gravitational potential depending on the distribution of the foreground galaxy's mass.Because the speed of light is constant, the distinct paths correspond to distinct arrival times of the photons from each SN image.Combining this time delay with a model of the foreground galaxy's mass distribution enables the direct inference of the rate of expansion of the Universe today H 0 , as well as other cosmological parameters (Refsdal 1964).
Historically, LSNe are rare -only a few detections have been made in total (Kelly et al. 2015;Rodney et al. 2021;Goobar et al. 2017;Amanullah et al. 2011;Quimby et al. 2014;Rodney et al. 2015).However, modern optical time-domain survey datasets, such as those collected in the southern hemisphere by the Dark Energy Survey's SN fields (DES; Abbott et al. 2016;Diehl 2020), in the northern hemisphere by the Zwicky Transient Facility (Graham et al. 2019) and the Young Supernova Experiment (Jones et al. 2021), and over the next decade by the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST; Ivezić et al. 2019), are promising places to search for LSNe.Based on imaging depth, sky area, and duration of observations, the DES SN fields are expected to contain ∼ 0.5 − 2 LSNe, and the LSST wide field is expected to contain ∼ 2, 000 LSNe (Oguri 2019).These datasets, which contain hundreds of millions to tens of billions of objects that are not LSNe, pose a significant challenge for searches (Abbott et al. 2021;Marshall et al. 2017).In particular, it is vital to identify an LSN rapidly to enable follow-up observations before the SN fades during the weeks to months after the explosion (Mihalas 1963).To keep pace with the data streams of large surveys and identify candidate LSNe promptly, we require fast and robust algorithms.
In Morgan et al. (2022) -hereafter referred to as "DZ1" -we designed a deep learning detection architecture ("ZipperNet") for LSNe and demonstrated its performance on four simulated optical survey datasets that mimic DES and LSST.In this work, we use a Zip-perNet to search the DES SN fields (Abbott et al. 2021) for LSNe.We also discuss the data collection and data reduction steps necessary to carry out a comprehensive LSN search in an optical survey dataset.We have made all code for data processing and deep learning available at (Morgan 2022).
We present this work as follows.In Section 2, we describe the characteristics of the DES SN field data.In Section 3, we describe the training and optimization of our deep learning approach.In Section 4, we quantify the performance of this architecture on the DES SN field data, as well as present candidate LSN systems.In Section 5, we discuss the significance of the results and the outlook for detecting LSNe in Rubin Observatory data.We conclude in Section 6.
2. DATA COLLECTION 2.1.The DES SN Fields DES SN field data were collected a) to facilitate the Type Ia-SN (SN-Ia) cosmology analyses in DES that use the single-epoch images and b) to enable galaxy population modeling (near the detection limits of the DES wide-field survey) that uses coadded images.All data were collected with DECam (Flaugher et al. 2015) on the Victor M. Blanco telescope from the Cerro-Tololo Inter-American Observatory in Chile between 2012 and 2018.There are 10 3-sq.deg.fields -eight shallow fields (X1, X2, E1, E2, C1, C2, S1, and S2) observed to a single-visit depth of ∼ 23.5 mag and two deep fields (X3 and C3) observed to a single-visit depth of ∼ 24.5 mag.Each field was imaged in the griz bands approximately every six nights over five years, subject to Sun, Moon, and weather conditions.The median full-width-at-halfmaximum point spread functions ("seeing") for the SN field images used in this analysis (after the downsam-pling discussed in section 2.2) were 1.37 , 1.26 , 1.15 , and 1.08 for the griz bands, respectively.

Candidate System Selection and Data Reduction
We begin our search for candidate LSNe with all cataloged objects of DES Data Release 1 (also referred to as the "Year 3 Gold Catalog") (Abbott et al. 2018).We construct an initial sample by requiring the object to be positioned within one of the SN fields and requiring all griz MAG AUTO measurements to be brighter than 27.5 mag.Then, within that sample, we require the i band MAG AUTO only to be brighter than 22.5 mag to restrict the total number of objects in this first search of the DES SN fields.Also within the initial sample, we require a catalog-level parameter size measurement (CM T) to be greater than 0.05, which excludes non-extended objects (e.g., stars) with approximately 99% galaxy purity and 98% galaxy completeness.To evaluate the purity and completeness, we take a nearest-neighbor machine learning classifier that combines DES photometry with near-infrared photometry as truth, which has shown near-perfect performance at mag i < 22.5 (Hartley et al. 2021).These cuts produce a sample of 3,459,186 candidate systems for our analysis.
We next introduce a selection on the images that are used in the LSNe search across all five years of DES SN field exposures (Abbott et al. 2021).If a system has two images on the same night in the same band, we choose the image for which the object was observed with the better seeing.For each image, we also require the cataloged object's centroid to be positioned more than 23 pixels from all CCD edges: this permits constructing image cutouts (45 pixel by 45 pixel) without producing partial images.Finally, to enforce cadence uniformity and simplify data processing, we require the same number of observations in each of the griz bands.We determine the band with the fewest useful observations and exclude images from the other bands to match it.In doing so, we exclude images from regions of the time series in descending order of the sampling rate.Thus, for each candidate lens galaxy in the SN fields selected from the DES catalog, we obtain a time series image set with the same number of images in each band of griz.A typical length for a time series image set is ∼ 20 − 35 epochs.We process each year of DES data independently.

Training Set Construction
Our approach for detecting LSNe in the DES deep fields requires samples of LSNe (positives) and non-LSNe (negatives) to train the ZipperNet in a binary classification scheme.To construct the training set, we used ∼ 2% of the total dataset -76,203 time series image sets.Due to the lack of real LSN examples, we create the positive class by using gravitational lensing simulation software (deeplenstronomy; Morgan et al. 2021) to add LSNe to DES images in the training set.For the negative class, we use time series image sets selected at random from the dataset.Even given the erroneous case where a real LSN is randomly selected for the negative training class, LSNe are expected to be sufficiently rare in the DES SN fields such that this error would be infrequent and not affect the training.Nevertheless, the two most likely types of false positives will be non-lensed SNe and strongly lensed galaxies without SNe; and unfortunately, both these types of systems are also expected to be rare in our dataset.Therefore, to prepare a training set with boosted representation of systems that we expect to be more challenging to classify, we also use deeplenstronomy to inject lensed source galaxies and non-lensed supernovae into a fraction of the negativeclass images.
The process of injecting simulated light sources into real time series image sets has multiple benefits.The training dataset includes all types of astronomical systems that the ZipperNet will classify because it is chosen from the total dataset.Also, the properties of the simulated source galaxies and SNe are drawn from real data, maintaining all inherent physical correlations.We join the DES Year 3 Gold catalog and DES Year 1 morphological catalog (Tarsitano et al. 2018) to obtain a sample of ∼ 100, 000 galaxies from which we draw parameter values for simulations.The simulated source galaxies are modeled with Sersic light profiles that have a color-independent ellipticity, a Sersic profile index, a band-wise half-light radius, a band-wise magnitude, and a photometric redshift -all measured within DES pipelines.As in DZ1, the injected SNe were simulated using public rest frame SN spectral energy distributions (Kessler et al. 2010) available in deeplenstronomy, which redshifts the distribution and calculates the observed magnitude in each band.The injected SNe reach peak brightness within the interval of 20 days before the first observation and 20 days after the final observation: the dataset contains falling-only (∼ 15%), rising-only (∼ 15%), and complete lightcurves (∼ 70%).
To calculate the lensing effects of the real galaxy on the simulated source light, we use the measured photometric redshift of the lens galaxy, select an Einstein radius at random from the interval [0.4 , 1.8 ], and model the mass distribution of the lens as a singular isothermal ellipsoid following similar approaches in the literature (Rojas et al. 2021).For simplicity, the mass distribution shares the measured center position and ellip-ticity values with the light from the real lens galaxy.This simplification is not expected to greatly affect performance because these parameters are expected to be positively correlated.From the mass profile, we calculate the lensed positions of the source galaxy and LSN, as well as account for the time delays of the separate SN images.The output of the deeplenstronomy simulation are time series image sets with three kinds of objects added to real DES images -LSNe, lensed source galaxies, and non-lensed SNe.
In total, 25% of the 76,203 time series image sets placed aside for training are injected with an LSN Ia and 25% are injected with a lensed core-collapse SN (LSN CC) to construct the positive class.Also, 16.5% of the training time series image sets are left untouched, 16.5% are injected with a galaxy-galaxy strong lens, 8.25% are injected with a SN Ia, and 8.25% are injected with a SN CC.The positive and negative training classes are equal in total number to maintain a balanced dataset throughout training.We describe the details of the training in Section 3.4, but it is worth noting here that given our choice of loss function, balancing the classes is essential to prevent class representation biasing the learned feature representation.The remainder of this subsection describes this simulation-injection process in detail.Examples of objects in the training dataset are collected in Figure 1.

Preprocessing
Before we train the ZipperNet and apply it to the observed dataset, we apply a series of standardization steps.We first truncate the time series image sets to 10 "time steps" in each band.A time step refers to a single exposure in the sequence of observations; in the DES SN fields, a time step is approximately 6-7 days.If an image set contains more time steps, we separate it into multiple 10-time step sequences: time steps 1-10 are a single sequence; time steps 2-11 are a second sequence; etc.Then, for each 10-step image sequence, we extract the total brightness as a function of time using the background-subtracted aperture technique presented in DZ1 with an aperture radius of 15 pixels.Importantly, when extracting the total brightness, the zeropoint of the image is not used to maintain independence from all non-image data products.This choice produces noise-dominated extracted brightness lightcurves, such as those in Figure 1, though it is shown in the remainder of the analysis that the ZipperNet can still identify the temporal signatures of LSNe within the noise.
Next, we average the images within each band to obtain a single image in each band for the 10-step image

ZipperNet
The two-branch architecture of ZipperNet was first presented and validated in DZ1, and we summarize here.One branch receives scaled, time-averaged images in each band as inputs to a block that extracts convolutional features.The other branch receives scaled extracted brightness-time series as inputs to a block that extracts sequence features.The outputs from the feature-extraction blocks are flattened and concatenated.A series of fully connected layers then weights and condenses the concatenated feature representation to produce an output score that the input system contains a LSN.The ZipperNet used in this paper is similar to Figure 2 of DZ1, and the exact hyperparameter settings for this analysis are presented in Table 1.
We performed a full hyperparameter optimization of the architecture and learning algorithm using the validation dataset.Small changes to hyperparameter settings from the prototype ZipperNet in DZ1 reflect a specialization for the real DES images used in the training data.We find that the addition of another convolutional layer, the addition of another long short-term memory (LSTM) layer, minor tweaks to convolutional layer kernel and stride settings, and the removal of dropout layers leads to boosted performance.The selected settings for the learning algorithm are presented in Section 3.4.

Training
To train the ZipperNet, we implemented a distributed setup on five computers (2 machines with Intel 3.2 GHz processors and 256 GB RAM, 1 machine with an AMD 2.2 GHz processor and 512 GB RAM, and 2 machines with IntelX 2.6 GHz processors and 768 GB RAM) on the DES cluster at Fermilab.The training dataset was split into five equal chunks -each placed on an independent computer.On each computer, we instantiated a ZipperNet and initialized the weights at the same randomly selected values.We then begin passing the chunks of training data through the ZipperNet instances on each of the five computers.At regular intervals (every 1/15 of a chunk), we collect the parameters of each of the five ZipperNet instances and average the values of the parameters.Mathematically, the averaging operation is equivalent to the weights being updated by normal training, provided the learning rate is scaled by the number of network instances.Within this setup, we use a batch size of five examples and use stochastic gradient descent with a Nesterov momentum coefficient of 0.9, a constant learning rate of 0.001, and categorical cross-entropy loss to update the weights at each training step.We refer to the exhaustion of all data in a chunk as a "training iteration" and cycle back to the beginning of the chunk once the data has all been passed through the network instance.We allow training to continue for five training iterations and reach a final validation set accuracy of 93.0%.This raw accuracy is dependent on the representations of the different types of negative examples in the validation dataset.In Section 4, we assess the performance using physically meaningful metrics.

Candidate Selection Criteria
The output of the trained ZipperNet on an input (pair of an averaged image and a lightcurve) is a score with a value typically between -100.0 and 50.0.Based on the minimum and maximum values of this range in our validation dataset, we linearly scale the ZipperNet output scores to the range [0.0, 1.0], such that they are similar to probabilities.Next, we select a threshold Zip-perNet score above which we include the candidate system in our final sample and below which we exclude the candidate system.We select this threshold by iterating through possible threshold values and analyzing the fraction of LSNe that scored higher than the threshold compared to the fraction of galaxies that scored higher than the threshold.The left panel of Figure 2 shows the attainable values of these quantities for different thresholds.We expect galaxies to be the largest background: the number of galaxies in a given area of sky is orders of magnitude higher than the number of SL or SNe.Therefore, we select the threshold by reducing the fraction of galaxies scored higher than the threshold to the lowest value before the fraction of LSNe scored higher than the threshold starts to decline rapidly.Based on this analysis, we select an operating threshold for the scaled ZipperNet scores of 0.76.This threshold value is contextualized with the ZipperNet scores for the systems in our validation dataset in the right panel of Figure 2.
We develop a final selection criterion to narrow the sample of candidate systems selected by ZipperNet.We leverage the aspect of our data processing from Section 3.2 in which time series image sets with more than 10 epochs are split into 10-epoch subsequences, which are then classified independently by ZipperNet.In analyzing the ZipperNet classifications made on all subsequences of a time series image set, we find that LSNe are more likely than galaxies to have multiple detections.This relationship is illustrated in Figure 3 using our validation dataset, which we use as motivation to develop a criterion on the aggregate detections in a time series image set.Importantly, the total length of the time series image sets in our training and validation data was not required to match the real data as a result of our preprocessing methods, so it would be inaccurate to set a strict requirement on the number of ZipperNet detections (score above the threshold) based on the validation dataset.Rather, to put the validation dataset and the real data on the same footing, we set a requirement on the ratio of number of detections to number of subsequences.Therefore, we select the threshold for this ratio such that the false positive rates are minimized to the point where human inspection of the final sample becomes feasible.We choose to require at least 60% of the subsequences to have a ZipperNet score above 0.76 for the candidate system to be included in our final sample of candidate LSNe.The 60% threshold and the 0.76 Zip-perNet score threshold were determined simultaneously by computing the LSN recall and galaxy false positive rate at all possible values.

Performance Metrics
To evaluate the performance of the fully trained Zip-perNet, we define quantities and metrics of interest and compute them on the validation dataset.We introduce two terms that describe classification score thresholds: "classified as an LSN" means the candidate system had a ZipperNet score greater than the threshold in at least 60% of subsequences; and "classified as background" means the candidate system had a ZipperNet score greater than the threshold in fewer than 60% of subsequences.We define the following terms regarding metrics based on the threshold score: -a true positive (TP) is an LSN, and  -a false negative (FN) is a LSN, and it is classified as background.
Using these quantities, common metrics like accuracy are straightforward to compute; however, those metrics are misleading due to the boosted representation of rare physical systems in our training and validation datasets.We instead focus on class-specific metrics that carry physical meaning and are robust against the class representation in the validation dataset: the LSN recall is LSN Recall = TP / (TP + FN); (1) the LSN-type-specific recall is where type is "Ia" or "CC"; and the false positive rate for each type of negative class is where type is "Galaxy," "SL," "SN-Ia," "SN-CC."The values of these metrics are collected in Table 2 for Zip-perNet alone and for the combination of ZipperNet with our final sample-selection criterion.
There are a few key results from these metrics worth highlighting.The ZipperNet LSN recall indicates that approximately 84% of all LSN in the validation dataset are scored above the operating threshold.The Zipper-Net galaxy false positive rate indicates that roughly 1.5% of galaxies will be scored above our operating threshold and erroneously populate our candidate sample.By itself, the ZipperNet is a powerful classifier, but the minimized galaxy false positive rate is still large enough where the resulting candidate sample would be too large for visual inspection.With the addition of the selection criterion on the number of ZipperNet detections for each constituent subsequence, the performance is boosted.Critically, the final galaxy false positive rate is reduced, facilitating visual inspection of the full final candidate sample.This stricter selection has the consequence of reducing the final LSN recall.However, most of the removed LSNe are those that peak before or after the window of observations.

Searching the DES SN Fields
Applying our trained ZipperNet and additional selection criterion to the DES SN field data produces 2245 candidate LSNe, approximately half of which had Zip-perNet detections in multiple years of DES data.We expect the majority of these systems to have resolvable features based on two aspects of the analysis.First, these 2245 candidate LSNe were identified in the magnitudelimited sample of the DES galaxies, leading to a tendency for low redshift, nearby galaxies to be more highly represented than high redshift, distant galaxies.Second, based on the physical selection function of the Zipper-Net on this dataset (shown in Appendix A), LSNe in systems with large Einstein radii and better seeing are more likely to be recalled.Therefore, because the majority of the systems in this candidate sample should have resolvable features, human visual inspection becomes a viable approach for identifying the most interesting candidate LSNe.
A team of strong lensing experts within DES inspected 6-year coadded, color-composite images of the 2245 candidate LSNe systems to search for lensing, similar to how precursor strong lensing searches have been carried out.The team assigned all objects a score using the following system: 1: The detection is an image artifact, such as a diffraction spike or contamination from a bright foreground star; 2: There is a single object, such as a galaxy or star; 3: There are multiple objects with no evidence of lensing, such as SNe or clusters of galaxies; 4: There are multiple objects with evidence of lensing.
Using this system and the median score for each object, the team of inspectors identified 522, 802, 871, and 50 objects with scores "1," "2," "3," "4," respectively.For the 50 systems with evidence of lensing, we extracted aperture lightcurves for each object in each system from the DES single-epoch images.Three systems from the 50 systems with evidence of lensing were identified to have SN-like time variability, and we upgraded their overall score to a "5."Candidate systems scored as a "4" or "5" are presented in Figures 4 and 5, respectively, and have their properties collected in Table B.1.The 50 candidate systems scored at or above a "4" found in this analysis show evidence of lensing in their images.There are still non-lenses in this sample: for example, DES-700492744 is a high-proper motion white dwarf appearing as a red object between two blue point sources; nevertheless, we include all systems labeled as interesting by the labeling team for completeness.Some of these systems also show evidence of point sources within the lensing configurations: there are nearly circular objects positioned within the lensing configuration.Going further, we analyze the time variability of the candidate systems by extracting 5-year backgroundsubtracted lightcurves for each source in the images.The objects scored as a "5" show evidence of SN-like time variability: a short rise followed by a steady decay in brightness over the course of approximately one month as shown by Figure 5.The objects scored as a "4" do not show this temporal behavior, however the possibility remains that some of the objects scored as a "4" are strongly lensed systems and potentially house a lensed quasar.Section 4.3 contains a detailed presentation of the three objects scored as a "5".
Lastly, we cross-match the 2,245 ZipperNet-identified systems with the systems identified during the DES 5-year photometric SN Ia cosmology analysis (Möller et al. 2022).In Möller et al. (2022), difference imaging (Kessler et al. 2015) identified 31,636 transients and SALT-II SN Ia lightcurve fitting (Guy, J. et al. 2010) identified 2,381 single-season SNe from that sample of transients.The SNe selected by lightcurve fitting are more likely to be SNe Ia than SNe CC, and most SNe CC in the total sample are also excluded by the fitting.Furthermore, this selection procedure searches for normal SNe Ia and is not adapted for possible changes in the light-curves from the lensing.In total, there is an overlap (using a 5 radius) of 104 systems amongst the ZipperNet sample and the DES SN analysis transient sample.All but four overlapping systems -DES-691702170, DES-699127397, DES-699340227, and DES-700977591 -were scored as either a "2" or a "3" by the labeling team, indicating no convincing evidence for lensing.The locations of the detected transients are marked in Figure 4.Only the transient in DES-699127397 passed the SALT-II SN Ia lightcurve fitting.The difference-imaging detections in DES-699340227 and DES-700977591 appear to be spurious detections due to image subtractions errors.Lastly, while the transients detected in DES-691702170 and DES-699127397 are likely SNe, these systems do not appear to be lenses and likely should have received lower grades from the labeling team; DES-691702170 lacks an obvious lensing galaxy and the positions of the galaxies in DES-699127397 are more likely a cluster of galaxies than multiple images of the same background galaxies due to their asymmetric alignment.
Based on the SN FPRs in Table 2, this overlap is consistent with the expected ZipperNet SN background.The three systems scored as a 5 by the visual inspection team, indicating the presence of both lensing and SNlike temporal behavior, were not included in the overlapping sample.We believe the faintness of the SNe or foreground contamination for the lensing galaxy may have contributed to the non-detection from difference imaging, though a full understanding of this discrepancy is beyond the focus of our analysis.

Final LSN Candidates
The three most interesting systems identified by the ZipperNet and subsequent human visual inspection are DES-691022126, DES-701263907, and DES-699919273.We present 5-year color-composite coadded images of these systems and extract lightcurves for each object of interest within them in Figure 5. From the lightcurves, the five observing seasons of the DES SN program are easily distinguishable, and we refer to each observing season as "Y1" through "Y5."We extract the lightcurves from the single-epoch images by summing the pixels in the aperture displayed in the coadded image, subtracting the sky background measured by DES, and converting to a magnitude using the zeropoint measured by DES.Importantly, the magnitudes are the combination of all objects within the aperture, so for example an SN lightcurve will contain contamination from its host galaxy.All estimated Einstein radii have been obtained by measuring the angular separations between objects, as opposed to a full modeling of the lensing system.We choose to present only the z-band lightcurves for these visualizations for simplicity, though all four bands were assessed to identify SN-like temporal behavior.The bluer bands such as g and r have larger PSFs in this dataset compared to the redder i and z bands, leading to noisier aperture photometry measurements.Furthermore, LSNe are likely to be at high redshifts, leading to a tendency for LSN temporal signatures to be most visible in the redder bands.
DES-691022126 is a system of four objects labeled in the top panel of Figure 5 as A, B, C, and D. We interpret objects C and D to be galaxies based on their constant brightness over time.Objects A and B are much redder, and display a greater degree of brightness variability when looking at the typical size of the magnitude error bars compared to the 5-year median z-band magnitude for each object.Furthermore, the lightcurves for objects A and B both contain a period of linear decline in magnitude on month timescales: object A in Y5 and object B in Y3.ZipperNet detected the system in Y2 and Y3, but not in Y5.We believe it detected the linear decline of object B in Y3 and that perhaps object C contained light from a SN between Y2 and Y3 for which ZipperNet detected the beginning of.The fact that the linear decline of object A's brightness in Y5 was not detected by ZipperNet is likely due to object A being the faintest source in the system and the selection function of ZipperNet (see the bottom right panel of Figure A).The SN-like lightcurve features that are shared between objects A and B, when combined with the evidence for lensing with an Einstein radius of approximately 1.7 , support the claim of the system as a LSN.
By comparison, DES-701263907 is a much more complicated system shown in the middle panel of Figure 5.A large foreground galaxy (SIMBAD source LEDA 135660) at redshift 0.03 dominates the image.Object B (SIMBAD source SDSS J024352.54-003708.4) is cataloged as a galaxy also at redshift 0.03, but may also be a dense, star-forming region based on its blue color.This dense area, combined with the gravitational potential of LEDA 135660 itself would have a large lensing cross section, increasing the likelihood that background objects would be lensed.Because the lightcurve extraction method used in the lightcurves of Figure 5 does not subtract the effect of the host galaxy, the variability of these objects cannot be assessed without difference imaging techniques outside the scope of this paper.Nonetheless, we identify object A as the most variable source in the system given the foreground contamination.In the image cutout for DES-701263907, we also note the location of an SN detected in September of 2020 (AT2020scq).It is possible that object B acts as a primary lensing galaxy, object A is an LSN identified by ZipperNet in 2018, and AT2020scq is a second appearance of object A delayed by approximately two years.Given the large foreground galaxies at redshift 0.03 and a potential Einstein radius of ≈ 3.0 , this time delay would be consistent with an LSN.
Lastly, DES-699919273 is another four-object system that we enumerate as A, B, C, and D in the bottom panel of Figure 5.We interpret object C as the lensing galaxy, object D as an image of the source galaxy without an SN, and objects A and B as images of the source galaxy, where an SN was present at some point during DES observations.The Einstein radius for this system is ≈ 2.1 .Particularly, we note a linear decline in z-band magnitude for object A in Y3 and a nearly identical linear decline in z-band magnitude for object B in Y5.ZipperNet detected the linear decline in Y5, but had no such detection in Y3.We interpret this event as another manifestation of the less-than-perfect recall of the classifier.Nonetheless, the SN-like temporal signal appearing in two of the images within a lensing geometry is evidence for the presence of an LSN.

DISCUSSION
The method presented in this analysis contains a few areas where improvements could increase the LSN recall while decreasing the false positive rate.One such change is to add centroiding to account for sub-pixellevel shifts in position prior to stacking and averaging the images.While the offsets are small, misalignment at the ≈ 0.25 scale can cause image-based features to become less sharp and harder for a convolutional layer to identify.When stacking the images, it may also boost performance to only include the images with high image quality (e.g.seeing above some quality threshold and / or cloudiness below some quality threshold): this would ensure that the ability to resolve features in the resulting composite image is only limited by the instrumentation.These possibilities focus on improving the appearance of spatial features in the data to boost the ZipperNet's ability to learn relationships and are motivated by the analysis of the physical selection function of our approach, which is described in Appendix A.
The lightcurve extraction step of the data preprocessing also could be improved by discarding common artifacts such as diffraction spikes and saturated pixels to avoid contaminating the extracted brightness.Similarly, an analysis of the clarity of features in the lightcurves as a function of the aperture radius used in the lightcurve extraction may find that a different aperture radius leads to higher performance.It's possible that scaling the time series image sets to have a standardized mean and a standardized variance of pixel value prior to preprocessing would lead to smoother lightcurves.Finally, the preprocessing steps used in this analysis down-selected images to standardize the cadence, though other approaches have demonstrated success with arbitrary numbers of images in the time series (Kodi Ramanah et al. 2022).Removing the need for a standardized cadence would greatly improve the applicability of this approach to real-time LSNe identification and remove the need for images to be discarded.
It's possible that the machine learning aspect of the analysis could be improved with subtle changes to the training set.For example, when simulating lensed systems, we made the simplifying approximation that the mass profile ellipticity was equivalent in angle and strength to the light profile.While there is likely to be a strong correlation between the mass and light profiles, the exclusion of training set examples with different relationships between mass profile ellipticity and light profile ellipticity may bias LSN selection to systems in which these quantities are highly correlated.We also employed a uniform distribution of Einstein radii, and it is possible that an approach such as Kodi Ramanah et al. ( 2022) with a physically motivated distribution could lead to improved performance.
In consideration of a real-time LSN detection pipeline, a couple changes to the methodology may improve performance.We envision the 10-epoch time series image sets being constructed as observations are ongoing: after a new image of a system is collected, the first image in the time series is discarded and a new 10-epoch sequence is created.There are two downsides to that approach: (1) there is an implicit requirement of 10 epochs before the trained ZipperNet can be utilized, and (2) our final selection criterion on the fraction of subsequences scored above the ZipperNet threshold requires additional epochs to create and track multiple subsequences.With the improvements to the preprocessing discussed above, it may be possible to sufficiently boost the ZipperNet performance to the point where the addi-tional selection criterion can be removed.Furthermore, we did not experiment with time series image sets with fewer than 10 epochs, and it is possible that the analysis can be performed with a less strict requirement on the total number of epochs.
With the current configuration, we have successfully reduced a catalog of 3,459,186 objects to 2,245 with our deep learning approach, and proceeded to identify 50 systems of interest through human visual inspection, three of which show some evidence of an LSN.While we do not confirm or further characterize these three systems of interest, they all contain lensing features and the presence of point sources as found during the human visual inspection.Full characterization would entail Scene Modeling Photometry (Brout et al. 2019) to obtain lightcurves without host galaxy contamination, photometric classification of that time-series photometry, redshift measurements for all objects in the system, and lens modeling which are beyond the scope of this search.Because any detected LSNe would have faded by now, follow-up observations to confirm LSNe are unlikely to provide any additional information apart from redshifts.However, several of the systems of interest were detected by ZipperNet in multiple years throughout DES operations.Therefore, these persistent lensed systems with point sources offer interesting candidates for lensed quasar searches.The three most interesting candidates (DES-691022126, DES-701263907, and DES-699919273) are the most likely LSNe found by our ZipperNet in the magnitude-limited 5-year DES dataset utilized in this analysis.Given the approximate time delays and Einstein radii of the systems, spectroscopic redshifts and lens modeling could produce three independent measurements of H 0 .
The ZipperNet architecture itself provides a new and powerful LSN identification tool going forward.The accompanying code for this analysis (Morgan 2022) also makes the data collection, processing, simulation, training, classification, and candidate selection routines available for future analyses.With first light from the Vera C. Rubin Observatory quickly approaching, setting up a pipeline to detect LSNe is vital for time-delay cosmography measurements.The analysis presented here and suggested improvements provide a template for one such pipeline that would facilitate real-time detection of LSNe in short time series sequences of images without a dependence on traditional and computationally expensive image processing algorithms.

CONCLUSION
This analysis presents the application of a deep learning LSN detection algorithm to an observed optical survey dataset.The algorithm utilizes a novel neural network architecture called a ZipperNet that simultaneously learns characteristic features from image and temporal data to identify LSNe in DES data.Using a Zip-perNet trained on simulated LSNe that are injected into the DES SN field data -along with a selection criterion on the number of detections for each system -our approach performs with an LSN recall of 61.13% and a false positive rate of 0.02%.This technique identified 2,245 candidate LSN systems in the DES SN fields, and a human visual inspection found 50 systems of interest, three of which contained evidence of a time-variable lensed source.Confirmation of these candidates of interest is left for future work, and these systems may facilitate direct measurements of H 0 when fully characterized.Looking to the Rubin Observatory era, the approach developed in DZ1 and implemented on the DES SN fields here has the potential to aid in the identification of several hundred LSNe.The "Coadd Id." is from the DES Y3 GOLD Catalog.The "Years Detected" indicate the years of DES data collection during which the candidate was selected by ZipperNet.The "Redshift" values are either photometric estimates from DES (shown to three significant digits) or spectroscopic measurements from OzDES (Yuan et al. 2015) and refer to the candidate lensing galaxy.

B. CANDIDATE METADATA
This appendix lists properties of systems detected by ZipperNet and scored as a "4" or a "5" by human visual inspection.

Figure 1 .
Figure 1.Examples of systems from our training dataset.The composite image is an RGB visualization of the averaged gri images and the scaled brightnesses are the values extracted from the g (blue "×"), r (green triangles), i (orange circles), z (red squares) images at each time step in the time series image set using the aperture method presented in DZ1.sequence.Finally, we scale the pixel values of the averaged images and the extracted brightness values linearly to range 0 to 1 on a per-example basis.The resulting input to the ZipperNet is two different kinds of data -1) a scaled image in each of the griz bands as a 4 × 45 × 45-element array and 2) a scaled 10-step lightcurve in each of the griz bands as a 4 × 10-element array.After processing the training dataset into 10-step sequences and downsampling to maintain equal representation of the positive and negative classes, we have a total of 1,000,012 training examples.We split these examples into 90% training and 10% validation datasets.

Figure 2 .
Figure 2. Left: A Receiver Operating Characteristic (ROC) curve showing the LSN true positive rate and LSN false positive rate for all possible values of the ZipperNet operating threshold.The operating threshold of 0.760 is chosen to minimize the false positive rate to the point immediately prior to the true positive rate declining rapidly.Right: Histograms of the scaled ZipperNet scores for each class in the validation dataset.The selected operating threshold limits false positives from all systems in the negative class while keeping the majority of the positive class.

Figure 3 .
Figure3.The number of time series image set subsequences scored above the ZipperNet threshold for each type of object in our validation dataset.On average, LSNe time series image sets are scored above the ZipperNet threshold in a higher fraction of their subsequences than all types of negative examples.

Figure 4 .
Figure 4. Candidate systems detected by ZipperNet that showed evidence of lensing but do not show SNe-like variability in their lightcurves.The properties of these candidates are collected in Table B.1.Difference imaging detections from the DES SN group are shown with white star markers.

Figure 5 .
Figure 5.The candidate LSNe identified by ZipperNet and human visual inspection.The aperture used to extract the magnitude measurement from each source is show and annotated on the coadded image.The properties of the candidates are collected in Table B.1.

Table 1 .
ZipperNet layer specifications.We adopt the following shorthand: kernel size (k), padding (p), stride (s), and hidden units (h).Arrows indicate the change in the size of the data representation as it is passed through the layer." † " indicates a Rectified Linear Unit (ReLU) activation function." ‡ " indicates a LogSoftmax activation function.In total, our model contains 4,148,225 trainable parameters.

Table 2 .
Metrics for evaluating the performance of Zipper-Net and our final sample selection that are robust against the class representations of the validation dataset.All metrics are defined in Section 4.1." †" indicates the use of an upper limit on the metric value resulting from limited statistical precision.
-a true negative (TN) is a galaxy, galaxy-galaxy lens, or unlensed SN, and it is classified as background; and R. Morgan thanks the Universities Research Association Fermilab Visiting Scholars Program for funding his work on this project.R. Morgan also thanks the LSSTC Data Science Fellowship Program, which is funded by LSSTC, NSF Cybertraining Grant #1829740, the Brinson Foundation, and the Moore Foundation; his participation in the program has benefited this work.We acknowledge the Deep Skies Lab as a community of multi-domain experts and collaborators who've facilitated an environment of open discussion, ideageneration, and collaboration.This community was important for the development of this project.This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1744555.Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.Funding for the DES Projects has been provided by the U.S. Department of Energy, the U.S. National Science Foundation, the Ministry of Science and Education of Spain, the Science and Technology Facilities Council of the United Kingdom, the Higher Education Funding Council for England, the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, the Kavli Institute of Cosmological Physics at the University of Chicago, the Center for Cosmology and Astro-Particle Physics at the Ohio State University, the Mitchell Institute for Fundamental Physics and Astronomy at Texas A&M University, Financiadora de Estudos e Projetos, Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro, Conselho Nacional de Desenvolvimento Científico e Tecnológico and the Ministério da Ciência, Tecnologia e Inovação, the Deutsche Forschungsgemeinschaft and the Collaborating Institutions in the Dark Energy Survey.The Collaborating Institutions are Argonne National Laboratory, the University of California at Santa Cruz, the University of Cambridge, Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas-Madrid, the University of Chicago, University College London, the DES-Brazil Consortium, the University of Edinburgh, the Eidgenössische Technische Hochschule (ETH) Zürich, Fermi National Accelerator Laboratory, the University of Illinois at Urbana-Champaign, the Institut de Ciències de l'Espai (IEEC/CSIC), the Institut de Física d'Altes Energies, Lawrence Berkeley National Laboratory, the Ludwig-Maximilians Universität München and the associated Excellence Cluster Universe, the University of Michigan, NSF's NOIRLab, the University of Nottingham, The Ohio State University, the University of Pennsylvania, the University of Portsmouth, SLAC National Accelerator Laboratory, Stanford University, the University of Sussex, Texas A&M University, and the OzDES Membership Consortium.Based in part on observations at Cerro Tololo Inter-American Observatory at NSF's NOIRLab (NOIRLab Prop.ID 2012B-0001; PI: J. Frieman), which is managed by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation.This manuscript has been authored by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy, Office of Science, Office of High Energy Physics. (Astropy Collaboration et al. 2013), deeplenstronomy (Morgan et al. 2021), lenstronomy (Birrer & Amara 2018; Birrer et al. 2021), matplotlib (Hunter 2007), numpy (Harris et al. 2020), pandas (McKinney et al. 2010), PlotNeuralNet (Iqbal 2018), PyTorch (Paszke et al. 2019), Scikit-Learn (Pedregosa et al. 2011), scipy (Virtanen et al. 2020).Table B.1.Properties of the systems detected by ZipperNet that received a score of 4 or 5 during the human visual inspection. astropy