The following article is Open access

The Most Interesting Anomalies Discovered in ZTF DR17 from the SNAD-VI Workshop

, , , , , , , , , , , , and

Published July 2023 © 2023. The Author(s). Published by the American Astronomical Society.
, , Citation Alina Volnova et al 2023 Res. Notes AAS 7 155 DOI 10.3847/2515-5172/ace9dd

2515-5172/7/7/155

Abstract

The SNAD team has developed an adaptive learning algorithm, named Pine Forest (PF), to enhance anomaly detection in astronomical data. Recognizing the essential role of human engagement in the discovery process, PF presents outliers to a human expert for review, and filters out trees which disagree with the feedback provided. During the sixth annual SNAD workshop (https://snad.space/2023/), held in 2023 July, we applied PF to the Zwicky Transient Facility's DR17 data. Interesting discoveries include long-duration objects such as supernovae, along with fast transients like red dwarf flares and one microlensing event. As a result, new variable stars were identified and labeled in the SNAD knowledge database.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The SNAD 11 team is an international group of researchers that has been working on anomaly detection in astronomy since 2018. Over these years, we have explored different anomaly detection algorithms and applied them to various astronomical data sets (Pruzhinskaya et al. 2019; Malanchev et al. 2021; Aleo et al. 2022). Our accumulated experience has led us to the conclusion that human involvement is crucial during the search process (Ishida et al. 2021; Pruzhinskaya et al. 2023). As a result, we developed our own adaptive learning algorithm, named Pine Forest 12 (PF). It is based on a tree filtering approach. It starts as a usual Isolation Forest algorithm, but once it is run, the object with the highest anomaly score is shown to a human expert. The expert is asked to provide a label to the outlier, with "yes" or "no" indicating whether it is a scientifically interesting anomaly. Given this feedback, PF filters out trees which assign a high anomaly score to objects the expert judges "not interesting."

During the SNAD-VI Workshop, we applied the PF algorithm to real data.

2. Data

We used the public Data Release 17 (DR17) 13 of the Zwicky Transient Facility survey (ZTF, Bellm et al. 2019), limited by the period spanning from 2020 December 1st to 2021 November 6th (59184 ≤ MJD ≤ 59524). This period only includes data from ZTF Phase II, thereby offering a more homogeneous data set, as the public Phase II survey has a different observing strategy compared to Phase I—specifically, a 2-night cadence survey of zg-band and zr-band observations of the northern sky. The end-point of the considered time interval is limited by the availability of the disclosed ZTF Private Survey data. Since our primary goal was to search for anomalies among the transients, we analyzed objects located higher than 15° above the galactic plane. Our final sample comprises approximately 67 million light curves with at least 100 photometric points in zr-band (catflags = 0). Each zr-band light curve was characterized by 54 features (e.g., amplitude, periodogram peak, etc.), which are suitable for analyzing transients and variable stars. The full description of the features can be found in Malanchev et al. (2021) and Malanchev (2021).

3. Results

The features extracted from the entire sample served as input for the PF algorithm. During each run of PF, the expert has gone through a total budget of 40 objects.

The algorithm showcases its capability to learn from the expert and adapt to the expert's opinion. However, despite targeting specific transients, it retains the ability to present outliers with completely different light curve behaviors. Thus, among the anomalies we found, there were long-duration objects (e.g., the known peculiar SN 2020uem, OID = 414308100011056, 14 a possible thermonuclear explosion within a dense circumstellar medium Uno et al. 2023) and fast transients (e.g., red dwarf flare OID = 762202400015565 15 ). Among the potentially interesting sources is the radio source NVSS J080730+755017 with an optical counterpart, OID = 858205100001741. 16 This source is characterized by a very long variation period (see Figure 1) and could be an active galactic nucleus. Another source is a binary microlensing event AT 2021uey, OID = 643210400013909, 17 which is classified as a microlensing candidate by the Fink broker using the public ZTF alert stream (ZTF18abktckv 18 ), and as other types by Gaia (Gaia21dnc) and ASAS-SN surveys(ASASSN-21mc). During the inspection of anomaly candidates, we also identified and labeled new variable stars (e.g., ZTF18acaochk, OID = 1847208300004447 19 ) and some artefacts 20 in the SNAD knowledge database.

Figure 1.

Figure 1. Top: binary microlensing event AT 2021uey, OID = 643210400013909 (https://ztf.snad.space/view/643210400013909). Bottom: radio source NVSS J080730+755017 with an optical counterpart, OID = 858205100001741 (https://ztf.snad.space/view/858205100001741).

Standard image High-resolution image

Acknowledgments

We used the equipment funded by the Lomonosov Moscow State University Program of Development. The authors acknowledge the support by the Interdisciplinary Scientific and Educational School of Moscow University "Fundamental and Applied Space Research." P.D.A. is supported by the Center for Astrophysical Surveys (CAPS) at the National Center for Supercomputing Applications (NCSA) as an Illinois Survey Science Graduate Fellow. V.V.K. is supported by the Ministry of science and higher education of Russian Federation, topic No. FEUZ-2020-0038.

Footnotes

Please wait… references are loading.
10.3847/2515-5172/ace9dd