Mining archival data from wide-field astronomical surveys in search of near-Earth objects

Increasing our knowledge of the orbits and compositions of Near-Earth Objects (NEOs) is important for a better understanding of the evolution of the Solar System and of life. The detection of serendipitous NEO appearances among the millions of archived exposures from large astronomical imaging surveys can provide a contribution which is complementary to NEO surveys. Using the AstroWISE information system, this work aims to assess the detectability rate, the achieved recovery rate and the quality of astrometry when data mining the ESO archive for the OmegaCAM wide-field imager at the VST. We developed an automatic pipeline that searches for the NEO appearances inside the AstroWISE environment. Throughout the recovery process, the pipeline uses several public web-tools to identify possible images that overlap with the position of NEOs, and acquires information on the NEOs predicted position and other properties (e.g., magnitude, rate and direction of motion) at the time of observations. We have recovered 196 appearances of NEOs from a set of 968 appearances predicted to be recoverable. It includes appearances for three NEOs which were on the impact risk list at that point. These appearances were well before their discovery. The subsequent risk assessment using the extracted astrometry removes these NEOs from the risk list. We estimate a detectability rate of 0.05 per NEO at an SNR>3 for NEOs in the OmegaCAM archive. Our automatic recovery rates are 40% and 20% for NEOs on the risk list and the full list, respectively. The achieved astrometric and photometric accuracy is on average 0.12 arcsec and 0.1 mag. These results show the high potential of the archival imaging data of the ground-based wide-field surveys as useful instruments for the search, (p)recovery and characterization of NEOs. Highly automated approaches, as possible using AstroWISE, make this undertaking feasible.


Introduction
Minor bodies in planetary systems potentially have an intimate relationship with life.Minor bodies might provide complex molecules to planets to facilitate the formation of life (e.g., Oba et al. 2022 while impacts later on can play a significant role in its evolution through climate changes and mass extinctions (e.g., Alvarez et al. 1980).Deepening our understanding on this potential relationship is a high priority on both the European science agenda (e.g., Cosmic Voyage 2050 1 ) and the US science agenda (e.g., "Origins, Worlds, and Life: A Decadal Strategy for Planetary Science and Astrobiology 2023-2032 2 ).Near-Earth Objects are a reservoir that interacts surely with Earth's biosphere and 1 https://www.cosmos.esa.int/web/voyage-2050 2 https://www.nationalacademies.org/our-work/planetary-scienceand-astrobiology-decadal-survey-2023-2032#sectionPublicationsrepresents physically the closest instance of minor bodies that are at our disposal.NEOs are asteroids or comets whose perihelion occurs at less than 1.3 astronomical units (au), meaning that close approaches with the Earth might occur at some point.The size of these objects ranges from meters to tens of kilometres.Currently, more than 30 000 near-Earth objects (NEOs) are catalogued in the Minor Planet Center (MPC 3 ) and the discovery rate has reached the order of thousands per year.While we are certain that the vast majority of the largest NEOs have been already discovered, very little is known about the majority: the ones in the range of a few meters up to about 150 meters.
With NEOs, the minor body ↔ life relationship can be approached from a scientific perspective: understanding the role of minor bodies in the origin and long-term evolution of life in a planetary system, such as our own Solar System.For example, NEOs might deliver the meteorites that provide the "starter set" for life in the form of prebiotic molecules (Oba et al. 2022).Furthermore, NEO compositions can put constraints on competing formation scenarios for rocky planets like Earth (Burkhardt et al. 2021).Thus the characterization of the orbits and physical composition of Near-Earth Objects is valuable to advance our scientific understanding of the minor body ↔ life relationship.
The minor bodies ↔ life relationship can also be approached from a societal perspective: the impact hazard poses a significant threat to technological civilizations on a planet (Borovička et al. 2013, Popova et al. 2013, Brown et al. 2013 ) and calls for planetary defence strategies.The Tunguska and Chelyabinsk impact are recent reminders.Advanced technological civilizations on a planet can also use their neighbouring minor bodies to mine (rare) chemical elements and compounds (Hein et al. 2020), possibly as a foraging stop in space exploration (Davis et al. 1993).
In conclusion, characterization of the orbits and physical composition of Near-Earth Objects is valuable both to advance both our scientific understanding of the formation and evolution of planetary systems and life and to serve societal goals related to planetary defence and space-based technological infrastructures.
Small NEOs are only observable from the Earth during their brief close approaches with our planet.Every day, observatories such as Pan-STARRS (Vereš et al. 2015), Catalina (Christensen et al. 2019), ATLAS (Tonry et al. 2018) and the Zwicky Transient Facility (ZTF, Bellm et al. 2019) are discovering new NEO candidates which need observational follow-up for their confirmation.It is not always easy to provide follow-up observations of these objects, since their changing observational conditions, usually linked to a growing position uncertainty, make them challenging objects for most telescopes.Particularly challenging are the observations of some subgroups of NEOs such as the Interior Earth Objects (IEOs, Sheppard et al. 2022) or Earth companions such as the Trojans (see for instance Santana-Ros et al. 2022).Once the close approach with the Earth ends, it is impossible to gather any further data on the NEO, until the next close approach happens, which sometimes might take several years.The only way to get any additional information about these bodies once their observational period is over is to rely on data mining.Therefore, it is important to systematically mine and monitor new archival data for the appearance of these objects.This can be done using serendipitous discovery in archival observations not dedicated Near-Earth Objects.For example, the Dark Energy Survey (DES, Abbott et al. 2021;Bernardinelli et al. 2022;Banda-Huarca et al. 2019), the Kilo-Degree Survey (KiDS, Mahlke et al. 2021) and the VISTA Hemisphere Survey (VHS, McMahon et al. 2013;Popescu et al. 2016) have been used to search for minor bodies, including NEOs.Similarly, upcoming wide-area surveys such as ESA's Euclid mission (Carry 2018;Pöntinen et al. 2020) and the Legacy Survey of Space and Time (Grav et al. 2016;Jones et al. 2018) and their combination (Guy et al. 2022) will have a role in NEO astrometric and photometric observations.These astronomical surveys have complementary data, being often deeper and surveying more away from the ecliptic compared to NEO-dedicated surveys.In particular, NEO precovery − detection of a known NEO in an observing dataset prior to its discovery − provides information about the NEO's motion at an earlier epoch at a likely complementary location in its orbit.This is because the orbit uncertainty depends on the fraction of the orbital arc that is covered during discovery.(P)recovering one or more points far away from the discovery and immediate follow-up observations can significantly improve the accuracy of the orbital parameters.This is especially true when re-calibrating archival observations to astrometric reference catalogues such as Gaia which postdate the observations.In this way, astrophysical missions have not only a scientific purpose but can also obtain a societal spin-off.This paper reports an exploratory pilot for that societal spin-off that consists not only of the re-use of astrophysical imaging data for precovery but also in the re-use of investments put in an information system called AstroWISE to make it generically capable to do precovery in a wide range of astronomical archives within a single data flow environment and with a single precovery pipeline.This exploratory AstroWISE precovery pilot was driven by the following quantitative and qualitative questions: 1. What fraction of detectable NEOs can we precover?And how automated can we get the precovery?2. What astrometric and photometric accuracy can be achieved?3. What level of automation can be in achieved the precovery workflow?4. What are the major challenges for exploiting various astronomical surveys for NEO space safety purposes?5. What value do these precoveries have for planetary defence? 6. Do precoveries have benefits for NEO science?
The structure of the paper is as follows.Section 2 describes the data and lists of NEOs used for precovery.In section 3, the pipeline is described in detail.In section 4 the findings and results of NEO precovery are presented and in section 5, we summarize and conclude with the main results of this work and foresee the next steps4 .

Data, software systems & applications
In this work, we use the archival imaging data of OmegaCAM, the wide-field imaging camera of the VLT Survey Telescope (VST) at ESO's Cerro Paranal Observatory.In the past Omega-CAM has been used for several wide-field surveys including the KiDS (Kilo-degree Survey, Kuijken et al. 2019), VST-ATLAS surveys (Shanks et al. 2015), the Fornax Deep Survey (FDS, Peletier et al. 2020) and VEGAS (Iodice et al. 2021).Omega-CAM has 32 science CCDs with a Field of View of about 1 • × 1 • on the sky.In its first decade of science operation, Omega-CAM has covered a large fraction of the southern hemisphere (Fig. 1).Considering the large volume of the dataset of Omega-CAM, we have developed a pipeline to automatize the process of (p)recovery.Here we focus our (p)recovery effort on two datasets of NEOs: 1. NEO's risk list5 : this list is provided by the ESA near-Earth Objects Coordination Centre (ESA-NEOCC) and consists of about 1350 known NEOs with a non-negligible chance of impact in the next hundred years.This list is updated regularly based on the most recent observations.We use the list of 1 February 2022.2. NEOs full list: this list consists of all the known NEOs, about 30 000 Sources.We used the version available on 1 June 2022.These data are processed and analysed using several public web-tools to collect the predicted properties of NEOs, using the AstroWISE system (Begeman et al. 2013;McFarland et al. 2013a) for data management, image processing, calibration and using dedicated software applications for streak detection, astrometry and photometry.We briefly introduce those software systems and software applications AstroWISE stands for Astronomical Wide-field Imaging System for Europe.AstroWISE is an environment consisting of hardware and software which is federated over institutes over Europe.It has been developed to scientifically exploit the ever-increasing avalanche of data produced by astronomical wide-field surveys.AstroWISE is an all-in-one system: it allows a scientist to process raw data, calibrate data, perform post-calibration scientific analysis and archive all the results in one environment.The system architecture links together all these commonly discrete steps in data analysis.

Precovery Pipeline
The precovery pipeline, shown in Fig. 2, consists of four main steps.During these steps, it interacts with several public webtools to collect the predicted properties of NEOs, and uses the AstroWISE system for data management, image processing, and calibration while using dedicated software applications for streak detection, astrometry and photometry.The four steps are described next.

Preliminary spatiotemporal crossmatch to observational archive (step 1)
As the first step, the possible appearance of NEOs from the provided list in exposures in ESO's OmegaCAM archive was assessed using the Solar System Object Image Search (SSOIS)6 web service (Gwyn et al. 2012).We used SSOIS as follows: 1.For the NEOs, after a search by object name, the orbital elements provided by the Canadian Astronomy Data Centre (CADC) itself were used, which is a regularly updated copy of the MPC orbital element database.2. SSOIS itself then generated an internal ephemerides database using the orbfit7 software package.3. The SSOIS' internal database of (RA, Dec) pointings and observation dates for ESO's OmegaCAM observations were used.This database was up to date with observations until approximately 1 November 20218 .4. The ephemerides database in point 2 was cross-matched in the 3-dimensional space of (observation date, RA, Dec) by SSOIS itself with the internal database of OmegaCAM observations in point 3.The cross-match algorithm is described in Gwyn et al. (2012).5. Uncertainties in input orbital parameters and image pointing are not taken into account in the cross-match procedure.This means that the cross-match produces only a result when an NEO ephemeris for the exact orbital elements specified in point 1 falls inside the field of view (FoV) of an OmegaCAM exposure for the exact (observation date, RA, Dec) specified in point 3. 6.We did not ask SSOIS to refine the cross-match to the (X, Y) of a specific detector as that functionality is not available (yet) for OmegaCAM.
To automate our usage of SSOIS we implemented a scripted interface to SSOIS.This interface submits a query to SSOIS.In return, SSOIS provides a list of (raw) imaging data (exposures) for OmegaCAM, date, time, exposure time and the observed filter.SSOIS outputs for OmegaCAM point to the main raw frames (and do not specify which extension (chip) overlaps with the predicted position of the NEO).The information returned by SSOIS on NEO names and OmegaCAM exposures was stored in a local SQLite file.

Final Spatiotemporal crossmatch (step 2)
In Step 2, the pipeline uses the Near-Earth Objects Dynamic Site (NEODyS)9 and JPL Horizons10 web-service11 to obtain more accurate predictions of the NEO positions over the duration of exposure and to obtain the predicted angular motion and visual magnitude.We supply to NEODyS/Horizons the start and end time of the observation, the name of the NEO and the observatory identifier to specify its location.This query to NEODyS/Horizons returns the (RA, Dec), 1σ uncertainties in position, V−band magnitude and the position of the NEO at the beginning and end of the exposure.These values are used to predict the rate and direction of NEO motion relative to the astrometric reference frame.
Based on the returned apparent magnitude we make a preliminary prediction on the signal-to-noise (SNR) for the given filter and exposure time of the candidate (p)recovery observation by assuming its appearance is a point source (i.e., neglecting proper motion) and assuming a solar spectral energy distribution of the NEO.Candidate (p)recoveries for which NEODyS/Horizons confirm the overlap with the exposure FoV and which have also a predicted SNR > 1 are stored in the local SQLite database.To automate our usage of NEODyS/Horizons we implemented a scripted interface to NEODyS/Horizons, using the same approach as for SSOIS.

Data Processing and Astrometry (step 3)
In step 3, we apply several filters and criteria to the precovery candidates resulting from step 2 to narrow down the search to sources which are likely to be detectable.These filters are: • an upper limit on the angular separation (RA, Dec) between SSOIS and NEODyS/Horizons predictions: the imaging data used for precovery are the ones that were identified by SSOIS and based on its predictions.However, once we refine these predictions using NEODyS/Horizons, the images may not overlap with the refined position.Therefore, if the angular separation between these predictions is larger than a certain limit (for example larger than the FoV), we might assume that the object is not within the FoV of the camera.• an upper limit on the 1σ uncertainty in the position (RA, Dec): for cases where the uncertainties in position are too large, the possibility that the NEO is within the FoV is very small.The pipeline skips these frames.
• availability of the raw data: in some cases, the raw frames are not publicly available.The pipeline skips these frames.• the exact position of NEOs on the camera: given the coordinates of the object, the pipeline inspects which CCD is overlapping with the corresponding position of NEOs and if there is no overlap, the pipeline excludes these frames.• a lower limit on the predicted SNR: this limit will remove cases with an SNR lower than the given limit to exclude very faint and non-detectable objects.
All the above-mentioned criteria are applied to both NEO samples i.e. the risk list and the full list.Additionally, for the full list, several other filters are used: • a lower limit on the 1σ uncertainty in the position: with this filter, we can focus the precovery search on objects without precise positional information.• a lower limit on the predicted length: considering the observational difficulties for detecting and matching short streaks (step 4, described later in the text), the pipeline skips frames with NEOs shorter than this limit.• objects without calibrated data: in rare cases, the pipeline fails to reduce/calibrate raw frames and therefore, no calibrated frame is available for further analysis.The pipeline skips these frames.
These used thresholds for these filters are expressed in table 1.
Once a frame passes all the filters and criteria (referred to as a precovery candidate), we then use AstroWISE to produce astrometrically and photometrically calibrated pixel images for the candidate precovery images returned by NEODyS/Horizons.If calibrated detector images already exist in the database, they were downloaded from AstroWISE.If not, the raw data was ingested from the ESO archive into the AstroWISE system as needed and the AstroWISE optical image pipeline was used to process that raw data.To obtain optimal astrometry, we manually astrometrically recalibrated the detector images downloaded from AstroWISE using SCAMP 2.10 (Bertin 2006).This version of SCAMP uses the coordinates of stars in the Gaia EDR3  Upper limit on 1σ errors in position 1 • 1 • Lower limit on 1σ errors in position -1 Limit on SNR 3 3 catalogue as astrometric reference objects.The final astrometry was extracted from the resampled and background subtracted pixels using SWarp (Bertin et al. 2002).Note that the majority of OmegaCAM data residing in AstroWISE have been astrometrically calibrated using the near-IR 2MASS catalogue (Skrutskie et al. 2006) as the astrometric reference catalogue.
To improve the astrometric solution of our dataset and to achieve better astrometry of the precovered NEOs, we implemented astrometric calibration using the Gaia EDR3 catalogue (Gaia Collaboration et al. 2021) as the astrometric reference.It allows us to (re)-calibrate OmegaCAM images astrometrically to Gaia.It improves the external astrometric accuracy from a typical RMS 0.3 to RMS 0.04 (a factor 7 improvement).This accuracy in astrometric calibration brings down the RMS of the external astrometric residuals below the size of most observed offsets between predicted and precovered astrometry.

Streak Detection and Association (step 4)
In step 4, the pipeline deploys StreakDet (Virtanen et al. 2016;Pöntinen et al. 2020) on the photometrically and astrometrically calibrated images to extract candidate streaks.StreakDet is a package that was developed for the purpose of identifying streaks in ground-based and space-based imaging data.StreakDet detects streaks in the images in four steps: segmentation of images, detection of streak-like objects, classification of objects to separate streaks from other astronomical objects and finally deriving their parameters (such as coordinates, total Fig. 4.An upper limit for the centroid estimation accuracy of StreakDet in a range of magnitude, length, SNR and seeing.The astrometric accuracy is within 0.12 and does not show any dependence on the observed properties of NEOs.Fig. 5.The offset between obtained RA and Dec of all the sources extracted (using SExtractor) in frames with and without resampling and background subtraction (using SWarp) in step 3 of the pipeline flux and length) by fitting a model.For the precovery pipeline, we have optimized the StreakDet configuration parameters for OmegaCAM.Once streak detection is done, then, the pipeline searches for the best match between the streaks found within the 3σ uncertainty ellipse of NEOs and their predicted properties.These predicted properties used include the NEOs length, the direction of motion and the magnitude.This procedure is illustrated in Fig. 3.The matching procedure (alternatively called "association") is done in two stages.
In the first step, the pipeline filters out (excludes) the detected streaks with properties different from the predicted properties of the NEOs.The applied filters are: −2 mag < m det, streak − m pred < 2 mag, ... and 0.5 < l det, streak /l pred < 2, ... and 0.5 < σ det, streak /σ pred < 1.5, ... and −30 • < θ det, streak − θ pred < 30 • , ... or 150 in which m det, streak , l det, streak , σ det, streak and θ det, streak are the observed apparent magnitude (in the given filter), length, width and position angle of motion of the detected streak while m pred , l pred , σ pred and θ pred are the same properties that are predicted for the NEO.Out of these 49 precoveries, 27 are identified using the pipeline, and 22 by visual inspection after step 3 of the pipeline.Note that due to close sky proximity of observations at the resolution of the figure, multiple observations can appear to be a single data point.
In the second step, for the detected streaks that satisfy the first step (if any), the pipeline estimates a difference score which is the standardized Euclidean distance between the observed properties of a streak and the predicted properties of the NEO.The parameters used for measuring the distance are the angular length and the position angle of motion, normalized by the observed length and the seeing (FWHM) during observations as follows12 : θ det, streak, norm = θ det, streak /Arctan(FWHM/l det, streak ) l det, streak, norm = l det, streak /FWHM in which θ det, streak, norm and l det, streak, norm are the normalised position angle of motion and length of the detected streak.Correspondingly, the difference score is estimated using: In the end, the detected streak with the lowest difference score is picked as the best match (indicated by the green circle in the fourth column of Fig. 3).Finally, those frames with a matched streak are visually inspected to check if it is a true precovery.

Results
This section presents the outcome of NEO precovery and discusses the performance of the precovery pipeline for the detection, astrometry and photometry of NEOs.

Accuracy of astrometry and photometry
The output catalogues of StreakDet provide the coordinates (X,Y) of the centre and the total flux of the detected streaks.However, they do not provide uncertainties of these measurements.To make an estimation of the centroiding accuracy of StreakDet, we run the pipeline for NEOs with predicted positional uncertainty of less than 0.02 (1-σ uncertainty ellipse).This is about 10 times smaller than the pixel size of the instrument.The offset between the predicted coordinates of these NEOs (RA, Dec) and the derived coordinates using StreakDet gives an upper limit of the total astrometric accuracy (i.e., the aggregate of calibration and centroiding accuracies).The average value of this offset, as shown in Fig. 4, is about ∼0.12 .
A preliminary estimate of the astrometric calibration accuracy is ∼0.05 .It is reasonable to assume that the centroiding and calibration astrometric errors are independent because  (i) the centroiding accuracy of the bright Gaia stars is likely << 0.05 and (ii) the same or at least very similar ensemble of Gaia stars are used for the astrometric calibration of the exposures for one apparition of an NEO.Note that the absolute astrometric accuracy of stars in the Gaia EDR3 is << 0.05 .Thus we draw the preliminary conclusion that the total absolute astrometric accuracy of our NEO positions is typically ∼0.12 .This means offsets of > 0.2 (i.e., the angular scale of 1 OmegaCAM pixel) between extracted and predicted positions are at typically 1.5σ significance prior to taking prediction uncertainties into account in the significance calculation.In individual cases the accuracy can be worse: see notes on individual recoveries below.Moreover, we also assessed the impact of resampling in step 3 of the pipeline (using SWarp) in the astrometry and found a deviation less than 0.01 (Fig. 5).
In the absence of any indication of photometric accuracy of StreakDet, we made an estimation using the output catalogue of Sextractor.The measurement uncertainty in the measured magnitudes of streaks (within their magnitude range) is on average 0.1 mag.

Precovery of NEOs in the ESA risk list
Table 2 summarizes the down selection of precovery candidates during the deployment of this pipeline on the NEOs in ESA's risk list in combination with ESO's OmegaCAM archive.This archive encompasses about 10 years of OmegaCAM exposures with a total number of exposures in the order of 400 000.
The query to SSOIS returned 10 345 exposures, i.e., candidate precoveries for about 1 in 40 exposures.Unfortunately, for 96% (9904 exposures) of those the upper limit to the predicted SNR of the appearance was < 1, leaving only 441 candidate precoveries.Subsequently, NEODyS made a refined positional prediction for these 441, which in 33% of the cases was >1 • away from the centre of the exposure (i.e., the RA, Dec of the pointing) which itself is 1 • × 1 • .For now, we assumed this made it too unlikely that the NEO appearance would be inside the exposure and removed these from the candidate list.This left 295 candidates.For 15% of those the 1σ semi-major axis of the ellipse was >1 • and thus again we deemed it too unlikely for now that the NEO appearance would be inside the exposure.This left us with 251 precovery candidate exposures.For 15 of those, the data could not be retrieved in AstroWISE; For 8 it turned out to belong to other OmegaCAM surveys and we are in the process of obtaining access.The remaining 7 are not yet been publicly released by ESO.For 28% of those the actual predicted position of the NEO landed outside the pixels of the detectors (e.g., in gaps or just outside the FoV) and given the typically small size of the error ellipse we deemed it too unlikely that a precovery could be made.This left 170 precovery candidates.At this step, we refined the SNR estimate by taking into account the proper motion.For 60% of those the predicted SNR was then less than 2. We removed those from the list of candidate precoveries.This left 68 precovery candidates on which precovery was attempted with StreakDet.If StreakDet failed manual precovery was attempted.This was successful for 57% of the cases, so 49 precoveries (listed in Table 4), with 26 being successful via StreakDet and 23 being successful via subsequent manual analysis.The sky distribution and image thumbnails of these 49 precoveries are shown in Fig. 6 and 7. Figure 8 on top shows the predicted properties of detected and non-detected cases.As is seen in this figure, for NEOs with a predicted length smaller than 2 , we only could detect NEOs by visually examining several frames available.These objects would be missed completely by the pipeline.

Precovery of NEOs in the full list
In table 3, we summarize the results from precovery of the full list of NEOs.Queries to SSOIS led to identifying 186 476 frames.In the subsequent step, the query to Horizons removed more than two third of the frames due to the low SNR (SNR<1) of the NEOs, which leaves 55 692 cases.Other filters are applied and in each step, a fraction of the precovery candidates are removed.The filters applied to the full list of NEOs are modified slightly to avoid the extra computation time for cases where the pipeline most likely doesn't detect any NEO.For the full list of NEOs, we apply an extra filter on the predicted length of NEOs and do not attempt recovery of objects with predicted lengths smaller than 3 .These objects, as was seen earlier, are the trickiest to detect (using StreakDet), and they have a similar appearance as other sources (e.g.extragalactic objects and blended objects) which makes the matching process harder.Moreover, a lower limit on the 1σ uncertainties in position is introduced to focus our search on objects with a less certain positional prediction.
After applying all these filters, 968 cases remained, about 0.5% of the initial number of frames (from SSOIS).Out of 968, after visual inspection of the results, the pipeline successfully detects an NEO in 196 cases (listed in Table 5), about one-fifth of the total cases.Out of these 196 NEO appearances, 114 are true precoveries.Fig. 11 shows the histogram of the time intervals for these true precoveries which ranges between 1 day and 3653  days (for 2021 WO), with the majority identified between 1 and 200 days before the discovery date.
The pipeline also results in 157 false positives (about 44% of automatic precoveries), of which 75% have a predicted SNR<5.Additionally, for long NEOs, the precovery rate decreases because of the limitations in streak detection: StreakDet segments long streaks (low and high SNR) as two separate streaks and the resulting positional information are normally not valid.Therefore in these cases, while the pipeline produces a precovery that almost matches with the NEO, we do not take it into account as a successful precovery.
The bottom graph in Figure 8 shows the predicted SNR and length of the detected and undetected precovery candidates.Out of 772 cases without a precovery, 311 have a predicted SNR<5 (Fig. 9), and 94 have a 01-σ positional uncertainty larger than 200 (Fig. 10).The precovery rate of 20% does not mean that the pipeline can not identify 80% of the precovery candidates.
Visual inspection shows that for the majority of cases without a recovery, no NEO is visible in the frames.
To examine the possible complementary role of other existing tools for streak detection, we searched for streaks using SExtractor (Bertin & Arnouts 1996) for the cases without a true detection.SExtractor could identify an extra 26 cases.However, when only SExtractor is used, the number of detection drops to only 58 (compare to 196 for StreakDet) which shows that SExtractor can not substitute StreakDet.Users must be cautious when using SExtractor and consider its limitation to properly segment the streaks which would result in an inaccurate estimate of the centroid of streaks (within a 01-2 ).The uncertainties of SExtractor centroiding (ERRX_WORLD and ERRY_WORLD parameters in the output tables) is on average 0.88 , in the same range as the FWHM of the images.To explore further the accuracy of the predicted properties of NEOs, in figure 12, we compare the predicted and observed (measured from data) length (angular size) and apparent magnitude of the precovered NEOs.There is a general agreement between the predicted and observed properties, in particular for the shorter streaks while the observed NEOs are on average about 0.5 mag fainter.We explore this magnitude offset further in Fig. 13 where the predicted and observed lengths and magnitudes of the precovered NEOs are shown.While there is no clear pattern in the magnitude offset between filters, the predicted and observed magnitudes in u and g show larger offsets.There are two possible sources for this offset: 1. over-prediction of the magnitudes of NEOs, and 2. the simplistic assumption on the colour of NEOs (solar colour) for magnitude transformation between the V-band and the observed filters.

Conclusions and Outlook
In the introduction section of this paper, we outlined our project objectives and to quantify the results from our exploratory study we defined our main objective to answer six questions.In this section, we answer these questions and add the outlook on possible next steps.
1. What fraction of detectable NEOs can we precover?And how automated can we get the precovery?The detectability and the precovery rates vary as a function of the chosen threshold of SNR.The detectability rate is estimated to be ∼0.05 per NEO at an SNR larger than 3 for both NEOs on the risk-list and the full list of NEOs.The precovery rate for SNR>3 is 40% for NEOs on the risk-list and 20% for the full list of NEOs.The precovery rate increases to about 50% for SNR>10.In other words, currently the majority of NEOs with a predicted 3 <SNR<10 remain undetected, even after visual inspection.Assessing if the failed precoveries are consistent with predicted errors on the predicted locations and brightnesses is an important next step.This step is beyond the scope of this paper.If inconsistent with predictions it might give new insight on the limitations of those predictions or insights on how to improve the detection/precovery process.
2. What astrometric and photometric accuracy can be achieved?The astrometric and photometric accuracies are 0.12 (15% of the average FWHM of OmegaCAM/VST images of about 0.8 ) and 0.1 mag.Improvements in astrometric accuracy are expected from propagating the proper motions in the Gaia astrometric reference catalogue to the observation date of the science image.Improvements in photometric accuracy can come from more sophisticated modelling of SED, observational configuration and NEO shape modelling.The Solar System Open Database Network (Berthier et al. 2022) might facilitate this.
3. What level of automation can be achieved in the precovery workflow?The precovery pipeline described here works automatically through all the steps.Thanks to the common data model for calibrated observations in AstroWISE (Mc-Farland et al. 2013b) it can be deployed straightforwardly on calibrated observations for many other instruments available in the AstroWISE archive.However, after the last step (streak detection and matching) and before reporting the recoveries, they must be inspected by an expert for confirming or rejecting the precoveries.Additionally, another challenge is that precise photometric calibration for a range of instruments is hard to fully automate.This is because the derivation of the solution uses reference stars sometimes inside the science images, sometimes in calibration observations.A potential solution would be to construct a photometric reference catalogue that spans the entire sky observable by OmegaCAM with sufficient stellar density.This appears possible by aggregating information from the multiple large-scale surveys of the Southern Sky.
4. What are the major challenges for exploiting various astronomical surveys for NEO space safety purposes?In addition to the automation already discussed, a main challenge is robust NEO detection and segmentation.This is also a main reason behind the limitations of precovery in this paper.
StreakDet is a great tool for detecting high SNR streaks with sizes between 5-20 .However, its performance drops for faint and long streaks.Deep learning might be a solution to improve streak detection and ultimately NEO precovery.
5. What value do these precoveries have for planetary defence?
The precovery of NEOs provides valuable positional information about NEOs before their discovery.In this paper, we could precover 3 NEOs and as a result, these 3 NEOs are removed from the risk-list of NEOs.
6. Do recoveries have benefits for NEO science?Precision astrometric information of a sizeable ensemble of recovered NEO can provide a better understanding of non-gravitational effects on NEO orbits (e.g. the Yarkovsky Effect).As a side remark, performing also serendipitous discovery in astronomical surveys, especially at high ecliptic latitudes, might provide interesting constraints on the orbital demography of NEOs, complementary to NEO dedicated surveys.Collecting the optical/near-infrared spectral energy distribution (SED) information from recovery of a sizeable NEO sample can be complementary to existing astrometric and SED information and hence provide valuable information about their orbit and composition/shapes/rotational properties.

T
Fig. 1.Top: the sky coverage of the OmegaCAM/VST observations over 10 years of observations of the southern hemisphere.The Astro-WISE archive contains 361977 individual exposures.Bottom: the histogram of exposure times of the OmegaCAM/VST observations in u (54918 exposures), g (79127 exposures), r (96811 exposures) and i (78876 exposures).

Fig. 2 .
Fig. 2. The precovery pipeline.This Diagram shows the four steps in the precovery Pipeline.The pipeline gets a text file containing NEO names as the input and produces a CSV table at the end containing the NEO precoveries.The intermediate outputs are stored in an SQLite database which communicates with the pipeline in many instances throughout the process.

Fig. 3 .
Fig. 3.The results of the streak detection using StreakDet for three examples.The first column (from left to right) shows the calibrated cutout around the predicted position of NEOs.In the second column, the 3σ uncertainty ellipse, and the predicted appearance of the NEO (considering its speed and direction of motion) are shown (red ellipse and red line).The third column shows the detected streaks across frames by StreakDet (grey circles).Frames in the fourth column indicate the streaks that match the predicted properties of the NEOs (green circles).

Fig. 6 .
Fig.6.The sky coverage of the OmegaCAM/VST observations (green area), the 10 345 frames resulted with a chance of overlapping with an NEO resulting from SSOIS (blue points), 19 precovery candidates without a detection (red points) and 49 precovered NEOs (black points).Out of these 49 precoveries, 27 are identified using the pipeline, and 22 by visual inspection after step 3 of the pipeline.Note that due to close sky proximity of observations at the resolution of the figure, multiple observations can appear to be a single data point.

Fig. 7 .
Fig. 7.The 49 precovered risk list NEOs using the precovery pipeline.26 precoveries are detected (fully) automatically and 23 are detected by visual inspection of the frames.

Fig. 8 .
Fig. 8. Left: SNR and length for precovered NEOs of the risk list − automatic using the pipeline (red points) or manually by visual inspection of frames (yellow points) − and not-precovered risk list NEOs in the OmegaCAM data (grey points).Right: SNR and length for precovered NEOs of the full list − using StreakDet (red points) and SExtractor (blue points) − and not-precovered risk list NEOs in the OmegaCAM data (grey points).

Fig. 9 .
Fig. 9. Histogram of the SNR distribution of the 968 precovery candidates after detectability filtering (in step 3) in u, g, r and i.

Fig. 10 .
Fig. 10.Histogram of the 1σ positional uncertainties of the 968 precovery candidates after step 3 (left) and the successful precoveries by the pipeline (right).

Fig. 11 .
Fig. 11.The time interval between precovery and discovery of the 196 NEO appearances (NEO full-list).The upper and lower panels show the NEO appearances after the discovery (recovery) and before the discovery (true precovery) of the NEOs respectively.

Fig. 12 .
Fig. 12. Predicted versus observed length (left) and magnitude(right) of detected NEOs of the full list by StreakDet (red points) and SExtractor (blue points).

Table 1 .
Summary of the filters applied to the NEO candidates in step 3 of the precovery pipeline • 1 •