MILAN Sky Survey, a dataset of raw deep sky images captured during one year with a Stellina automated telescope

Modern automated telescopes allow to capture astronomical images in a reproducible way. During the MILAN research project (MachIne Learning for AstroNomy), we have observed deep sky with a Stellina observation station for twelve months from the Luxembourg Greater Region. Thus, we have captured raw images of more than 188 deep sky objects visible from the Northern Hemisphere (galaxies, stars clusters, nebulae, etc.), We have compiled and published this data as the MILAN Sky Survey dataset, allowing interested researchers, industry practitioners and citizens to reuse it.


a b s t r a c t
Modern automated telescopes allow to capture astronomical images in a reproducible way. During the MILAN research project (MachIne Learning for AstroNomy), we have observed deep sky with a Stellina observation station for twelve months from the Luxembourg Greater Region. Thus, we have captured raw images of more than 188 deep sky objects visible from the Northern Hemisphere (galaxies, stars clusters, nebulae, etc.), We have compiled and published this data as the MILAN Sky Survey dataset, allowing interested researchers, industry practitioners and citizens to reuse it. © 2023 Published by Elsevier Inc. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ ) and representative of celestial targets that can be captured. • Researchers, industrials, and citizens can use this data to apply and develop image post processing methods (e.g., frames filtering, demosaicing, alignment, stacking, color adjustments by photometry, histogram stretching, denoising). • Data may be used by scientists for astrometry and photometry as complement of existing sky surveys obtained with professional ground telescopes. • To our best knowledge, this dataset is the largest compilation of raw images captured by a portable automated observation station available to the public.

Objective
Nowadays, Electronically Assisted Astronomy (EAA) is widely applied by astronomers to observe deep sky objects (nebulae, galaxies, star clusters). By capturing images directly from a digital camera coupled to a telescope and applying soft image processing (raw images alignment and then stacking), this approach generates enhanced views of deep sky targets that can be displayed in near real-time on a screen (laptop, tablet, smartphone). EAA also allows to observe faint deep sky targets in adversarial outdoor conditions, for example in geographical areas heavily impacted by light pollution or during a night with the Moon. Faint celestial objects such as nebulae and galaxies are almost invisible by direct observation in an urban or suburban night sky; with EAA, they become impressive and detailed.
During the MILAN research project (MachIne Learning for AstroNomy), we have designed and tested innovative image processing techniques for EAA. Thus, an important step of the project was to collect images corresponding to what can be obtained with a portable equipment accessible to amateurs and under imperfect capture conditions (especially regarding light pollution [14] ) -in a different way from what can be obtained with recent professional observatories (large diameters) located in ideal areas (e.g. mountains or desert).
For twelve months, we have captured images of 188 deep sky objects visible from the Northern Hemisphere (galaxies, stars clusters, nebulae, etc.) -and listed in well-known astronomical catalogs: Messier, New General Catalog (NGC), Index catalog (IC), Sharpless, Barnard [15] .
Observations and captures were realized with a Stellina station (for more details, see Section 3 ), and the obtained images were compiled and published in a dataset.

Data Description
The dataset is composed of ZIP archives grouped and stored into several repositories -one per month ( Table 1 ).
Each ZIP archive file contains a sequence of raw images for an observation session of a given celestial target. ZIP files are named with the respective target code and day of observation (format: NAME-YYYYMMDD.zip, for example: NGC1027-20230130.zip [11] ).
As the same deep sky object can be identified with different identifiers coming from various catalogs, we only use one single identifier according to the following priority order: Messier, NGC, IC, Sharpless, Barnard [15] . Categories of deep sky objects are listed in ( Table 2 ).
In ZIP files, we have stored raw images in 16-bit FITS files (Flexible Image Transport System). This format is widely used in astronomy and files can be managed with any scientific image software like SIRIL [16] or with dedicated Python libraries like astropy [17] . Table 1 List of ZIP archives per month. Each file is identified by the target code from well-known catalogs (Messier, NGC, IC, Sharpless, Barnard) and the observation date.

Month
Archive name (and count of raw images per archive) Each FITS data file consists of a single Header Data Unit (HDU) with two elements: an ASCII text header and the binary data, i.e. the raw image as a single-channel matrix of 16-bits integers -obtaining a RGB image from this single-channel matrix can be done by demosaicing them (with the 'RGGB' pattern).
Raw images have a resolution of 3096 × 2080 pixels and correspond to a field of view of approximately 1 °× 0.7 °(we can see an example of raw image in Fig. 1 ).

Table 2
List of deep sky objects per month. Each object is identified by its code in catalogs (Messier, NGC, IC, Sharpless, Barnard), and the type was extracted using the Stellarium software [18] .

Equipment
Raw images have been captured with a Stellina observation station, designed and commercialized by the VAONIS company ( Fig. 2 ). This instrument is an improved and automated version of the well-known Short-tube 80 refractor, appreciated for its versatility by astronomers [13] . More precisely, Stellina is built with an apochromatic Extra-low Dispersion doublet with an aperture of 80 mm and a focal length of 400 mm -focal ratio of f/5. It is equipped with a Sony IMX 178 CMOS RGB sensor with a resolution of 6.4 million pixels (3096 × 2080 pixels). The Dawes Limit of the instrument is 1.45 arcseconds.
An anti-pollution filter (CLS type) is placed in front of the camera sensor. The observation station also has a fully automated alt-azimuth mount: setup, object tracking and focusing are also automated. Stellina also has an integrated field rotator that adapts to the target and allows the end-user to adjust the framing.

Deep sky targets selection
There are many tools to define a list of objects that can be captured. One book has been very useful. In [19] , the author details the different catalogs and the aspects to consider for making a relevant selection. For example, if we consider a famous catalog like NGC, which dates back to the 19th century, then we already have several thousands of objects which were visually accessible at a time when light pollution was much weaker than today. Nevertheless, it is important to choose targets that fit the characteristics of the optics and the sensor used, in our case objects that can be visualized using a Stellina station. We have therefore refined our selection based on [13] , because it details the characteristics and the potential targets of the 80/400 refractor, an optical configuration very close to the Stellina. Moreover, before launching the observations, we simulated on the Stellarium software [18] the visualization of the pre-selected objects to check to their visibility and their position in the sky at the time of the observation.
As a result, we have selected various types of objects ( Table 2 , Fig. 3 ) -including known targets such as the Orion Nebula (Messier 42), but also little-known objects with low brightness reachable by an instrument like the Stellina observation station. In the end, the targeted object with the lowest magnitude is Messier 45 (1.2) and the targeted object with the highest magnitude is IC59 (13.33).   One last point to note: in practice, we got many more objects than those targeted. Indeed, the field of view of the instrument sometimes allowed to capture several objects at the same time. Let's take the case of the Messier 35 open cluster: it is located next to another cluster that was captured at the same time (NGC2158).

Procedure
The following procedure was applied to capture images: -Before capture sessions, the instrument was installed in a dark environment (no direct light) and properly balanced using a bubble level and held in the same position and location (longitude: 6.121, latitude: 49.142). -During capture sessions, the sky was clear and allowed for reasonable quality acquisition.
Authors were always near the instrument during observation, to avoid weather issues (wind, cloud, fog, rain) or disturbance (animals). The live-generated Stellina views allowed to control that the capture was going well ( Fig. 4 ). If any issues appeared during the captures, then the data acquisition was stopped, and the data was not stored. -The default parameters of the Stellina observation station were applied: 10 s for exposition time and 20 dB for gain. These settings are optimized for tracking with an alt-azimuth mount [13] . -The total capture duration was selected according to the magnitude of the target. Approximately 100 raw images were captured for low magnitude stars clusters, while several hundreds of raw images were captured for high magnitude objects like nebulae. -After capture sessions, raw images were copied from the Stellina telescope data storage and then compiled in zip files named with the target code and the observation date.

Challenges
Capturing data for twelve successive months was challenging. The weather is versatile in the observation site, and it was necessary to be available and ready as soon as the conditions were favorable. The duration of observation sessions is also highly variable from one season to another: shorter during the summer nights while the winter nights are much longer.
Dealing with these difficulties was the whole point of creating such a dataset, which corresponds to common observation conditions -contrary to the surveys produced from ideal locations.

Ethics Statement
This work does not include any studies with human or animal subjects.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.