A flexible deep learning crater detection scheme using Segment Anything Model (SAM)

Craters are one of the most important morphological features in planetary exploration. To that extent, detecting, mapping and counting craters is a mainstream process in planetary science, done primarily manually, which is a very laborious, time-consuming and inconsistent process. Recently, machine learning (ML) and computer vision have been successfully applied for both detecting craters and estimating their size. Existing ML models for automated crater detection have been trained in specific types of data e.g. digital elevation model (DEM), images and associated metadata from orbiters such as the Lunar Reconnaissance Orbiter Camera (LROC) etc. Due to that, each of the resulting ML schemes is applicable and reliable only to the type of data used during the training process. Data from different sources, angles and setups can compromise the reliability of these ML schemes. In this paper we present a flexible crater detection scheme that is based on the recently proposed Segment Anything Model (SAM) from META AI. SAM is a promptable segmentation system with zero-shot generalisation to unfamiliar objects and images without the need for additional training. Using SAM, without additional training and fine-tuning, we can successfully identify crater-looking objects in various types of data (e,g, raw satellite images Level-1 and 2 products, DEMs etc.) for different setups (e.g. Lunar, Mars) and different capturing angles. Moreover, using shape indexes, we only keep the segmentation masks of crater-like features. These masks are subsequently fitted with a circle or an ellipse, recovering both the location and the size/geometry of the detected craters.


Introduction
Impact craters are circular-elliptical depressions on planetary surfaces caused by the impact of meteors (Melosh, 1989).The size and the shape of craters depend on numerous factors (impact angle, composition of the target body, size and type of meteor etc. (Melosh, 1989)), giving rise to a plethora of crater types with varying diameters (Salamunićcar et al., 2012).Impact craters are amongst the most important morphological features in planetary exploration (McSween et al., 2019), and they have been extensively used for inferring the composition and structure of celestial bodies (Lemelin et al., 2019).Craters act as natural excavation sites to study stratigraphy, strata and stratification, providing pivotal information for the geology and landscape evolution of the planet (Huang et al., 2018).The distribution of crater sizes has also been widely applied for estimating the age of planetary surfaces, through calculation of crater size-frequency distribution (CSFD) and chronostratigraphy (Hartman and Neukum, 2001).Apart from CSFD, the shape and the erosion of the crater have scale real-time crater detection, and reduce human biases (Silburt et al., 2018).
Automatic crater detection is a challenging scientific endeavour due to the wide variety of impact craters, the diversity of input data, and the level of background noise (Emami et al., 2019).Various methodologies have been suggested for CDA over the years.From convolutional neural networks (CNN) combined with Canny edge detection (Robbins et al., 2014a), to hybrid supervised-unsupervised machine learning (Emami et al., 2019), and using Adaboost with support vector machines for detecting craters on Mars (Wetzler et al., 2005).In Salamunićcar et al. (2011) 77 CDA methodologies are outlined, divided in image-based and digital elevation model (DEM)-based approaches (Di et al., 2014).With the recent advancements in deep learning, U-nets (a deep learning framework based on convolutional neural networks originally suggested as a segmentation algorithm in medical imaging (Ronneberger et al., 2015)) have been successfully applied for detecting and estimating the size of impact craters.U-nets have been trained using Lunar DEM (Silburt et al., 2018) and photo images from the Lunar Reconnaissance Orbiter Camera (LROC) (Downes et al.).In Wetzler et al. (2018), a set of U-nets are trained using labelled photos from the Mars express mission.In Lee (2019), topography data from Mars are used to train U-nets.In Yang et al. (2020) data from Chang'E-1 and E-2 are used for training ML for detecting craters, and approximate their age based on their morphology.A segmentation approach capable of detecting and mapping the shape of a crater is also suggested in Mohamad et al. (2019).The method is based on Mask Region Convolutional Neural Networks (MaskRCNN Kaiming et al., 2017) trained using orthoimages of DEM data from the Moon.All these methods perform sufficiently well when applied to data similar to the ones that they have been trained for.Although models trained using DEM data from the Moon showed promising results when applied to DEM data from Mercury (Silburt et al., 2018), nonetheless using ML-based CDA trained with a specific type of data to a different type of inputs is not advisable.As stated in Wetzler et al. (2018) regarding using U-nets trained for identifying craters on Mars, small differences between Mars and other celestial bodies are enough to make this model unreliable outside of Mars (Wetzler et al., 2018).To that extent, an algorithm that can identify craters in a flexible manner, without being limited by the celestial body, data type, or measurement setup, would be highly valuable (Silburt et al., 2018).
In the current paper we present a flexible approach for identifying craters using the Segment Anything Model (SAM) (Kirillov et al., 2023).SAM is a foundation model developed by META for computer vision and image segmentation.SAM can segment (cut-out) any morphological feature in any given image identifying which pixels belong to an object.Foundation models like SAM are generalised ML models trained in large datasets, allowing them to be fine-tuned for specific tasks with a relatively small number of training samples.SAM was trained with over 1 billion masks on non astronomical images to segment images in a prompt-able way (i.e. using a compressed representation of images) allowing transfer zero-shot to new image distributions.Via this approach, regardless of the type of the data (e.g.photos, DEM, spectra, gravity etc.) or the celestial body (e.g.Moon, Mars etc.) and the measurement setup, the data will be segmented into different categories and classes.Subsequently each mask is further classified into crater and no-crater based on geometric indexes that evaluate how circular or elliptical is the investigated mask.Via numerous examples, we illustrate the effectiveness of this processing pipeline to different sets of data from different planetary bodies and measurement setups.The results highlight the potential of foundation segmentation models for crater identification and pattern recognition in planetary science in general.

Methodology
The processing pipeline is composed of three sequential steps.Initially, the input image undergoes segmentation using SAM (Kirillov et al., 2023).SAM was not fine-tuned with additional images; we use the model as presented and described in Kirillov et al. (2023).There are no restrictions regarding the celestial body, data type, resolution etc., any type of imagery data can be used as input.Next, each segmentation mask is analysed to determine its shape.Any masks that are not identified as circles or ellipses are filtered out, and the remaining masks are subjected to further processing to extract their boundaries and fit an ellipse to their edges.Finally, a post-processing filter is employed to eliminate any potential duplicates, artefacts, or false positives.

Segment Anything Model (SAM)
Image segmentation is a branch of computer vision and digital image processing aiming at segmenting a given image into several masks (Szeliski, 2011).Numerous algorithms have been suggested for image segmentation throughout the years from using unsupervised clustering methods such as K-means (Dhanachandra et al., 2015) to histogram-based methods (Qin et al., 2011) and data coding and compression (Ma et al., 2007).In recent years, deep learning has been extensively used for image segmentation with impressive results compared to previous approaches (Farabet et al., 2013;Chen et al., 2018;Kim et al., 2021;Kexin and Chenjun, 2020;Yang et al., 2018;Noh et al., 2015).Deep learning has become the standard in remote sensing segmentation in geosciences (Buscombe and Goldstein, 2022;Chen et al., 2020;Collins et al., 2020;Gupta et al., 2021;Zhang et al., 2018), and for real time identification of objects in Martian terrain for safe rover navigation (Liu et al., 2023a;Goh et al., 2022;Liu et al., 2023b).
Foundation deep learning schemes (Sofiiuk et al., 2022;Qin et al., 2022) have been developed for interactive image segmentation trained with large and diverse image databases ( COCO Lin et al., 2014, LVIS Gupta et al., 2019 etc.).In April 2023, META released their own model named ''Segment Anything Model'' (SAM) (Kirillov et al., 2023), a deep learning image segmentation that outperforms previous approaches.SAM has been trained in a high-quality dataset (SA-1B Kirillov et al., 2023) consisting of 11 millions images from a provider that works directly with photographers (Kirillov et al., 2023), and billions of masks, significantly larger than previous databases (Kirillov et al., 2023).SAM consists of a computationally expensive deep image encoder that is based on Masked Autoencoders (He et al., 2022) and a pretrained vision transformer model (Dosovitskiy et al., 2021).The image embedding produced from the image encoder is further enriched with a variety of input prompts such as clicks, selected boxes and text (Kirillov et al., 2023), or can use a dense grid of points to perform segmentation in an automatic manner.The embeddings are subsequently given as inputs to the mask decoder (based on a vision transformer model) that is trained to map the causal relationship between given embeddings and their associate segments/masks.SAM demonstrates excellent performance in a wide range of images from databases significantly different compared to its original training dataset SA-1B (Kirillov et al., 2023).The generalisation capabilities of SAM make it a potential candidate for CDA, overcoming the limitations of data-specific CDA without the need for additional training and well-labelled data.Regarding SAM's computational requirements, from our experience, 2 GPUs NVDIA T4 Tensor cores are sufficient computational resources for providing real-time results.Moreover, recent developments in radiation tolerant GPU-based AI-processing in space (Fredrik et al., 2020) make it possible for the proposed scheme to be used as an onboard real-time CDA in future planetary missions.
Applying SAM to a given image results to the following outputs (Kirillov et al., 2023): • Segmentation masks • The areas of the masks in pixels • The boundary box for each mask • The quality of the mask (from 0 to 1), an indicator of how reliable a mask is • The input point (x,y coordinates) that generated each mask.SAM uses a dense grid of points and for each one of them it estimates a mask in its near proximity.The point that corresponds to the highest quality mask is the input point of the mask • The stability score for each mask, which is an additional quality index (from 0 to 1) that estimates how stable the mask is for different input coordinates • The crop of the image used to generate each mask From the above it is evident that SAM in principle can both detect and estimate the size of impact craters, since it provides direct information regarding the area of the mask and its bounding box.SAM can be modified and tuned by changing its hyper-parameters, but it is generally advised to use the default ones (Kirillov et al., 2023).The tunable hyper-parameters of SAM control how dense the grid points are placed, the thresholds for filtering out low quality masks (both quality and stability), and the minimum size of the masks (Kirillov et al., 2023).
Fig. 1 illustrates an example of using SAM to a natural colour image of Orcus Patera taken from Mars Express High Resolution Stereo Camera (HRSC).Orcus patera is an elongated depression on Mars with debatable formation, believed to be created by cratering, volcanic or tectonic causes (van der Kolk et al., 2001;Williams and Friedlander, 2015).Despite SAM not being trained specifically for Mars Express HRSC images, SAM appears to capture all the major features in the image, which is indicative of its effectiveness.Additionally, access to geometrical features of the masks enables further classification based on shape and size.The next processing step involves filtering out noncircular/elliptical masks and fitting circles and ellipses to the remaining segments.

Circular-elliptical indexes
In the previous section SAM was applied to extract the segmentation masks of different morphological features for an input image.SAM was  3. Notice that the ellipse that fits Orcus patera is slightly shifted due to the irregular shape of this unique crater.Its original mask is correctly identified, but an ideal ellipse is difficult to fit the elongated and irregular shape of Orcus patera.
not fine-tuned and re-trained using astronomical images.Consequently, SAM segments any morphological feature with distinct boundaries, and it is not tuned for detecting just craters.Since the majority of impact craters have a circular-elliptical shape (Melosh, 1989), it is a rational choice to filter out all the segmentation masks that are not circularelliptical.To do that we need to define indexes based on which the circularity-ellipticity of each mask will be assessed.
Regarding circularity, if the mask is a circle then its radius () can be inferred from the measured area () (number of pixels) via  = √   .Its circumference can also be calculated using the previously estimated radius via  = 2 √   .Subsequently, the perimeter of the mask ( ) is calculated manually from the image, and if the shape is a circle then the ratio  = ∕ should be close to 1.This is a mainstream approach for calculating the circularity of an object (Bottema, 2000) with minimum computational requirements.One drawback of this approach is that elliptical shapes with low eccentricity can result in  ≈ 1 and therefore give the false impression that an ellipse is a circle.To overcome this, we first fit an ellipse to the investigated mask, and subsequently we define the index  =   where  and  are the major axes of the ellipse.If both  and  equal 1 ±  , then the shape is classified as a circle.The threshold  is to be tuned depending on the type of the image, but from our experience a value  ≈ 0.1-0.5 should be considered as a default.
For the ellipticity, first we fit an ellipse to the investigated mask, and subsequently we infer its area from  = , where  and  are the main axis of the fitted ellipse.Then we estimate the area of the mask () manually by measuring the pixels of the mask.If the mask is an ellipse then the ratio  =   should be  ≈ 1± .The threshold  is to be tuned but a value  ≈ 0.1-0.5 should be considered as a default.Since by definition a circle is also an ellipse with  = , in order to distinguish between circular and elliptical objects we first evaluate if an image is a circle (using  and ) and if not only then do we further check the ellipticity index .Lastly, additional constraints in the eccentricity and the ratio of ∕ can be trivially implemented to filter out elongated elliptical features.
To summarise, the circularity index  is first estimated to assess if an object is a circle or not.If an object is a circle, we further filter out circles with high eccentricity using the index .Subsequently, the ellipticity index is calculated for all the remaining masks, and all the masks with low ellipticity indexes are filtered out.Through this approach, using ,  and  we filter out all the segmentation masks that are not circular-elliptical.The remaining masks are classified as circles or ellipses and a Canny filter is applied separately to each one of them to derive their edges.Lastly, a circle or an ellipse (depending on the mask classification) is fitted to the edges.The axes , , and the coordinates of the centre are saved for each of the remaining The threshold  is crucial for the final results.Using  = 0 will result on keeping only the ideal circles and ellipses and filtering out the rest of the masks, while  = 1 will keep all the objects masked by SAM.From our experience a value between  = 0.1-0.5 is advisable tuning it accordingly for the investigated type of dataset.
Fig. 2 illustrates the results of applying the proposed CDA in Mars Express HRSC natural colour image of Orcus Patera (see Fig. 1).It is indicative that the majority of the craters are correctly identified and mapped with a small amount of false positives.Even non-conventional (speculated van der Kolk et al., 2001) craters like the well-known elongated elliptical depression in the middle are correctly identified and sufficiently mapped.The undetected small craters in Fig. 2 are due to the size biases inherit in SAM, and as it is shown in Fig. 3, if we zoom in an investigated area the majority of the small craters will be correctly identified and mapped.

Case studies
The SAM-based CDA is not constrained to a specific type of data, measurement setup and celestial body.This is because the core element of the proposed CDA is SAM, a generic foundation model for segmentation (Kirillov et al., 2023), not fine-tuned for specific type of data.To demonstrate the flexibility of the proposed scheme we examine a set of case studies from different celestial bodies using different types of data and measurement setups.
Fig. 4 shows three case studies implementing the proposed CDA with Lunar data.The dataset comprised a DEM and two Lunar Reconnaissance Orbiter Camera (LROC) images captured from different angles.The outcomes of the experiment revealed that the proposed methodology could successfully identify and provide reasonably accurate estimations of the size of craters.Nonetheless, as noted in the previous section, the inability to identify smaller craters can probably be attributed to the size limitations inherent in SAM.However, this limitation can be mitigated by zooming into the image, as depicted in Fig. 3. Focusing can potentially reduce the resolution of an image, nonetheless, from our experience experimenting with numerous images, resolution does not significantly affect the performance of SAM as shown in Fig. 3. Interestingly, as visible in the bottom panel of Fig. 4, the algorithm is able to detect craters even in a high oblique angle image captured by LROC, thus suggesting the possibility of real-time detection of craters using operational rover and lander cameras.
The second case study involves data from Mars.Fig. 5 showcases three examples using three different types of data (A) Thermal Emission Imaging System (THEMIS) infrared images, (B) High Resolution Imaging Experiment (HiRISE) orthoimages and (C) a mosaic from European Space Agency's (ESA) HRSC.Similar to the previous case studies, the proposed CDA manages to both detect and measure the investigated craters with sufficient accuracy.SAM-based CDA works equally well regardless of the investigated celestial body indicating its potential to be used as a universal CDA in future space exploration missions.
In the last example we examine Phobos, the largest of the two Martian moons.We use a false colour image taken from Mars Reconnaissance Orbiter (MRO).Fig. 6 shows the input, the remaining masks after filtering non-circular/elliptical shapes, and finally the fitted circles and ellipses.The results are sufficiently good despite the fact that no specific training was done for this unique type of data.The proposed CDA manages to identify most of the craters with minimum false positives and negatives, despite the fact that it has not been trained for false colour MRO images.
The three case studies examined in this section consists of a diverse set of data from different celestial bodies using different instruments and measurement configurations.Overall, 298 craters were detected manually via visual inspection in all the examined case studies in order to estimate the precision and recall of the proposed CDA.The precision is defined as the ratio of the true positives over the summation of true and false positives, while the recall is the ratio of true positives over the summation of true positives plus false negatives.Via visual inspection, we derived that the proposed scheme managed to correctly detect 249 of them with 92 false positives .The recall and precision are 0.8356 and 0.7302 respectively, and they stay relatively constant for all the examined case studies.These numbers are comparable to the current state of the art (Lee and Hogan, 2020), although additional testing is needed to overcome any statistical uncertainties due the small statistical sample used in this study.The main advantage of the suggested scheme is its flexibility to various types of data.While previous approaches are trained and are only applicable to a specific type of data from a specific celestial body, the proposed CDA can be generalised to a diverse set of datasets from arbitrary celestial bodies while retaining its recall and precision.

Discussion: Limitations and future work
The proposed SAM-based CDA is essentially a shape detector that focuses on circular/elliptical shapes.This is both an advantage and a drawback.This generic detection objective allows for the algorithm to work equally well despite the dataset and the investigated celestial body.SAM is trained using millions of images to identify segments and masks, and any mask that has a circular/elliptical shape will be identified as a crater.At the same time, this can give rise false positives, since not all circular/elliptical shapes are craters.One common artefact that we encountered was that the central peaks of some craters were falsely identified as craters due to their circular shape.This can be easily overcome by adding an additional filter that removes craters with similar centres.However, this can also potentially remove any actual crater with its centre coinciding with a larger crater leading to false negatives.Another typical artefact was missclassification of shadows as craters due to their elliptical shape.These artefacts were filtered out using eccentricity thresholds not allowing elongated ellipses to be categorised as craters.Nonetheless, via this threshold the algorithm will not be able to detect actual elongated and elliptical craters.Another example of non-crater circular morphological feature is shown with white arrow in Fig. 7.The circular elevated topography seen in the image is falsely identified as crater due to its shape and distinct features.In the same figure there are also some false negatives that are highlighted with yellow arrows.The algorithm failed to detect them despite detecting similar craters in the near proximity of the false negatives.This is an unexpected behaviour, which indicates that more research is needed to properly assess the limitations and instabilities of the proposed scheme.Another drawback of the proposed scheme is the need for tuning the threshold for the geometrical indexes.Depending on the data the circularity/ellipticity threshold should be tuned accordingly in order to filter out non-crater masks.In noisy, low resolution and cluttered data, the threshold should be relaxed to allow non-ideal circles and ellipses, which will consequently lead to false positives.
SAM is not tuned for astronomical images and this can result to partial classifications as shown in Fig. 1, fourth crater from the left.In this example a linear feature cuts through the crater separating it into two segments.Since SAM is not trained for these type of data, it masks one of these segments separately leading to a partial detection as shown in Fig. 2. Within that context the proposed CDA is expected to underperform in challenging situations like craters subject to viscous relaxation in icy bodies (Bland et al.) and craters populated by rocks and boulders (Daly et al.).To overcome these issues SAM needs to be re-trained and fine-tuned for these type of data to learn to identify such unique morphological features.SAM is a generic foundation model that can be fine-tuned for planetary surfaces via transfer learning.Transfer learning refers to a variety of techniques aimed at using an existing pre-trained model on one task to perform another related task (Pan and Yang, 2010).Transfer learning has been successfully applied in geophysics where state of the art foundation models such as YOLO v3 (Redmon and Farhadi, 2018), pre-trained in big datasets (such as Imagenet Deng et al., 2009), were used to further learn to detect specific geophysical targets of interest (Dramsch and Lüthje, 2018;Li et al., 2022).Similar approaches, where a foundation pre-trained model is further trained for a specific task, have been widely applied in various scientific fields utilising the core capabilities of a foundation model combined with domain knowledge from a specialised well-labelled dataset (Wang et al., 2022;Sun et al., 2021;Chiba and Sasaoka, 2021;Minoofam et al., 2021).For future work, the proposed SAM-based CDA could be potentially improved by using SAM as the foundation model and further train it with a diverse well-labelled dataset from various celestial bodies and different types of planetary data.

Conclusions
Through a series of examples using different types of data from various celestial bodies, we demonstrated the potential of SAM as a CDA and as a pattern recognition planetary tool in general.The proposed CDA performs equally well for various types of datasets and celestial bodies, and without the need for additional labelled data for fine-tuning SAM.The current work lays the foundations for a single flexible CDA for planetary science; and also introduces SAM as an effective way to identify patterns and dominant features in planetary data.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1. (A) Mars Express HRSC natural colour image of Orcus Patera (B) Segmentation of the input image using Segment Anything Model (SAM) (Kirillov et al., 2023).(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 2 .
Fig. 2. (A)The remaining segmentation masks from Fig.1after filtering out the noncircular/elliptical classes using geometrical indexes.(B) Canny filter is applied to each one of the remaining masks, and the edges are fitted with circles and ellipses.The black box is the focused area examined in Fig.3.Notice that the ellipse that fits Orcus patera is slightly shifted due to the irregular shape of this unique crater.Its original mask is correctly identified, but an ideal ellipse is difficult to fit the elongated and irregular shape of Orcus patera.

Fig. 3 .
Fig. 3. Applying the proposed CDA to the top left area of the Mars Express HRSC natural colour image of Orcus Patera shown in black box in Fig. 2. (A) is the input image, (B) are the remaining masks after filtering based on geometrical indexes, and (C) are the final fitted craters.The small undetected craters shown in Fig. 2 are now correctly identified and mapped after focusing in the investigated area.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 4 .
Fig. 4. The proposed CDA using different types of Lunar data.(A) column are the input images, (B) are the remaining masks after using geometrical indexes, and (C) are the final fitted craters.The proposed CDA works reasonably well regardless of data type and measurement configuration.

Fig. 5 .
Fig. 5.The proposed CDA using different types of data from Mars.(A) column are the input images, (B) are the remaining masks after using geometrical indexes, and (C) are the final fitted craters.Similar to Fig. 4, the proposed CDA works reasonably well regardless of data type and the measurement configuration.

Fig. 6 .
Fig. 6.The proposed CDA applied to Phobos using a false colour image from Mars Reconnaissance Orbiter (MRO).(A) column is the input image, (B) are the remaining masks after filtering non-circular/elliptical shapes, and (C) are the final fitted craters.Despite the unique nature of the input image, the proposed CDA works reasonably well, indicating the flexibility of SAM-based crater detection and pattern recognition in general.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 7 .
Fig. 7. Two examples using HiRISE orthoimages from Mars.With white arrow is a false positive due to the circular elevated topography that is wrongly classified as crater.Yellow arrows highlight false negatives on craters similar to the ones that the algorithm manage to correctly detect in the same image.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)