The chances of detecting life on Mars

Missions to Mars progressively reveal the past and present habitability of the red planet. The current priority for Mars science is the recognition of de ﬁ nitive biosignatures related to past or present life. Success of life detection missions requires choices of the best mission design, location on Mars and particular sample to be analyzed. It is essential therefore to incorporate as much information as possible into the mission planning stages to maximize the precious opportunities provided by robotic operation on Mars. Bayesian statistics allow us to accommodate the many unknowns associated with a mission that has yet to take place. We have used Bayesian statistics to reveal that although in situ missions are less complex the overall probabilities of a successful mission to detect biosignatures on Mars are higher for sample return. If a mission has been designed with safe landing and operation as a priority, recognizing and avoiding those samples that do not contain the target biosignature is the most important char-acteristic, while for a mission where the best possible samples have been targeted the probability that the sample contains the target biosignature and that it can be correctly detected is the most dominant issue. Usefully, Bayesian statistics can be used to evaluate the chances of detecting past or present life for missions to different landing sites on Mars. A comparative assessment of Eberswelde Crater and Gale Crater indicates a higher probability of success for the latter and the probabilities of success are con- sistently higher for the sample return mission variant. Bayesian statistics, therefore, can inform future Mars mission planning steps to help maximize the possibility of success.


Introduction
Detecting evidence of life in samples of Mars is a major scientific preoccupation. Space missions can employ two approaches to the challenge, namely in situ analysis on Mars or the return of samples for analysis on Earth. In situ approaches have been tried but have not yet provided the searched-for evidence of life (Biemann et al., 1976;Leshin et al., 2013;Ming et al., 2014), although controversy still exists over the in situ Viking data (Levin, 2014), while sample return missions are still in the planning stages (McLennan et al., 2012). Each mission to Mars provides incremental data that improves our knowledge of the martian environment. Some of this data is sought after while other data is unexpected and fortuitous. With every increase in background knowledge subsequent planning is more informed and the probabilities of successful future missions enhanced. However, owing to the great expense of martian missions and the infrequency of their occurrence, other ways of improving mission planning are desirable.
Statistical approaches are one way in which mission design can be improved (Sims et al., 2002). Bayesian methods (Sivia and Skilling, 2006) in particular are useful because they can accommodate the significant unknowns associated with a mission that has yet to take place. Bayesian statistics produce degrees of belief or "Bayesian probabilities". The Bayesian approach has been used previously to decide the amounts of sample needed to be collected during sample return missions to carbonaceous asteroids (Carter and Sephton, 2013) and to target samples and perform interpretations of inconclusive data on Mars organic matter detection missions (Sephton and Carter, 2014). Benefits of a Bayesian statistical approach include identification of key components to which mission success is most sensitive. While the values of estimated inputs into the statistics may be modified as new data is acquired, the relative importance of individual types of data is unlikely to change. With the parts of missions to which overall success is most sensitive constrained, future mission design can take account of these findings and allocate resources accordingly.
Increasing mission complexity requires progressively more intricate statistical analysis, so for the purposes in this paper we will consider a relatively simple mission that will capture the fundamentals of Bayesian analysis. We will assume the following: (i) only one sample will be collected, (ii) the mission has only one sampling tool and (iii) only one type of target rock is to be sampled. The context in which these mission objectives will be operated will be varied, but these fundamental assumptions will remain. Many choices of values to include in the calculations can be updated as more information is received from Mars and the most accurate values will be perpetually open to debate, yet we hope that the method we establish provides a useful means of comparing mission designs.

Defining mission components for a simple mission
To identify a space mission with the highest probability of success we need to consider four components of the mission whose probability of occurrence will influence the likelihood for mission success. These probabilities are: 1. J and J are the propositions that the journey required is, or is not, completed successfully. 2. S and S are the propositions that we can successfully, or unsuccessfully, acquire a single sample at the designated sample site. 3. L and L are the propositions that the sample does, or does not, contain the target biosignature. 4. T and T are the propositions that we have, or do not have, a positive test result for the target biosignature.

Defining mission outcomes (dependent probabilities)
There are six possible outcomes to the scientific mission and for each outcome we can calculate the probability of it occurring. The mission outcomes can be thought of as dependent probabilities and are listed in Table 1.

Defining mission steps (independent probabilities)
The six dependent probabilities above cover all possibilities and so must sum to one. From this requirement we know that we can have at most five independent pieces of information and the remaining probability is simply the sum of the five independent probabilities subtracted from a total of one. Note that changing the values of P J I ( ) | or P S J I ( , ) | will necessarily change all of the other probabilities.
So we now introduce five independent probabilities ( Table 2). The dependent and independent probabilities can appear very similar, e.g. P T L S J I ( , , , ) | and P T L S J I ( , , , ) | , but are mathematically different. To appreciate the difference one must note the position of the vertical line in the two probabilities. This vertical line divides the things we assume we do not know from those that we assume we do know. In P T L S J I ( , , , ) | we assume that we have some background knowledge I ( ), but that T , L, S and J are unknown and we wish to know the probability that T and L, and S and J occur simultaneously. Whereas in P T L S J I ( , , , ) | we assume L, S and J are known and only the probability of T occurring remains to be calculated.

Probability estimation methodologies
In this section we consider how to estimate the independent probabilities for a case involving a single sample, a single sample tool and one target rock type.

Journey probabilities
The probability that a journey can be completed successfully will depend on where we start, where we want to get to and how we transition between the two. Any journey, e.g. between the points A and B, can be broken down into a series of steps. There will be an intermediate point, e.g. C, and the first step will be A C → and the second step will be C B → . This process can be iterated so that any journey can be broken down into many short steps. The probability of completing a journey is the product of the probability of completing each step.
The number of steps that a journey is broken into is a matter of convenience. What is important is the ability to assign a meaningful probability to complete the chosen steps. It is possible that a step, e.g. C B → , can be completed in two, or more, ways. The particular way chosen will depend on information that is not currently available. What matters at this stage of the analysis is that we can estimate P C B ( ) → using some appropriate methodology. Perhaps the most relevant example of two different journey types is provided by comparing in situ and sample return missions to Mars (Fig. 1). In situ missions rely on analyses on or near the surface of Mars to achieve their objectives. Sample return missions select samples on Mars but rely on extensive analyses in Earth laboratories to meet mission goals. To date, only in situ Mars missions have taken place. Substantial planning is taking place for Mars Sample Return and statistical approaches can form part of ongoing preparation activities.
In situ and sample return missions present different engineering challenges. While some features are common to both mission types, sample return also requires sample storage, departure from Mars, transport to Earth and recovery in a fashion that maintains sample integrity. Mission designs for Mars Sample Return involve the collection and temporary storage (caching) of material on the surface of Mars, before its recovery by a separate mission. If caching is involved, the journey can be complex because a sample must be obtained at one site and then transported to a suitable storage location. Table 1 Mission outcomes (dependent probabilities) and their definitions.

#
Dependent probabilities Definition DP1 P J I ( ) | This is the probability that the journey is not completed successfully. The outcome is that no sample arrives at the point of measurement. When calculating this probability we are allowed to use whatever background knowledge (I ) that we have DP2 P S J I ( , ) | This is the probability that we have a successful journey but do not obtain a sample. Again we can use background knowledge when we calculate this probability DP3 P T L S J I ( , , , ) | This is the probability for our preferred outcome. Namely, a positive test result on a sample that contains the target biosignature, which has happened after a successful journey and sample collection step DP4 P T L S J I ( , , , ) | This is the probability for an outcome we would prefer to avoid. We successfully acquire a sample containing the target biosignature, but the test returns a negative result following some sort of failure in the physical test or the analysis DP5 P T L S JI ( , , , ) | This is the probability for another outcome that we would wish to avoid. We get a positive test result from a sample that does not contain the target biosignature DP6 P T L S JI ( , , , )~| This is the probability of a negative test result from a sample that does not contain the target biosignature. This outcome is one we would prefer not to experience, but as a true result is better than the outcomes that involve testing errors An example of the steps for a journey to gather samples from the martian surface may contain: 1. Earth surface to Earth orbit; 2. Earth orbit to Mars orbit; 3. Mars orbit to Mars surface landing site; 4. landing site to sample collection site.
For in situ measurements these four steps would constitute the complete journey. For a sample return mission we would have a lengthy and more complex list of stages to the journey, which might be: 1. Earth surface to Earth orbit; 2. Earth orbit to Mars orbit; 3. Mars orbit to Mars surface landing site; 4. landing site to sample collection site; 5. transfer of sample to return vehicle at sample collection site; 6. sample collection site to Mars orbit; 7. Mars orbit to Earth orbit; 8. Earth orbit to Earth landing site; 9. Earth landing site to sample receiving facility.
The probability for the journey element of a sample return mission will be the probability of the two separate journeys.
When estimating if we can obtain a sample at a location with a specific tool there are two issues to consider: does the tool operate as designed, and does the target rock exist at the location. Therefore the probability that we can obtain a sample at a specified site is given by P S J I P P ( , ) (tool works as designed) (rock type exists at sample location) | = × Different tools will have different probabilities of working as designed due to the complexity of their operation. A scoop will be more likely to operate as designed than a rock abrasion tool or corer which in turn is more likely to operate successfully when compared to a more complex drill.
The probability that a rock type exists will depend on the ensemble of rock types that might be present. If we assume that the target biosignature is contained in the rock and not in unconsolidated regolith then P P P P P P (target rock present) For example, for a location that is judged to have a probability of 1/3 that it is regolith covered, a probability of 1/3 that it is suitable target rock and a probability of 1/3 that it is a mixture, then P(target rock present) ¼2/3.

Probability of representative samples
Assuming the target biosignature is present somewhere in the sampled horizon, whether one will recover the target biosignature will depend on (i) the size and shape of the sample as well as (ii) where within the sampled rock you might expect to find the target biosignature. Each type of rock will have its own structure and can be modeled mathematically to get a good estimate of the probability of the target biosignatures being present within a certain sized subsample. The probability of detecting organic Table 2 Mission steps (independent probabilities) and their definitions.

#
Independent probabilities Definition This is the probability of successfully completing the necessary journey IP2 P S J I ( , ) | This is the probability of successfully obtaining a sample when we assume that the journey is completed successfully; we can also use our background knowledge for this probability IP3 P L S J I ( , , ) | This is the probability that the sample contains the target biosignature when we assume that the journey is completed successfully and that a sample was successfully obtained IP4 P T L S J I ( , , , ) | This is the probability that we obtain a true positive result; there is also a related probability of getting false negative result IP5 P T L S J I ( , , , ) |~This is the probability of getting a false positive result; there is also a related probability of getting a true negative result matter in extraterrestrial samples of various sizes has been investigated before. Previous work (Carter and Sephton, 2013) has discovered that for organic matter in an object such as the Murchison meteorite nanometer-sized subsamples (where the axes of a three-dimensional sample are less than a nanometer in length) are never representative of the whole sample but micrometer-sized subsamples can be representative of the whole at low probabilities (90%) if the subsample shape is a cube. Potential mineralogy of Mars sample could be highly variable so it is appropriate to simply adopt the previously published probabilities (Carter and Sephton, 2013) as a preliminary estimate.

True positive result probabilities
Having successfully obtained a sample containing the target biosignature it then has to be transferred to the testing device and a measurement made. Our hope is that this will result in a positive test result. However there are a number of things that could prevent a true positive, e.g. there may be too little of the target biosignature present for it to be detected, or there may be some error during the measurement process. For the probability of a true positive, values of one or zero are unlikely to be realistic because in practice absolute certainty is elusive.

False positive result probabilities
Having successfully obtained a sample it is possible that it does not contain the target biosignature but that we still manage to obtain a positive test result. Such false positives can occur for two reasons: either (i) contamination has occurred or (ii) an error in the testing procedure has resulted in a false positive. Once again values of zero or one are unlikely to be realistic.

Defining relationships between probabilities
Each of the independent probabilities can be assigned any value between zero and one. Although zero and one themselves are unattractive and unrealistic assignments, it can be shown that the six dependent probabilities (mission outcomes) can be expressed as functions of these five independent probabilities (mission steps). The relationships between probabilities are presented in Table 3.

The influence of sampling location
Each of the five independent probabilities (mission steps) could depend on the location from which the sample is taken. The nature of the sample may change with location, e.g. loose sand in one area and solid rock in another. If a scoop is employed then, from an engineering viewpoint, sand will be easier to sample than solid rock. Geological materials from different locations will have been subject to different geological processes and hence can be expected to differ in the probability of the target biosignature being present. For instance, for a mission whose objective is to detect biosignatures of past or present life, Amazonian basaltic rock in one location may have a low probability of containing the target biosignature because basaltic rocks represent poorly habitable conditions and have poor biosignature preservation potentials. By contrast, Noachian clay-rich rocks reflect habitable conditions and are associated with high biosignature preservation potentials. If the rocks from two locations are different then the probability for success and failure may change. So the probabilities for all six of the dependent probabilities (mission outcomes) can change with location. Knowledge of how probabilities increase or decrease with location choice represents an important goal for Mars mission science.
If we have a list of possible sample locations we need a systematic method to compare the relative desirability of each location. Potentially there is an infinite list of locations if we were to consider the complete martian surface. If there is a best location then there must be a single function that would allow us to identify this one location from all of the possibilities. A possible choice for this function would be to produce weightings for the six possible dependent probabilities (mission outcomes) ( Table 4).
The best sampling location is then identified by calculating a utility function which is a weighted sum of the probabilities of the six possible dependent probabilities (mission outcomes) listed in Table 4 as follows: The various weights are set such that w [ 1, 1] i ∈ − + captures their relative importance. One should set the values of the weights prior to examining the outcome probabilities of any possible sample locations.

The influence of landing site selection
The most dominating factor on sampling location is the choice of landing site. Landing sites can be crudely subdivided into two types. The first type of landing site is defined by relatively safe conditions achieved by preferring moderate latitudes and low altitudes and by selecting places with gentle topography, few rock exposures and limited dune cover. This type of landing site lends itself to safe landing and was favored for early Mars lander missions such as Viking (Moore and Jakosky, 1989). The second type of landing site has conditions that are more challenging from an engineering viewpoint with more extreme latitudes, higher altitudes, boulder fields, near surface rocks and exposures that could include steep sided cliffs. This type of landing site is more suitable for scientific investigations because the relaxation of engineering restrictions can give access to subsurface materials recently exposed in cliff faces that provide higher probabilities of detecting records of past life. Access to the martian subsurface has some negative associations because, in the most amenable sites, landing (and trafficking if a rover is involved) is more hazardous. As the current Mars exploration program matures, the nature of preferred landing sites is migrating from the engineering to science-focussed locations.

The engineering-focused landing site
A first possible set of weights assumes that engineering considerations associated with mission safety are paramount and supersede scientific objectives. Hence w 1 1 = − because we wish to Table 3 Relationships between independent and dependent probabilities.
avoid locations where the journey is difficult to complete successfully, w 1 2 = − because we wish to avoid locations where it is difficult to obtain a sample, w 1 3 = + because this represents a scientifically successful measurement, w 1 4 = − because we wish to avoid false positive results, w 1 5 = − because we wish to avoid false negative results, and w 0 6 = because we can only have five independent probabilities so we choose to be indifferent to this one.

The science-focused landing site
A second set of weights assumes that scientific considerations are the highest priority irrespective of engineering difficulties: w 0 1 = because we are ambivalent about whether the journey is difficult or easy, our primary concern is obtaining a true positive result, w 0 2 = because of the same reasons, w 1 3 = + as this represents a scientifically successful measurement, w 1 4 = − because we wish to avoid false positive results, w 1 5 = − because we wish to avoid false negative results, and w 0 6 = because we are indifferent to this result.
6. Comparing different mission designs 6.1. Comparing engineering (probably safe) and scientific (potentially challenging) landing sites In Table 5 probabilities are given that relate to two hypothetical mission priorities (engineering with a safety-driven approach and science with its focus on obtaining the target biosignature) that necessitate two different sampling locations with varying levels of difficulty for sample acquisition. The assigned probabilities are open to debate and can be modified as future technologies are Table 4 The various weights for the six dependent probabilities (mission outcomes).

#
Weighted dependent probabilities Definition W1 w P JI ( ) 1 ×~| The weight of the probability that the journey is not completed successfully W2 w P S JI ( , ) 2 ×~| The weight of the probability that we have a successful journey but do not obtain a sample W3 w P T L S JI ( , , , ) 3 × | The weight of the probability that we have a positive test result on a sample that contains the target biosignature W4 w P T L S JI ( , , , ) 4 ×~| The weight of the probability that we successfully acquire a sample containing the target biosignature, but the test returns a negative result following some sort of failure in the physical test or the analysis W5 w P T L S JI ( , , , ) 5 ×~| The weight of the probability that we get a positive test result from a sample that does not contain the target biosignature W6 w P T L S JI ( , , , ) 6 ×~~| The weight of the probability that we get a negative test result from a sample that does not contain the target biosignature Table 5 Individual probabilities of successful missions associated with an engineering focussed landing site where safety is the highest priority and a science focussed landing site where safety is not prioritized.  Table 6 The utility function for successful missions associated with an engineering focussed landing site where safety is the highest priority and a science focussed landing site where safety is not prioritized.

Location A (engineering)
Location B (science) Engineering (safest possible mission) focused Science (best possible sample) focused f 2 À 0.0160 0.1900 Table 7 Individual probabilities of successful missions associated with in situ and sample return missions to Mars. developed. For the purpose of this paper, however, we assign values for the two mission priorities based on the following assumptions: The probability of successfully completing the journey is higher for an engineering (safety) focussed mission that contains less risk compared to a science-focussed mission which involves more risk.
The probability of successfully obtaining a sample is higher in safer environments targeted by an engineering (safety) focussed mission rather than the more difficult terrain associated with a science-focussed mission.
The probability of obtaining a sample which contains the target biosignature is higher when science is the focus of the mission rather than engineering (safety) as a priority.
In both scenarios we have kept the values for true positive and false positive test results the same. The probability of a true positive measurement is high for a well tested instrument that is used to measure the correct sample. The probability of a false positive measurement is low for a well tested instrument that is used to measure the correct sample.
If we choose the first set of weights that focus on engineering and safety considerations then f 1 , which measures the success of the mission, is maximized by location A, which is a safe site in which to land and operate (Table 6). If we choose the second set of weights that focus on science then f 2 is maximized by location B, which is geologically more varied but contains more science opportunities. During planning the question must be asked, therefore, is it better that the mission is seen to succeed by obtaining a sample to test, even if the probability of meeting the primary success criteria is lower, at location A, or do you risk complete failure at location B but with a higher chance of meeting the primary success criteria?
It is important to note that the weights that have the greatest effect on the final probability of success are those that have the greatest dependent probabilities (misson outcomes). The sensitivities identified could help to guide investment in mission preparation and operation (Table 5).

Comparing in-situ and return missions
We can apply our statistical approach outlined above to compare the probability of success to in situ and sample return missions to Mars or more importantly identify those features to which the success of both types of mission is most sensitive. We chose a science-focussed landing site for our comparison of in situ and sample return missions (Table 7). As before, assigned probabilities are open to discussion and will change as future technologies are developed and our understanding of Mars increases. For the purpose of this paper, however, we assign values for the two mission designs based on the following assumptions: The probability of completing the journey is higher for the in situ mission because such a mission involves fewer steps.
The probability of obtaining a sample in a form suitable for analysis is higher for a sample return mission because sample preparation on Mars is less necessary.
The probabilities of whether the sample contains the target biosignature are the same for both cases.
The probabilities of true positives are higher for a sample return mission because of the more exhaustive analyses available for in situ missions that could constrain and discount any contamination despite more contamination opportunities occurring.
The probabilities of false negatives are lower for an in situ mission because of the same reasons as above.
In Table 8 we compare the overall probabilities for in situ and sample return missions for samples obtained from the same site. Table 8 The utility function for successful missions associated with in situ and sample return missions to Mars.

In situ on Mars
Mars Sample Return f 1 À 0.2496 À 0.0424 f 2 0.1904 0.3976 Table 9 Abridged data for landing sites on Mars (Grant et al., 2011;Kereszturi, 2012  To compare two actual landing sites we chose Gale Crater and Eberswalde Crater (Fig. 2). These locations were part of the four shortlisted candidates for Mars Science Laboratory (Grant et al., 2011). Although there are some features common to both sites each has different altitudes, latitudes, geology and dune coverage (Kereszturi, 2012). The Mars Science Laboratory Curiosity Rover is currently operating in Gale Crater. Abridged data for Eberswalde and Gale are presented in Table 9.
For demonstration of the Bayesian approach, certain features for the two craters can be highlighted. Both Gale Crater and Eberswalde Crater have strong scientific reasons for investigation. Gale Crater formed in the Noachian and has a large central mound with a kilometers thick sequence that displays strata containing the Noachian-Hesperian boundary and an associated transition from clay and sulfate to sulfate and oxide mineralogies (Milliken et al., 2010). Eberswalde Crater formed in the Late Noachian to Early Hesperian (Rice et al., 2013) and contains a fan-shaped deposit thought to represent an ancient delta with clay-rich channels deposited from liquid water (Malin and Edgett, 2003). The channels post-date ejecta from the nearby Holden Crater implying the flow of water after the Early-Late Hesperian (Rice et al., 2013). It is reasonable to assume that the presence and abundance of rocks which reflect habitable conditions can be used to imply the likelihood of detecting biosignatures. Hence the probabilities for success for Gale Crater and Eberswalde Crater can be estimated (Table 10).
Some contrasting characteristics of Gale Crater and Eberswalde Crater can be used to assess the influence of different landing sites on the probabilities of mission success. Eberswalde Crater has characteristics that make landing relatively difficult relative to Gale Crater. Eberswalde Crater has an order of magnitude lesser dune coverage and an order of magnitude more accessible exposures (Kereszturi, 2012). Eberswalde Crater is also situated at relatively high latitudes compared to Gale Crater.
Again, assigned probabilities of successful missions can be the subject of extensive deliberation and will change as future technologies are developed and our understanding of locations on Mars increases. For the purpose of this paper, however, we assign values for the two mission designs based on the different features of the two craters outlined above and the following assumptions: The probability of completing the journey is higher for Gale Crater because it has a lower latitudinal setting and a lower altitude than Eberswalde Crater.
The probability of obtaining a sample in a form suitable for analysis is higher for Gale Crater because of the much higher percentage of accessible exposures compared to Eberswalde Crater.
The probability of whether the sample contains the target biosignature is higher for the phyllosilicate and sulfate-rich layers of Gale Crater which reflect past liquid water and high organic preservation potential, but also redox opportunities for life; Eberswalde Crater offers only the first two of these features.
As before the true positive to false positive ratios are higher for the Mars Sample Return variant of each mission.
In Table 11 we compare the utility functions for success for in situ and sample return missions to Eberswelde and Gale Crater on Mars.
For the given probabilities we see that in both locations a sample return mission is preferred under both the engineering (safety) and science focus criteria. If we can only carry out in situ measurement then Gale Crater is the preferred location under both criteria. This is also the case for sample return missions. This result is in accord with the independent probabilities for the two craters with Gale Crater displaying generally higher values than Eberswelde Crater.

Conclusions
Bayesian statistical analysis reveals that the probability of success and failure of Mars missions can be characterised by a   number of outcomes with dependent probabilities, which can be expressed by a smaller number of independent probabilities. The existence of relationships between independent probabilities (mission steps) and dependent probabilities (mission outcomes) guarantees that if values can be assigned to mission steps then probabilities can be calculated for each of the six possible mission outcomes. Bayesian statistical approaches, therefore, allow probabilities of success for different Mars Life Search mission designs to be generated and compared long before they operate on the red planet. Bayesian statistics reveal that the size of the sample will have a significant effect on the probability of successfully detecting the target biosignatures. Irrespective of sample size, successful detection can be made uncertain by contamination or problems with the measurement technique which can lead to false positives or false negatives. For an engineering (safety) focused mission, the probability that samples which do not contain the target biosignature, can be recognized as such and then avoided is the most important issue, while for a science focused mission the probability that samples which do contain the target biosignature can be recognized and then collected, is the most dominant concern. The overall probabilities of a successful mission to detect biosignatures on Mars are universally higher for a sample return mission than an in situ mission, despite the sample return misson involving more engineering risks. Bayesian statistical analysis can be used to determine the probability of success for missions to different landing sites on Mars. An example comparison of Eberswelde Crater and Gale Crater revealed a higher probability of success for the latter based on both engineering and science considerations.