The moon lander optimal control problem revisited

: We revisit the control problem for a spacecraft to land on the moon surface at rest with minimal fuel consumption. We show that a detailed analysis in the related 3D phase space uncovers the existence of inﬁnitely many safe landing curves, contrary to several former 2D descriptions that implicitly claim the existence of just one such curve. Our results lead to a deeper understanding of the dynamics and allows for a precise characterization of the optimal control. Such control is known to be bang-bang and our results give a full characterization of the switch position.


Introduction
How should we land safely a spacecraft on the moon surface (a so-called moon lander), so as to use the least possible amount of fuel? See Figure 1. This problem was originally suggested by Miele [5,Section 4.8] in 1962 and subsequently solved by Meditch [4] in 1964. It is our purpose to give a new simplified solution to this optimal control problem and to clarify a delicate point that may create misunderstandings.
To model this problem, we introduce the variables h(t) = height of the spacecraft at time t, v(t) =ḣ(t) = velocity of the spacecraft, m(t) = total mass of the spacecraft, α(t) = thrust at time t.
The function α(t) is the control and we assume that 0 ≤ α(t) ≤ 1, A = [0, 1]. If α(t) = 0, then the engine is switched off and the spacecraft is in free fall, while α(t) = 1 means that the maximal thrust is used against gravity. As the fuel is burnt, the mass m(t) of the spacecraft diminishes over time and the rate of change is negatively proportional to α(t). According to Newton's second law, the motion of the spacecraft is (the upward direction is the positive direction) where the parameter k > 0 represents the velocity of the exhaust gases with respect to the spacecraft. Physical constraints such as h(t) ≥ 0 (spacecraft above the moon surface) and m(t) ≥ m s (total mass larger than the empty spacecraft mass) will pop up in the course. The target is to minimize the amount of fuel used or, equivalently, to maximize the remaining fuel at the landing instant. Hence, the purpose is to maximize max α P(α) = m(τ) where τ is the first time when h(τ) = v(τ) = 0. In spite of its simple formulation, this is a fairly delicate control problem and a full solution requires a great attention to apparently harmless details. The dynamics in (1.1) may be rewritten as the 3 × 3 system so that the phase space is R 3 but, in view of the different roles played by {h, v} on the one hand and by m on the other hand, most studies have been performed by projecting the dynamics on the plane {h, v}. From a physical point of view, one expects a safe landing to occur with a positive acceleration, able to slowdown the free fall of the spacecraft: by looking at (1.1), this translates into the condition α(t) > gm(t), namely that the upward acceleration overcomes the downward weight. This leads to define the so-called safe landing curves able to steer the spacecraft safely at rest on the moon surface. As quite usual in control problems, one is then led to represent geometrically these curves and, since a 3D representation in the space {h, v, m} may be fairly complicated, in former literature the strategy was to depict the safe landing dynamics with a 2D representation in the plane {h, v}. Unfortunately, this strategy has a drawback, the description of the effective dynamics is incomplete. Due to the unknown value of m(τ), there exist infinitely many such curves, and, to our knowledge, a complete analysis in this direction has not been performed, see e.g., the monographs [1, p. 58], [2, Figure  4.10, p. 151], [3, Figure II. 2, p. 36]) and the bibliographies therein. This fact is highlighted in Section 2, where we fully describe the mutual position between these curves in dependence of the residual mass m(τ). Our 3D analysis emphasizes that they remain ordered even if projected onto the {h, v}-plane. In our opinion, this is remarkable because, even if one does expect no intersection in the full space (uniqueness of the related Cauchy problem), in principle it could happen that two of the three components coincide: one could have the same {h, v}-components but a different m. The precise statement is contained in Theorem 1 and a different point of view is highlighted in Corollary 2.
Once the behavior of the safe landing curves is clarified, for the solution of the control problem we partially follow [4], see also [2,3], complementing the arguments therein with more precise information. The full solution is given, step by step, in Section 3 and the resulting statement is given in Theorem 5, complemented with several illustrative pictures.

Characterization of the safe landing curves
The dynamics (1.1) is described by the 3 × 3 system (1.2) with initial conditions where m 0 is the initial mass of fuel plus the mass of the empty spacecraft m s > 0. As already mentioned, the goal is to land on the moon safely, maximizing the remaining fuel m(τ) − m s , where τ = τ(α) is the first time when h(τ) = v(τ) = 0. Since α = −ṁ/k, this is equivalent to minimize the total applied thrust before landing; hence, the payoff is given by As we shall see, a safe landing is possible only if the control α satisfies α(t) ≡ 1 in some left neighborhood of the final time τ. Taking this for granted, we analyze here the interval (t * , τ) when the thrust is switched on (α = 1), so that the system (1.2) becomes Note that we have here no condition on m(τ) which then plays the role of a parameter. The solution of this system is The last two equations in (2.3) are parametric representations in the phase plane {v, h} of the safe landing curves. This is a delicate point since, while (2.3) is a 3-dimensional problem, the analysis in the plane {v, h} is 2-dimensional and if one restricts to the plane the dependence on m(t) somehow disappears. This is the point that we now clarify with a full analysis in the plane able, however, to take into account the mass through its residual amount m(τ).
From a physical point of view, condition (2.4) ensures that the spacecraft is able to decelerate its free fall (ḧ ≥ 0), at least when there is no fuel left. Clearly, this is a limit situation: if no fuel is left then the thrust cannot be switched on! In order to ensure that the spacecraft is able to decelerate the free fall at any time during its flight (that is, when m s ≤ m(t) ≤ m 0 ) one needs the more restrictive assumption that Physically, condition (2.5) states that the spacecraft is able to decelerate its free fall, since the very beginning, also when the tank is completely full of fuel, namely 0 < −g + 1/m 0 ≤ḧ(t) for a.e. t. We will show that the (infinitely many) safe landing curves are determined by a unique parameter, the mass of fuel remaining at the safe landing instant τ, namely m(τ) − m s . If m(t * ) = m 0 , since when α(t) ≡ 1 the fuel is used at rate k, we have that Proof. Let us label all the related components of the two motions with an index i ∈ {1, 2}. We perform the two changes of variable σ = τ i − t and (2.3) becomes (for i = 1, 2) In turn, recalling thatv 1 ,v 2 < 0, this means that the following implication holds Let us now prove the first statement in a neighborhood of σ = 0. Two Taylor expansions give Hence, we may locally write the curves as functions and obtain Analyzing the two resulting parabolas, we infer that, Since m 1 > m 2 , this proves the claimed order of the safe landing curves (2.3) in a neighborhood of the origin (v, h) = (0, 0). We still need to show that the same order remains forever (when α(t) ≡ 1). Assume for contradiction that the two curves intersect somewhere else than at the origin, that is, where we understand that the σ i 's are the minimal instants where this occurs. Precisely because the σ i 's are minimal, this means that, prior to these instants, the mutual position of the two curves is the same as in a neighborhood of σ = 0. Hence, for the two curves to meet, the tangent vectors must be directed in a given way. More precisely, the tangent vectors are given by and, for the two curves to intersect, it must be which contradicts (2.9). This contradiction shows that the two curves never meet and, hence, that the order is the same as in a neighborhood of σ = 0. This completes the proof of the first statement.
Concerning the second statement, we notice that the parametric representation (2.8) enables us to determine the extremal points v i , h i ∈ R − × R + as follows: (2.12) By dropping the denominator k 2 and the additive constant −m 0 , we analyze the behavior of h i through the function This completes the proof.
The result in Theorem 1 is illustrated in Figure 2 where, in the plane {v, h} we plot the curves relative to (2.8) 2 -(2.8) 3 The curve below (on the left) describes (v 2 , h 2 ), the curve above (on the right) describes (v 1 , h 1 ). Incidentally, we observe that (2.10) fully characterizes the (horizontal) asymptotic behavior at (0, 0) of the safe landing curves in Figure 2. For later use, we rephrase Theorem 1 in a slightly different form.
Corollary 2. Assume that (2.5) holds and let Then h > 0 for any m ∈ [m s , m 0 ) and there exist negative and strictly decreasing functions The statement of Corollary 2 is qualitatively described in Figure 3. its inverse map. Subsequently, we define the (strictly decreasing) composite function  In order to complete the description of the safe landing curves, we merely assume (2.4) and not necessarily (2.5), and we study the parametric representation (2.8) by extending the time interval to σ ∈ [0, ∞). Let us start from the origin (σ = 0) and move clockwise, as σ increases. From (2.10) we know that, at the origin, the curve starts with horizontal tangent and behaves like a parabola. We numerically obtained the qualitative plot represented in Figure 4. The point with vertical tangent corresponds tov(σ) = 0 which, recalling (1.1) and that α = 1, yields gm(σ) = 1. Under the more stringent (physical) assumption that (2.5) holds, this point is not attained and the curve has the plot as in Figures 2 and 3, as already stated in Corollary 2. But if we merely assume (2.4), then the point with vertical tangent exists and, as mentioned above, it represents the instant in which gm(σ) = 1. Above this point the thrust is not able to balance the weight gm(σ) and the spacecraft proceeds with negative (downwards) acceleration until the amount of fuel reduces the global mass to m = 1/g.
Let us continue moving upwards (clockwise) on the curve in Figure 4. By recalling thatḣ = −v, it is clear that the curve has horizontal tangent at the very same place where it crosses the vertical axis. Finally, less interesting is the part in the half plane v > 0, because there is no need to use the thrust therein and, hence, (2.8) does not represent the dynamics of the spacecraft any more.

Solution of the minimal fuel consumption problem
Let us define what is meant by admissible control and data for (1.2). As already mentioned in the introduction, for physical reasons we are interested in solutions satisfying the state constraints m(t) ≥ m s and h(t) ≥ 0, for any t ∈ [0, τ). While the former is guaranteed by imposing (3.1), the latter will be deduced a posteriori by our analysis. As outlined in Section 2, too heavy spacecrafts cannot land safely, implying the further constraint (2.5). We prove below that in order to land safely, the control α(t) should be 1, for any t ∈ (t * , τ] and some t * ∈ [0, τ). Notice that (2.5) is a sufficient condition to have h(t) > 0 in (t * , τ), whenever h(τ) = v(τ) = 0. The condition gm(τ) ≥ 1 would imply that gm(t) > 1 in (t * , τ), hencev(t) < 0 and, in case of soft landing, v(t) > 0, yielding h(t) < 0. Since (1.2) and (3.1) ensure that m(τ) ≥ m s , if gm s ≥ 1 then the only admissible initial data are the trivial ones, (h 0 , v 0 ) = (0, 0). Therefore, (2.4) is unavoidable if one aims to analyze the nontrivial dynamics of (1.2).
In order to obtain an optimal control for the problem, we use Pontryagin Minimum Principle. Given an optimal solution (h, v, m, α), there exist and such that p = (p 1 , p 2 , p 3 ) solves the adjoint equation and satisfies the transversality condition p(τ) ∈ N C = R×R×(−∞, 0] (the normal cone at the endpoint). This means that We consider the Hamiltonian which vanishes along the optimal trajectory, that is, moreover, the optimal control (if any!) fulfills the following minimality condition: (3.6) In (3.6) we emphasized the part depending on a which has to be minimized. For the determination of the optimal control, we proceed in several steps.
• Step 1: if it exists, the optimal control is bang-bang with at most one switch and α(τ) = 1. This step is quite standard, see the original paper by Meditch [4] or [2,Section 4.7]. Here we suggest a slightly simplified form, by reducing the number of cases to be analyzed and focusing mainly on the control α. Consider the function so that the minimality condition (3.6) characterizes the optimal control α through the rule showing that the optimal control is bang-bang with at most one switch, provided that the "ambiguous situation" where ρ(t) = 0 occurs at most in one point.
To conclude, we use again (3.8): since we excluded ρ(t) ≡ 0, we see that ρ has at most one zero at some t * ∈ (0, τ). By (3.7), this shows that α has at most one switch and, since α(τ) = 1, if this switch exists it is from 0 to 1. This proves the claim of Step 1.
• Step 2: explicit form of the solution of (1.2) if the optimal control exists. From Step 1 we know that, if the optimal control exists, then it has at most one switch on time t * ∈ (0, τ). In the interval (0, t * ), when the thrust is switched off (α = 0, with spacecraft in free fall), the system (1.2), with initial conditions (2.1), has the solution This yields the constraint 10) whose representation in the phase plane {v, h} is a concave parabola having the vertex at the point (v, h) = (0, h 0 + v 2 0 /2g), see the left picture in Figure 5. Therefore, if the optimal control α exists, then the solution (h, v, m) of (1.2)-(2.1) is given by (3.9) and (2.3) in which t * < τ, possibly t * = 0 (no switch, thrust always switched on). These lines are represented (separately) in Figure 5. • Step 3: geometric interpretation of the existence of a switch on time. If (2.5) is satisfied, the components (v, h) of the optimal trajectories behave as in Figure 5: the engine is switched off until the free fall trajectory reaches a safe landing curve, and this happens at a point of the switching curve. In other words, there exists t * ≥ 0 and m ∈ [m s , m 0 ) such that (v(t * ), h(t * )) = (v, Γ m 0 (v)) for some v ∈ [v m 0 (m s ), 0), see (2.13). Then the engine is switched on and the optimal trajectory follows the safe landing curve (2.8) corresponding to a final fuel mass equal to m − m s , that is, m(τ) = m.
In the extreme situation where v = v m 0 (m s ), the safe landing occurs with no fuel left. This means that (v(t * ), h(t * )) = (v m 0 (m s ), Γ m 0 (v m 0 (m s ))) and this is the corner point of the black line in the left picture of Figure 6: when the spacecraft reaches this point the thrust must be switched on and safe landing is possible but with no fuel left after landing. The part of black line connecting the corner point and the origin is called the extremal curve and is the trajectory in the (v, h)-plane followed by the spacecraft when landing with no residual fuel. The switching curve (where h = Γ m 0 (v)) has the same endpoints as the extremal curve: it is depicted with a thinner line, see also the zoomed picture on the right of Figure 6. The gray region outside this line represents the crash region: if (v(t), h(t)) reaches this region (including at the initial instant!), then the thrust is not powerful enough to allow safe landing because the velocity v(t) is too negative compared to the (low) altitude h(t). Indeed, if the spacecraft is (e.g.,) initially at height 1m and the initial velocity is −1000km/h (downwards), there is no way to land safely. This is a very delicate part, where the 2D pictures in Figure 6 may be misleading. The points between the extremal curve and the switching curve are not admissible initial values for (v 0 , h 0 )! The reason is that they are only projections of the orbits in the 3D space. Finally, if the initial data are within the white region, then the spacecraft cannot land safely because of the lack of fuel, the corresponding parabola intersects no safe landing curves.
Apart of the extremal case, the intermediate situations are described analytically in Theorem 1 and geometrically in the right picture of Figure 6. The switching curve on the right of the extremal curve (landing with no fuel left), contains all the switch on points for α. Each point on this curve is the intersection between a concave parabola and the corresponding safe landing curve, with some positive amount of fuel left at the landing time τ.
• Step 4: analytic formulation of the geometric pattern. Aiming to make the geometric observations in Step 3 fully rigorous, we recall that the (lower) boundary of the "safe landing region" is the switching curve whose equation is h = Γ m 0 (v) for v ∈ [v m 0 (m s ), 0), see (2.13) and Figure 6. Its extremal point (v m 0 (m s ), Γ m 0 (v m 0 (m s ))) may also be expressed through the parametric representation of the switching curve so that, if t * denotes the switching instant, we have It is clear, and fully apparent from (3.11), that the corner point depends on m 0 . With this extremal point at hand, we seek the "critical parabola", namely the set of initial data such that, following (3.10), reach the point in (3.11) at some time t * > 0. This gives that represents the condition for the parabola to intersect the switching curve at the lack of fuel extremal point (3.11). But only the part of the parabola on the right of the switching curve contains the initial data allowing for a safe landing.
There are three different situations where the optimal control does not exist and the spacecraft cannot land safely: (i) the thrust is too weak, unable to stop the free fall; (ii) there is not enough fuel to stop the spacecraft; (iii) the spacecraft is too close to the moon surface and/or with too large downwards velocity. The situation described by (i) occurs if (2.5) is violated, which means that (2.4) is a necessary condition for the existence of initial data allowing a safe landing, see Step 2. Formally, this leads to our first conclusion: if gm s ≥ 1, then the only admissible initial data are the trivial ones, (h 0 , v 0 ) = (0, 0). If (2.5) is satisfied, then from (2.13) and (3.12) we derive the second conclusion: 2gk 2 log 2 m 0 m s + m s k 2 log m 0 m s , then safe landing is possible.
In this case, there exists a unique admissible control which is then optimal; these initial data lie either on the switching curve or on the concave parabola bounding the safe landing region, see the left picture of Figure 6.
Let us then see what happens when (2.5) is satisfied but (3.12) does not hold, namely when the first term in (3.12) differs from the third term. If the inequality > holds in (3.12) (either v 0 is "too negative" of h 0 is "too large"), then the spacecraft cannot land safely on the moon, due to a lack of fuel. The related parabola intersects none of the safe landing curves and the spacecraft ends again in the gray (crash) part. This gives the third conclusion m 0 m s , then safe landing is impossible which means, in fact, that there are no admissible controls.
If the inequality < holds in (3.12) (meaning that the initial data are placed below the critical parabola), two subcases may occur. If v 0 ≤ v m 0 (m s ) or v 0 > v m 0 (m s ) and h 0 < Γ m 0 (v 0 ) (gray region in the left picture in Figure 6) then the spacecraft cannot land safely because it is either too close to the moon or moving too quickly downwards. This gives the fourth conclusion if v 0 ≤ v m 0 (m s ) or v 0 > v m 0 (m s ) and h 0 < Γ m 0 (v 0 ), then safe landing is impossible which means that, also in this case, there are no admissible controls. In the second subcase where < holds in (3.12) we are in the safe landing region (yellow part in the left picture in Figure 6): then the spacecraft may land safely after one switch on time, which gives the last conclusion 2g , then safe landing is possible and the optimal control has exactly one switch on time.
In this case, there exists infinitely many admissible controls and the optimal one is bang-bang with a unique switch on time. This statement suggests to define , namely the abscissa v > 0 of the intersection between the critical parabola and the v-axis.
In summary, the above steps and conclusions prove the following statement The statement of Theorem 5 is summarized in Figure 6 that describes the different regions for a given m 0 > m s . The switch points belong to the switching curve, namely the part of the boundary connecting the (extremal) corner point with the origin, see the right picture in Figure 6. By increasing m 0 the switching curve is prolonged on the separation between the insufficient fuel and crash regions (white and gray) and the related extremal parabola is shifted upwards.