Stochastic analysis of Bernoulli processes

These notes survey some aspects of discrete-time chaotic calculus and its applications, based on the chaos representation property for i.i.d. sequences of random variables. The topics covered include the Clark formula and predictable representation, anticipating calculus, covariance identities and functional inequalities (such as deviation and logarithmic Sobolev inequalities), and an application to option hedging in discrete time.


Introduction
Stochastic analysis can be viewed as an infinite-dimensional version of classical analysis, developed in relation to stochastic processes.
In this survey we present a construction of the basic operators of stochastic analysis (gradient and divergence) in discrete time for Bernoulli processes. Our presentation is based on the chaos representation property and discrete multiple stochastic integrals with respect to i.i.d. sequences of random variables. The main applications presented are to functional inequalities (deviation inequalities, logarithmic Sobolev inequalities) in discrete settings, cf. [10,16,23], and to option pricing and hedging in discrete time mathematical finance.
This survey can be roughly divided into a first part (Sections 2 to 11) in which we present the main basic results and analytic tools, and a second part (Sections 12 to 15) which is devoted to applications.
We proceed as follows. In Section 2 we consider a family of discrete-time normal martingales. The next section is devoted to the construction of the stochastic integral of predictable square-integrable processes with respect to such martingales. In Section 4 we construct the associated multiple stochastic integrals of symmetric functions on N n , n ≥ 1. Starting with Section 5 we focus on a particular class of normal martingales satisfying a structure equation. The chaos representation property is studied in Section 6 in the case of discrete time random walks with independent increments. A gradient operator D acting by finite differences is introduced in Section 7 in connection with multiple stochastic integrals, and used in Section 8 to state a Clark predictable representation formula. The divergence operator δ, adjoint of D, is presented in Section 9 as an extension of the discrete-time stochastic integral. It is also used in Section 10 to express the generator of the Ornstein-Uhlenbeck process. Covariance identities are stated in Section 11, both from the Clark representation formula and by use of the Ornstein-Uhlenbeck semigroup.
Functional inequalities on Bernoulli space are presented as an application in Sections 12 and 13. On the one hand, in Section 12 we prove several deviation inequalities for functionals of an infinite number of i.i.d. Bernoulli random variables. Then in Section 13 we state different versions of the logarithmic Sobolev inequality in discrete settings (modified, L 1 , sharp) which allow one to control the entropy of random variables. In particular we recover and extend some results of [5], using the method of [10]. Our approach is based on the intrinsic tools (gradient, divergence, Laplacian) of infinite-dimensional stochastic analysis. We refer to [4,3,17,20], for other versions of logarithmic Sobolev inequalities in discrete settings, and to [7,28] for the Poisson case. Section 14 contains a change of variable formula in discrete time, which is applied with the Clark formula in Section 15 to a derivation of the Black-Scholes formula in discrete time, i.e. in the Cox-Ross-Rubinstein model, see e.g. [19], §15-1 of [27], or [24], for other approaches.

Discrete-time normal martingales
Consider a sequence (Y k ) k∈N of (not necessarily independent) random variables on a probability space (Ω, F , P). Let (F n ) n≥−1 denote the filtration generated by (Y n ) n∈N , i.e.
Recall that a random variable F is said to be F n -measurable if it can be written as a function F = f n (Y 0 , . . . , Y n ) of Y 0 , . . . , Y n , where f n : R n+1 → R.
Assumption 2. 1 We make the following assumptions on the sequence (Y n ) n∈N : a) it is conditionally centered:
Condition (2.1) implies that the process (Y 0 + · · · + Y n ) n≥0 is an F n -martingale. More precisely, the sequence (Y n ) n∈N and the process (Y 0 + · · · + Y n ) n≥0 can be viewed respectively as a (correlated) noise and as a normal martingale in discrete time.

Discrete stochastic integrals
In this section we construct the discrete stochastic integral of predictable squaresummable processes with respect to a discrete-time normal martingale.
Definition 3.1 Let (u k ) k∈N be a uniformly bounded sequence of random variables with finite support in N, i.e. there exists N ≥ 0 such that u k = 0 for all k ≥ N . The stochastic integral J(u) of (u n ) n∈N is defined as

N. Privault/Stochastic analysis of Bernoulli processes 438
The next proposition states a version of the Itô isometry in discrete time. A sequence (u n ) n∈N of random variables is said to be F n -predictable if u n is F n−1measurable for all n ∈ N, in particular u 0 is constant in this case.
Proposition 3.2 The stochastic integral operator J(u) extends to square-integrable predictable processes (u n ) n∈N ∈ L 2 (Ω×N) via the (conditional) isometry formula This proves the isometry property (3.3) for J. The extension to L 2 (Ω × N) follows then from a Cauchy sequence argument. Consider a sequence of bounded predictable processes with finite support converging to u in L 2 (Ω × N), for example the sequence (u n ) n∈N defined as Then the sequence (J(u n )) n∈N is Cauchy and converges in L 2 (Ω), hence we may define J(u) := lim k→∞ J(u k ).
From the isometry property (3.3) applied with n = 0, the limit is clearly independent of the choice of the approximating sequence (u k ) k∈N .
Note that by bilinearity, (3.3) can also be written as and that for n = 0 we get for all square-integrable predictable processes u = (u k ) k∈N and v = (v k ) k∈N .
Proof. It is sufficient to note that Corollary 3.6 The indefinite stochastic integral (J(u1 [0,k] )) k∈N is a discrete time martingale with respect to (F n ) n≥−1 .
Proof. We have

Discrete multiple stochastic integrals
The role of multiple stochastic integrals in the orthogonal expansions of random variables is similar to that of polynomials in the series expansions of functions of a real variable. In some cases, multiple stochastic integrals can be expressed us-ing polynomials, for example Krawtchouk polynomials in the symmetric discrete case with p n = q n = 1/2, n ∈ N, see Relation (6.2) below.
Given f 1 ∈ l 2 (N) we let As a convention we identify ℓ 2 (N 0 ) to R and let J 0 The following proposition gives the definition of multiple stochastic integrals by iterated stochastic integration of predictable processes in the sense of Proposition 3.2.
It satisfies the recurrence relation and the isometry formula Proof. Note that we have Note that since 0 ≤ i 1 < i 2 < · · · < i n and 0 ≤ j 1 < j 2 < · · · < j n we have Hence 0≤i1<···<in, 0≤j1<···<jn When n < m and (i 1 , . . . , i n ) ∈ ∆ n and (j 1 , . . . , j m ) ∈ ∆ m are two sets of indices, there necessarily exists k ∈ {1, . . . , m} such that and this implies the orthogonality of J n (f n ) and J m (g m ). The recurrence relation (4.3) is a direct consequence of (4.5). The isometry property (4.4) of J n also follows by induction from (3.3) and the recurrence relation.

Discrete structure equations
Assume now that the sequence (Y n ) n∈N satisfies the discrete structure equation: hence the hypotheses of the preceding sections are satisfied. Since (5.1) is a second order equation, there exists an F n -adapted process (X n ) n∈N of Bernoulli {−1, 1}-valued random variables such that Consider the conditional probabilities p n = P(X n = 1 | F n−1 ) and q n = P(X n = −1 | F n−1 ), n ∈ N. (5.3) From the relation E[Y n | F n−1 ] = 0, rewritten as we get and we also have the relations which yield In the symmetric case p k = q k = 1/2, k ∈ N, the image measure of P by the mapping ω → X(ω) is the Lebesgue measure on [0, 1], see [26] for the nonsymmetric case.

Chaos representation
From now on we assume that the sequence (p k ) k∈N defined in (5.3) is deterministic, which implies that the random variables (X n ) n∈N are independent. Precisely, X n will be constructed as the canonical projection X n : Ω → {−1, 1} on Ω = {−1, 1} N under the measure P given on cylinder sets by The sequence (Y k ) k∈N can be constructed as a family of independent random variables given by where the sequence (ϕ n ) n∈N is deterministic. In this case, all spaces L r (Ω, F n ), r ≥ 1, have finite dimension 2 n+1 , with basis An orthogonal basis of L r (Ω, F n ) is given by ..,k l )} ) : 0 ≤ k 1 < · · · < k l ≤ n, l = 0, . . . , n + 1 .

Let
denote the random walk associated to (X k ) k∈N . If p k = p, k ∈ N, then coincides with the Krawtchouk polynomial K n (·; N + 1, p) of order n and parameter (N + 1, p), evaluated at S N , cf. [23].
Let now H 0 = R and let H n denote the subspace of L 2 (Ω) made of integrals of order n ≥ 1, and called chaos of order n: The space of F n -measurable random variables is denoted by L 0 (Ω, F n ).
Alternatively, Lemma 6.3 can be proved by noting that and as a consequence, any F ∈ L 0 (Ω, F N ) can be expressed as Definition 6.5 Let S denote the linear space spanned by multiple stochastic integrals, i.e.
The completion of S in L 2 (Ω) is denoted by the direct sum The next result is the chaos representation property for Bernoulli processes, which is analogous to the Walsh decomposition, cf. [22]. This property is obtained under the assumption that the sequence (X n ) n∈N is i.i.d. See [8] for other instances of the chaos representation property without this independence assumption.
Proposition 6.7 We have the identity Proof. It suffices to show that S is dense in L 2 (Ω). Let F be a bounded random variable. Relation (6.4) of Lemma 6.3 shows that E[F | F n ] ∈ S. The martingale convergence theorem, cf. e.g. Theorem 27.1 in [18], implies that (E[F | F n ]) n∈N converges to F a.s., hence every bounded F is the L 2 (Ω)-limit of a sequence in S. If F ∈ L 2 (Ω) is not bounded, F is the limit in L 2 (Ω) of the sequence (1 {|F |≤n} F ) n∈N of bounded random variables.
As a consequence of Proposition 6.7, any F ∈ L 2 (Ω, P) has a unique decomposition as a series of multiple stochastic integrals. Note also that the statement of Lemma 6.3 is sufficient for the chaos representation property to hold.

Gradient operator
Definition 7.1 We densely define the linear gradient operator Note that for all k 1 , . . . , k n−1 , k ∈ N we have where in the above relation, " * " denotes the first k−1 variables (k 1 , . . . , k n−1 ) of On the other hand, D k is a continuous operator on the chaos H n since The following result gives the probabilistic interpretation of D k as a finite difference operator. Given . .). Proposition 7. 3 We have for any F ∈ S: Proof. We start by proving the above statement for an F n -measurable F ∈ S.
with from (5.5): First we note that from (6.4) we have for (k 1 , . . . , k n ) ∈ ∆ n : On the other hand if k ∈ {k 1 , . . . , k l } we have In the general case, as n goes to infinity, and since from (7.2) the operator D k is continuous on all chaoses H n , n ≥ 1, we have The next property follows immediately from Proposition 7.3.
If F has the form F = f (X 0 , . . . , X n ), we may also write . . , X n ). The gradient D can also be expressed as where F (S · ) is an informal notation for the random variable F estimated on a given path of (S n ) n∈N defined in (6.1) and S · + 1 {X k =∓1} 1 {k≤·} denotes the path of (S n ) n∈N perturbed by forcing X k to be equal to ±1.
We will also use the gradient ∇ k defined as k ∈ N, with the relation hence ∇ k F coincides with D k F after squaring and multiplication by p k q k . From now on, D k denotes the finite difference operator which is extended to any F : Ω → R using Relation (7.4). The L 2 domain of D is naturally defined as the The following is the product rule for the operator D.

Clark formula and predictable representation
In this section we prove a predictable representation formula for the functionals of (S n ) n≥0 defined in (6.1).
Proof. The formula is obviously true for F = J 0 (f 0 ). Given n ≥ 1, as a consequence of Proposition 4.2 above and Lemma 4.6 we have: Although the operator D is unbounded we have the following result, which states the boundedness of the operator that maps a random variable to the unique process involved in its predictable representation.

Lemma 8.3 The operator
is bounded with norm equal to one.
Proof. Let F ∈ S. From Relation (8.2) and the isometry formula (3.4) for the stochastic integral operator J we get with equality in case F = J 1 (f 1 ).
As a consequence of Lemma 8.3 we have the following corollary.
Corollary 8.5 The Clark formula of Proposition 8.1 extends to any F ∈ L 2 (Ω).
, the Clark formula extends to F ∈ L 2 (Ω) by a standard Cauchy sequence argument. For the second identity we use the relation Let us give a first elementary application of the above construction to the proof of a Poincaré inequality on Bernoulli space. We have . More generally the Clark formula implies the following. Corollary 8.6 Let a ∈ N and F ∈ L 2 (Ω). We have and Proof. From Proposition 3.5 and the Clark formula (8.2) of Proposition 8.1 we have which implies (8.7). Relation (8.8) is an immediate consequence of (8.7) and the isometry property of J.
As an application of the Clark formula of Corollary 8.6 we obtain the following predictable representation property for discrete-time martingales.
Proof. Let k ≥ 1. From Corollaries 7.6 and 8.6 we have: hence it suffices to let

Divergence operator
The divergence operator δ is introduced as the adjoint of D. Let U ⊂ L 2 (Ω × N) be the space of processes defined as Definition 9.1 Let δ : U → L 2 (Ω) be the linear mapping defined on U as

Proposition 9.2
The operator δ is adjoint to D: The next proposition shows that δ coincides with the stochastic integral operator J on the square-summable predictable processes.
provided all series converges in L 2 (Ω), where (ϕ k ) k∈N appears in the structure equation (5.1). We also have for all u, v ∈ U: Proof. Using the expression (4.5) of u k = J n (f n+1 ( * , k)) we have By bilinearity, orthogonality and density it suffices to take u = gJ n (f •n ), f, g ∈ ℓ 2 (N), and to note that In particular, (9.4) implies the following divergence formula provided all series converge in L 2 (Ω).
In the symmetric case p k = q k = 1/2 we have ϕ k = 0, k ∈ N, and Moreover, (9.5) can be rewritten as a Weitzenböck type identity: (9.8) The last two terms in the right hand side of (9.4) vanish when (u k ) k∈N is predictable, and in this case the Skorohod isometry (9.5) becomes the Itô isometry as shown in the next proposition.
provided the series converges in L 2 (Ω). If moreover (u k ) k∈N is predictable and square-summable we have the isometry

11)
and δ(u) coincides with J(u) on the space of predictable square-summable processes.

Ornstein-Uhlenbeck semi-group and process
The Ornstein-Uhlenbeck operator L is defined as L = δD, i.e. L satisfies LJ n (f n ) = nJ n (f n ), f n ∈ ℓ 2 (N) •n .

Proposition 10.1 For any F ∈ S we have
Proof. Note that D k D k F = 0, k ∈ N, and use Relation (9.4) of Proposition 9.3.
Note that L can be expressed in other forms, for example Let now (P t ) t∈R+ = (e tL ) t∈R+ denote the semi-group associated to L and defined as . The next result shows that (P t ) t∈R+ admits an integral representation by a probability kernel. Let q N t : Ω × Ω → R + be defined by Lemma 10.2 Let the probability kernel Q t (ω, dω) be defined by Proof. Since L 2 (Ω, F N ) has finite dimension 2 N +1 , it suffices to consider functionals of the form F = Y k1 · · · Y kn with 0 ≤ k 1 < · · · < k n ≤ N . We have for ω ∈ Ω, k ∈ N: which implies, by independence of the sequence (X k ) k∈N , Consider the Ω-valued stationary process (X(t)) t∈R+ = ((X k (t)) k∈N ) t∈R+ with independent components and distribution given by Proposition 10.8 The process (X(t)) t∈R+ = ((X k (t)) k∈N ) t∈R+ is the Ornstein-Uhlenbeck process associated to (P t ) t∈R+ , i.e. we have Proof. By construction of (X(t)) t∈R+ in Relations (10.4)-(10.7) we have Since the components of (X k (t)) k∈N are independent, this shows that the law of (X 0 (t), . . . , X n (t)) conditionally to X(0) has the density q n t (ω, ·) with respect to P: dP(X 0 (t)(ω) = ǫ 0 , . . . , X n (t)(ω) = ǫ n | X(0))(ω) = q n t (ω, ω)dP(X 0 (ω) = ǫ 0 , . . . , X n (ω) = ǫ n ). The independent components X k (t), k ∈ N, can be constructed from the data of X k (0) = ǫ and an independent exponential random variable τ k via the following procedure. If τ k < t, let X k (t) = X k (0) = ǫ, otherwise if τ k > t, take X k (t) to be an independent copy of X k . This procedure is illustrated in the following equalities:

Consequently we have
The operator L 2 (Ω × N) → L 2 (Ω × N) which maps (u k ) k∈N to (P t u k ) k∈N is also denoted by P t . As a consequence of the representation of P t given in Lemma 10.2 we obtain the following bound.
Proof. As a consequence of the representation formula (10.10) we have P(dω)a.s.:

Covariance identities
In this section we state the covariance identities which will be used for the proof of deviation inequalities in the next section. The covariance Cov(F, G) of F, G ∈ L 2 (Ω) is defined as Proof. This identity is a consequence of the Clark formula (8.2): A covariance identity can also be obtained using the semi-group (P t ) t∈R+ . Proposition 11.3 For any F, G ∈ L 2 (Ω) such that Proof. Consider F = J n (f n ) and G = J m (g m ). We have From (10.11)-(10.14) the covariance identity (11.4) shows that where (ξ i ) i∈N is a family of i.i.d. random variables, uniformly distributed on [0, 1]. Note that the marginals of (X k , X k 1 {ξ k <α} + X ′ k 1 {ξi>α} ) are identical when X ′ k is an independent copy of X k . Let Then we have the relation Next we prove an iterated version of the covariance identity in discrete time, which is an analog of a result proved in [15] for the Wiener and Poisson processes.
Theorem 11.6 Let n ∈ N and F, G ∈ L 2 (Ω). We have Proof. Take F = G. For n = 0, (11.7) is a consequence of the Clark formula. Let n ≥ 1. Applying Lemma 8.6 to D kn · · · D k1 F with a = k n and b = k n+1 , and summing on (k 1 , . . . , k n ) ∈ ∆ n , we obtain which concludes the proof by induction and bilinearity.
As a consequence of Theorem 11.6, letting F = G we get the variance inequality see Relation (2.15) in [15] in continuous time. In a similar way, another iterated covariance identity can be obtained from Proposition 11.3.
Corollary 11.8 Let n ∈ N and F, G ∈ L 2 (Ω, F N ). We have The covariance and variance have the tensorization property: if F, G are independent, hence most of the identities in this section can be obtained by tensorization of a one dimensional elementary covariance identity.
An elementary consequence of the covariance identities is the following lemma.
Then F and G are non-negatively correlated: According to the next definition, a non-decreasing functional F satisfies D k F ≥ 0 for all k ∈ N.
Definition 11.11 A random variable F : Ω → R is said to be non-decreasing if for all ω 1 , ω 2 ∈ Ω we have The following result is then immediate from Proposition 7.3 and Lemma 11.10, and shows that the FKG inequality holds on Ω. It can also be obtained from from Proposition 11.3.
Proposition 11.12 If F, G ∈ L 2 (Ω) are non-decreasing then F and G are nonnegatively correlated: Note however that the assumptions of Lemma 11.10 are actually weaker as they do not require F and G to be non-decreasing.

Deviation inequalities
In this section, which is based on [16], we recover a deviation inequality of [5] in the case of Bernoulli measures, using covariance representations instead of the logarithmic Sobolev inequalities to be presented in Section 13. The method relies on a bound on the Laplace transform L(t) = E[e tF ] obtained via a differential inequality and Chebychev's inequality.
Proposition 12.1 Let F : Ω → R be such that |F + k − F − k | ≤ K, k ∈ N, for some K ≥ 0, and DF L ∞ (Ω,ℓ 2 (N)) < ∞. Then Proof. Although D k does not satisfy a derivation rule for products, from Proposition 7.8 we have (12.2) and since the function x → (e x − 1)/x is positive and increasing on R we have: or in other terms: We first assume that F is a bounded random variable with E[F ] = 0. From Lemma 10.15 applied to DF , we have

N. Privault/Stochastic analysis of Bernoulli processes
In the general case, letting We have for all x ≥ 0 and t ≥ 0: The minimum in t ≥ 0 in the above expression is attained with where we used the inequality (1 + u) log(1 + u) − u ≥ u 2 log(1 + u). If K = 0, the above proof is still valid by replacing all terms by their limits as K → 0. If F is not bounded the conclusion holds for F n = max(−n, min(F, n)), n ≥ 1, and (F n ) n∈N , (DF n ) n∈N , converge respectively almost surely and in L 2 (Ω × N) to F and DF , with DF n 2 L ∞ (Ω,L 2 (N)) ≤ DF 2 L ∞ (Ω,L 2 (N)) .
In case p k = p for all k ∈ N, the conditions which is Relation (13) in [5]. In particular if F is F N -measurable, then Finally we show a Gaussian concentration inequality for functionals of (S n ) n∈N , using the covariance identity (11.2). We refer to [3,4,17], [20], for other versions of this inequality. Then Proof. Again we assume that F is a bounded random variable with E[F ] = 0. Using the inequality we have

Logarithmic Sobolev inequalities
The logarithmic Sobolev inequalities on Gaussian space provide an infinite dimensional analog of Sobolev inequalities, cf. e.g. [21]. On Riemannian path space [6] and on Poisson space [1], [28], martingale methods have been successfully applied to the proof of logarithmic Sobolev inequalities. Here, discrete time martingale methods are used along with the Clark predictable representation formula (8.2) as in [10], to provide a proof of logarithmic Sobolev inequalities for Bernoulli measures. Here we are only concerned with modified logarithmic Sobolev inequalities, and we refer to [25], Theorem 2.2.8 and references therein, for the standard version of the logarithmic Sobolev inequality on the hypercube under Bernoulli measures.
The entropy of a random variable F > 0 is defined by for sufficiently integrable F . Lemma 13.1 The entropy has the tensorization property, i.e. if F, G are sufficiently integrable independent random variables we have Proof. We have In the next proposition we recover the modified logarithmic Sobolev inequality of [5] using the Clark representation formula in discrete time.
Proof. Assume that F is F N -measurable and let M n = E[F | F n ], 0 ≤ n ≤ N . Using Corollary 7.6 and the Clark formula (8.2) we have x log x and using the bound we have: where we used the Jensen inequality and the convexity of (u, v) → v 2 /u on (0, ∞) × R, or the Schwarz inequality applied to 1/ √ F and (D k F/ √ F ) k∈N , as in the Wiener and Poisson cases [6] and [1]. This inequality is extended by density to F ∈ Dom (D). Theorem 13.3 can also be recovered by the tensorization Lemma 13.1 and the following one-variable argument: letting p+q = 1, p, q > 0, f : Similarly we have which, by tensorization, recovers the following L 1 inequality of [11,7], and proved in [28] in the Poisson case. In the next proposition we state and prove this inequality in the multidimensional case, using the Clark representation formula, similarly to Theorem 13.3.
Proof. Let f (x) = x log x and From the relation we have, using the convexity of Ψ:

N. Privault/Stochastic analysis of Bernoulli processes
The proof of Theorem 13.5 can also be obtained by first using the bound and then the convexity of (u, v) → v(log(u + v) − log u): The application of Theorem 13.5 to e F gives the following inequality for F > 0, F N -measurable: This implies As already noted in [7], (13.6) and the Poisson limit theorem yield the L 1 inequality of [28]. Let M n = (n + X 1 + · · · + X n )/2, F = ϕ(M n ), and p k = λ/n, k ∈ N, λ > 0. Then In the limit we obtain where U is a Poisson random variable with parameter λ. In one variable we have, still letting df = f (1) − f (−1), where ∇ k is the gradient operator defined in (7.7). This last inequality is not comparable to the optimal constant inequality of [5] since when F + k − F − k ≥ 0 the right-hand side of (13.9) grows as F + k e 2F + k , instead of F + k e F + k in (13.8). In fact we can prove the following inequality which improves (13.4), (13.6) and (13.9).
Theorem 13.10 Let F be F N -measurable. We have Clearly, (13.11) is better than (13.9), (13.7) and (13.6). It also improves (13.4) from the bound which implies By the tensorization property (13.2), the proof of (13.11) reduces to the following one dimensional lemma.
Lemma 13.12 For any 0 ≤ p ≤ 1, t ∈ R, a ∈ R, q = 1 − p, pte t + qae a − pe t + qe a log pe t + qe a ≤ pq qe a (t − a)e t−a − e t−a + 1 + pe t (a − t)e a−t − e a−t + 1 .

Proof. Set
Then g ′ (t) = pq qe a (t − a)e t−a + pe t −e a−t + 1 − pte t + pe t log(pe t + qe a ) and g ′′ (t) = pe t h(t), where which implies that h ′ (a) = 0, h ′ (t) < 0 for any t < a and h ′ (t) > 0 for any t > a. Hence, for any t = a, h(t) > h(a) = 0, and so g ′′ (t) ≥ 0 for any t ∈ R and g ′′ (t) = 0 if and only if t = a. Therefore, g ′ is strictly increasing. Finally, since t = a is the unique root of g ′ = 0, we have that g(t) ≥ g(a) = 0 for all t ∈ R.
This inequality improves (13.4), (13.6), and (13.9), as illustrated in one dimension in Figure 1, where the entropy is represented as a function of p ∈ [0, 1] with f (1) = 1 and f (−1) = 3.5. The inequality (13.11) is a discrete analog of the sharp inequality on Poisson space of [28]. In the symmetric case p k = q k = 1/2, k ∈ N, we have which improves on (13.6).
Letting and in the limit as n goes to infinity we obtain where U is a Poisson random variable with parameter λ. This corresponds to the sharp inequality of [28].

Change of variable formula
In this section we state a discrete-time analog of Itô's change of variable formula which will be useful for the predictable representation of random variables and for option hedging.
The above proposition also provides an explicit version of the Doob decomposition for supermartingales. Naturally if (f (M n , n)) n∈N is a martingale we have In this case the Clark formula, the martingale representation formula Proposition 8.9 and the change of variable formula all coincide. In this case, we have in particular If F is an F N -measurable random variable and f is a function such that Such a function f exists if (M n ) n∈N is Markov and F = h(M N ). In this case, consider the semi-group (P k,n ) 0≤k<n≤N associated to (M n ) n∈N and defined by Letting f (x, n) = [P n,N h](x) we can write

Option hedging in discrete time
In this section we give a presentation of the Black-Scholes formula in discrete time, or Cox-Ross-Rubinstein model, see e.g. [9,19], §15-1 of [27], or [24], as an application of the Clark formula.
In order to be consistent with the notation of the previous sections we choose to use the time scale N, hence the index 0 is that of the first random value of any stochastic process, while the index −1 corresponds to its deterministic initial value.
Let (A k ) k∈N be a riskless asset with initial value A −1 , and defined by where (r k ) k∈N , is a sequence of deterministic numbers such that r k > −1, k ∈ N. Consider a stock price with initial value S −1 , given in discrete time as where (a k ) k∈N and (b k ) k∈N are sequences of deterministic numbers such that We have Consider now the discounted stock price given as , n ∈ N.
If −1 < a k < r k < b k , k ∈ N, then (S n ) n∈N is a martingale with respect to (F n ) n≥−1 under the probability P * given by In other terms, under P * we have where E * denotes the expectation under P * . Recall that under this probability measure there is absence of arbitrage and the market is complete. From the change of variable formula Proposition 14.1 or from the Clark formula (8.2) we have the martingale representatioñ Definition 15.1 A portfolio strategy is a pair of predictable processes (η k ) k∈N and (ζ k ) k∈N where η k , resp. ζ k represents the numbers of units invested over the time period (k, k + 1] in the asset S k , resp. A k , with k ≥ 0. The value at time k ≥ −1 of the portfolio (η k , ζ k ) 0≤k≤N is defined as and its discounted value is defined as (1 + r k ) −1 , n ≥ −1. A n (ζ n+1 − ζ n ) + S n (η n+1 − η n ) = 0, n ≥ 0.
Note that the self-financing condition implies V n = ζ n A n + η n S n , n ≥ 0.
Our goal is to hedge an arbitrary claim on Ω, i.e. given an F N -measurable random variable F we search for a portfolio (η k , ζ k ) 0≤k≤n such that the equality holds at time N ∈ N.
Proposition 15.6 Assume that the portfolio (η k , ζ k ) 0≤k≤N is self-financing. Then we have the decomposition (1 + r k ). (15.7) Proof. Under the self-financing assumption we have hence for the discounted portfolio we get: which successively yields (15.8) and (15.7).
As a consequence of (15.7) and (15.3) we immediately obtaiñ (1 + r k ) −1 , n ≥ −1. Then the portfolio (η k , ζ k ) 0≤k≤n is self financing and satisfies in particular we have V N = F , hence (η k , ζ k ) 0≤k≤N is a hedging strategy leading to F .
In particular we have V N = F . To conclude the proof we note that from the relation V n = ζ n A n + η n S n , 0 ≤ n ≤ N , the process (ζ n ) 0≤n≤N coincides with (ζ n ) 0≤n≤N defined by (15.11).
The above proposition shows that there always exists a hedging strategy starting fromṼ (1 + r k ) −1 .
(1 + b k )(1 + a k ) 1 + r k The hedging strategy is given by Note that η k is non-negative (i.e. there is no short-selling) when f is an increasing function, e.g. in the case of European options we have f (x) = (x − K) + .