Keywords

1 Introduction

This is a technical paper where we analyze in detail invariance, existence, and uniqueness of solutions for nonlinear valuation equations inclusive of credit risk, collateral margining with possible re-hypothecation, and funding costs. In particular, we study conditions for existence, uniqueness, and invariance of the comprehensive nonlinear valuation equations first introduced in Pallavicini et al. (2011) [11]. After briefly summarizing the cash flows definitions allowing us to extend valuation to default closeout, collateral margining with possible re-hypothecation and treasury funding costs, we show how such cash flows, when present-valued in an arbitrage-free setting, lead straightforwardly to semi-linear PDEs or more generally to FBSDEs. We study conditions for existence and uniqueness of such solutions.

We formalize an invariance theorem showing that even though we start from a risk-neutral valuation approach based on a locally risk-free bank account growing at a risk-free rate, our final valuation equations do not depend on the risk-free rate at all. In other words, we do not need to proxy the risk-free rate with any actual market rate, since it acts as an instrumental variable that does not manifest itself in our final valuation equations. Indeed, our final semi-linear PDEs or FBSDEs and their classical solutions depend only on contractual, market or treasury rates and contractual closeout specifications once we use a hedging strategy that is defined as a straightforward generalization of the natural delta hedging in the classical setting.

The equations’ derivations, their numerical solutions, and the invariance result had been analyzed numerically and extended to central clearing and multiple discount curves in a number of previous works, including [3, 5, 10,11,12], and the monograph [6], which further summarizes earlier credit and debit valuation adjustment (CVA and DVA) results. We refer to such works and references therein for a general introduction to comprehensive nonlinear valuation and to the related issues with valuation adjustments related to credit (CVA), collateral (LVA), and funding costs (FVA). In this paper, given the technical nature of our investigation and the emphasis on nonlinear valuation, we refrain from decomposing the nonlinear value into valuation adjustments or XVAs. Moreover, in practice such separation is possible only under very specific assumptions, while in general all terms depend on all risks due to nonlinearity. Forcing separation may lead to double counting, as initially analyzed through the Nonlinearity Valuation Adjustment (NVA) in [5]. Separation is discussed in the CCP setting in [3].

The paper is structured as follows.

Section 2 introduces the probabilistic setting, the cash flows analysis, and derives a first valuation equation based on conditional expectations. Section 3 derives an FBSDE under the default-free filtration from the initial valuation equation under assumptions of conditional independence of default times and of default-free initial portfolio cash flows. Section 4 specifies the FBSDE obtained earlier to a Markovian setting and studies conditions for existence and uniqueness of solutions for the nonlinear valuation FBSDE and classical solutions to the associated PDE. Finally, we present the invariance theorem: when adopting delta-hedging, the solution does not depend on the risk-free rate.

2 Cash Flows Analysis and First Valuation Equation

We fix a filtered probability space \((\varOmega , \mathscr {A}, \mathbb {Q})\), with a filtration \((\mathscr {G}_u)_{u\ge 0}\) representing the evolution of all the available information on the market. With an abuse of notation, we will refer to \((\mathscr {G}_u)_{u\ge 0}\) by \(\mathscr {G}\). The object of our investigation is a portfolio of contracts, or “contract" for brevity, typically a netting set, with final maturity T, between two financial entities, the investor I and the counterparty C. Both I and C are supposed to be subject to default risk. In particular we model their default times with two \(\mathscr {G}\)-stopping times \(\tau _I,\tau _C\). We assume that the stopping times are generated by Cox processes of positive, stochastic intensities \(\lambda ^I\) and \(\lambda ^C\). Furthermore, we describe the default-free information by means of a filtration \((\mathscr {F}_u)_{u\ge 0}\) generated by the price of the underlying \(S_t\) of our contract. This process has the following dynamic under the measure \(\mathbb {Q}\):

$$ dS_t=r_tS_tdt+\sigma (t,S_t)dW_t $$

where \(r_t\) is an \(\mathscr {F}\)-adapted process, called the risk-free rate. We then suppose the existence of a risk-free account \(B_t\) following the dynamics

$$ dB_t=r_tB_tdt. $$

We denote \(D(s,t,x)=e^{-\int _s^tx_udu}\), the discount factor associated to the rate \(x_u\). In the case of the risk-free rate, we define \(D(s,t):=D(s,t,r)\).

We further assume that for all t we have \(\mathscr {G}_t=\mathscr {F}_t\vee \mathscr {H}^I_t\vee \mathscr {H}_t^C\) where

$$\begin{aligned} \begin{aligned} \mathscr {H}_t^I=\sigma ({1}_{\{ \tau _I\le s\}}, \ s\le t),\\ \mathscr {H}_t^C=\sigma ({1}_{\{ \tau _C\le s\}}, \ s\le t). \end{aligned} \end{aligned}$$

Again we indicate \((\mathscr {F}_u)_{u\ge 0}\) by \(\mathscr {F}\) and we will write \(\mathbb {E}^{\mathscr {G}}_t[\cdot ] := \mathbb {E}[\cdot | \mathscr {G}_t]\) and similarly for \(\mathscr {F}\). As in the classic framework of Duffie and Huang [8], we postulate the default times to be conditionally independent with respect to \(\mathscr {F}\), i.e. for any \(t>0\) and \(t_1,t_2\in [0,t]\), we assume \(\mathbb {Q}\{\tau _I>t_1,\tau _C>t_2 | \mathscr {F}_t\}=\mathbb {Q}\{\tau _I>t_1|\mathscr {F}_t\}\mathbb {Q}\{\tau _C>t_2 | \mathscr {F}_t\}\). Moreover, we indicate \(\tau =\tau _I\wedge \tau _C\) and with these assumptions we have that \(\tau \) has intensity \(\lambda _u=\lambda _u^I+\lambda _u^C\). For convenience of notation we use the symbol \(\bar{\tau }\) to indicate the minimum between \(\tau \) and T.

Remark 1

We suppose that the measure \(\mathbb {Q}\) is the so-called risk-neutral measure, i.e. a measure under which the prices of the traded non-dividend-paying assets discounted at the risk-free rate are martingales or, in equivalent terms, the measure associated with the numeraire \(B_t\).

2.1 The Cash Flows

To price this portfolio we take the conditional expectation of all the cash flows of the portfolio and discount them at the risk-free rate. An alternative to the explicit cash flows approach adopted here is discussed in [4].

To begin with, we consider a collateralized hedged contract, so the cash flows generated by the contract are:

  • The payments due to the contract itself: modeled by an \(\mathscr {F}\)-predictable process \(\pi _t\) and a final cash flow \(\varPhi (S_T)\) payed at maturity modeled by a Lipschitz function \(\varPhi \). At time t the cumulated discounted flows due to these components amount to

    $$ {1}_{\{ \tau >T\}}D(0,T)\varPhi (S_T)+\int _t^{\bar{\tau }}D(t,u)\pi _udu. $$
  • The payments due to default: in particular we suppose that at time \(\tau \) we have a cash flow due to the default event (if it happened) modeled by a \(\mathscr {G}_\tau \)-measurable random variable \(\theta _\tau \). So the flows due to this component are

    $$ {1}_{\{ t<\tau<T\}}D(t,\tau )\theta _\tau ={1}_{\{ t<\tau <T\}}\int _t^TD(t,u)\theta _u d{1}_{\{ \tau \le u\}}. $$
  • The payments due to the collateral account: more precisely we model this account by an \(\mathscr {F}\)-predictable process \(C_t\). We postulate that \(C_t>0\) if the investor is the collateral taker, and \(C_t<0\) if the investor is the collateral provider. Moreover, we assume that the collateral taker remunerates the account at a certain interest rate (written on the CSA); in particular we may have different rates depending on who the collateral taker is, so we introduce the rate

    $$\begin{aligned} c_t={1}_{\{ C_t>0\}}c_t^++{1}_{\{ C_t\le 0\}}c_t^- \ , \end{aligned}$$
    (1)

    where \(c_t^+,c_t^-\) are two \(\mathscr {F}\)-predictable processes. We also suppose that the collateral can be re-hypothecated, i.e. the collateral taker can use the collateral for funding purposes. Since the collateral taker has to remunerate the account at the rate \(c_t\), the discounted flows due to the collateral can be expressed as a cost of carry and sum up to

    $$ \int _t^{\bar{\tau }}D(t,u)(r_u-c_u)C_udu. $$
  • We suppose that the deal we are considering is to be hedged by a position in cash and risky assets, represented respectively by the \(\mathscr {G}\)-adapted processes \(F_t\) and \(H_t\), with the convention that \(F_t>0\) means that the investor is borrowing money (from the bank’s treasury for example), while \(F<0\) means that I is investing money. Also in this case to take into account different rates in the borrowing or lending case we introduce the rate

    $$\begin{aligned} f_t={1}_{\{ V_t-C_t>0\}}f_t^++{1}_{\{ V_t-C_t\le 0\}}f_t^-. \end{aligned}$$
    (2)

    The flows due to the funding part are

    $$ \int _t^{\bar{\tau }}D(t,u)(r_u-f_u)F_udu. $$

    For the flows related to the risky assets account \(H_t\) we assume that we are hedging by means of repo contracts. We have that \(H_t>0\) means that we need some risky asset, so we borrow it, while if \(H<0\) we lend. So, for example, if we need to borrow the risky asset we need cash from the treasury, hence we borrow cash at a rate \(f_t\) and as soon as we have the asset we can repo lend it at a rate \(h_t\). In general \(h_t\) is defined as

    $$\begin{aligned} h_t={1}_{\{ H_t>0\}}h^+_t+{1}_{\{ H_t\le 0\}}h^-_t. \end{aligned}$$
    (3)

    Thus we have that the total discounted cash flows for the risky part of the hedge are equal to

    $$ \int _t^{\bar{\tau }}D(t,u)(h_u-f_u)H_udu. $$

The last expression could also be seen as resulting from \((r-f) - (r-h)\), in line with the previous definitions. If we add all the cash flows mentioned above we obtain that the value of the contract \(V_t\) must satisfy

$$\begin{aligned} \begin{aligned} V_t=&\mathbb {E}^{\mathscr {G}}_t\left[ \int _t^{\bar{\tau }}D(t,u)(\pi _u+(r_u-c_u)C_u+(r_u-f_u)F_u-(f_u-h_u)H_u)du\right] \\&+\mathbb {E}^{\mathscr {G}}_t\bigg [{1}_{\{ \tau >T\}}D(t,T)\varPhi (S_T)+D(t,\tau ){1}_{\{ t<\tau <T\}}\theta _\tau \bigg ]. \end{aligned} \end{aligned}$$
(4)

If we further suppose that we are able to replicate the value of our contract using the funding, the collateral (assuming re-hypothecation, otherwise C is to be omitted from the following equation) and the risky asset accounts, i.e.

$$\begin{aligned} V_u=F_u+H_u+C_u, \end{aligned}$$
(5)

we have, substituting for \(F_u\):

$$\begin{aligned} \begin{aligned} V_t=&\mathbb {E}^{\mathscr {G}}_t\left[ \int _t^{\bar{\tau }}D(t,u)(\pi _u+(f_u-c_u)C_u+(r_u-f_u)V_u-(r_u-h_u)H_u)du\right] \\&+\mathbb {E}^{\mathscr {G}}_t\bigg [{1}_{\{ \tau >T\}}D(t,T)\varPhi (S_T)+D(t,\tau ){1}_{\{ t<\tau <T\}}\theta _\tau \bigg ]. \end{aligned} \end{aligned}$$
(6)

Remark 2

In the classic no-arbitrage theory and in a complete market setting, without credit risk, the hedging process H would correspond to a delta hedging strategy account. Here we do not enforce this interpretation yet. However, we will see that a delta-hedging interpretation emerges from the combined effect of working under the default-free filtration \(\mathscr {F}\) (valuation under partial information) and of identifying part of the solution of the resulting BSDE, under reasonable regularity assumptions, as a sensitivity of the value to the underlying asset price S.

2.2 Adjusted Cash Flows Under a Simple Trading Model

We now show how the adjusted cash flows originate assuming we buy a call option on an equity asset \(S_T\) with strike K. We analyze the operations a trader would enact with the treasury and the repo market in order to fund the trade, and we map these operations to the related cash flows. We go through the following steps in each small interval \([t,t+dt]\), seen from the point of view of the trader/investor buying the option. This is written in first person for clarity and is based on conversations with traders working with their bank treasuries.

Time t:

  1. 1.

    I wish to buy a call option with maturity T whose current price is \(V_t = V(t,S_t)\). I need \(V_t\) cash to do that. So I borrow \(V_t\) cash from my bank treasury and buy the call.

  2. 2.

    I receive the collateral amount \(C_t\) for the call, that I give to the treasury.

  3. 3.

    Now I wish to hedge the call option I bought. To do this, I plan to repo-borrow \(\varDelta _t\) stock on the repo-market.

  4. 4.

    To do this, I borrow \(H_t = \varDelta _t S_t\) cash at time t from the treasury.

  5. 5.

    I repo-borrow an amount \(\varDelta _t\) of stock, posting cash \(H_t\) as a guarantee.

  6. 6.

    I sell the stock I just obtained from the repo to the market, getting back the price \(H_t\) in cash.

  7. 7.

    I give \(H_t\) back to treasury.

  8. 8.

    My outstanding debt to the treasury is \(V_t-C_t\).

Time \(t+dt\):

  1. 9.

    I need to close the repo. To do that I need to give back \(\varDelta _t\) stock. I need to buy this stock from the market. To do that I need \(\varDelta _t S_{t+dt}\) cash.

  2. 10.

    I thus borrow \(\varDelta _t S_{t+dt}\) cash from the bank treasury.

  3. 11.

    I buy \(\varDelta _t\) stock and I give it back to close the repo and I get back the cash \(H_t\) deposited at time t plus interest \(h_tH_t\).

  4. 12.

    I give back to the treasury the cash \(H_t\) I just obtained, so that the net value of the repo operation has been

    $$\begin{aligned} H_t(1+h_t \, dt ) - \varDelta _t S_{t+dt} = - \varDelta _t \, dS_t + h_t H_t \, dt \end{aligned}$$

    Notice that this \(- \varDelta _t dS_t\) is the right amount I needed to hedge V in a classic delta hedging setting.

  5. 13.

    I close the derivative position, the call option, and get \(V_{t+dt}\) cash.

  6. 14.

    I have to pay back the collateral plus interest, so I ask the treasury the amount \(C_t(1+c_t\, dt)\) that I give back to the counterparty.

  7. 15.

    My outstanding debt plus interest (at rate f) to the treasury is \(V_t-C_t+C_t(1+c_t\, dt)+(V_t-C_t)f_t\, dt=V_t(1+f_t\, dt)+C_t(c_t-f_t\, dt)\).

    I then give to the treasury the cash \(V_{t+dt}\) I just obtained, the net effect being

    $$\begin{aligned} V_{t+dt} - V_t(1+f_t\, dt)-C_t(c_t-f_t)\, dt = dV_t - f_t V_t \, dt-C_t(c_t-f_t) \, dt \end{aligned}$$
  8. 16.

    I now have that the total amount of flows is:

    $$\begin{aligned} - \varDelta _t \, dS_t + h_t H_t \, dt+dV_t - f_t V_t \, dt-C_t(c_t-f_t) \, dt \end{aligned}$$
  9. 17.

    Now I present-value the above flows in t in a risk-neutral setting.

    $$\begin{aligned}&{\mathbb {E}}_t [ - \varDelta _t \, dS_t + h_t H_t \, dt+dV_t - f_t V_t \, dt-C_t(c_t-f_t) \, dt] \\&\quad =- \varDelta _t (r_t - h_t) S_t\, dt + (r_t - f_t) V_t \, dt-C_t(c_t-f_t) \, dt-d \varphi (t) \\&\quad = -H_t(r_t - h_t)\, dt + (r_t - f_t) (H_t+F_t+C_t) \, dt-C_t(c_t-f_t) \, dt-d \varphi (t) \\&\quad = (h_t-f_t)H_t\, dt+(r_t - f_t)F_t\, dt+(r_t-c_t)C_t\, dt -d \varphi (t) \end{aligned}$$

    This derivation holds assuming that \({\mathbb {E}}_t [ dS_t] = r_t S_t \, dt\) and \({\mathbb {E}}_t [ dV_t] = r_t V_t \, dt - d \varphi (t)\), where \(d \varphi \) is a dividend of V in \([t,t+dt)\) expressing the funding costs. Setting the above expression to zero we obtain

    $$\begin{aligned} d \varphi (t) = (h_t-f_t)H_t\, dt+(r_t - f_t)F_t\, dt+(r_t-c_t)C_t\, dt \end{aligned}$$

    which coincides with the definition given earlier in (6).

3 An FBSDE Under \(\mathscr {F}\)

We aim to switch to the default free filtration \(\mathscr {F}=(\mathscr {F}_t)_{t\ge 0}\), and the following lemma (taken from Bielecki and Rutkowski [1] Sect. 5.1) is the key in understanding how the information expressed by \(\mathscr {G}\) relates to the one expressed by \(\mathscr {F}\).

Lemma 1

For any \(\mathscr {A}\)-measurable random variable X and any \(t\in \mathbb {R}_+\), we have:

$$\begin{aligned} \mathbb {E}_t^{\mathscr {G}}[{1}_{\{ t<\tau \le s\}}X]={1}_{\{ \tau>t\}} \frac{\mathbb {E}_t^{\mathscr {F}}[{1}_{\{ t<\tau \le s\}}X]}{\mathbb {E}_t^{\mathscr {F}}[{1}_{\{ \tau >t \}}]}. \end{aligned}$$
(7)

In particular we have that for any \(\mathscr {G}_t\)-measurable random variable Y there exists an \(\mathscr {F}_t\)-measurable random variable Z such that

$$ {1}_{\{ \tau>t\}}Y={1}_{\{ \tau >t\}}Z. $$

What follows is an application of the previous lemma exploiting the fact that we have to deal with a stochastic process structure and not only a simple random variable. Similar results are illustrated in [2].

Lemma 2

Suppose that \(\phi _u\) is a \(\mathscr {G}\)-adapted process. We consider a default time \(\tau \) with intensity \(\lambda _u\). If we denote \(\bar{\tau }=\tau \wedge T\) we have:

$$\begin{aligned} \mathbb {E}^{\mathscr {G}}_t\left[ \int _t^{\bar{\tau }}\phi _udu\right] ={1}_{\{ \tau >t\}}\mathbb {E}^{\mathscr {F}}_t\left[ \int _t^TD(t,u,\lambda )\widetilde{\phi _u}du\right] \end{aligned}$$

where \(\widetilde{\phi _u}\) is an \(\mathscr {F}_u\) measurable variable such that \({1}_{\{ \tau>u\}}\widetilde{\phi _u}={1}_{\{ \tau >u\}}\phi _{u}\).

Proof

$$ \mathbb {E}^{\mathscr {G}}_t\left[ \int _t^{\bar{\tau }}\phi _udu\right] =\mathbb {E}^{\mathscr {G}}_t\left[ \int _t^T{1}_{\{ \tau>t\}}{1}_{\{ \tau>u\}}\phi _udu\right] =\int _t^T\mathbb {E}^{\mathscr {G}}_t\left[ {1}_{\{ \tau>t\}}{1}_{\{ \tau >u\}}\phi _u\right] du $$

then by using Lemma 1 we have

$$ =\int _t^T{1}_{\{ \tau>t\}}\frac{\mathbb {E}^{\mathscr {F}}_t\left[ {1}_{\{ \tau>t\}}{1}_{\{ \tau>u\}}\phi _u\right] }{\mathbb {Q}[\tau>t \ | \mathscr {F}_t]}du={1}_{\{ \tau>t\}}\int _t^T\mathbb {E}^{\mathscr {F}}_t\left[ {1}_{\{ \tau >u\}}\phi _u\right] D(0,t,\lambda )^{-1} du $$

now we choose an \(\mathscr {F}_u\) measurable variable such that \({1}_{\{ \tau>u\}}\widetilde{\phi _u}={1}_{\{ \tau >u\}}\phi _{u}\) and obtain

$$\begin{aligned} \begin{aligned}&={1}_{\{ \tau>t\}}\int _t^T\mathbb {E}^{\mathscr {F}}_t\left[ \mathbb {E}^{\mathscr {F}}_u\left[ {1}_{\{ \tau>u\}}\right] \widetilde{\phi _u}\right] D(0,t,\lambda )^{-1}du\\&={1}_{\{ \tau>t\}}\int _t^T\mathbb {E}^{\mathscr {F}}_t\left[ D(0,u,\lambda )\widetilde{\phi _u}\right] D(0,t,\lambda )^{-1}du={1}_{\{ \tau >t\}}\mathbb {E}^{\mathscr {F}}_t\left[ \int _t^TD(t,u,\lambda )\widetilde{\phi }_udu\right] \end{aligned} \end{aligned}$$

where the penultimate equality comes from the fact that the default times are conditionally independent and if we define \(\varLambda _X(u)=\int _0^u\lambda ^X_sds\) with \(X\in \{I,C\}\) we have that \(\tau _X=\varLambda _X^{-1}(\xi _X)\) with \(\xi _X\) mutually independent exponential random variables independent from \(\lambda ^X\).Footnote 1 A similar result will enable us to deal with the default cash flow term. In fact we have the following (Lemma 3.8.1 in [2])

Lemma 3

Suppose that \(\phi _u\) is an \(\mathscr {F}\)-predictable process. We consider two conditionally independent default times \(\tau _I,\tau _C\) generated by Cox processes with \(\mathscr {F}\)-intensity rates \(\lambda ^I_t,\lambda ^C_t\). If we denote \(\tau =\tau _C\wedge \tau _I\) we have:

$$\begin{aligned} \mathbb {E}^{\mathscr {G}}_t\left[ {1}_{\{ t<\tau<T\}} {1}_{\{ \tau _I<\tau _C\}}\phi _\tau \right] ={1}_{\{ \tau >t\}}\mathbb {E}_t^\mathscr {F}\left[ \int _t^TD(t,u,\lambda ^I+\lambda ^C)\lambda ^I_u\phi _udu\right] . \end{aligned}$$

Now we postulate a particular form for the default cash flow, more precisely if we indicate \(\widetilde{V}_t\) the \(\mathscr {F}\)-adapted process such that

$$ {1}_{\{ \tau>t\}}\widetilde{V}_t={1}_{\{ \tau >t\}}V_t $$

then we define

$$ \theta _t=\epsilon _{t}-{1}_{\{ \tau _C<\tau _I\}}LGD_C(\epsilon _{t}-C_t)^++{1}_{\{ \tau _I<\tau _C\}}LGD_I(\epsilon _{t}-C_t)^-. $$

Where LGD indicates the loss given default, typically defined as \(1-REC\), where REC is the corresponding recovery rate and \((x)^+\) indicates the positive part of x and \((x)^-=-(-x)^+\). The meaning of these flows is the following, consider \(\theta _\tau \):

  • at first to default time \(\tau \) we compute the close-out value \(\epsilon _{\tau }\);

  • if the counterparty defaults and we are net debtor, i.e. \(\epsilon _{\tau }-C_\tau \le 0\) then we have to pay the whole close-out value \(\varepsilon _\tau \) to the counterparty;

  • if the counterparty defaults and we are net creditor, i.e. \(\epsilon _{\tau }-C_\tau >0\) then we are able to recover just a fraction of our credits, namely \(C_\tau +REC_C(\varepsilon _\tau -C_\tau )=REC_C\varepsilon _\tau +LGD_CC_\tau =\varepsilon _\tau -LGD_C(\varepsilon _\tau -C_\tau )\) where \(LGD_C\) indicates the loss given default and is equal to one minus the recovery rate \(REC_C\).

A similar reasoning applies to the case when the Investor defaults.

If we now change filtration, we obtain the following expression for \(V_t\) (where we omitted the tilde sign over the rates, see Remark 3):

$$\begin{aligned} \begin{aligned} V_t=&{1}_{\{ \tau>t\}}\mathbb {E}^{\mathscr {F}}_t\left[ \int _t^{T}D(t,u,r+\lambda )((f_u-c_u)C_u+(r_u-f_u)\widetilde{V}_u-(r_u-h_u)\widetilde{H}_u)du\right] \\&+{1}_{\{ \tau>t\}}\mathbb {E}^{\mathscr {F}}_t\left[ D(t,T,r+\lambda )\varPhi (S_T)+\int _t^{T}D(t,u,r+\lambda )\pi _udu\right] \\&+{1}_{\{ \tau >t\}}\mathbb {E}^{\mathscr {F}}_t\left[ \int _t^TD(t,u,r+\lambda )\widetilde{\theta }_u du\right] , \end{aligned} \end{aligned}$$
(8)

where, if we suppose \(\epsilon _t\) to be \(\mathscr {F}\)-predictable, we have (using Lemma 3):

$$\begin{aligned} \begin{aligned} \widetilde{\theta }_u&=\epsilon _{u}\lambda _u-LGD_C(\epsilon _{u}-C_u)^+\lambda ^C_u+LGD_I(\epsilon _{u}-C_u)^-\lambda ^I_u. \end{aligned} \end{aligned}$$
(9)

Remark 3

From now on we will omit the tilde sign over the rates \(f_u,h_u\). Moreover, we note that if a rate is of the form

$$\begin{aligned} x_t=x^+{1}_{\{ g(V_t,H_t,C_t)>0\}}+x^-{1}_{\{ g(V_t,H_t,C_t)\le 0\}} \end{aligned}$$

then on the set \(\{\tau >t\}\) it coincides with the rate

$$ \widetilde{x}_t=\widetilde{x}^+{1}_{\{ g(\widetilde{V}_t,\widetilde{H}_t,C_t)>0\}}+\widetilde{x}^-{1}_{\{ g(\widetilde{V}_t,\widetilde{H}_t,C_t)\le 0\}} $$

because \({1}_{\{ \tau>t\}}x^+{1}_{\{ g(V_t,H_t,C_t)>0\}}=\widetilde{x}^+{1}_{\{ \tau>t\}}{1}_{\{ g(V_t,H_t,C_t)>0\}}\), and on \(\{\tau >t\}\) we have \(V_t=\widetilde{V}_t\) and \(H_t=\widetilde{H}_t\), and hence \(g(V_t,H_t,C_t)>0\iff g(\widetilde{V}_t,\widetilde{H}_t,C_t)>0 \).

We note that this expression is of the form \(V_t={1}_{\{ \tau >t\}}\Upsilon \) meaning that \(V_t\) is zero on \(\{\tau \le t\}\) and that on the set \(\{\tau >t\}\) it coincides with the \(\mathscr {F}\)-measurable random variable \(\Upsilon \). But we already know a variable that coincides with \(V_t\) on \(\{\tau >t\}\), i.e. \(\widetilde{V}_t\). Hence we can write the following:

$$\begin{aligned} \begin{aligned} \widetilde{V}_t=&\mathbb {E}^{\mathscr {F}}_t\left[ \int _t^{T}D(t,u,r+\lambda )(\pi _u+ (f_u-c_u)C_u+(r_u-f_u)\widetilde{V}_u-(r_u-h_u)\widetilde{H}_u)du\right] \\&+\mathbb {E}^{\mathscr {F}}_t\left[ D(t,T,r+\lambda )\varPhi (S_T)+\int _t^TD(t,u,r+\lambda )\widetilde{\theta }_u du\right] .\\ \end{aligned} \end{aligned}$$
(10)

We now show a way to obtain a BSDE from Eq. (10), another possible approach (without default risk) is shown for example in [9]. We introduce the process

$$\begin{aligned} \begin{aligned} X_t=&\int _0^{t}D(0,u,r+\lambda )\pi _u du+\int _0^tD(0,u,r+\lambda )\widetilde{\theta }_udu\\&+\int _0^{t}D(0,u,r+\lambda )\left[ (f_u-c_u)C_u+(r_u-f_u)\widetilde{V}_u-(r_u-h_u)\widetilde{H}_u\right] du. \end{aligned} \end{aligned}$$
(11)

Now we can construct a martingale summing up \(X_t\) and the discounted value of the deal as in the following:

$$ D(0,t,r+\lambda )\widetilde{V}_t+X_t=\mathbb {E}^{\mathscr {F}}_t[X_T+D(0,T,r+\lambda )\varPhi (S_T)]. $$

So differentiating both sides we obtain:

$$\begin{aligned}&-(r_u+\lambda _u)D(0,u,r+\lambda )\widetilde{V}_udu+D(0,u,r+\lambda )d\widetilde{V}_u+dX_u\\&\quad =d\mathbb {E}^{\mathscr {F}}_u[X_T+D(0,T,r+\lambda )\varPhi (S_T)]. \end{aligned}$$

If we substitute for \(X_t\) we have that the expression:

$$ d\widetilde{V}_u+\left[ \pi _u-(r_u+\lambda _u)\widetilde{V}_u+\widetilde{\theta }_u+(f_u-c_u)C_u+(r_u-f_u)\widetilde{V}_u-(r_u-h_u)\widetilde{H}_u\right] du$$

is equal to;

$$\frac{d\mathbb {E}^{\mathscr {F}}_u[X_T+D(0,T,r+\lambda )\varPhi (S_T)]}{D(0,u,r+\lambda )}. $$

The process \((\mathbb {E}^{\mathscr {F}}_t[X_T+D(0,T,r+\lambda )\varPhi (S_T)])_{t\ge 0}\) is clearly a closed \(\mathscr {F}\)-martingale, and hence

$$\begin{aligned} \int _0^tD(0,u,r+\lambda )^{-1}d\mathbb {E}^{\mathscr {F}}_u[X_T+D(0,T,r+\lambda )\varPhi (S_T)] \end{aligned}$$

is a local \(\mathscr {F}\)-martingale. Then, being

$$\begin{aligned} \int _0^tD(0,u,r+\lambda )^{-1}d\mathbb {E}^{\mathscr {F}}_u[X_T+D(0,T,r+\lambda )\varPhi (S_T)] \end{aligned}$$

adapted to the Brownian-driven filtration \(\mathscr {F}\), by the martingale representation theorem we have

$$\begin{aligned} \int _0^tD(0,u,r+\lambda )^{-1}d\mathbb {E}^{\mathscr {F}}_u[X_T+D(0,T,r+\lambda )\varPhi (S_T)]=\int _0^tZ_udW_u \end{aligned}$$

for some \(\mathscr {F}\)-predictable process \(Z_u\). Hence we can write:

$$\begin{aligned} d\widetilde{V}_u+\left[ \pi _u-(f_u+\lambda _u)\widetilde{V}_u+\widetilde{\theta }_u+(f_u-c_u)C_u-(r_u-h_u)\widetilde{H}_u\right] du=Z_udW_u. \end{aligned}$$
(12)

4 Markovian FBSDE and PDE for \(\widetilde{V}_t\) and the Invariance Theorem

As it is, Eq. (12) is way too general, thus we will make some simplifying assumptions in order to guarantee existence and uniqueness of a solution. First we assume a Markovian setting, and hence we suppose that all the processes appearing in (12) are deterministic functions of \(S_u,\widetilde{V}_u\) or \(Z_u\) and time. More precisely we assume that:

  • the dividend process \(\pi _u\) is a deterministic function \(\pi (u,S_u)\) of u and \(S_u\), Lipschitz continuous in \(S_u\);

  • the rates \(r,f^\pm ,c^\pm ,\lambda ^I,\lambda ^C\) are deterministic bounded functions of time;

  • the rate \(h_t\) is a deterministic function of time, and does not depend on the sign of H, namely \(h^+=h^-\), hence there is only one rate relative to the repo market of assets;

  • the collateral process is a fraction of the process \(\widetilde{V}_{u}\), namely \(C_u=\alpha _u\widetilde{V}_{u}\), where \(0\le \alpha _u\le 1\) is a function of time;

  • the close-out value \(\epsilon _t\) is equal to \(\widetilde{V}_t\) (this adds a source of nonlinearity with respect to choosing a risk-free closeout, see for example [6] and [5]);

  • the diffusion coefficient \(\sigma (t,S_t)\) of the underlying dynamic is Lipschitz continuous, uniformly in time, in \(S_t\);

  • we consider a delta-hedging strategy, and to this extent we choose \(\widetilde{H}_t=S_t\frac{Z_t}{\sigma (t,S_t)}\); this reasoning derives from the fact that if we suppose \(\widetilde{V}_t=V(t,S_t)\) with \(V(\cdot ,\cdot )\in C^{1,2}\) applying Ito’s formula and comparing it with Eq. (12), we have that \(\sigma (t,S_t)\partial _SV(t,S_t)=Z_t\).Footnote 2

Under our assumptions, Eq. (12) becomes the following FBSDE:

$$\begin{aligned} \begin{aligned} dS_t=&r_tS_tdt+\sigma (t,S_t)dW_t\\ S_0=&s \\ d\widetilde{V}_t=&-\underbrace{\left[ \pi _t+\widetilde{\theta }_t-\lambda _t\widetilde{V}_t+f_t\widetilde{V}_t(\alpha _t-1)-c_t(\alpha _t\widetilde{V}_t)-(r_t-h_t)S_t\frac{Z_t}{\sigma (t,S_t)}\right] }_{B(t,S_t,\widetilde{V}_t,Z_t)}dt+Z_tdW_t\\ V_T=&\varPhi (S_T) \end{aligned} \end{aligned}$$
(13)

We want to obtain existence and uniqueness of the solution to the above-mentioned FBSDE and a related PDE. A possible choice is the following (see J. Zhang [15] Theorem 2.4.1 on page 41):

Theorem 1

Consider the following FBSDE on [0, T]:

$$\begin{aligned} \begin{aligned} dX^{q,x}_t&=\mu (t,X^{q,x}_t)dt+\sigma (t,X^{q,x}_t)dW_t \quad q<t\le T \\ X_t&=x \quad 0\le t\le q\\ dY^{q,x}_t&=-f(t,X^{q,x}_t,Y^{q,x}_t,Z^{q,x}_t)dt+Z^{q,x}_tdW_t\\ Y^{q,x}_T&=g(X^{q,x}_T) \end{aligned} \end{aligned}$$
(14)

If we assume that there exists a positive constant K such that

  • \(\sigma (t,x)^2\ge \frac{1}{K}\);

  • \(|f(t,x,y,z)-f(t,x',y',z')|+|g(x)-g(x')|\le K(|x-x'|+|y-y'|+|z-z'|)\);

  • \(|f(t,0,0,0)|+|g(0)|\le K\);

and moreover the functions \(\mu (t,x)\) and \(\sigma (t,x)\) are \(C^2\) with bounded derivatives, then Eq. (14) has a unique solution \((X^{q,x}_t,Y^{q,x}_t,Z^{q,x}_t)\) and \(u(t,x)=Y^{t,x}_t\) is the unique classical (i.e. \(C^{1,2}\)) solution to the following semilinear PDE

$$\begin{aligned} \begin{aligned} \displaystyle \partial _tu(t,x)+\frac{1}{2}\sigma (t,x)^2\partial ^2_xu(t,x)+\mu (t,x)\partial _xu(t,x)+f(t,x,u(t,x),\sigma (t,x)\partial _xu(t,x))&=0\\ u(T,x)&=g(x) \end{aligned} \end{aligned}$$
(15)

We cannot directly apply Theorem 1 to our FBSDE because B(tsvz) is not Lipschitz continuous in s because of the hedging term. But, since the hedging term is linear in \(Z_t\) we can move it from the drift of the backward equation to the drift of the forward one. More precisely consider the following:

$$\begin{aligned} \begin{aligned} dS^{q,s}_t&=h_tS^{q,s}_tdt+\sigma (t,S^{q,s}_t)dW_t \quad q<t\le T \\ S_q&=s_q \quad 0\le t \le q \\ dV^{q,s}_t&=-\underbrace{\left[ \pi _t+\theta _t-\lambda _tV^{q,s}_t+f_tV^{q,s}_t(\alpha _t-1)-c_t(\alpha _tV^{q,s}_t)\right] }_{B'(t,S^{q,s}_t,V^{q,s}_t)}dt+Z^{q,s}_tdW_t\\ V^{q,s}_T&=\varPhi (S_T^{q,s}). \end{aligned} \end{aligned}$$
(16)

Indeed, one can check that the assumptions of Theorem 1 are satisfied for this equation:

Theorem 2

If the rates \(\lambda _t,\ f_t,\ c_t,\ h_t,\ r_t\) are bounded, then \(|B'(t,s,v)-B'(t,s',v')|\le K(|s-s'|+|v-v'|)\) and \(|B'(t,0,0)|+\varPhi (0)\le K\). Hence if \(\sigma (t,s)\) is a positive \(C^2\) function with bounded derivatives, then the assumptions of Theorem 1 are satisfied and so Eq. (16) has a unique solution, and moreover \(V_t^{t,s}=u(t,s)\in C^{1,2}\) and satisfies the following semilinear PDE:

$$\begin{aligned} \begin{aligned} \partial _tu(t,s)+\frac{1}{2}\sigma (t,s)^2\partial ^2_su(t,s)+h_ts\partial _su(t,s)+B'(t,s,u(t,s))&=0\\ u(T,s)&=\varPhi (s) \end{aligned} \end{aligned}$$
(17)

Proof

We start by rewriting the term

$$ B'(t,s,v)=\pi _t(s)+\theta _t(v)+(f_t(\alpha _t-1)-\lambda _t-c_t\alpha _t)v. $$

Since the sum of two Lipschitz functions is itself a Lipschitz function we can restrict ourselves to analyzing the summands that appear in the previous formula. The term \(\pi _t\) is Lipschitz continuous in s by assumption. The \(\theta \) term and the \((f_t(\alpha _t-1)-\lambda _t-c_t\alpha _t)v\) term are continuous and piece-wise linear, hence Lipschitz continuous and this concludes the proof.

Note that the S-dynamics in (16) has the repo rate h as drift. Since in general h will depend on the future values of the deal, this is a source of nonlinearity and is at times represented informally with an expected value \(\mathbb {E}^h\) or a pricing measure \(\mathbb {Q}^h\), see for example [5] and the related discussion on operational implications for the case \(h=f\).

We now show that a solution to Eq. (13) can be obtained by means of the classical solution to the PDE (17). We start considering the following forward equation which is known to have a unique solution under our assumptions about \(\sigma (t,s)\).

$$\begin{aligned} dS_t=r_tS_tdt+\sigma (t,S_t)dW_t \quad S_0=s. \end{aligned}$$
(18)

We define \(V_t=u(t,S_t)\) and \(Z_t=\sigma (t,S_t)\partial _su(t,S_t)\). By Theorem 2 we know that \(u(t,s)\in C^{1,2}\) and by applying Ito’s formula and (17) we obtain:

$$\begin{aligned} \begin{aligned} dV_t&=du(t,S_t)\\&=\left( \partial _tu(t,S_t)+r_tS_t\partial _su(t,S_t)+\frac{1}{2}\sigma (t,S_t)^2\partial ^2_su(t,S_t)\right) dt+\sigma (t,S_t)\partial _su(t,S_t)dW_t\\&=\left( (r_t-h_t)S_t\partial _su(t,S_t)-B'(t,S_t,u(t,S_t))\right) dt+\sigma (t,S_t)\partial _su(t,S_t)dW_t\\&=\left( (r_t-h_t)S_t\frac{Z_t}{\sigma (t,S_t)}-\pi _t(S_t)-\theta _t(V_t)-(f_t(\alpha _t-1)-\lambda _t-c_t\alpha _t)V_t)\right) dt+Z_tdW_t \end{aligned} \end{aligned}$$

Hence we found the following:

Theorem 3

(Solution to the Valuation Equation) Let \(S_t\) be the solution to Eq. (18) and u(ts) the classical solution to Eq. (17). Then the process \((S_t,u(t,S_t),\sigma (t,S_t)\partial _su(t,S_t))\) is the unique solution to Eq. (13).

Proof

From the reasoning above we found that \((S_t,u(t,S_t),\sigma (t,S_t)\partial _su(t,S_t))\) solves Eq. (13). Finally from the seminal result of [14] we know that if there exist \(K>0\) and \(p\ge \frac{1}{2}\) such that:

  • \(|\mu (t,x)-\mu (t,x')|+|\sigma (t,x)-\sigma (t,x')|\le K|x-x'|\)

  • \(|\mu (t,x)|+|\sigma (t,x)|\le K(1+|x|)\)

  • \(|f(t,x,y,z)-f(t,x,y',z')|\le K(|y-y'|+|z-z'|)\)

  • \(|g(x)|+|f(t,x,0,0)|\le K(1+|x|^p)\)

then the FBSDE (14) has a unique solution. Since we have to check the Lipschitz continuity just for y and z we can verify that Eq. (13) satisfies the above-mentioned assumptions and hence has a unique solution.

Remark 4

Since we proved that \(V_t=u(t,S_t)\) with \(u(t,s)\in C^{1,2}\), the reasoning we used, when saying that \(\widetilde{H}_t=S_t\frac{Z_t}{\sigma (t,S_t)}\) represented choosing a delta-hedge, it is actually more than a heuristic argument.

Moreover, since (17) does not depend on the risk-free rate \(r_t\) so we can state the following:

Theorem 4

(Invariance Theorem) If we are under the assumptions at the beginning of Sect. 4 and we assume that we are backing our deal with a delta hedging strategy, then the price \(V_t\) can be calculated via the semilinear PDE (17) and does not depend on the risk-free rate r(t).

This invariance result shows that even when starting from a risk-neutral valuation theory, the risk-free rate disappears from the nonlinear valuation equations. A discussion on consequences of nonlinearity and invariance on valuation in general, on the operational procedures of a bank, on the legitimacy of fully charging the nonlinear value to a client, and on the related dangers of overlapping valuation adjustments is presented elsewhere, see for example [3, 5] and references therein.