Optimal Portfolio Selection of Mean-Variance Utility with Stochastic Interest Rate

In order to tackle the problem of how investors in financial markets allocate wealth to stochastic interest rate governed by a nested stochastic differential equations (SDEs), this paper employs the Nash equilibrium theory of the subgame perfect equilibrium strategy and propose an extended Hamilton-Jacobi-Bellman (HJB) equation to analyses the optimal control over the financial system involving stochastic interest rate and state-dependent risk aversion (SDRA) mean-variance utility. By solving the corresponding nonlinear partial differential equations (PDEs) deduced from the extended HJB equation, the analytical solutions of the optimal investment strategies under time inconsistency are derived. Finally, the numerical examples provided are used to analyze how stochastic (short-term) interest rates and risk aversion affect the optimal control strategies to illustrate the validity of our results.


Introduction
Portfolio optimization, which is an important topic in the financial market, has been studied by a vast of researchers after the first publication by Markowitz [1]. In portfolio optimization from a given utility function and system parameters, the optimal values of the control parameters are determined to maximize the final utility. Previous research results in this field can be cataloged by the utilities used such as the mean-variance utility [2], the endogenous habit formation [3], the hyperbolic discounting [4], and the classic constant relative risk aversion (CRRA) [5]. Among those utilities, optimal asset allocation under mean-variance is one of the very interesting and thought-provoking topics in the classic results of financial economics. Under this framework, that optimal asset allocation problem within multiple periods is time-inconsistent which prohibits the application of the classical Bellman Optimality Principle. More precisely, the time inconsistency states that the optimal control depends on the starting point. This basically means that the control strat-egy that is optimal on the full-time interval ½0, T may not be optimal in the subinterval of ½t, T. Currently, there are two widely used ways for handling this type of problem.
One common method is called the buy and hold of the precommitted strategy. From the point of economical meaning, Kydland and Prescott [6] explain this tactic. Richardson [7] is credited for being the first to study the portfolio optimization problem by using the mean-variance utility function in the continuous time. He was soon followed by Bajeux-Besnainou and Portait [8]. They used a single stock with a constant risk-free rate for their studies. Using the same motivation, Li and Ng [9] change the classical mean-variance problem to a stochastic linear quadratic control one. Further research such as adding transaction cost, extending, and improving the models has been done by Dai et al. [10], Xue et al. [11], Lim [12], Bielecki et al. [13], Yao et al. [14], and Xia [15].
Recently, researchers have found that the volatility model is constructed with a stochastic model rather than a deterministic model (see Christoffersen et al. [16], Heston [17], Li et al. [18], Liu et al. [19]). Commonly used optimal asset allocation strategies are mostly controlled by the Hamilton-Jacobi-Bellman (HJB) equation. A more intense way of analysing the time-inconsistent problem has been to put it in a game theoretic context with the aim of making the model more realistic. In brief, the control process under a multiperiod time is like a game where the periods are treated as multiple players. Each player presents the accordingly current flavour and the incarnations of future tastes. The game ends when we find the subgame perfect Nash equilibrium point. As mentioned already, Markowitz [1] was the pioneer of this field, and he can also be credited with introducing the game theory approach to tackle the meanvariance portfolio optimization problem. Other researchers who also used this approach are Vieille and Weibull [20], Goldman [21], Peleg and Yaar [22], Pollak [23], Wu and Zhuang [24], and Krusell and Smith [25]. Basak and Chabakauri [26], especially, were interested in using the game theory for the continuous portfolio optimization problem. And the development, both from the modeling and the actual economic meaning, is made by Bjork and Murgoci [27] through extending the analysis to other different objective functions than the mean-variance one. Along with the work of Bjork et al. [4], Bjork and Murgoci [27] study the mean-variance portfolio optimization by relaxing the CRRA into a dynamic one which depends on the current wealth which is more economically reasonable.
In this paper, we follow the work of Bjork et al. [4], with the aim of making a further realistic financial model. We achieve this by relaxing the assumption of a constant interest rate and converting it to a stochastic (short-term) interest rate that is nested by the shown SDE [28]: The parameters in the stochastic processes are determined through the classical experimental data. For this particular problem, we then present the generalized extension of the HJB equation, applying the developed control theory with time inconsistency by Bjork and Murgoci [27]. Finally, we will proceed to provide a numerical illustration to show our results.
The remaining of the article is compiled in the following manner: in Section 2, we explain the setting of the financial market, while the structure is of a mean-variance optimal asset portfolio model with the stochastic (short-term) interest rate under state-dependent risk aversion (SDRA). Also, an extension of the HJB equation is derived along with the stochastic (short-term) interest rate. In Section 3, a closedform analytical solution is given for the optimal investment strategy across a number of situations. The numerical study is presented in Section 4 with some comparisons, and a conclusion is given in Section 5.

The Basic Framework
2.1. The Model. Let the model be composed of two assets being a stock and a bond with the interest rate of the bond being governed by a stochastic process. Then, the bond F t will follow the process described in the following SDE and the stock price S t and stochastic interest rate r t are modeled by the following stochastic process, with the parameter μ as the drift rate of S t and σ as the volatility of S t . We assume that the Brownian motion ðB r t , B s t Þ is correlated with the correlation CovðB r t , B s t Þ = ρ sr . The stochastic term structure model of interest rate, both single factor and multifactor, is nested within the above SDE (4). If we choose the most suitable restrictions on the six parametersα, β, ε, θ, η, andρ sr , then the SDE in (4) can be used to describe an array of interest rate processes of which some commonly known ones include Merton [29], Vasicek [30], and the CIR model [31].
Here, the total wealth is given as W t , while the initial wealth is W 0 , and as mentioned above, we apportion this wealth to the two asset classes. The amount of money we invest in the stock S t is denoted by u t , and the rest will be invested in the bond. Thus, we have the stochastic process of the total wealth 2.2. The Optimal Asset Portfolio Selection Model. Let U½0, T = fu t : ½0, T → ℝg be the set of all possible acceptable control strategies in the time period ½0, T. Then, we define the mean-variance portfolio selection problem as an optimal solution for the most likely strategy that the mean-variance utility is maximal at the termination period T. Also, the mean-variance utility has risk aversion that depends on wealth shown by γðwÞ. So we have the objective function J of time t, interest rate r, wealth w, and the optimal strategy u described as The goal is to solve for the most optimal admissible strategyû such that the mean-variance utility function reaches its maximal at the termination time T. Simply put, we want to optimize our expected returns even when there is a risk involved. Thus, we have the equation

Journal of Function Spaces
We could rewrite Jðt, r, w, uÞ as where 2.3. The Game Theoretic Framework. There are various approaches for solving time-inconsistent problems. One of the most suited methods is to use the subgame perfect Nash equilibrium theory by Bjork et al. [4], especially considering our problem. The brief explanation of the theory is as follows: suppose that we have several players and each player will have a control u. If at the later time point t the player chooses the optimal control uð·, tÞ, we simply regard that the player at time l, l ≤ t, will still use this control uð·, lÞ as the optimal one. Using this theory, then we can formally provide a mathematical equation for the formal equilibrium control as the following equation for ðt, r, wÞ ∈ ½0, T × R n × R n and u ∈ R k are satisfied, we can define theû as the equilibrium control, and hence, we get the equilibrium value function in the form of (7). Because the traditional Bellman optimality principle cannot be used to solve the optimal control problem with time inconsistency, the extended HJB equations can be formulated to solve this problem. Definition 1. Let A be an infinitesimal generator; then, for any fixed control u ∈ U, the associate infinitesimal generator is denoted as A u and thus is defined as follows where From the analysis of Bjork and Murgoci [27], we define the following extended HJB equations for the Nash equilibrium problem with a stochastic interest rate.
has the following boundary conditions We derived Theorem 2 above, please see Appendix.
Remark 3. In the above theorem, f ð·Þ represents f ðt, r, w, yÞ and f y ð·Þ represents f y ðt, r, wÞ. Function f ð·Þ is not the same as f y ð·Þ. f ð·Þ is the function of the four variables t, r, w, y, while f y ð·Þ is a function of the three variables t, r, w. But if the parameters y are fixed, f ðt, r, w, yÞ is equal to f y ðt, r, wÞ.

The Main Results
In this paper, we describe risk aversion as the willingness to accept the loss (or profit) when exposed to uncertainty in the future. A risk-averse investor will be more cautious about their wealth when the payoff at a later date is unknown. Such a person will opt to place their wealth in a low-incomegenerating investment, for instance, a checkable bank account with high certainty rather than a high-incomegenerating stock with high uncertainty. We can have a constant risk-averse investor who bears the same investment attitude towards different scenarios at all times. Earlier research suggests that authors assumed the risk aversion as a constant. This implies that the equilibrium solution of the control for the dollar-invested amount in the risky asset was independent on the investor's current wealth. We relax this assumption to be more economically realistic and impose a SDRA term. Further, we suggest the interest rate be regarded as a stochastic process so that it reflects the real economy rather than a constant rate with no relationship to other economic factors.

Mean-Variance Utility with Stochastic
Interest Rate under SDRA. We proceed to give a more realistic equation for the stochastic process, where the SDRA γðwÞ along with a stochastic interest rate can give an investment strategy that depends on the current wealth. This is, According to the extended HJB equation (14) in 3 Journal of Function Spaces Theorem 2, we have After the simplification (17), we have For the purpose of completeness, we recall the dynamics processes, From the above equations, we have V ww = f ww + 2f wy + f yy + γ ww 2 g 2 + 2γ w gg w + γg 2 w + γgg ww , V rr = f rr + γg 2 r + γgg rr , V wr = f wr + f yr + γ w gg r + γg r g w + γgg wr , Substituting all the equations above to (19), we obtain The optimization problem in (25) requires dZ/dujû = 0, that is By solving the above equation (27), we obtain the optimal controlû aŝ Remark 4. If ρ sr = 0, η = 0, the stochastic process of the bond becomes a risk-free bond. Thus, our result (28) reduces toû which is the same as Zhang et al. [32] if μ − r t = β.

Mean-Variance Utility with Stochastic
Interest Rate under CRRA. Here, we relax the assumption to consider the case where the investor has the same risk tolerance all the time, but the interest rate is still stochastic. The wealth dynamic process is the same as (5) with the expected value function as follows Journal of Function Spaces By using the same method such as (28), we have the optimal control aŝ 3.3. Mean-Variance Utility with Constant Interest Rate under CRRA. We consider the simplest situation where both of the assumptions are removed. That is, the investor has the same risk tolerance all the time, and the interest rate does not change in the life of the whole investment. The dynamics of the wealth process are the same as (5) with the expected value function of Jðt, w, UÞ = E t,w ½FðW U T Þ + GðE t,w ½W U T Þ in (9). Therefore, the optimal control can be shown as followŝ

Numerical Study
Using dimension analysis and applying the mean-variance utility function for the returns instead of the wealth itself, we naturally come up with the explicit form of the statedependent risk function. This model is called the "natural" risk function, which is expressed as γðwÞ = γ/w. In an economic sense, that is to say that when the investor is more wealthy, the investor is more willing to invest in risky assets. We can then rewrite the problem of the nonlinear PDE for the mean-variance portfolio optimization as with the operator In order to solve (34) numerically by the finite difference method, we firstly assume that Substituting (36) into the optimal control equation (28), we obtainû =ũw with the optimal control u * as Thus, the two dimensional nonlinear PDEs reduce to a one dimension nonlinear PDEs _ a +ã 1 a + b 1 a r + b 2 a rr +ca r = 0, This problem can be solved numerically using the finite difference method. The nonlinear term is discretized by the fix-point iteration method. By using the above numerical method, we analyze Relaxing some assumptions, we can make some of the parameters being constant for the save of convenience while still preserving generality. The values of the parameters are provided in Table 1. Figure 1 shows the optimal investment ratiou * starting from time t = 0to the end of the period t = Tunder different CRRA values. During the investment period, the investor gradually reduces the investment in the risky assets over the long run under the different risk aversions. However, if the investor is less risk-averse, the investor will increase the investor's holdings in the risky assets. On the contrary, when risk aversion gets larger, the investment ratio of holding risky Coefficient between two assets ρ sr -0.1 5 Journal of Function Spaces assets decreases, which means that the investor will allocate less money into the stock.
The optimal investment ratio versus the interest rate under different risk aversions is depicted in Figure 2. We could clearly see that the amount of money spent on the stock will decrease as the interest rate increases. This is consistent with our intuition. Also, we can find that under a small risk aversion, the change of interest rate r causes more significant change in the amount of money spent on the stock, which means that under small risk aversion, the optimal control is more sensitive to the interest rate.
We also plot the total wealth of investors from the time when the interest rate isr = 0:01up to the existing time whenr = 0:05for different SDRA valueγðwÞ. This is shown in Figure 3. The figure is also captured at the terminal time T which only give a qualitative explanation. From the figure, we could see two trends: firstly, when risk aversions are small, the wealth will decrease as the interest increase. Since when the interest rate increases, we will spend less of the investor's wealth in the risky stock, as there is a chance that the general loss in the risky investment is higher than the bond return. Thus, the total wealth will decrease. On the other hand, when  Journal of Function Spaces the risk aversions are large, we have a very limited investment in risky assets, so the total wealth will increase because of the relatively large amount of bonds return. But we can see that the increase is very mild, since the return of the risk-free bond will not be very high, which is reasonable. For Figure 4, we show the investor's wealth evolution beginning from time t = 0 to the terminal time t = T under different SDRA value γðwÞ. It is shown that generally, wealth processes all have an increasing trend. However, this increase has fluctuations due to the stochastic nature of the interest rate and the stock which are stochastic in the context of financial markets. Within a short time period, the wealth increases slowly with frequent fluctuations, and in the long run, the increasing trend will be more stable under the optimal control strategy. Moreover, when a CRRA value γ is small, the investor has a relatively more fluctuated wealth process. When the γ gets larger, the wealth process is relatively stable. This means that high risky assets sometimes bring us more return.

Conclusions
In this paper, we come up with an analytical solution for the portfolio optimization problem that contains a stochastic   Journal of Function Spaces (short-term) interest rate. This problem is also governed by the mean-variance utility function with SDRA. We then proceed to use the Nash equilibrium for a subgame strategy to derive the equilibrium value function and also the equilibrium control policy and further provide analytical expressions for these. As a result, we present the closed-form solution for the associate portfolio selection model and analyse how the stochastic (short-term) interest rate and investor's risk aversion will affect the optimal control policy. This is done through a simplified financial setting that the investor has a "natural risk aversion." Finally, our numerical results show that in the presence of a stochastic (short-term) interest rate, the portfolio optimization problem is more economically sensible than the one when we assume a constant interest rate.

Appendix A. Derivation of the Theorem 2
From the Hamilton-Jacobi-Bellman equation in [4], and the objective function (6), we can obtain and consequently have where the function F and function G have the forms described in (9). When the time condition satisfies l > t, we have ðA:3Þ Hence, the equation of (A.3) above can simply be written as, ðA:6Þ On both sides of the equation, we apply expectations and then we can get, and combine this result into the definition of (A.2), we then From the condition of the iteration step by step, we obtain ðA:10Þ By substituting the results of (A.9) and (A.10) back into the equation of (A.8), we can get ðA:12Þ Through our proposed problem (A.1) and the control law as defined in the traditional research, we find out that the control U coincides with the equilibrium lawû in ½l, T, and then, we formulate the following, J l, W l , R l ,û ð Þ= Q l, W l , R l ð Þ , f U l, W l , R l , y ð Þ = f l, W l , R l , y ð Þ , g U l, W l , R l ð Þ = g l, W l , R l ð Þ : ðA:13Þ

Journal of Function Spaces
Thus, the optimization problem of (A.12) can be given as ðA:14Þ Here, by using the following operator denotations for similarity ðA:15Þ We finally have developed the following extended HJB equation with stochastic volatility in the light of the work of [4], sup u∈U A u V t, r, w ð Þ− A u f t, r, w, w ð Þ+ A u f w t, r, w ð Þ f − A u G * g ð Þ t, r, w, w ð Þ+ H u g t, r, w, w ð Þ g= 0, 0 ≤ t ≤ T: ðA:16Þ

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.