An empirical study on discrete optimization models for portfolio selection

In this paper, we investigate four discrete optimization models 
arising from single period portfolio selection: Mean-variance model, 
mean-absolute-deviation model, minimax model and conditional 
Value-at-Risk model. These four models are established by 
considering the minimal transaction unit and the cardinality 
constraint in real-world investment practice. Extensive 
computational results are reported to compare the features of the 
models. We evaluate the performance of the models by analyzing the 
in-sample and out-of-sample numerical results with real data from 
Shanghai Stock Exchange.


1.
Introduction. Portfolio selection is to seek a best allocation of investment among a basket of assets. In his pioneering work [16], Markowitz proposed the famous mean-variance (MV) model for portfolio selection. In the MV model, the expected return and the risk of investment are measured by the mean and variance of the return, respectively. The MV model is thus a quadratic programming problem. Over the last five decades, the classical mean-variance model has been extended and improved. A major obstacle in the application of the mean-variance model is the computational complexity of estimating the covariance matrix required by the MV model. To overcome this difficulty, Sharpe [21] proposed a single-factor model which significantly simplifies the computation of the parameters in the MV model. The single-factor model was lately extended to multi-factor models (see [18]). Konno and Yamazaki [11] proposed an alternative portfolio optimization model-meanabsolute variance (MAD) model in which the absolute deviation from the mean is employed to measure the risk of the investment which leads to a linear programming problem for portfolio selection. It is shown that the MAD model is equivalent to the MV model under the assumption that the returns are multivariate normally distributed. Empirical results on the application of MAD model in the internationally diversified investment using stock-bond integrated model were reported in [8]. Young [24] proposed a minimax (MM) model based on the principle of maximizing the minimum return of the portfolio over all observed historical periods. The minimax rule for portfolio selection was also investigated in [4]. Recently, the conditional Value-at-Risk (CVaR), also known as mean excess loss or worst conditional expectation, has been used as a risk measure in portfolio selection (see [19] [20]). Under certain assumptions, the CVaR portfolio selection model can be reformulated as a linear programming model, which enhances its application in financial practice.
An implicit assumption in most of the portfolio selection models is the infinite divisibility of asset: the decision variables of the portfolio are proportions of the total amount of investment. In practice, however, the investors are often restricted to purchase the securities only in multiples of minimum transaction unit, i.e., integer numbers of assets taken as the basic unit of transaction. This is the case for many stock markets such as China, Hong Kong, Japan and many European markets. Another important discrete nature of portfolio selection models with real features is cardinality constraint which imposes a maximum number of securities in the portfolio. These discrete constraints lead to discrete optimization models for portfolio selection.
Mansini and Speranza [14] [15] showed that portfolio selection models with minimum transaction lots are important in some European Stock Exchange markets. Moreover, for small scale fund, rounding the optimal portfolio with fractional components may result in a significant distortion of the risk-return structure (see [12]). Heuristic methods for various portfolio selection models with cardinality constraint and minimum transaction unit were investigated in [5][6] [14]. The MV models with cardinality constraint were discussed in [2] [3]. Branch-and-bound methods for MAD models or mean semi-derivation model with transaction cost and minimum transaction unit have been studied extensively in [9][10] [12] [15]. Li et al. [13] proposed an exact method for discrete MV model with cardinality constraint and minimum transaction unit. Recently, Angelelli et al. [1] investigated the MAD and CVaR models with real features including cardinality and minimum transaction unit. Extensive numerical results were reported in [1] to compare the financial characteristics of these two models.
The aim of this paper is to investigate the performance of four discrete optimization models for portfolio selection: mean-variance (MV) model, mean absolute deviation (MAD) model, minimax (MM) model and conditional Value-at-Risk (CVaR) model. The discrete features of these four models arise from the minimum transaction unit and the cardinality constraint on the number of assets included in a portfolio. We compare the performance of these four discrete optimization models by in-sample and out-of-sample performance analysis with data from Shanghai Stock Exchange. Our empirical results show that the discrete CVaR model tends to produce portfolio with the most stable performance comparing with the other three discrete models. We also find that the portfolio generated by the MV model performs well on average while the MAD model often generates portfolio with poor performance. Finally, our out-of-sample analysis shows that the MM model behaves not so well in the worst scenarios as it should have by the minimax principle of the model.
The paper is organized as follows. In Section 2, we briefly review the continuous versions of the four models. The four discrete optimization models are derived in Section 3. In Section 4, we compare the performance of these four discrete optimization models using the historical data of 150 stocks from SSE180 index in Shanghai Stock Exchange. Finally, we give some concluding remarks in Section 5.
2. Review of the four continuous models. Suppose that there are n securities indexed by i, with the random rate of return R i . A portfolio is represented by the vector x = (x 1 , . . . , x n ) T , where x i is the proportion of the budget to be invested in security i for i = 1, . . . , n. The random rate of return of the portfolio is given by Assume that no short selling is allowed, i.e., x i ≥ 0, i = 1, . . . , n. Then, all feasible portfolios are given by denotes the expected value of the random variable α. The expected return and variance of portfolio x ∈ X are given by The expected return µ and covariance σ ij are usually estimated by prediction and statistical approaches using historical data of stock prices. A novel method was proposed in [23] for short term forecast of time series with numerical results using stock market and foreign exchange data.
The mean-variance (MV) model of Markowitz [16] [17] is to minimize the variance of the portfolio under certain minimum expected return level. The resulting optimization model is a parametric quadratic programming problem: where ρ is a parameter representing the minimum rate of return required by an investor. Konno and Yamazaki [11] introduced the L 1 (absolute deviation) risk measure: Suppose that R i is a discrete random variable with T scenarios and the realization of R i at scenario t is r it . Then the absolute deviation of portfolio x is given by

XUETING CUI, XIAOLING SUN AND DAN SHA
The MAD model is to minimize the L 1 risk of the portfolio under certain expected return level, which can be expressed as the following problem: x ∈ X.
Note that the downside mean deviation [7]). Introducing an additional for each t, the MAD model is equivalent to the following linear programming problem: Although the MAD model simplifies the estimation of covariance and reduces the quadratic programming problem of the MV model to a linear programming problem, it was shown in [22] that ignoring the covariance matrix results in greater estimation risk that may outweigh the benefits. Young [24] proposed a minimax (MM) model for portfolio selection. This model is based on the conservative principle that the portfolio should be selected such that the portfolio has the maximum return if the worst case scenario in the past observed periods happens. The MM model can be formulated as a linear programming: Rockafellar and Uryasev [19] introduced conditional Value-at-Risk (CVaR) as a measure of risk which is a mean return under a specified size of worst realization. The worst conditional expectation for a tolerance level 0 < β ≤ 1 is defined as: ≥ p} is the left-continuous inverse of the cumulative distribution function of the rate of return F x (η) = P (R(x) ≤ η). Assume that R(x) is a discrete random variable. Then the worst conditional expectation is LP computable [1] [19]. The CVaR model can be described as follows: where p t is the probability of scenario t. In general, we can assume p t = 1 T , i.e., the probabilities of each scenario are equal.
3. Discrete optimization models for portfolio section. In this section we formulate the discrete optimization models for the four continuous portfolio selection models described in the previous section. Two real features will be incorporated into the portfolio selection models: minimal transaction unit and cardinality constraint.
In financial practice, the number of shares of a stock included in a portfolio must be an integer multiple of the minimal transaction unit. In Shanghai Stock Exchange, the minimal transaction unit for each stock is 100 shares. Thus our decision variables should be integer. Cardinality constraint that restricts the maximum number of stocks in the portfolio is also important in portfolio selection. Investors tend to construct portfolio in manageable size and meanwhile keep the risk of portfolio well diversified. Transaction cost is also a real feature that should be taken into account in many situations of portfolio selection. In Shanghai Stock Exchange, however, the transaction cost is in a fixed ratio proportional to the investment amount, and thus does not bring great influence on the selection of optimal portfolios. Denote • p i : the price per lot of asset i at the date of portfolio selection; • W : the total investment budget; • K: the maximum number of stocks to be included in the portfolio; • ρ: the minimum required expected portfolio return; • x i : the decision variable representing the number of lots of asset i in the portfolio.
As integer variables are introduced, the budget constraint n i=1 p i x i = W is not necessarily satisfied exactly. The budget constraint can be relaxed as where the ratio 1 − δ with 0 < δ < 1 is to ensure that at least (1 − δ)W amount of money is invested.

XUETING CUI, XIAOLING SUN AND DAN SHA
The discrete optimization problem for the MV model with the above discrete features is described as follows: where constraint (1) is the budget constraint, constraint (2) represents the minimum expected return level, constraint (3) restricts the number of selected securities in the portfolio, and constraint (4) represents the bound for the number of asset i and the integrality of x i if asset i is selected. Similarly, the discrete optimization problems for the MAD model can be formulated as follows: The discrete CVaR model is formulated as: 4. Empirical analysis. In this section, we carry out empirical experiment to compare the in-sample and out-of-sample performance of the four discrete optimization models described in the previous section. All the optimization problems are solved by the commercial optimization software CPLEX 10.2 using the historical data from Shanghai Stock Exchange. portfolios based on the data from the first 150 weeks, and then analyze the out-of-sample performance on the last 35 weeks which has a negative market trend (bear market). In our numerical experiment, the total budget is W = 100, 000 (yuan) RMB and the maximum number of stocks is K = 20. Also, we set δ = 0.01 which means that at least 99% of total budget has to be invested. In our analysis, the gross return, the net expected return and the standard variance of a portfolio are calculated. The gross return is the mean of the portfolio return before subtracting the transaction cost. The net return is calculated by where c is the ratio of fixed transaction cost in Shanghai Stock Exchange which ranges from 1% to 3%, and r f is the riskfree rate of return. For CVaR model, we only consider the case β = 0.25 in the numerical experiment.   In-sample analysis. In the in-sample analysis, we first solve the four discrete optimization models for different levels of expected return to obtain the efficient frontiers, i.e., the return-risk relation of the optimal portfolios generated by the four models. Figures 1, 2 and 3 show the efficient frontiers of the four discrete optimization models using data in the in-sample periods of S 1 , S 2 and S 3 , respectively, where the x-axis represents the standard deviation of the optimal portfolios, and the y-axis represents the net expected return.
The following trends can be observed from Figures 1 -3: • The efficient frontier of MV model is above all other three models. This is because MV model employs variance as the risk measure and the optimal portfolios generated by all other three models are also feasible portfolios in the MV model. We also see that the efficient frontiers of discrete MV and MAD models are always consistent. As the expected rate of return level increases, the similarity of the efficient frontiers of MV and MAD models is more evident.  • The efficient frontier of the MM model always has greatest deviation from that of the MV model. This is partially because the MM model tends to choose some securities with high mean return and less stability at the same time in order to improve the worst return, resulting in portfolios with higher In Tables 1-3, we list some criterions of the in-sample performance of the optimal portfolios for the four models with different minimum required return level using different data. In columns 4-8 of Tables 1 -3, "N" is the number of stocks in the portfolio, "Capital" is the actual amount of money invested, "Gross Ret." stands for the gross return, "Net Ret." the net return and "Std Dev." the standard deviation. It is evident that the number of securities in the optimal portfolios tends to decreases, as the expected return level ρ increases, which leads to less diversification of the portfolio risk.

4.3.
Out-of-sample analysis. We now evaluate the portfolio performance in outof-sample periods with different market trends. The numerical results of out-ofsample performance for the four discrete optimization models are summarized in Tables 4-6. From Tables 4-6, we have the following observations: • In all the three out-of-sample periods with different market trends, the portfolio generated by the CVaR model generally has a better and stable performance with higher Sharpe ratio. Meanwhile the MAD model usually underperforms other models in terms of Sharpe ratio in different markets. • When the market is positive or negative, MV model does not always give better performance than other models, while when the market trend is volatile, MV portfolio performs better in most situations, which illustrates the efficiency of MV model in controlling fluctuation of return.
• When the model parameter and market trend change, the performance of the MM model changes dramatically. As the required return level ρ increases, the Sharpe ratio of the portfolio generated by MM model decreases in the positive and negative markets, while the Sharpe ratio tends to increase in the volatile market. This observation demonstrates the instability of MM portfolio.  6 illustrate the periodic out-of-sample return of the four models with ρ = 1% using the three data sets, respectively. For comparison, we also plot the periodic return of SSE180 index in Figures 4-6. Comparing the trends of the out-ofsample performance of these four discrete models, there is no absolute dominance observed among those models. Nevertheless, we observe that the CVaR model tends to have better performances comparing with other three models and the market index in three different market trends. This conforms with the numerical results in Tables 4-6 that the portfolio generated by CVaR model has a higher Sharpe ratio. To our surprise, we see from Figure 4-6 that the portfolio generated by the MM model does not perform so well in controlling the worst return in the out-of-sample   periods, which somehow contrasts with the minimax principle of the MM model that attempts to optimize the return in the worst-case scenarios.

5.
Conclusions. We have analyzed and compared the performance of four discrete optimization models with real features: the MV model, the MAD model, the MM  model and the CVaR model. Our in-sample and out-of-sample analysis shows that the CVaR model tends to generate portfolio with the most stable performance comparing with the other three discrete models. We also observe that the MAD model generates portfolio with poor performance in most situations. Finally, we find in the out-of-sample analysis that the MM model gives portfolio with unstable performance.