Multi-objective portfolio optimization of mutual funds under downside risk measure using fuzzy theory

Article history: Received 6 April 2012 Received in revised format 26 April 2012 Accepted May 24 2012 Available online 30 May 2012 Mutual fund is one of the most popular techniques for many people to invest their funds where a professional fund manager invests people's funds based on some special predefined objectives; therefore, performance evaluation of mutual funds is an important problem. This paper proposes a multi-objective portfolio optimization to offer asset allocation. The proposed model clusters mutual funds with two methods based on six characteristics including rate of return, variance, semivariance, turnover rate, Treynor index and Sharpe index. Semivariance is used as a downside risk measure. The proposed model of this paper uses fuzzy variables for return rate and semivariance. A multi-objective fuzzy mean-semivariance portfolio optimization model is implemented and fuzzy programming technique is adopted to solve the resulted problem. The proposed model of this paper has gathered the information of mutual fund traded on NASDAQ from 2007 to 2009 and Pareto optimal solutions are obtained considering different weights for objective functions. The results of asset allocation, rate of return and risk of each cluster are also determined and they are compared with the results of two clustering methods. © 2012 Growing Science Ltd. All rights reserved


Introduction
Mutual fund is one of the most popular methods for many people to invest their funds where a professional fund manager invests people's funds based on some special predefined objectives.There are literally various types of mutual funds in the world, which makes it difficult to choose the appropriate one.Therefore, we need to use an appropriate technique to make an assessment on different mutual funds and choose the most efficient ones.Murthi et al. (1997) and Basso & Funari (2001) proposed a data envelopment analysis model to measure the mutual fund performance and Deb & Banerjee (2009) proposed a downside risk analysis for equity mutual funds.Chang et al. (2010) suggested an extended TOPSIS method with different distance approaches.
Portfolio optimization is one of the main problems in modern investment theories.Markowitz (1952Markowitz ( , 1959) ) argued that in portfolio optimization problem there are two criteria: 1) return of portfolio that should be maximized 2) risk of portfolio that should be minimized.In Markowitz mean-variance model there is a set of solutions called efficient frontier where an investor should choose one of them depending his/her risk/reward criteria.The mean-variance model is valid if the return has normal distribution and in case the return is not normally distributed downside risk measures is used instead of variance measure (Vercher et al., 2007).
In recent studies, researchers have used multi-objective portfolio optimization models.Chang et al. (2009) introduced a portfolio optimization model in different risk measures and solved it using genetic algorithm.Anagnostopoulos & Mamanis (2010) proposed a multi-objective model with discrete variables.In fuzzy environment, Ammar & Khalifa (2003) implemented fuzzy variables for portfolio optimization problem and Terol et al. (2006) used fuzzy compromise programming for portfolio selection.They used Sharpe's single index model and defined future beta as a fuzzy number.Jana et al. (2009) defined mean value and variance as trapezoidal fuzzy numbers and transformed them to possibilistic form.They proposed a three-objective-model and solved it using fuzzy programming technique.Clustering analysis is one of data mining approaches, which helps control the scale of many problems and it can be used for optimization problem.Tola et al. (2008) used cluster analysis for portfolio optimization.Chen & Huang (2009) used a two-stage method and clustered observations based on four characteristics: rate of return, standard deviation, turnover rate and Treynor index.They implemented fuzzy variables for return and risk variables.For portfolio optimization problem, they used α-cut method and optimized return and risk separately and in two models.
In this paper, we use a method originally developed by Chen & Huang (2009) with two more characteristics: Sharpe index and semivariance, and cluster mutual funds using Ward method and kmeans method separately and compare their results.

Definition indices (1) Rate of return
The net asset value (NAV) is defined as current market value of a fund minus its liabilities divided by outstanding number of shares (Mobius, 2007).Rate of return based on net asset value defined as: where i is the number of mutual funds, i,t R is the rate of return at time t and , is the net asset value at time t.
(2) Variance is the average rate of return of T months.
(3) Downside risk measure (semivariance) (4) Turnover rate Turnover is a measure of fund's transactions, high turnover rate shows that the fund is an active fund and pays more transaction cost.

100%
(5) Treynor index Treynor (1965) introduced a ratio for performance evaluation of portfolios. , where is the Treynor index, is the average return, is the return of risk-free asset and is the measurement of systematic risk and calculated as: where , is the covariance between the return of mutual fund i and the return of the market index and is the variance of .
(6) Sharpe index Sharpe (1966) proposed a measurement for performance evaluation of mutual funds.We use this index because his study was particularly based on mutual funds.
Where is the Sharpe index and is the standard deviation and called total risk. (7)

Clustering
Clustering is a grouping observations or records into classes of similar objects have named and it is widely used in data mining.A cluster is a group of observations, which are similar to one another and are not similar to observations in other clusters.There are two kinds of algorithms in clustering: 1hierarchical 2-non-hierarchical.Hierarchical clustering has a tree-like cluster structure and creates a dendrogram, where the tree diagram is implemented to display arrangement of clusters.There are various methods to determine distance among clusters, which lead to create different methods in hierarchical clustering (Larose, 2005).In this study, we choose Ward method in hierarchical clustering methods, where dissimilarity among clusters is the Euclidean distance among their centroids.This method minimizes within-clusters sum of squares based on pair wise distances (Decker, & Lenz, 2007).
There are various non-hierarchical clustering methods and k-means is one of the most commonly methods, where it minimizes the sum of distances between each observation and its cluster center (Zio & Bazzo, 2010).In k-means method, refers to the number of clusters, which must be determined before clustering process.K-means method follows this algorithm: Step 1) Determine (number of clusters), Step 2) Assign observations randomly to be the initial cluster center locations, Step 3) For each observation determine the nearest cluster center, which the nearest criterion is usually Euclidean distance, hence, each cluster center includes a subset of observations and our dataset segments to k clusters, Step 4) For each cluster, find the new cluster center and substitute them with random center points, new centers are found using weighted average of observations in each cluster.For example the center of points (1,2) and (2,4) is (1.5,3).
Before we precede clustering process, we need to normalize our observations and the proposed model of this paper uses min-max normalization as follows, In Eq. ( 8), is a normalized observation, is the observation that will be normalized, min is the minimum amount of all observations in that index and max is the maximum amount of all observations in that index.The advantage of this method is that, after normalization process minimum unit equals to 0, maximum unit equals to 1 and all observations will range zero to one (Larose, 2005).

Making fuzzy variables
Note that a crisp number for risk and rate of return obtained from historical data is not usually appropriate for future because there are many uncertainties associated with future and many things may influence on it.Hence, we define risk and rate of return as triangular fuzzy numbers.A triangular fuzzy number could be written as , , .We define fuzzy rate of return membership function as Eq. ( 9).Fig. 1 shows this function.
We can make fuzzy risk variable similar to making fuzzy rate of return variable.Markowitz (1952Markowitz ( , 1959) ) proposed the mean-variance model for portfolio optimization problem.Instead of crisp numbers in the main model we substitute the variables of his model with fuzzy variables as Eq. ( 11).

Maximize
Minimize 2 Subject to 1

Number of assets
Proportion invested in asset i, i=1, 2, …, N

Fuzzy expected return of asset i
Fuzzy covariance between asset i and asset j We assumed short sales are not allowed and there is not any dividend, taxes and transaction costs during the evaluation period.The Markowitz's model is formulated as a convex quadratic programming, where the mean-variance model has an assumption that the rate of return is normally distributed but this assumption does not hold in many cases.Therefore, we propose other risk measures such as semivariance to improve the main model (Chang et al., 2009) and substitute it for the proposed model of this paper.As Chen & huang (2009) discussed we can also eliminate the correlation part in the model because at clustering process the variance among clusters maximized and the correlation between clusters (variables) decreases.Eq. ( 12) illustrates our new model: Eq. ( 12) is one of the most difficult kinds of optimization models to solve.This model is a multiobjective fuzzy non-linear programming.As illustrated in Eq. ( 10), the risk and return variables are fuzzy and have lower bound and upper bound.Eq. ( 13) is written using these bounds.
maximize , minimize , subject to 1 0 1, … , The above model could be solved using Kuhn-Tucker conditions considering lower and upper bounds, separately.Hence, we split Eq. ( 13) into two models that both are multi-objective.
Definition1 (Complete optimal solution).is said to be a complete optimal solution of a r-objective problem if and only if, there exists such that for i=1,…,r and for all .
When we solve this problem a complete optimal solution does not always exist, hence Pareto optimal solution is defined as follows Definition2 (Pareto optimal solution).is said to be a Pareto optimal solution of a r-objective problem if and only if, there does not exist another , such that for i=1,…,r and for at least one k, 1, 2, … Eq. ( 14) considers lower bound of return and upper bound of risk and gives us the worst solution and Eq. ( 15) considers upper bound of return and lower bound of risk and gives us the best solution.In next section we discuss solution procedure for these two MONLP models using fuzzy theory and obtain Pareto optimal solution.

Fuzzy programming technique
There are different methods to solve MONLP problems and we use the method developed by Zimmermann (1978), which is fuzzy programming technique.In the first stage of fuzzy programming technique, each objective function is optimized with constraints, separately.The objective functions minimizes to obtain lower bound (l ) and maximizes to obtain upper bound (u ).For our problem, we first minimize and maximize the return objective function and then the method minimizes and maximizes the risk objective function considering the constraint.Then we define objective functions as two fuzzy functions.Fig. 2 and Fig. 3 illustrate the membership function of these fuzzy functions and Eqs.(16-17) define these functions.We define as the percentage rate that i th objective function nears to its optimized solution or minimum amount of membership function.Eq. ( 18) shows this definition.
In return objective function we have Eq.( 19) as follows, and in risk objective function we have Eq.( 20) as follows, → Now we can rewrite our model as follows, P percentage that objective functions additionally near to their optimized condition, Weight of return objective function, Weight of risk objective function, We can assume different α-levels and LINGO11.0 software can be used to solve this model.The second constraint of model is non-linear and software gives local optimum but because our solution space is convex, the local optimum is also global optimum.

Numerical example
In order to study the performance of the proposed model of this paper, we have selected 92 equity mutual funds from NASDAQ from 2007 to 2009 (available online at http//:finance.yahoo.com).

Fig. 4. Dendrogram of Ward method
For these cases NAV and market information are extracted and the characteristics were calculated for each fund based on return rate, variance, semivariance, turnover rate, Treynor index and Sharpe index.Finally, all data were normalized leaving us to have 92 rows and 6 columns where all of numbers ranged [0,1].For clustering analysis Minitab 14 software helped us.Observations clustered using Ward method and k-means method considering Euclidean distance.Fig. 4. illustrates the dendrogram obtained from Ward method.
In clustering analysis, the number of clusters should be determined and there is not a common method to accomplish this task.In Ward method we can fix distance or similarity level and obtain number of clusters.In our example distance level is fixed to 2.5.But in k-means method the number of clusters (k) should be specified before clustering process.Trial and error procedure can be useful.We can first fix k=2 and then increase it and find the appropriate number.We observed 4 clusters are more appropriate.Number of clusters more or less than 4 separated a few observations with abnormal characteristics.The results obtained from Ward and k-means method are shown in Tables 1 and Table 2.In next stage two characteristics remained (return rate and semivariance) for modeling.Four eliminated characteristics (variance, turnover rate, Treynor index and Sharpe index) helped us for more accurate clustering.After clustering process the mean (µ) and standard deviation (σ) of each cluster are specified.For making fuzzy variables we used original data and assumed , , as Eq .( 22).Tables 3 and Table 4 show the results of fuzzy variables in Ward and k-means clustering methods.-------------------------------------------------------------Cluster 2 [187.3,402.955,0.634,1.298] Cluster 3 [10.848,86.962,163.076] [0.14,0.53,0.92 The above problem is a multi-objective fuzzy non-linear problem.To solve this problem we assumed 5 levels for α: [0, 0.25, 0.5, 0.75, 1].
For example if we assume α=0.5 and as explained before split the model into two models we have Eqs Eq. 26 can be written the same as Eq.27.In this example, we assumed 3 weights: [ 0.25, 0.75 , 0.5 , 0.75, 0.25 .Tables 5-14 illustrate results for our numerical example completely.Except α=1, in other levels of α, investment proportion variables and return rate and risk variables have interval form.Results show that return rate is still triangular fuzzy number.For example in k-means method with weights [ 075, 0.25 we can say that return rate is (0.139, 0.568, 1.298).However, membership function of risk variable is not known because its objective function is a quadratic function.However, we make fuzzy variables and solve the problem using α-cut method to have fuzzy variables in results.In α-cut method when α=1 means that the problem is solved in certain condition.In other words, we solve our problem in fuzzy and certain condition simultaneously.The results show that in Ward method investment proportion of variable is higher than others and in k-means method variable is higher.These variables have more influence in forming portfolio and we can say that these clusters are better than others.Referring to data before clustering, number of funds in Cluster 3 in Ward method and Cluster 4 in k-means method is illustrated in Table 15.15 although more than half of funds are common in two methods but we should specify that clustering method is effective on our methodology.On the other hand, our methodology suggests 16 or 17 mutual funds out of 92 mutual funds and also determines the investment proportion in each cluster.

Conclusions
Since mutual funds developed rapidly in recent years, evaluating their performance has been an important subject.Several multi-criteria decision making methods were explored and in this paper multi-objective portfolio optimization is chosen.In current study, performance indices such as rate of return, variance, semivariance, turnover rate and Treynor ratio and Sharpe ratio are calculated and then clustering based on these indices is done.We used Ward method as a hierarchical and k-means as a non-hierarchical method separately.We made fuzzy variables from return rate and semivariance.Risk measurement is very important in portfolio optimization.Semivariance is used as a downside risk measure.A multi-objective fuzzy mean-semivariance model is made.This model is solved with fuzzy technique programming.
As shown in results, solutions with confidence level less than one are interval.These interval numbers helps investors to decide better.They know the range of return and risk and choose a number in investment proportion interval.It means that our methodology considers investors preferences.Results show that we cannot prefer Ward method rather than k-means and vice versa because the Pareto optimal solutions of clustering methods do not dominate each other.Because the clustering method was effective in our methodology, future studies can follow this methodology without clustering and compare the results and as another work can complete the optimization model and solve it using advanced techniques such as metaheuristics.

Fig. 2 .
Fig. 2. Membership function of return objective function ] Cluster 4As shown in Table4, Cluster 2 is ignored because its return rate was less than Cluster 4 and its semivariance was more.Each fuzzy number has a membership function.For example the membership function of return rate in Cluster 1 and Ward method is as follows,The membership function of other variables can be written as Eq.(23).Fuzzy return rate and semivariance are our inputs for portfolio optimization model.In our example, there were two multiobjective problems.One of them is as follows, For the first objective function lower bound and upper bound are [0.46,0.832]andfor the second are[20.783,215.586].Now we can write the last model that can be solved with optimization software package as follows,

Table 5
Pareto optimal solution of Ward method with α =0

Table 6
Pareto optimal solution of Ward method with α =0.25

Table 7
Pareto optimal solution of Ward method with α =0.5

Table 8
Pareto optimal solution of Ward method with α =0.75

Table 9
Pareto optimal solution of Ward method with α =1

Table 10
Pareto optimal solution of k-means method with α =0

Table 12
Pareto optimal solution of k-means method with α =0.5

Table 13
Pareto optimal solution of k-means method with α =0.75

Table 14
Pareto optimal solution of k-means method with α =1