Abstract

Enhanced index tracking (EIT) is an active research area in portfolio management that focuses on adding reliable value relative to the index on the basis of mimicking the behavior of the benchmark index. To solve the EIT problem, many approaches have been proposed. However, it still remains a critical challenge to efficiently generate a portfolio with good quality. In this study, we propose a learning-based approach named IntelliPortfolio for the EIT problem. IntelliPortfolio uses PCA and clustering to select stock and estimates the investment weight for each constituent stock using a long short-term memory (LSTM) network. Two advantages of the proposed algorithm are as follows. (1) It considers both the fundamentals and the price information for stocks and can balance the trade-off between the performance and the diversity of the selected stocks. (2) It uses a LSTM model to estimate investment weights, which is more capable to handle long sequences of input and is more robust to predict the future trend of stock market. Experimental results on the five real-world datasets of the international stock market illustrate the significant performance superiority of the proposed approach in comparison with five state-of-the-art algorithms.

1. Introduction

As one of the most important strategies of passive investment, index tracking describes the process of investing in a portfolio that attempts to match the performance of some specified benchmark indexes. Today, indexed portfolios often function as the core holding in an investor’s overall equity allocation. The idea of “enhanced index tracking (EIT)” has recently gained tremendous importance because more and more fund managers seek to outperform the index by appreciating the core of index investing. The reasons are pretty obvious. By design, index tracking can only produce a similar return to the index, which makes index funds always underperform the index by the amount of fund costs [1]. Enhanced return, therefore, could often bring fund managers competitive advantages after deducting expenses to track and reward new customers. In fact, enhanced index tracking is a dual-objective problem seeking the optimal decisions in the trade-off between maximizing expected performance and minimizing tracking risk.

Existing methods for enhanced index tracking include statistic-based, heuristic, and learning-based, as discussed in Section 2. Statistic-based approaches are the most mature method and have been studied for many years. However, such approach requires a significant amount of calculation and becomes instable when the covariance matrix of the dataset is ill-conditioned or nonpositive [2]. Heuristic methods have been shown to be effective in finding good portfolios [3], but are inefficient in solving high-dimensional EIT problems. More specifically, they are prone to fall into local minimums when searching for a large solution space and often result in suboptimal portfolios [4]. Learning-based approaches, although still in their infancy, are continuously popular. Methods belonging to this category are sensitive to the stability of the stock market, and their performance often fluctuates significantly.

To address the above issues, we propose an Intelligent Portfolio algorithm for enhanced index tracking problem, referred to as IntelliPortfolio, which aims to automatically select the constituent stocks for the portfolio from a benchmark index and determine the investment weight for each constituent stock. The key ideas of IntelliPortfolio are to select constituent stocks for the portfolio using principal component analysis (PCA) and k-means clustering algorithm and to estimate the investment weight for each constituent stock using a long short-term memory (LSTM) network. The motivation here is twofold. (1) The use of PCA and clustering algorithm to select stock is that it can consider both the fundamentals and the price information for stocks and can balance the trade-off between the performance and the diversity of the selected stocks. (2) The use of LSTM to estimate investment weights is that LSTM is more capable to handle long sequences of input when compared to other recurrent neural networks and is more robust to predict the future trends of stock [5]. This strategy is shown to produce better solutions in our practical experiments.

In summary, our work makes the following contributions to the field:(i)We propose a novel, principal component analysis (PCA) and clustering-based stock selection algorithm to select constituent stocks for a portfolio from the benchmark index. Our algorithm considers both the fundamentals and the price information for each stock and can balance the trade-off between the performance and the diversity of the selected stocks. Our extensive experiments also show the effectiveness and the generality of proposed stock selection algorithm.(ii)We propose IntelliPortfolio algorithm to solve EIT problem, which seamlessly integrates the stock selection algorithm with a long short-term memory (LSTM)-based investment weight estimation algorithm. Given a stock dataset and the number of stocks in a portfolio, IntelliPortfolio can both select the constituent stocks of the portfolio and determine the investment weight for each constituent stock.(iii)We evaluate the performance of IntelliPortfolio through extensive experiments using five real-world datasets of international stock markets. We show that IntelliPortfolio outperforms five state-of-the-art enhanced index tracking algorithms by 8.78%–665.97% in terms of four well-known performance metrics.

Enhanced index tracking is an active research area in portfolio management based on index tracking which aims to replicate the performance of the benchmark. We can classify the previous studies for index tracking on this problem into three categories: statistic-based, heuristic, and learning-based approaches, as discussed below.

2.1. Statistic-Based Approach

Traditionally, index tracking problem has been treated as linear or quadratic programming problems. Wu et al. [6] presented several constrained cluster-based linear mixed-integer optimization models for tracking broad market indices using the developed Lagrangian and semi-Lagrangian relaxation approaches for computing near optimal solutions. Chen and Kwon [2] developed a 0-1 integer program model considering initial portfolio selection and subsequent investment in assets. Fang and Wang [7] proposed a bi-objective programming model for the index tracking portfolio selection problem. Edirisinghe [8] developed a closed-form cost-free solution to the index tracking portfolio selection problem. Wu and Wu [9] designed a multifactor liner regression model as the basis of the tracking models and to enhance the capacity of the decision model, and a Lagrangian-based algorithm is applied to approximate optimal solutions. Oliver and Baumann [10] proposed a value-based mixed-integer linear programming (MILP) formulation for the index tracking problem that leads to a high similarity in terms of the normalized historical value developments between the tracking portfolio and the index and to low rebalancing costs.

Statistic-based approach is a category of classic solutions to the EIT problem and has been studied for many years. However, such approach requires high-precision dataset and a significant amount of calculation. Moreover, they suffer from the poor stability when the covariance matrix of the dataset is ill-conditioned or nonpositive [2].

2.2. Heuristic Approach

Because the enhanced index tracking problem has been proven to be NP hard [1115], many previous studies apply heuristic algorithms to solve it. Mutunge and Haugland [16] proposed a greedy constructive heuristic algorithm that can extend the current portfolio by a single asset in each iteration. Sant’Anna et al. [17] applied a hybrid solution approach combining mathematical programming and genetic algorithm to deal with the volatility problem of stock markets in developing countries. Chen et al. [18] introduced an indexed portfolio optimization model using the mean-variance-skewness framework and proposed a hybrid algorithm combining the firefly algorithm (FA) and the genetic algorithm (GA). Strub and Oliver [19] developed an iterated greedy heuristic for replicating the 1/N portfolio by investing in a subset from a given investment universe. Salehpoor and Molla-Alizadeh-Zavardehi [20] presented a hybrid metaheuristic algorithm to solve the index tracking problem. Beasley et al. [21] presented an evolutionary heuristic algorithm for the index tracking problem. Orito et al. [22] proposed a two-step stock selection algorithm that uses a heuristic method to select stocks and then uses a genetic algorithm to construct a portfolio. Oh et al. [23] used GA to optimize index funds, aiming at tracking stock index and minimizing tracking error. Roland and Berg [24] proposed a hybrid algorithm combining GA with quadratic programming to search for the optimal tracking portfolio. Kumar and Mishra [11] proposed a multiobjective optimizer for portfolio optimization which is based on covariance-guided Artificial Bee Colony (ABC) algorithm. Chen et al. [3] proposed a grouping genetic algorithm for solving the group trading strategy portfolio (GTSP) problem. Saborido et al. [25] proposed the mean-downside risk skewness (MDRS) model and defined the new mutation, crossover, and reparation operators for an evolutionary multiobjective optimization to solve the index tracking problem. Benidis et al. [26] provided a unified framework for a large variety of sparse index tracking formulations and derived a mixed-integer programming (MIP)-based algorithm considering various tracking error functions and constraints. Canakgoz and Beasley [27] presented a mixed-integer linear programming formulations for index tracking problem. García et al. [28] proposed a genetic algorithm and Tabu search-based for index tracking optimization.

Heuristic methods have been proven to be effective in finding good portfolios [3], but they often suffer from the inefficiency in the EIT problem with a high-dimensional space [29]. They are prone to fall into the local minimum when searching for the large solution space [4] and often result in suboptimal portfolios.

2.3. Learning-Based Approach

Fu et al. [29] presented a stacking stock selection model based on supervised learning, used a genetic algorithm to select stock features, and labelled stocks according to the return-to-volatility ratio ordering. Dose and Cincotti [30] presented a stochastic-optimization technique based on time-series cluster analysis for index tracking problem. Ouyang et al. [4] used the deep autoencoder to select the portfolio and then used the deep neural network to dynamically determine the portfolio weight. Zhang and Tan [31] proposed a new stock selection model named DeepStockRanker to predict the future stock return ranking based on the historical data without handcrafted features. Chalvatzis and Hristu-Varsakelis [32] proposed a deep long short-term memory (LSTM) model to predict asset prices and a prediction-based trading strategy. Paiva et al. [33] proposed a method combining support vector machine (SVM) and mean-variance (MV) for portfolio selection. Jiang et al. [34] proposed a deep reinforcement learning framework aiming to maximize cumulative returns using deterministic policy gradient (DPG). Liang et al. [35] implemented three new continuous reinforcement learning algorithms, namely, deep deterministic policy gradient (DDPG), proximal policy optimization (PPO), and policy gradient (PG), in portfolio management. Moody et al. [36] trained two kinds of reinforcement learning methods, namely, real-time recurrent learning (RTRL) and Q-learning, to solve index tracking problem. Park et al. [37] proposed an approach for deriving a multiasset portfolio trading strategy using deep Q-learning. Lu [38] implemented a learning model using LSTM with reinforcement learning or evolution strategies as agents. Zhang and Maringer [39] developed a model that combines GA with recurrent reinforcement learning (RRL) for asset trading. García-Galicia et al. [40] provided a reinforcement learning model in continuous-time discrete-state portfolio management with time penalization for transaction cost. Vo et al. [41] introduced a deep responsible investment portfolio model containing a multivariate bidirectional LSTM neural network to predict stock returns.

Learning-based approach has received much attention recently. However, many learning-based algorithms construct the portfolio by predicting the future price of stocks, which has been proven to be inaccurate [41]. Reinforcement learning-based algorithms, on the contrary, can construct the portfolio adaptively. However, our experiment shows that they are very sensitive to the stability of the stock market, and their performance fluctuates significantly.

3. Problem Statement

This study focuses on the enhanced index tracking problem, referred to as EIT, that aims to produce a portfolio that attempts to earn a higher return than the benchmark index (excess return) while minimizing the risk of deviating from the benchmark (tracking risk). In this section, we introduce notations and terminology first and then define the EIT problem formally.

3.1. Notations and Terminology

(i): time point.(ii): decision time point. is the in-sample time period to select tracking portfolio, and is the out-of-sample time period to evaluate it.(iii): number of stocks in the benchmark index.(iv): number of stocks in the portfolio.(v): price of stock at time point .(vi): index value at time .(vii): stock selection indicator for a portfolio. If the th stock is selected to form the portfolio at time point , ; otherwise .(viii): investment weight for the th stock in the tracking portfolio .(ix): the single period continuous time return of the tracking portfolio at time point and :(x): the single period continuous time return of the benchmark index at time point and :(xi): tracking error of the portfolio, which is defined as the distance between the returns of the tracking portfolio and its benchmark index [1]:(xii): excess return of the portfolio, which is assessed by the average excess return per period achieved by the tracking portfolio [1]:

3.2. Problem Formulation

As shown in Figure 1, the EIT problem consists of two phases named in-sample and out-of-sample, respectively, and the decision time point is . Any solutions to the EIT problem need to construct a tracking portfolio first according to the performance of stock market during the in-sample phase (from the time point 1 to in Figure 1). The ultimate goal is to find an optimal tracking portfolio that can obtain the minimized tracking error and the maximized excess return during the out-of-sample phase (from the time point to in Figure 1). However, it is impossible to optimize this problem directly because a solution has to finish the construction of a portfolio at the decision time point , and the price of a stock during the out-of-sample period (i.e., ) is unknown at . To deal with this issue, we follow the previous studies [1] and simplify the EIT problem by optimizing the portfolio during the in-sample phase instead of the out-of-sample phase, which assumes that the stock prices between and are independent and identical distributed (i.i.d).

In summary, our EIT can be defined as a two-objective optimization problem:where and is given, and equations (5) and (7) state that the goal of EIT is to optimize the tracking error and excess return simultaneously.

Equations (9)–(11) state the constraints for the EIT problem. Equation (9) ensures that the number of stocks selected in the tracking portfolio is equal to at any time point. Equation (9) ensures that no short positions are considered in a tracking portfolio. Equation (10) normalizes the investment weight for each stock in the tracking portfolio.

3.3. Complexity Analysis

This formulation shows that the goal of EIT problem is to search for an optimal portfolio consisting of stocks, each of which is selected from a known benchmark index containing stocks, to minimize tracking error and maximize excess return. According to previous studies [1115], the enhanced index tracking problem is essentially a classic combinatorial optimization (CO) problem and has been proven to be NP hard. The NP-completeness proof is established in [15]. Therefore, a naive exhaustive search solutions would not be practical due to the high dimensionality of decision space and the combinatorial nature of brutal force search.

4. Algorithm Design and Implementation

In this section, we introduce IntelliPortfolio, an efficient clustering and LSTM-based portfolio construction algorithm to solve the EIT problem. Its key idea is twofold: firstly, it applies principal component analysis (PCA) and k-means clustering algorithm to automatically select constituent stocks for the portfolio from the benchmark index; secondly, it uses a long short-term memory (LSTM) network to determine the investment weight for each constituent stock in the portfolio. In the following, we first present the overview of our IntelliPortfolio algorithm. We then present the constituent stock selection algorithm and the weight estimation algorithm. Finally, we discuss the details of the IntelliPortfolio algorithm.

4.1. Overview

Figure 2 illustrates the detailed steps of IntelliPortfolio. IntelliPortfolio contains two phases: constituent stock selection and investment weight estimation. At the constituent stock selection phase, after normalizing the original dataset, we use a PCA-based algorithm to reduce the dimensionality of the original dataset and apply a k-means clustering algorithm to generate clusters from the benchmark index and then add stocks that are closest to the center in each cluster to the portfolio. At the investment weight estimation phase, we adopt a novel windowed-random sampling strategy to generate random samples with fixed-window size and apply the LSTM model to estimate the investment weight for each constituent stock, which finally finishes the construction of the portfolio.

4.2. Constituent Stock Selection

The first step of constructing a tracking portfolio is to select constituent stocks from the benchmark index consisting of stocks. Previous studies on this problem include industry-based method [34], trading volume-based method [23], and autoencoder-based method [4]. Industry-based method takes into account of the information on industry, market capitalization, and trading amount for each stock. Such approach is interpretable and easy to understand, but fail to reflect the trend in stock market [1]. Trading volume-based method, by contrast, can indicate the fluctuation of stock price effectively, but ignores the fundamentals of a stock. Autoencoder-based method trains a deep encoder-decoder network by using the stock price dataset and can adaptively select stocks with the largest and smallest training error. However, it suffer from the low performance issue when the benchmark index contains a large number of stocks.

To deal with these issues, we propose a novel stock selection algorithm integrating PCA and clustering method. The key of the our algorithm is twofold. (1) It considers both the fundamentals and the price information for each stock by receiving all known features from a dataset and using a PCA algorithm to find out the major factors. (2) It uses a clustering algorithm to automatically divide the stocks in the benchmark index into clusters and selects stocks that are closest to the center in each cluster as the constituent stocks of the portfolio. Our intensive experiment shows that this selection strategy has the ability to balance the trade-off between the performance and the diversity of the selected stocks.

Specifically, the detailed process of our stock selection algorithm is listed in Algorithm 1.

Require: : original dataset; : the number of stocks in the portfolio; : the desired number of features after dimensionality reduction; : the start time point for selection; : the decision time interval for stock selection.
Ensure: : constituent stocks in the portfolio; : the dataset consisting of the information of constituent stocks in the portfolio.
(1);
(2);
(3)Perform mean-variance normalization on every feature of all stocks in ;
(4);
(5)fordo
(6);
(7);
(8);
(9)end for
(10)fordo
(11);
(12);
(13)end for
(14); return;

As shown in Algorithm 1, the stock selection algorithm starts by performing mean-variance normalization on every feature in the original dataset (line 3). After that, it applies a PCA algorithm [42] to reduce the dimensionality of to , where is a small, positive integer given by a human expert (line 4). The idea here is to retain the important features containing as much as the variance of the dataset, while removing the insignificant features whose variance is near to zero.

Given the time point and the time interval that are used to select the stocks, we repeat times of applying a k-means clustering algorithm with cluster number to generate different sets of clusters, each of which represent the clustering result during time period (line 6). For each time point , we find the stocks that are closest to the center of each cluster as the candidate constituent stocks of the portfolio (line 7), where the function measures the distance between a stock and the center of a cluster. Once we get the candidate stocks in each time point, we put them together (line 8) and compute the total number of occurrences during (line 11) and then find the stocks with the -maximum number of occurrences (i.e., ) as the constituent stocks of the portfolio. Finally, the algorithm extracts the information for stocks in from the dataset (line 14) and returns the two-tuple . Note that the characteristic of this selection algorithm is that it can consider both the fundamentals and the price information for each stock over a period of time.

4.3. Investment Weight Estimation

After determining constituent stocks for the portfolio, we need to assign the investment weight for each constituent stock. In this study, we propose a long short-term memory (LSTM) network to estimate the investment weight for each constituent stock, as shown in Figure 3.

As shown in Figure 3, our LSTM networks are composed of an input layer, a hidden layers, and an output layer. The number of neurons in the input layer is equal to the number of explanatory variables (feature space) reduced in stock selection. The number of neurons in the output layer reflects the output space, i.e., neurons in our case indicating the weight for each constituent stock in the portfolio in . The main characteristic of LSTM networks is contained in the hidden layer consisting of so-called memory cells. Each of the memory cells has three gates maintaining and adjusting its cell state : a forget gate , an input gate , and an output gate .

When processing an input sequence, its features are presented time point by time point to the LSTM network. Hereby, the input at each time point (in our case, the information of constituent stocks) is processed by the network as denoted in the equations above. Once the last element of the sequence has been processed, the final output for the whole sequence is returned. The detailed process is described as follows.

Require: : the dataset consisting of information of constituent stocks; : the length of a data sequence for training; : training iterations.
Ensure: : the LSTM model for weight estimation.
(1);
(2);
(3)fordo
(4);
(5);
(6);
(7)end for return;

The weight estimation algorithm first initializes the LSTM model with random parameters (line 1) and splits the dataset into training dataset containing continuous data sequences (line 2). During each iteration, it randomly picks a data sequence from (line 3), generates a newer LSTM model (line 4), and finally optimizes the model according to the loss function (lines 5 and 6). More specifically, the loss function is defined as the weighted sum of and : , and the ADAM optimizer [43] is used in the training process.

4.4. IntelliPortfolio Algorithm

We are now ready to describe our IntelliPortfolio algorithm in Algorithm 3. IntelliPortfolio first cleans the original dataset by removing stocks with missing data (line 1) and selects stocks to form the initial portfolio (line 2). After that, it obtains a LSTM-based weight estimation model using stock information on portfolio (line 3). To finish the final optimization, IntelliPortfolio constructs the test dataset by extracting the data with length of time interval from to in (line 4) and produces the final investment weights for constituent stocks according to and (line 5).

Require: : the original dataset; : the number of stocks in the portfolio; the decision time point.
Ensure: : the set of weight values of constituent stocks in the portfolio.
(1);
(2);
(3);
(4);
(5); return;

Note that, according to the definition of the EIT problem, , , and are given. Our IntelliPortfolio algorithm contains seven hyperparameters, namely, , , , , , , and , respectively, where(i): the desired number of features after dimensionality reduction on (ii): the start time point for stock selection(iii): the time period for stock selection(iv): the length of a data sequence used for training a LSTM model(v): the weight value for two-objectives and (vi): the training iterations of the LSTM model(vii): the time interval for final optimization

We will report the values of these hyperparameters in Section 5 and explain the basic principle for determining the values for some important hyperparameters.

5. Experiments

We have implemented our approach and conducted extensive experiments on five real-world datasets of international stock market. The source code and the data can be found in https://github.com/anon4review/IntelliPortfolio. In this section, we first describe our experiment setup and then present the experimental results to prove the efficiency and effectiveness of the proposed approach.

5.1. Experimental Settings
5.1.1. Datasets and Running Environment

We choose five well-known stock indices in the international stock markets as our datasets to evaluate our IntelliPortfolio algorithm, namely, SSE constituent index (SSE180) [44], Dow Jones Industrial Average (DJIA) [45], Financial Time Stock Exchange 100 Index (FTSE100) [46], Hang Seng Index (HSI) [47], and Nikkei stock average (Nikkei225) [48]:(i)SSE180 chooses 180 sample stocks from all 1000 A-share stocks registered Shanghai, China. It reflects the profile and operation of the Shanghai stock market.(ii)DJIA consists of stocks of 30 representative large industrial and commercial companies, which can roughly reflect the price level of the entire industrial and commercial stocks in the United States.(iii)FTSE100 contains 100 representative stocks of influential companies in European. It is one of the most important indicators for global investors to observe the trend of European stocks [21].(iv)HSI is an important indicator of Hong Kong stock market prices and represents 63% of the 12-month average market capitalization rate of all listed companies on the Hong Kong Stock Exchange; it contains 50 representative stocks.(v)Nikkei225 contains 225 stocks with good continuity and comparability. It is the most common and reliable indicator for examining and analyzing the long-term evolution and dynamics of the Japanese stock market.

The available features of each dataset are shown in Table 1.

For each index, we extract ten-year (2431 trading days) data from 2009 to 2018 and remove stocks with missing data. As a result, we have 111 stocks in SSE180, 28 in DJIA, 58 in FTSE100, 39 in HSI, and 201 in Nikkei225. For each dataset, we choose 2371 trading days as the in-sample period (i.e., training dataset) and use the last 60 trading days as out-of-sample period (i.e., test dataset).

Table 2 lists the number of constituent stocks (i.e., ) in the result portfolio.

All experiments run on a computer equipped with four processors, 8GiB RAM, 512 GB disk, and running windows 10. To ensure consistency, we run each algorithm five times and calculate the average of these five runs.

5.1.2. Performance Metrics

We use four metrics to evaluate the performance of the algorithms, namely, tracking error [4, 16, 23], excess return [21], information ratio [2, 49], and the Sharp ratio [34, 36, 38], defined as follows:(i)Tracking error : the performance difference between the portfolio and the benchmark index, which measures the tracking accuracy of the portfolio. We have defined in equation (3) and want to minimize the value of .(ii)Excess return : the average excess return per period achieved by the tracking portfolio compared to its benchmark index, which is defined in equation (4). We want to maximize the value of .(iii)Information ratio : a measurement of portfolio returns beyond the returns of a benchmark index, compared to the volatility of those returns. during the time interval can be defined aswhere is the excess return and is the tracking error. We want to maximize the value of .(iv)Sharp ratio : a well-known measurement that indicates the performance of an investment (i.e. the portfolio in our case) compared to a risk-free asset, after adjusting for its risk. during the time interval is defined aswhere is expectation and is the standard variance. We want to maximize the value of .

The performance improvement of an algorithm over a baseline algorithm in comparison is defined aswhere is the performance of the baseline algorithm and is that of the algorithm being evaluated.

5.1.3. Baseline Algorithms and Hyperparameters

Because IntelliPortfolio is a two-step optimization approach consisting of stock selection and weight estimation algorithms, we need to evaluate the performance of our stock selection algorithm first. Specifically, we compare it with four state-of-the-art methods, namely, random method (Random), industry-based method (Industry) [34], trading volume-based method (Volume) [23], and autoencoder-based method (Autoencoder) [4]. We provide a brief description for each method as follows and report its hyperparameters (including IntelliPortfolio) in Table 3.Random method (Random) randomly selects constituent stocks from the benchmark indexIndustry-based method (Industry) selects constituent stocks from the benchmark index considering the industry, the market capitalization, and the trading amount information for each stock in the marketTrading volume-based method (Volume) chooses the top stocks from the benchmark index by sorting the total trading volume from large to small during a fixed periodAutoencoder-based method (Autoencoder) trains a encoder-decoder network using the historical dataset and selects stocks with the largest and smallest training error

Finally, to evaluate the overall performance of our IntelliPortfolio method, we compare it with five state-of-the-art algorithms, namely, genetic algorithm and recurrent reinforcement learning (GA-RRL) [39], deep deterministic policy gradient (DDPG) [35], recurrent reinforcement learning (RRL) [38], DPG [34], and heuristic genetic algorithm (HGA) [21]. We provide a brief description for each algorithm as follows and report its hyperparameters (including IntelliPortfolio) in Table 4.GA-RRL uses a genetic algorithm (GA) to improve the trading results of a RRL-type equity trading system. It takes the advantage of GA’s capability to select an optimal combination of technical indicators, fundamental indicators, and volatility indicators for improving out-of-sample trading performance.DDPG adopts a deep deterministic policy gradient algorithm to implement portfolio management, in which the agent takes the stock data during a fixed time interval and the current stock weights as the observed environment, and derives the weights of portfolio for the next day.RRL uses trading volume-based method to select constituent stocks of a portfolio and adopts the recurrent reinforcement learning and LSTM model to implement portfolio management.DPG is a financial-model-free reinforcement learning framework to provide a deep machine learning solution to the portfolio management problem. It consists of the Ensemble of Identical Independent Evaluators (EIIE) topology, a Portfolio-Vector Memory (PVM), an Online Stochastic Batch Learning (OSBL) scheme, and a fully exploiting and explicit reward function.HGA selects stocks and determines portfolio weights based on the genetic algorithm (GA) and uses a heuristic method to update the population.

5.2. Experiment Results
5.2.1. Hyperparameter Estimation Results for Stock Selection Algorithm

Two important hyperparameters, namely, and , are involved in our stock selection algorithm. represents the start time point for stock selection process, and denotes the length of the time interval for stock selection. Specially, we divide each dataset into three subsets: a validation set containing 60 days of data, a test set consisting of 60 days of data, and a training set containing all of the remaining data. In our experiment, we choose time intervals of 60, 90, and 120 days for stock selection (i.e.,  = 60/90/120), as suggested in [1, 21, 38, 39]. For each time interval, we try the first, the middle, and the last time point as our start time point for stock selection process.

Based on the above experiment design, we have performed 45 = 5  3  3 groups of experiments, and only report the results on SSE180 dataset in Table 5 due to length limitations. Detailed results can be found in our online repository.

We can see from Table 5 that the results during the last stages containing 120 days (i.e.,  = 2191 and  = 120, denoted as 2191-120) outperform other eight combinations by obtaining three out of four best values of our performance metrics, with the only exception of the . Considering the values obtained by these combinations are not significantly different and all others are, we can safely take as an unimportant metrics. Moreover, in terms of metric, which reflects both the risks and the benefits of a portfolio, the results of 2191-120 combination outperform others significantly.

The above results are reasonable because of the following. (1) The latest time point can reflect the recent trend of stock market. This is consistent with the findings of many previous studies in [1, 23, 35]. (2) The longest period of time interval can fully reflect the trends of stock, which are in turn utilized by our LSTM model. In summary, we make a conclusion that we should use the latest 120 days of data for stock selection, which is adopted in the following experiments.

5.2.2. Comparison Results with Different Stock Selection Algorithms

To prove the effectiveness and the robustness of our selection algorithm, we mix five stock selection algorithms (i.e., Random, Industry, Volume, Autoencoder, and IntelliPortfolio), five EIT algorithms requiring stock selection process (i.e., GA-RRL, DDPG, RRL, DPG, and IntelliPortfolio), and five dataset (i.e., SSE180, DJIA, FTSE100, HSI, and Nikkei225) together and conduct 225 groups of tests. Due to limited space, Table 6 shows the 25 groups of testing results on the SSE180 dataset and highlights the best values. For the complete experimental results, please check our online repository.

We can see from Table 6 that our stock selection algorithm outperforms other four algorithms on five EIT algorithms over four performance metrics, i.e., obtaining 14 out of 20 best values in all. Note that the volume algorithm outperforms other algorithms on metric; this is because it only considers the trading volume while choosing stocks, and stocks with large trading volume are influencing factors for index pricing. Although volume algorithm performs well on , it is underperforming in other three metrics. Another important observation from Table 6 is that our stock selection algorithm achieves more significant improvements over the other algorithms on the metric, which means our algorithm has better performance when considering the and metrics simultaneously. In summary, the above results indicate that our selection algorithm not only has better performance than others but also has strong generality and is suitable for many EIT algorithms.

For a better illustration, we plot the four normalized performance metrics of five stock selection algorithms on SSE180 dataset in Figures 4(a)–4(e), respectively. In each figure, the -axis lists the five algorithms and the -axis represents the measurements of the four performance metrics. We conclude from Figure 4 that our stock selection algorithm achieves stable improvements compared with the other five algorithms, and the improvements are more significant in and metrics.

5.2.3. Overall Algorithm Performance Results

Finally, we compare our IntelliPortfolio algorithm with five state-of-the-art EIT algorithms on five international stock markets, and the results are listed in Table 7.

We can see from Table 7 that IntelliPortfolio outperforms other five algorithms on five datasets over four performance metrics, i.e., obtaining 16 out of 20 best values in all. Specifically, in terms of , our algorithm achieves an average performance improvement of 178.90% over DPG, 29.3% over RRL, 260.90% over DDPG, 98.01% over GA-RRL, and 89.16% over HGA; in terms of , our algorithm achieves an average performance improvement of 665.97% over DPG, 364.57% over RRL, 136.47% over DDPG, 69.47% over GA-RRL, and 130.73% over HGA; in terms of , our algorithm achieves an average performance improvement of 120.20% over DPG, 98.54% over RRL, 8.78% over DDPG, 86.57% over GA-RRL, and 37.71% over HGA. Note that although DPG and DDPG algorithms outperform other algorithms on metric, they are still underperforming in other three metrics. Such results illustrates the significant and consistent performance superiority of the IntelliPortfolio algorithm in comparison with five state-of-the-art algorithms.

Figure 5 illustrates the average performance measurements of different algorithms. We observe from Figure 5 that IntelliPortfolio outperforms all other five algorithms, followed by GA-RRL, DDPG, PG, RRL, and HGA. The difference between IntelliPortfolio and other algorithms is significant, and all others are not. Such results indicate that IntelliPortfolio is stable and robust on different datasets in comparison with other algorithms.

6. Conclusion and Future Work

In this study, we propose an learning-based approach named IntelliPortfolio for the EIT problem. It applies principal component analysis (PCA) and k-means clustering algorithm to automatically select constituent stocks for the portfolio from the benchmark index and uses long short term memory (LSTM) network to determine the investment weight for each constituent stock in the portfolio. We conducted extensive experiments on five real-world datasets of international stock market. The superiority of IntelliPortfolio was illustrated with four performance metrics in comparison with five state-of-the-art EIT algorithms.

It is of our future interest to make IntelliPortfolio more reactive by learning previous patterns of stock market and supporting automatic adjustments according to future market prediction. We will also explore the possibility of proposing practical reinforcement learning-based algorithms to make more intelligent and adaptive portfolio decisions.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declares no potential conflicts of interest.