Kernel-Based Aggregating Learning System for Online Portfolio Optimization

,


Introduction
Portfolio optimization is a fundamental issue of computational finance which aims to invest wealth in a set of assets to meet some financial demands in the long run. ere are two major schools of principles and theories for this problem: (i) Markowitz [1] introduces the mean-variance theory that illustrates the relationship between portfolio expected return and risk; (ii) Kelly [2] presents the Kelly investment criterion, which focuses on multiperiod portfolio selection and tends to maximize the expected log return. Due to the sequential nature of financial market data, it is suitable to solve online portfolio optimization (OLPO) problems following the last framework. In recent years, we have witnessed much research effort from machine-learning and artificial intelligence communities to design OLPO strategies through diverse prediction models and online learning algorithms (see [3][4][5][6][7][8][9][10][11][12][13] and references therein for more details).
In finance industry, heuristic principles based on economic phenomena are often adopted. Trend representation is one of the main methods to make future price predictions following this principle. In the survey by Li and Hoi [14], there are three categories for trend representation: patternmatching, trend-reversing, and trend-following. Patternmatching tries to find historical patterns that are similar to the current pattern and uses the historical results to predict the asset price. Gyorfi et al. [15] identify the similarity set by comparing two market windows via Euclidean distance and then conduct nonparametric kernel-based sequential investment strategies. In addition, Györfi et al. [16] further discuss the nonparametric nearest neighbor system to search for historical patterns which are located in the l nearest neighbors of the current pattern. Trend-reversing and trendfollowing are frequently observed in financial markets, as shown in Figure 1. Most individual investors trade by analyzing these trends, forming large momentums that drive the asset price up or down. Trend-reversing assumes that poor performing assets will perform well in the subsequent periods and vice versa. For example, Li et al. [8] present the online moving average reversion (OLMAR) strategy that takes the moving average in a recent time window as a prediction of the future asset price. Huang et al. [9] propose the robust median reversion (RMR) strategy which exploits the L 1 -median of recent asset prices as a robust statistic against noise and outliers. However, studies of behavioral finance in [17][18][19] indicate irrational phenomena in financial markets, which is contradictive to the efficient capital market model presented by Fama [20]. People tend to believe that good/poor performing assets will keep on rising up/going down, thus further push up/down the price along the previous direction. Hence, by following this trend pattern, it is possible to capture potential opportunities for excess returns. Agarwal et al. [21] take a Newton ascent step on the current portfolio to follow the price trend. Lai et al. [11] track the historical peak prices of assets in a recent time window and learn portfolios to catch the potential profit patterns. Besides, Lai et al. [13] propose a short-term sparse portfolio optimization system based on the alternating direction method of multipliers, which concentrates wealth on a small proportion of assets that have good increasing potential and proves that the augmented Lagrangian has a saddle point.
To the best of our knowledge, most OLPO strategies are built according to trend-reversing principle and can be seen as defensive systems. Few aggressive systems in OLPO that could catch up with those of the state-of-the-art defensive ones are investigated. As an aggressive system, Lai et al. [11] have achieved better investing performance than OLMAR and RMR [8,9], but it lacks effective trend representation and does not take into account the downside risk which may lead to poor prediction performance and large investment losses in some market environments. Besides, in financial practice, technical analysis is one common method to analyze the asset price, which can identify price patterns from historical data by exploiting technical indicators and then suggest future activities. Wang and Zheng [22] investigate the statistical stationarity of well-known technical indicators including moving average, bollinger bands, moving average convergence-divergence, and rate of change and apply them in high-frequency trading. In modern portfolio theory, investors can also rebalance their portfolios by using various indicator information concerning the financial market, which can individuate the nature of the investment opportunity about to be faced. Moreover, prediction with a single model based on certain selection criterion is often unstable and sensitive to the noisy data and outliers which are seldom considered by existing aggressive systems with effective trend representation. ese potential problems will lead to estimation errors and thus nonoptimal portfolios. Huang et al. [23] explicitly estimate the next price relative by combining four types of different forecasting estimators and design an online portfolio selection strategy named combination forecasting reversion. Lin et al. [24] exploit mean reversion principle from a metalearning perspective and formulate a boosting method for price relatice prediction. Yang et al. [25] present an online portfolio strategy, named WAACS, which is proved to be a universal portfolio. It utilizes the available side information in markets and applies the weak aggregating algorithm to aggregate all the expert advice given by all the constant rebalanced portfolio strategies.
In this paper, we present a novel online learning system named kernel-based aggregating learning (KAL) for portfolio optimization to address the above drawbacks in two stages. Firstly, technical indicator information of historical prices, nonstationary nature of time series data, and weighted aggregation of multiple estimators are taken into comprehensive consideration to make the improved price relative prediction. Specifically, indicator information suggests the possible trend pattern of price fluctuations; autoregression integrated moving average model deals with the nonstationary nature of price time series; meanwhile, weighted aggregation improves the robustness of estimation and breaks the limitation of optimal parameters selected in hindsight.
en, online convex optimization theory, in references [26][27][28], is applied to calculate coefficients of the proposed prediction model. Secondly, an enhanced learning system is built to optimize the portfolio by maximizing future wealth with a kernel-based increasing factor. And then we develop a fast algorithm for the KAL objective to make it applicable to large-scale and time-limited situations. Experimental results show that KAL achieves better performance than other state-of-the-art systems in OLPO. e remainder of this paper is organized as follows. In Section 2, we introduce the problem setting baseline and related works. In Section 3, we illustrate the whole KAL system in detail. In Section 4, extensive experiments on benchmark datasets from real financial markets with diverse assets and in different time spans are conducted. In Section 5, the concluding remarks are presented.

Problem Setting.
e problem setting in this paper is consistent with the standard and common one that has been used by many previous research studies [8][9][10][11][12][13][14][15][16]. Consider an investment task over a financial market with d assets. On the tth period, the asset prices are represented by a close price vector p t ∈ R d + , t � 0, 1, 2, . . . , where R d + denotes the d-dimensional nonnegative number space, and each element p (i) t represents the close price of asset i. Moreover, there is another concept called price relative [29]: where a division between two vectors denotes an elementwise division in this paper. x (i) t is the outcome of one unit wealth invested in the ith asset during the tth trading period. In fact, price relative is the main form of price information that an OLPO system exploits.

Mathematical Problems in Engineering
At the beginning of the tth period, we diversify our capital among d assets. To denote the proportion of the total wealth invested in each asset, we introduce a portfolio vector lying on the d-dimensional simplex: In this paper, we assume there are no short selling, no borrowing money, and all the wealth from pervious period should be reinvest in the current period, which lead to this nonnegative constraint and the equality constraint.
Since we adopt price relative, the portfolio wealth would multiplicatively grow. e cumulative wealth (CW) at the end of the tth period is a number S t � S t− 1 · (b T t x t ). Without loss of generality, suppose the whole investment lasts n periods with initial wealth S 0 � 1, then the evolution of S n is e OLPO problem can be formulated as a sequential decision task. e portfolio manager aims to design a strategy b t n t�1 to maximize the cumulative wealth S n : Only the historical information until the current period can be used to select the next portfolio vector b t+1 , and different portfolio optimizing strategies indicate different principles of how to use the historical information.
In addition, we make several general assumptions in the above model as a supplement: (i) Transaction cost: no transaction cost or taxes in this OLPO model (ii) Market liquidity: one can buy and sell required quantities at last closing price of any given trading period (iii) Impact cost: market behavior is not affected by an OLPO strategy ese assumptions are not trivial, and we will empirically analyze the effects of transaction costs in Section 4.

Related Works.
In this section, some related works are introduced specifically and their performance will be compared with our KAL in Section 4.
Uniformly buy-and-hold (UBAH) is a simple and commonly used baseline. e portfolio manager allocates his capital equally in d assets at the beginning and does not rebalance in subsequent periods. It is usually adopted as market strategy to produce the market index. Another common benchmark is the beststock (BS) strategy, a special buy-and-hold strategy that invests all the wealth on the best asset in hindsight.
Borodin et al. [30] present the anticorrelation (Anticor) algorithm that calculates a crosscorrelation matrix between two specific market windows and then transfers weights from the previous winning assets to the current losing assets. Li et al. [31] propose correlation-driven nonparametric learning (CORN) approach that identifies the linear similarity among two market windows via correlation, which also adopts the idea of pattern-matching. Anticor and CORN try to dig the correlation between different assets separately and all follow the mean reversion principle.
Online moving average reversion (OLMAR) and robust median reversion (RMR) are two state-of-the-art defensive strategies based on the mean reversion principle. Li et al. [8] assume that the asset price in the next period will reverse to its moving average (MA) and takes MA as a reference of the asset price trend. ere are two types of MA. One is the socalled simple moving average (SMA) which truncates the historical prices via a time window and calculates its arithmetical average. Another one is the exponential moving average (EMA) which adopts all historical prices and each price is exponentially weighted:

Mathematical Problems in Engineering
Huang et al. [9] exploit the L 1 -median of the recent prices as a robust prediction, which is less sensitive to outliers and noise than OLMAR.
Lai et al. [11] explore an aggressive strategy based on the trend-following principle that could catch up with OLMAR and RMR, named the peak price tracking (PPT) strategy. It extracts the increasing power of the assets by using the peak price in a fixed time window as a prediction to get potential growth opportunities.

Motivation.
Empirical studies show that real-world financial markets are not always effective. ey often overreact to all kinds of information and create potential opportunities for capturing excess profits. More aggressive OLPO systems with promising performance should be further investigated. PPT, which estimates the next price via peak price, has achieved good results on most datasets. But it also may lead to poor prediction performance and large investment losses when the market environment is accompanied by the downside risk. Besides, existing OLPO strategies often lack explicit price trend presentation which can effectively recognize the trend patterns. Moreover, the single-model prediction always suffers from noises and outliers in the data and ignores the temporal heterogeneity of historical data, both of which could reduce the accuracy and robustness of estimators. To fill in these gaps, in this paper we propose the KAL system to make an improved price relative prediction by exploiting indicator information and aggregating learning method and optimize the portfolio by an enhanced tracking system with a kernel-based increasing factor.

Price Relative Prediction with Component Estimator.
In financial markets, technical analysis is a popular way to analyze the asset price. It exploits technical indicators to identify price patterns and guide the investment behavior to make profits. In this paper, we consider particular situations of different assets and adopt simple moving average of prices to follow the trend pattern of each asset. At first, according to whether the close price exceeds its simple moving average in (5) at the end of tth period, we assume the indicator information has two states as follows: where y (i) t+1 represents the indicator information status of asset i at the beginning of t + 1th period, p (i) t is the close price of asset i at the end of tth period, and SMA (i) t (w) is its moving average in recent w periods. From the perspective of technical analysis, the future price of asset i in short term has high probability to rise up when y (i) t+1 � 1 and go down when y (i) t+1 � 2. Time series data in financial field are usually not realization of a stationary process, some of them may contain deterministic trends. Autoregression integrated moving average (ARIMA) is one effective linear model for time series prediction, and it has great statistical properties and structural flexibility and can deal with the nonstationary characteristics of price sequences well.
We denote ∇ h p t is the h order differences of p t , and ϵ t denotes the zero-mean random noise term at time t. e price sequence of p t satisfying the ARIMA (k, h, and q) model is formulated as follows: which are parameterized by three terms k, h, and q and weight vector α ∈ R k of the autoregression (AR) part and β ∈ R q of the moving average (MA) part. e original price prediction could be approximated with the AR (k + m, d) model as follows: where m ∈ N is a properly chosen constant and γ is the coefficient vector to be solved which belongs to the set At period t, we first make a price prediction p t , after which the real price p t is revealed, and then we suffer a loss denoted by ℓ t (p t , p t (γ)). Our goal is to minimize the cumulated losses over a predefined number of iterations T. e regret after T rounds is defined as follows: We wish to obtain an efficient algorithm that can guarantee this regret growth sublinearly in T, implying that the per-round regret will vanish as T increases. Now we present one specific online convex optimization algorithm by applying the Online Newton Step method in [27] to solve the parameter vector γ ∈ R k+m in the model above. Algorithm 1 iteratively optimizes the coefficient vector γ in an online manner.
It has been proved that this iterative procedure guarantees a proper upper bound of the regret R T in prediction, as shown in eorem 1. e details of the proofs can be found in [27] and we omit the details here.
where D is the diameter of P, G is the upper bound of ‖∇ℓ t (γ)‖ for all t, and γ ∈ P. e loss functions ℓ t (γ) are assumed to be α-expconcavity in γ. en, the online sequence γ t T t�1 generated by Algorithm 1 guarantees R T ≤ O(log T).
PPT [11] has proposed a future price prediction named peak price, which is the maximum price of the asset on the most recent periods. Following the idea of PPT, we give the nadir price which is the minimum price of the asset in this time window. Peak prices and nadir prices of different assets are gathered as vectors p 1,t+1 and p 2,t+1 : 4 Mathematical Problems in Engineering We propose a novel and improved future price prediction with the indicator information y mentioned before. If y (i) t+1 � 1, it indicates asset i would be in an upward trend. And if y (i) t+1 � 2, it indicates asset i would be in a downward trend. Irrational phenomena in financial markets shows that prices of poor performing assets will keep on going down, in this situation p 2,t+1 can achieve better prediction performance than PPT and OLMAR. Since no short selling is allowed, investors can only make profits when their asset prices increase. e peak price p 1,t+1 can extract the increasing power of different assets. It is essential to consider the price trends as well as the increasing power of different assets; hence, we combine p t+1 , p 1,t+1 , and p 2,t+1 to design the resulted price prediction as follows: en, we produce the resulted price relative prediction with the component estimator:

Price Relative Prediction with Aggregating Estimator.
As we can see, the value of y t+1 in (7) depends on the window size of the simple moving average, thus the price prediction p t+1 in (12) changes accordingly and sensitively. Meanwhile, the optimal parameter w can only be chosen in hindsight. Now we consider an aggregating approach to combine a set of experts, and each expert estimates the price relative in the next period following the scheme in Section 3.2.1 with different parameters. e experts are generated by sampling the parameter w uniformly from the range U(w min , w max ), and then we present a weighted aggregation of these experts as the final price relative prediction: where e � (w max − w min + 1) is the total number of experts, θ j is the weight of the jth expert, θ belongs to the decision set 1] , and x j t+1 is the predictive value of the jth expert on (t + 1)th period by (13). e remaining issue is how to compute the weights assigned to each expert, and now we present another online convex optimization algorithm by applying the Online Gradient Descent method in Algorithm 2 to calculate θ ∈ K in each iteration.
As shown in eorem 2, the regret of the aggregating estimator can also be bounded (see [27] for detailed proofs).

Theorem 2.
Assume that the loss function ℓ t (θ) is H-strong convex in θ. Let e ≥ 1, and set η � 1/(Ht), where H is the lower bound of ∇ 2 ℓ t (θ) for all t and θ ∈ K. en, the online sequence θ t T t�1 generated by Algorithm 2 guarantees: e whole indicator information-based price relative prediction scheme with aggregating estimator can be interpreted as shown in Figure 2.

Kernel-Based Increasing Factor.
After future price prediction, the second step is to optimize our portfolio according to certain criterion. Similar to the criterions adopted in [10,11], in this paper a tracking system is also conducted. It invests more wealth in potentially good performing assets and less wealth in potentially bad performing ones. We first establish the following KAL objective: Input: Given parameters h, k, m, learning rate η, and initial matrix A 0 and initial vector γ 0 .
(1) for t � 1 to T do (2) Calculate price prediction p t by (9); (3) Receive p t and incur loss ℓ t (γ t ); where ‖·‖ denotes the Euclidean norm. e maximization of b T x t+1 is adopted to track x t+1 with b. e constraints on the right of (16) control the deviation from last portfolio b t and ensure the feasibility of b.
Instead of the pure increasing factor b T x t+1 , we propose a generalized increasing factor as follows: where D t+1 is a positive definite symmetric matrix that rescales the relative influence of different assets in the increasing factor and x t+1 is the average price relative prediction of all assets. x t+1 − x t+1 1 can be seen as a normalized price relative prediction, after that some assets have positive signs while others have negative signs, suggesting an increase or decrease in investing proportion. e generalized increasing factor in (17) can be seen as an inner product; to maximize this inner product, b − b t should track D t+1 (x t+1 − x t+1 1).
As for the setting of D t+1 , there are many sorts of principles to achieve different financial targets. In this paper, firstly we define K(u, v) ∈ R d×d as a kernel matrix for two vectors u, v ∈ R d : It is a positive definite diagonal kernel satisfying Mercer's theorem and measures the similarity between u and v. If u i is closed to v i , then K ij ≈ 1. If u i is far away from v i , then K ij ≈ 0. From the perspective of technical analysis, the distance between the asset price and its mean value implies the strength of trend momentums. e larger the distance is, the greater the corresponding strength will be. For example, at the end of the tth period, the asset price p (i) t falls below its moving average heavily; thus, the difference between them is great and the asset price in subsequent periods is more likely to continue to fall. Naturally, we hope specific assets having this great power can produce more optimization influence; Input: Given the parameter e, learning rate η, and initial vector θ 0 .
(1) for s � 1 to t do (2) Calculate the final price prediction e j�1 θ j s p j s ; (3) Receive p s and incur loss ℓ s (θ s ); (4) Let the gradient ∇ s � ∇ℓ s (θ s ), update the weight vector θ s+1 � Π K (θ s − (1/η)∇ s ).  igure 2: e whole two-step price relative prediction scheme of KAL. By comparing close price p t with its moving average in a fixed time window, indicator information y t+1 is revealed. en, peak or nadir prices of different assets are picked up. By combining them with online outputs of the ARIMA model, the future price prediction p t+1 with the component estimator is produced. At last, by aggregating multiple component estimators, the final predicted value x t+1 is generated. thus, the corresponding elements of D t+1 would be set larger. Now we present the following form of D t+1 : where φ sma t indicates the ensemble moving average of the price relative. e kernel-based increasing factor cannot be arbitrarily large; thus, a constraint generalized from that of (16) is added to the optimization, leading to the whole KAL portfolio optimization: where (y) + � max(0, y) denotes the positive part of y. e constraint can be seen as a generalized Mahalanobis distance between b and b t with the square adjustment matrix K 2 t+1 , such that the feasible set of b is an ellipsoid centered at b t . ϵ can be seen as an expected profiting level. If b T t x t+1 > ϵ, the potential wealth exceeds the expected level, then the portfolio remains unchanged.

Algorithm to Solve KAL.
To solve the KAL objective, in this section, we design a fast algorithm based on the gradient projection principle. It consists of simple and explicit matrix calculations, which are applicable to large-scale and timelimited situations.
At first, we relax the simplex constraint b t ∈ Δ d in (20) and search and optimize b t+1 . e gradient of the objective function in (20) is K − 1 t+1 (x t+1 − x t+1 1); thus, the gradient ascent step is Substituting (21) into the constraint in (20) yields:

and then we obtain
If ‖x t+1 − x t+1 1‖ � 0, there is no need to update this portfolio; hence, λ t+1 � 0. Otherwise, λ t+1 can be chosen in the interval of (23). To exploit full strength of gradient ascent, λ t+1 is set as To ensure that the resulting portfolio is nonnegative, we finally project the above portfolio to the simplex domain by the algorithm in [32]: e whole KAL system can be summarized as Algorithm 3 and illustrated by Figure 3.

Experimental Results
In this section, we use the cumulative wealth and other performance criteria to measure the performance of the proposed KAL system and evaluate its effectiveness by comparing with seven existing strategies on several realworld datasets.
eir detailed information is shown in Table 1. NYSE(O) and NYSE(N) are two different datasets from New York Stock Exchange (NYSE) with different stocks and in different time spans. NYSE(O) is the well-known NYSE dataset pioneered by Cover [29], and it contains 5651 daily price relatives of 36 stocks in NYSE for a 22-year period from July 3, 1962, to December 31, 1984. NYSE(N) is the extended version of NYSE(O) and is collected by Li et al. [33]. For consistency, this dataset is from January 1, 1985, to June 3, 2010, which consists of 6431 trading days of 23 stocks and covers the global financial crisis in 2008. DJIA is collected by Borodin et al. [30], which consists of 30 stocks from Dow Jones Industrial Average containing price relatives of 507 trading days, ranging from January 1, 2001, to January 1, 2003. SP500 and TSE are collected from constituent stocks of Standard & Pool 500 and Toronto Stock Exchange, respectively. Interested readers can check [12](http://OLPO. stevenhoi.org) or the original papers for the first five datasets. e dataset HS300 is collected by Lai et al. [12], which contains 44 stocks of certain CSI300 constituents from China in a recent time span. It supplements the database of this research area since the datasets before are mainly from North America. As we can see, the datasets mentioned above cover much long trading periods from 1962 to 2017 and diversified markets, which enables us to examine how the proposed KAL system performs under different events and crises such as the dot-com bubble from 1995 to 2001 and the subprime mortgage from 2007 to 2009. We take five representative state-of-the-art portfolio selection systems (CORN, Anticor, OLMAR, RMR, and PPT) and two trivial ones (Market and Beststock) to make comparisons with KAL. Due to diverse effective principles as introduced in Section 2.2, the five state-of-the-art systems Mathematical Problems in Engineering 7 will show advantages in different parts of the experiments. e parameters for these systems are set by their defaults and according to previous experiments [8,9,11,30,31]. CORN: w � 5, P � 1, ρ � 0.1; Anticor: w � 5; OLMAR: α � 0.5, ϵ � 10; RMR: w � 5, ϵ � 5; and PPT: w � 5, ϵ � 100. Following similar methods in the related works [8-12, 16, 23, 24, 33], the parameters of our KAL system are empirically set as follows: h � 1, k + m � 7, v � 5, w max � 30, w min � 2, and ϵ � 100 for all datasets. e parameters of the ARIMA model are chosen as h � 1 and k + m � 7, which are consistent with previous research studies [23,34]. e time window size of v � 5 is usually used in stock markets, and the price information in such a time window reflects the recent financial environment. To choose the sampling range of experts U(w min , w max ) and the expected profiting level ϵ, we will conduct experiments in Section 4.4 to further evaluate how different choices of these parameters affect the performance metrics.

Performance Metrics.
Performance is evaluated on several common metrics: cumulative wealth (CW), mean excess return (MER), sharpe ratio (SR), and information ratio (IR). CW is the core metric to evaluate investing performance. MER measures how much better a system is than the market in average. SR and IR are two kinds of riskadjusted return metrics that trade off between risk and return. Table 2 shows the final cumulative wealth achieved by various systems on the six benchmark datasets without considering transaction costs.

Cumulative Wealth.
As we can see, the proposed KAL outperforms other state-of-the-art systems on five datasets and ranks second on TSE. For instance, KAL achieves much higher CWs (7.35E + 18, 3.42E + 9, 21.45) than PPT (1.31E + 18, 2.89E + 9, 11.76) on NYSE(O), NYSE(N), and SP500, respectively. Only KAL (1.41) and RMR (1.35) among the nontrivial systems perform better than the Market (1.34) on HS300, and KAL achieves 30% higher CW than PPT. It indicates that KAL is an effective system following aggressive principle and accumulates more wealth by considering the indicator information. To see how the KAL system works during the entire investments, we plot the CWs of different systems on NYSE(O) and DJIA in Figure 4. e plots of KAL are above other systems on most periods, suggesting that it achieves effective investing performance in the long run. (1) Calculate SMA t by (5), the indicator information y t+1 by (7).
Output: e next portfolio b t+1 . ALGORITHM 3: Solving KAL system. In finance, return is the proportion of wealth that an investor has gained or lost on one period. e daily return of the tth period is r t � b T t x t − 1. MER is the long-term average return that a portfolio selection system exceeds the Market benchmark: where r s,t and r m,t denote the returns of a portfolio selection system and the Market strategy, respectively. At the same time, we take the t statistic as reference to see whether the return of a system is significantly higher than the Market benchmark. According to the capital asset pricing model, the expected return can be decomposed to the market component and the inherent excess return. So, the following linear regression model can be established: where α s is the α-factor representing the active return, β s is the β-factor representing the volatility from the market, and e t is the error term. By using the ordinary least squares method, the coefficients α s and β s can be estimated with n sample pairs of r s,t and r m,t . We also conduct a right-tailed ttest to test whether α s is significantly higher than 0 and show that the excess return is not due to luck. e MERs and the corresponding p values of t-tests for different systems on the six datasets are shown in Table 3. KAL achieves higher MERs than other state-of-art systems on four datasets and ranks the second on TSE. For example, KAL achieves MER � 0.0025 and 0.0003 on SP500 and HS300, compared with OLMAR (0.0020 and -0.0003), RMR (0.0019 and − 4.5E − 5), and PPT (0.0022 and − 0.0005), respectively. As we can see, KAL has high inherent excess returns, and it is the only state-of-the-art system that achieves a positive MER on HS300. Moreover, PPT obtains significantly better performance than the Market at a confidence level of 99% on five datasets. ese results suggest that KAL is an effective and aggressive system that can capture significant excess returns in the long run.

Sharpe Ratio and Information Ratio.
A rational investor not only wants to gain excess return which is higher than the risk-free asset but also wants to balance with risk as well. Sharpe Ratio (SR) is a traditional measurement as a risk-adjusted return, calculated as SR � (E(r s )− r f )/σ(r s ), where E(r s ) and σ(r s ) are the expectation and the standard deviation of r s , respectively. ey can be estimated by the daily samples r s,t of n periods. r f is the return of a risk-free asset in the financial market (e.g., bank deposits, bonds, and currency funds). In this paper, all the wealth is invested in risk asserts, so r f is set to 0. Information Ratio (IR) is also a risk metric but it directly measures the risk-adjusted excess return of a system compared with the Market benchmark, calculated as IR � (E(r s − r m ))/(σ(r s − r m )).
e SRs of different systems are given in Table 4. KAL ese results show that KAL has a good ability in balancing between return and risk; hence, it is robust and reliable for investments.

Transaction Costs.
Transaction cost is an important and unavoidable issue in portfolio selection. We conduct experiments of cumulative wealth according to the proportional transaction cost model and vary the transaction cost ratio from 0 to 0.5%. e CWs of different strategies on six benchmark datasets are shown in Figure 5. As we can see, when the transaction costs increase, the CWs achieved by all strategies drops considerably. While KAL still outperforms PPT on all the datasets and outperforms OLMAR on most datasets except TSE. Notice that the real transaction cost ratio is usually below 0.5%; therefore, KAL is effective and practical applicable.

Parameter Sensitivity.
Notice that our KAL system has several key parameters: the sampling range for aggregating estimator U(w min , w max ) and the expected profiting level for portfolio optimization ϵ. Now we conduct experiments on all datasets to evaluate how different choices of these parameters affect the CWs. First, we fix w min � 2 and w max � 30 and let ϵ change in 30 − 120. Results are shown in Figure 6. e CWs remain nearly unchanged as ϵ varies, which indicate that KAL is stable to the change of ϵ. Hence, ϵ � 100 is empirically used for KAL to conduct experiments. Next, we fix ϵ � 100 and w min � 2 and let w max change in 5 − 50. Results are plotted in Figure 7, which show that the performance of KAL is good on all datasets when w max is around 30. We empirically set w max � 30 as a conventional value.

Conclusion
In this paper, we consider the online portfolio optimization problem from the perspective of aggregating learning and trend-following. So far, few aggressive systems that follow this trend pattern and could catch up with most existing state-of-the-art defensive systems are investigated in depth. Most previous works lack effective trend representation and suffer large investment losses when the market environment is with downside risk. Meanwhile, they are sensitively affected by noises and outliers in data and face limits of optimal parameters chosen in hindsight.
KAL addresses these issues in two stages. At first, technical indicator information extracted from historical data is designed. And an aggregating price relative prediction based on this additional information is proposed, which applies the online convex optimization method to calculate the model coefficients. en, an online learning system is presented to track the increasing power of different assets by maximizing a kernel-based increasing factor. By this method, the better performing assets get more investment, while others get less. A fast algorithm is also developed for KAL, which is applicable to large-scale and time-limited environments.
Extensive experiments on real-world markets show that the KAL system achieves promising performance. It achieves the highest CWs and the highest significant excess returns on most benchmark datasets, which outperforms other state-of-the-art systems. It also has robust performance with high SRs and IRs, which are comparable with other state-ofthe-art systems. In summary, KAL is an effective and efficient OLPO system.      For a further study of the KAL system, it would be useful to mention and discuss the overfitting problem. Following the theorem in [35], the Minimum Backtest Length (MinBTL, in years) is needed to avoid selecting a strategy with a given in-sample SR among N trials with an expected out-of-sample SR of zero. According to the experimental results in Section 4.2.3 and Section 4.4, we could roughly calculate the approximate upper bound to the MinBTL of KAL on six benchmark datasets. en, after comparing this upper bound and the realistic backtest length, we find that KAL could avoid overfitting from this perspective on most datasets. Because MinBTL is merely a necessary, nonsufficient condition to avoid overfitting, this issue deserves further investigations in our future works.

Data Availability
e matlab data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.