Application of integrated data mining techniques in stock market forecasting

Abstract Stock market is considered too uncertain to be predictable. Many individuals have developed methodologies or models to increase the probability of making a profit in their stock investment. The overall hit rates of these methodologies and models are generally too low to be practical for real-world application. One of the major reasons is the huge fluctuation of the market. Therefore, the current research focuses in the stock forecasting area is to improve the accuracy of stock trading forecast. This paper introduces a system that addresses the particular need. The system integrates various data mining techniques and supports the decision-making for stock trades. The proposed system embeds the top-down trading theory, artificial neural network theory, technical analysis, dynamic time series theory, and Bayesian probability theory. To experimentally examine the trading return of the presented system, two examples are studied. The first uses the Taiwan Semiconductor Manufacturing Company (TSMC) data-set that covers an investment horizon of 240 trading days from 16 February 2011 to 23 January 2013. Eighty four transactions were made using the proposed approach and the investment return of the portfolio was 54% with an 80.4% hit rate during a 12-month period in which the TSMC stock price increased by 25% (from $NT 78.5 to $NT 101.5). The second example examines the stock data of Evergreen Marine Corporation, an international marine shipping company. Sixty four transactions were made and the investment return of the portfolio was 128% in 12 months. Given the remarkable investment returns in trading the example TSMC and Evergreen stocks, the proposed system demonstrates promising potentials as a viable tool for stock market forecasting.

Further author, article and funding information is available at the end of the article

AuTHOr BiOGrAPHY
Chin-Yin Huang is professor in the Department of industrial Engineering & Enterprise information at Tunghai university, Taichung, Taiwan. Meanwhile, he serves as the Chairman of international Foundation for Production research in Asia-Pacific region. He had a PhD from Purdue university, uSA. His research interests include agent-based integrated systems, distributed production planning, and big data analysis. His publications appear in international Journal of Production research, Computers in industry, Computers and industrial Engineering, robotics and Computer-integrated Manufacturing, Epilepsy research, and international Journal of Production Economics, Engineering Computations, and Production Engineering. He also co-authored chapters for Handbooks of industrial Engineering, Handbook of industrial robotics, and Handbook of Automation.

PuBLiC iNTErEST STATEMENT
This paper introduces insightful knowledge about using integrated data mining techniques for stock market forecasting. The integrated approach is not only novel but also effective, because its high hit rate in stock forecasting is rarely seen in literature. By taking two stocks, Taiwan Semiconductor Manufacturing Company (TSMC) and Evergreen Marine Corporation, for an investigation during a 12-month period, the results showed that the investment returns of the portfolio were 54 and 128% for TSMC and Evergreen, respectively. Note, the stock prices changed by +25% and −7.7% for TSMC and Evergreen, respectively. Given the remarkable investment returns in trading the example TSMC and Evergreen stocks, the proposed system demonstrates promising potentials as a viable tool for stock market forecasting.

introduction
Forecasting stock investment return is an important financial issue that has been given a lot of attentions (Matías & reboredo, 2012). in the last decade, a number of intelligent systems and hybrid models have been proposed for making trading decisions in an attempt to outperform the main market and be profitable in stock investment (Atsalakis & Valavanis, 2009b). The nature of stock market prediction requires the combining of several computing techniques synergistically rather than exclusively (Jang, Sun, & Mizutani, 1997). it is essential to clarify as predicting the "stock market trend." in reality, it is impossible to predict the future absolute value of the stocks on a daily basis. However, based on the assumption that is largely supported by real case studies that with appropriate training over any (uptrend, down-trend, and flat) horizon one could have enough indicators to forecast the trend with significant accuracy. Future trends may be predicted to some extent based on some key indicators and past behaviors.
Forecasting requires the knowledge of the dominant market variables that "explain" stock market behavior which is both dynamic and volatile. Due to system uncertainties and other unknown (random) factors, every stock market model is approximate. Thus, once model uncertainty is acknowledged, soft computing techniques emerge as the best candidates chosen over standard benchmark linear models to deal with such problems (Atsalakis, Dimitrakakis, & Zopounidis, 2011). One of the best ways to model the market value is the use of expert systems with artificial neural networks (ANN), which is void of standard formulas and can easily adapt the changes of the market. in literature, many ANN models are evaluated against statistical models for forecasting the market value. it is observed that in most of the cases ANN models give better results than other methods (Guresen, Kayakutlu, & Daim, 2011). The proposed system in this research is a hybrid intelligent forecast system combined with ANN. it may predict with significant accuracy stock price trends using historical stock market prices from the Taiwan Stock Exchange (TSE) and gives very encouraging results. The trend of the Taiwan Semiconductor Manufacturing Company (TSMC) stock and the Evergreen Marine Corporation stock were predicted with an 80.4% or higher accuracy. This percentage of accuracy corresponds to a ratio 4:1 (80.4/19.6) of making a 54% profitable stock transaction in a year-long window in which the global recession was at its height and most trading was non-profitable. All case studies performed on the returns of the TSE stocks result in 80% or higher accuracy. in the sections that follow, we propose a system that integrates various data mining techniques to support the stock trading decision-making. The system also incorporates the theory of top-down trading and tandem trading pioneered by Livermore (1940). The theory was found useful in stock forecasting. Analysis of top-down analysis in stock prediction is vital for two important reasons. One is the top-down analysis of the market direction. The investor must know the overall trend of the market before making a trade. This applies to the stock market, the industry group, and individual stocks. The method is to probe whether the market, the industrial group, or the stock is headed up, down or sideways (Leung, Daouk, & Chen, 2000). Then, the individual stock is investigated by the system integrated with data mining techniques including technical analysis, Bayesian probability theory, dynamic time series theory, and ANN.
in this research, we start with checking the main market. The step is to know which way the overall market is headed: up, down, or sideways. Secondly, we examine the specific industry group to make sure that the group is moving in the same direction in order to increase the chance of making a profit on the trade. Thirdly, we review the sister stocks to see if the stock is moving in the same direction. in the fourth step, all three factors are examined at the same time; that is, considering the overall market, the industry group and the sister stocks simultaneously. it can be clearly seen how the system works when all factors are in unison. Lastly, the system that integrated data mining techniques is employed to attain the stock up/down prediction.
The remaining sections of this paper are organized as follows. Section 2 gives the background of the related studies. Section 3 introduces the system of data mining techniques used in this study and Section 4 provides results of the approach using the daily TSE stock price. The final section gives the conclusion and recommendations for future research. This paper contributes to the study of intelligence forecasting. it would also help to realize profitable stock transactions if properly implemented.

literature Review and Related Work
Many financial analysts and stock market investors seem convinced that they can make profits by employing one technical analysis approach or another to predict stock market. Some use time series models expressed by financial theories to forecast a series of stock price data. ANN is usually chosen as a stock prediction tool besides other methods. Yet, these approaches cannot be employed alone because they are not directly applicable to predict the market value which is always subject to external impact. The nature of the stock market is affected by system uncertainties and other unknown (random) factors. Prediction requires combining several computing techniques synergistically rather than exclusively (Chavarnakul & Enke, 2009;Zarandi, Hadavandi, & Turksen, 2012). Thus, it necessarily indicates the hybrid use of technical analysis, time series forecasting, and possibly ANN. in the following, a review is given to the recent development of hybrid approach for the prediction of the stock market.
Technical analysis and ANN were used by Mandziuk and Jaruszewicz (2011). They introduced an experimental evaluation of a neuro-genetic system for the prediction of the short-term stock index. The buy/sell signals generated by the technical analysis, MACD, Williams, Moving Averages (MA), and relative Strength indicator (rSi) are considered for stock trading. Their results showed that prediction based on the neuron-genetic model worked well during both uptrend and downtrend.
The approach developed by Tan, Quek, and Yow (2008) involves the use of technical analysis and neuro-fuzzy. Their intelligent stock trading system combines the superior predictive capability of a fuzzy neural network and the widely accepted MA and rSi trading rules. The system was able to identify and predict overbought/oversold trends in the stock counter and alert the investor to buy at the start of an uptrend and to sell off just before the trend reversed and the stock counter went into a decline. Atsalakis et al. (2011) adopted the Elliot wave theory and a neuro-fuzzy approach. They presented the Wave Analysis Stock Prediction system, which was based on the neuro-fuzzy architecture that utilized the Elliott Wave Theory. The system showed a tendency to achieve hit rates in the 60% mark which was significantly better than forecasting with the help of a coin.
The approach by Abraham, Nath, and Mahanti (2001) incorporated the principal component analysis and ANN. A hybridized soft computing technique for the automated stock market forecasting and trend analysis is used along with the principal component analysis to preprocess input data before they are fed to an ANN for stock forecasting. Zuo and Kita (2012) presented a Bayesian network technique to predict the up/down analysis of the daily stock indexes and the result were compared with the psychological line and trend estimation technical analyses. The average correction rate of their algorithm was almost 60%, which is almost equal to or higher than the technical psychological line (50-59%) and the trend estimation (50-52%).
Chen, Su, Cheng, and Chiang (2011) explored pattern recognition and time series forecasting. Theirs was a novel price-pattern detection method that looked for certain price-patterns ("price trend" and "price variation") contained in the time series variables that can be used to forecast the stock market.

the Proposed Methodology-integrated Data Mining techniques
This paper presents a system that incorporates the top-down trading theory first introduced by Livermore (1940) and various data mining techniques. Livermore believed that stock trends follow a trend line that can be used to forecast both in the long-and short-term. He published this particular idea in "How to Trade in Stock" in 1940. using stock data he concluded that stock-group behavior was an important indication to overall market direction, whether they are big or small-an indication embraced by the Wall Street but ignored by most traders. He believed stock-groups often provided the key to changes in trends. As the favored groups of the moment became weaker and collapsed, a correction in the overall market was usually on the way. The same thing happened in year 2000 dot.com bubble and year 2009 financial market collapse. The leaders flipped and fell first, and the others followed. Figure 1 depicts the block diagram of the system. Detail descriptions of the system are as follows.
Step 1: Examining Current Market Direction The first step is to survey and to establish the current market direction and to investigate if the current line of least resistance is positive, negative, or neutral (Livermore, 1940). it is essential to make sure the least resistance lines are in the direction of the investor's trade before entering the trade. Figure 2 shows that the TSi began its recovery in November of 2008 where a pivot point was formed and basic direction was changed.
Step 2: Tracking the Industry Group The second step is to check the specific industry group. Since the trades of TSMC are of interest, the semiconductor industry group is checked out to make sure that the group is moving along the line of least resistance, in order to increase the chance of making a profit on the selected trade. Stocks  do not move alone. When they move, they move in a group. The semiconductor industry group began its recovery in November of 2008, the same time TSi began its recovery in Figure 2. in July/ August, it gave a clear signal that the line of least resistance was upward. The signals confirmed that the trend was now heading to the upside.
Step 3: Checking Tandem Trading Tandem trading involves comparing two stocks of the same group by comparing the stock of interest in trading with its sister stocks. To trade in TSMC, the Taiwan MediaTek is examined as a sister stock. Both stocks bottomed out in December of 2008 and gave a signal, by a pivotal point, that the line of least resistance was positive. Because the broker/dealers are also often an important bellwether group for what the market may do in the future, this chart action was a precursor of what was to come in the overall market (see Figure 4).
Step 4: Scoring the Three Factors in the fourth step, the previous three factors, namely the market, the industry group, and the Tandem stocks, are examined all together. it can be clearly seen in Figures 2-5 that all factors are in unison. All the signals in the figures show a bottoming out in November and a reversal in trend, clearly indicating that the line of least resistance was now upward in direction. The rules to score the three factors are described as follows.  and −1 otherwise rule 2: if the industry group and individual stock value are the same (upward, downward, or flat), the score is 1 and −1 otherwise rule 3: if the tandem stock and individual stock value are the same (upward, downward, or flat), the score is 1 and −1 otherwise rule 4: Sum up the scores of rules 1-3. The summed score is considered one of the key factors in ANN Step 5: Integrated Data Mining Techniques for Stock Forecasting Lastly, after all the trend lines are confirmed and the score is made, the next step is to make prediction of the future stock values. Our approach is to identify and predict the profits or losses in the next one day, two days, three days, and four days in the stock counter (Atsalakis & Valavanis, 2009a). The information is vital for the investor to buy at the start of an uptrend and to sell off just before the trend reverses. Since the stock market behaves dynamically, integrated data mining techniques can provide a suitable approach to figure the behavior patterns (uptrend, down-trend, and flat) of the stock price from the stock data-set (Han & Kamber, 2001;Jang et al., 1997). Since the stock data-set does not show the correlation with stock behavior patterns, the techniques including technical analysis, Bayesian probability, dynamic time series, and ANN are integrated to figure the patterns, not necessary correlation only, from those massive and non-meaningful data. More details are elaborated in the following subsections.

Technical Analysis
Technical analysis and fundamental analysis are two major stock market analyzing methods used to predict short-term and long-term stock trends, respectively. For most investors, it is valuable to accurately predict market trends and daily value movements because one would want to invest in the stock at the right time when the market is on the upward trend market (Ausloos & ivanova, 2002;Edwards, Magee, & Bassetti, 2007). Fundamental analysis considers commercial factors, such as financial statements, management ability, business competition, and market conditions, in order to determine the intrinsic value of a given stock. Technical analysis helps recognize the price patterns according to the extrapolations from historical price patterns. in technical analysis method, chart patterns and technical indicators are the two major analyzing tools. Charting patterns such as head-and-shoulder and flag use stock charts to study the movement of the stock prices. Technical indicators such as rSi and moving average are produced by specific equations to examine market signals and help investors make trading decisions. Technical analysts widely use market indicators of many sorts, some of which are mathematical transformations of price, often including up and down volume, and advance/decline data. Popular technical indicators are usually classified into two major functions: trend and momentum (Liu & Lee, 1997). All the technical indicators utilized in this study are summarized in Table 1.

Bayesian Probability
Bayesian probability is a method used to update the probability estimates for a hypothesis once additional evidence is learned. Bayesian inference is closely related to subjective probability, often called "Bayesian probability." There are many useful functions in Bayesian probability. One is probabilistic learning. BP can calculate explicit probabilities for hypothesis, among the most practical approaches to certain types of learning problems. Each training example can gradually increase or decrease the probability when a hypothesis is correct. Prior knowledge can be combined with observed data. Even when Bayesian methods are computationally intractable, they can provide a standard of optimal decision-making against which other methods can be measured (Spiegelhalter, Dawid, Lauritzen, & Cowell, 1993;Tsai, Wang, & Zhu, 2010). The formula of BP is expressed as follows.
Given n mutually exclusive and exhaustive events E 1 , E 2 , E n such that P(E i ) ≠ 0 for all i, we have for 1 ≦ i ≦ n; where P(E i ) is the prior probability and P(E i | F) is the posterior probability. Table 2 tabulates several technical indicators calculated by BP. it also gives the result of prior probability and posterior probability. The value of each technical indicator stands for the performance accuracy of the individual stock according to the recent 300 trading days. The result can provide a standard of optimal decision-making for selecting significant technical indices. We then ignore the technical indicators with low values and select the significant ones. The values of the selected indictors become the inputs of the neural network in the next step. BP screens out the unnecessary technical indicators to prevent possible losing trades. From Table 2, we select MA, ADX and William as candidates of significant technical indicators for the ANN in this research. (1)

Dynamic Time Series Theory
Exponential smoothing is a technique that can be applied to time series data to either produce smoothed data or make forecast. Time series data themselves are a sequence of observations. The exponential smoothing model for forecasting does not eliminate any past information but adjust the weights given to the past data that older data get increasingly less weight. Each new forecast is based on an average that is adjusted each time there is a new forecast error. The proportion of the error that will be incorporated into the forecast is called the exponential smoothing factor and is identified as α. The raw data sequence is often represented by X t and the output of the exponential smoothing algorithm is commonly written as Equation 2, which may be regarded as the best estimate of what the next value of x will be. The simplest form of exponential smoothing is given by the formula below, where α is the smoothing factor, and 0 < α < 1. in other words, the smoothed statistic S t is a simple weighted average of the previous observation X t − 1 and the previous smoothed statistic S t − 1 . Values of α close to one have less of a smoothing effect and give greater weight to recent changes in the data, while values of α close to zero have a greater smoothing effect and are less responsive to recent changes (Billah, King, Snyder, & Koehler, 2006).
Adaptive exponential smoothing methods allow a smoothing parameter to change over time, in order to adapt to changes in the characteristics of the time series. However, these methods tend to produce unstable forecasts and have performed poorly in empirical studies (Entorf, Gross, & Steiner, 2012;Taylor, 2004). We present a new adaptive method, which enables a smoothing parameter to be modeled as a linear combination function of the trading volume, trend, and momentum. Figure 6 illustrates the closed loop structure of the adaptive exponential smoothing method, where V(i) is the volume indicator of the ith day, T(i) is the trend indicator of the ith day, M(i) is the momentum indicator of ith day, and α(i) is the smoothing factor of the ith day. D(i) is the actual stock value of the ith day, F(i) is the forecast stock value at time i, and Z(i) is the deviation of the forecast value at time i. Note that only the smoothing factor of each day at its first processing step is controlled using the deviation between the predicted stock value at the final stage and actual value for the final stage (Ohama, Fukumura, & uno, 2005).
The simplest form of adaptive exponential smoothing is given by the formula below, where (i) = (i − 1)+ * (V(i)+T(i)+M(i)) and β is a small coefficient value less than .05 and is used to fine tuning α(i) according to the following setting steps: Step 1: V(i) is the volume indicator of the ith day.
• If the stock transaction volume of today is more than twice of yesterday's volume, then V(i) = 1 • If the stock transaction volume of today is less than half of yesterday's volume, then V(i) = −1 Step 2 The adaptive exponential smoothing α is used to examine the performance of the exponential smoothing with fixed α. The investment horizon is 60 days, from September 2012 to November 2012. Figure 7 compares the performance of the adaptive exponential smooth α to those of three distinct α values at .3, .4, and .5. it shows that the adaptive α follows the actual stock up/down value much closely than the other three lines in Figure 7.

ANN Training
Many ANN models have been evaluated against statistical models for market forecast. it is observed that in most cases ANN models give better result than other methods (Chen, Leung, & Daouk, 2003;Lee, 2004). The most commonly used neural network technique in pattern recognition is Multilayer Perceptron (MLP) for the classification problems. MLP architecture using back-propagation algorithm has gone into the application field of stock price prediction. Two important characteristics of the MLP are its non-linear processing elements (PEs, applying the sigmoid function in this research) and their massive interconnectivity. Sigmoid functions all share a similar S shape that is essentially linear in their center and non-linear toward their bounds that are approached asymptotically. To find the optimal neural weights by the back-propagation algorithm based on mathematically training a network in order to minimize the error of a cost function such as the Mean Square Error (MSE), it is required that the sigmoid function is easily differentiable, thus permitting the evaluation of increment of weights via the chain-rule for partial derivatives (Yonaba, Anctil, & Fortin, 2010). The backpropagation rule propagates the errors through the network and allows adaptation of the hidden PEs. The MLP is trained with error correction learning, which means that the desired response for the system must be known. Learning typically occurs by example through training, where the training algorithm iteratively adjusts the connection weights. When the network is adequately trained, it is able to generalize relevant output for a set of input data. Training automatically stops when generalization stops improving, as indicated by an increase in the MSE of the validation samples. MSE is the average squared difference between outputs and targets.
Since the forecasting problem has been converted to a classification problem (Hajizadeh, Ardakani, & Shahrabi, 2010), we develop a new target setting rules. rule 1: As the stock value increases and the up value is over the stock transaction tax, the target value is identified as "Buy," labeled as "1" rule 2: "Sell," labeled as "−1" if the stock value decreases and the down value is more than the stock transaction tax rule 3: if not in the cases of the above, the target value is identified as "Hold," labeled as "0" The desired ANN response is the target value set to reflect the actual stock performance (Wang & Chan, 2007). in this study, the ANN input data include the top-down scores, selected key technical indicators and the forecasting value. Three output units are "Buy," "Sell," and "Hold," respectively. The number of hidden neurons is 20. We set aside some samples for validation and testing. The percentage of training data is set 70%, validation is 15%, and 15% for testing data. The gate of MSE is set 3 × 10 −2 . Figure 8 depicts the MSE decreasing after 57 epochs in TSMC.
The ANN system is trained to distinguish among "Buy," "Sell," and "Hold." A confusion matrix summarizes the results of testing the algorithm for further inspection (Simon & Simon, 2010). Figure 9 shows the classification results for the whole testing period. it shows a sample set of 144 stock up/ down values: 68 Buys, 66 Sells, and 40 Holds. Each column of the matrix represents the instances in a predicted class, while each row represents the instances in an actual class. All correct predictions are located in the diagonal of the table, so it is easy to visually inspect the table for errors, as they will be represented by any non-zero values outside the diagonal. Figure 9 shows the true positive rate of "Buy," "Sell," and "Hold" as 98.5, 97, and 97.5%, respectively. Overall, the true positive rate is 97.7%.

experimentation setup and test Results
For training and evaluating the performance of the presented approach, 240-trading-day stock data were considered. The system was retrained daily. A paper portfolio of NT$1,000,000 was the initial investment. Stocks were bought whenever the forecast was positive, and the position was closed when the forecast became negative. Transaction costs were taken into consideration and were amount to .6% of the individual stock trading price.
TSMC stock was tested first. Experiments were carried out on a personal computer. The system was coded in Microsoft VBA and the neural network analysis was run in MATLAB. it is noted that this period also includes the great recession, European debt crisis and the fiscal cliff of the united States in 2012.

One Year Period
The rate of accuracy of the proposed approach was 81%. The moving hit rate is illustrated in Figure 10 which shows the hit rate since the first day of this period. A hit rate is a term used to describe the success rate of an effort. This rate compares the number of times an initiative was a success against the number of times it was attempted. The moving hit rate of TSMC converges towards .8. The TSMC stock price began with NT$78.5 on 16 February 2012 and reached NT$101 on 23 January 2013. The corresponding stock price increasing rate over this period is 23%-an investment return of 53.6%! (see also Figure 11) To compare the performances of different time periods, this period is broken into three sub-periods; namely, one month, one quarter, and six months, respectively. in Section 4.6, the system is again applied to the Evergreen stock to compare the investment performance.

First Period: 12/24/2012-01/23/2013
For each period, the result includes the portfolio return being compared to the initial investment of NT$1million. The moving hit rate is a diagram which shows the hit rate since the first day of this period. The return of investment is 3.6%. The accurate rate of this period is 70%. it can be seen in Figure 12 that the moving hit rate converges toward .8. The TSMC stock price rose from NT$95 to NT$101.5, the stock price increasing rate of the period is up 5% as shown in Figure 13.

Second Period: 10/30/2012-01/23/2013
During this second period of 60 trading days, the results are even better. Again, the results include the portfolio return being compared to the initial investment of NT$1million. The return of investment achieves 8.6%. The rate of accuracy of this period is 75%. it can be seen in Figure 14 that the moving hit rate converges toward the 81% region. The TSMC stock price increased from NT$88 to NT$101.5, the up rate of this period is up 12% as seen in Figure 15.

Third Period: 08/06/2012-01/23/2013
The portfolio return as compared to the initial investment is again considered. The return of investment achieves 18.7%. The rate of accuracy of this period is 77.5%. it is seen in Figure 16 that the    moving hit rate converges toward .8. The stock price began from NT$81 to NT$101.5, while the increasing rate of the stock price during this period is up 19% up, as indicated in Figure 17.

Summary of TSMC Stock Performance
The performances of different periods of TSMC are summarized in Table 3. The proposed system made 82 transactions in the stock market during this period of 240 trading days. This gave a rough average of 1 transaction for every three days. While the stock value increased by 23%, the return of the portfolio during the whole period was 53.6% with an 81% accuracy rate. The total trading period was also divided into three sub-periods that cover one month, one quarter, and six months, respectively. The result of each period is summarized as follows: • The accurate rates achieved were 70, 75, and 77.5%, respectively.
• The rates of the stock price were 5% up, 12% up, and 19% up, respectively.

Application to the Evergreen Stock
The approach is also applied to Evergreen the same as in TSMC. The Evergreen stock was tested with an initial paper portfolio of NT$1,000,000. Two hundred forty-trading-day Evergreen stock data were considered for training and evaluating the performance of the system which was retrained daily. Transaction costs were taken into consideration and were amount to .6% of the individual stock trading price. Figure 16. the moving hit rate in the third period of 120 trading days. Figure 17. the returns of investment and the variation of stock price in the third period.
The proposed system made 64 Evergreen stock transactions in the market during this period of 240 trading days. This gave a rough average of one transaction for every four days. Although the stock value dropped by 7.7% in this period, the return of the portfolio during the whole period still made a 128.4% in profit with an 88% accuracy rate. To study the performance of different periods, we divide the periods into one month, one quarter, six months, and one year. The result of each period is summarized as follows (Table 4): • The rates of the stock price were 6.6% down, 5% up, and 14.6% up, respectively.

conclusions
The proposed approach that integrated various data mining techniques has achieved remarkable results. The investment returns of the TSMC and Evergreen stocks were 53.6 and 128.4% for the trading days considered. The system was retrained daily. As all sub-periods of the TSMC and Evergreen trading generated profits for various trading days, it is evident that the proposed system is highly effective for stock forecast. instead of giving a straight tool, this research proposes a methodological system to handle the stock forecast. Every stock may have different structures in the top-down theory, the dynamic time series, and ANN, and have different choices in the technical analysis and the Bayesian probability. Hence, applications of the methodological system are not limited to the TSMC and Evergreen stocks. in our future work, we will apply the proposed system to the popular Nasdaq-100 index of Stock Market as well as some of the companies listed in the Nasdaq-100 index. Additionally, justifying the decision based on the proposed system by applying linguistic fuzzy-set approach to include experts' opinions is also our future research.