Application of Cloud Model in Qualitative Forecasting for Stock Market Trends

Forecasting stock prices plays an important role in setting a trading strategy or determining the appropriate timing for buying or selling a stock. The use of technical analysis for financial forecasting has been successfully employed by many researchers. The existing qualitative based methods developed based on fuzzy reasoning techniques cannot describe the data comprehensively, which has greatly limited the objectivity of fuzzy time series in uncertain data forecasting. Extended fuzzy sets (e.g., fuzzy probabilistic set) study the fuzziness of the membership grade to a concept. The cloud model, based on probability measure space, automatically produces random membership grades of a concept through a cloud generator. In this paper, a cloud model-based approach was proposed to confirm accurate stock based on Japanese candlestick. By incorporating probability statistics and fuzzy set theories, the cloud model can aid the required transformation between the qualitative concepts and quantitative data. The degree of certainty associated with candlestick patterns can be calculated through repeated assessments by employing the normal cloud model. The hybrid weighting method comprising the fuzzy time series, and Heikin–Ashi candlestick was employed for determining the weights of the indicators in the multi-criteria decision-making process. Fuzzy membership functions are constructed by the cloud model to deal effectively with uncertainty and vagueness of the stock historical data with the aim to predict the next open, high, low, and close prices for the stock. The experimental results prove the feasibility and high forecasting accuracy of the proposed model.


Introduction
Forecasting stock prices is an attractive pursuit for investors and researchers who want to beat the stock market. The benefits of having a good estimation of the stock market behavior are well-known, minimizing the risk of investment and maximizing profits. Recently, the stock market has become an easily accessible investment tool, not only for strategic investors, but also for ordinary people. Over the years, investors and researchers have been interested in developing and testing models of stock price behavior. However, analyzing stock market movements and price behaviors is extremely challenging because of the market's dynamic, nonlinear, non-stationary, nonparametric, noisy, and chaotic nature [1]. Stock markets are affected by many highly interrelated uncertain factors that include economic, political, psychological, and company-specific variables. These uncertain factors are undesirable for the stock investor and make stock price prediction very difficult, but at the same time, they are also unavoidable whenever stock trading is preferred as an investment ability to capture uncertainties with fuzzy and random nature. However, the membership functions are difficult to obtain for existing fuzzy approaches of measurement uncertainty. In order to conquer this disadvantage, the cloud model was used to calculate the measurement uncertainty. A cloud is a new, easily visualized concept for uncertainty with well-defined semantics, mediating between the concept of a fuzzy set and that of a probability distribution [11][12][13][14][15][16]. A cloud model is an effective tool in transforming qualitative concepts and their quantitative expressions. The digital characteristics of cloud, expect value (Ex), entropy (En), and hyper-entropy (He), well integrate the fuzziness and randomness of linguistic concepts in a unified way. Cloud is combined with several cloud drops in which the shape of the cloud reflects the important characters of the quantity concept [17]. The essential difference between the cloud model and the fuzzy probability concept lies in the used method to calculate a random membership degree. Basically, with the three numerical characteristics, the cloud model can randomly generate a degree of membership of an element and implement the uncertain transformation between linguistic concepts and its quantitative instantiations.
Candlestick patterns provide a way to understand which buyer and seller groups currently control the price action. This information is visually represented in the form of different colors on these charts. Recently, several traders and investors have used the traditional Japanese candlestick chart pattern and analyzed the pattern visually for both quantitative and qualitative forecasting [6][7][8][9][10]. Heikin-Ashi candlesticks are an offshoot from Japanese candlesticks. Heikin-Ashi candlesticks use the open-close data from the prior period and the open-high-low-close data from the current period to create a combo candlestick. The resulting candlestick filters out some noise in an effort to better capture the trend.

Problem Statement
The price variation of the stock market is a non-linear dynamic system that deals with non-stationary and volatile data. This is the reason why its modeling is not a simple task. In fact, it is regarded as one of the most challenging modeling problems due to the fact that prices are stochastic. Hence, the best way to predict the stock price is to reduce the level of uncertainty by analyzing the movement of the stock price. The main motivation of our work was the successful prediction of stock future value that can yield enormous capital profits and can avoid potential market risk. Several classical approaches have been evolved based on linear time series models, but the patterns of the stock market are not linear. These approaches lead to inaccurate results, which may be susceptible to highly dynamic factors such as macroeconomic conditions and political events. Moreover, the existing qualitative based methods developed based on fuzzy reasoning techniques cannot describe the data comprehensively, which has greatly limited the objectivity of fuzzy time series in uncertain data forecasting. The most important disadvantage of the fuzzy time series approach is that it needs subjective decisions, especially in the fuzzification stage.

Contribution and Novelty
The objective of the work presented in this paper is to construct an accurate stock trend prediction model through utilizing a combination of the cloud model, Heikin-Ashi candlesticks, and fuzzy time series (FTS) in a unified model. The purpose of the cloud model is to add the randomness and uncertainty to the fuzziness linguistic definition of Heikin-Ashi candlesticks. FTS is utilized to abstract linguistic values from historical data, instead of numerical ones, to find internal relationship rules. Heikin-Ashi candlesticks were employed to give easier readability of the candle's features through the reduction of noise, eliminates the gaps between candles, and smoothens the movement of the market.
As far as the authors know, this is the first time that the cloud model has been used in forecasting stock market trends that is unlike the current methods that adopt a fuzzy probability approach for forecasting that requires an expert to define the extra parameters of the probabilistic fuzzy system such as output probability vector in probabilistic fuzzy rules and variance factor. These selected statistical parameters specify the degree of randomness. The cloud model not only focuses on the studies regarding the distribution of samples in the universe, but also try to generalize the point-based membership to a random variable on the interval [0, 1], which can give a brand new method to study the relationship between the randomness of samples and uncertainty of membership degree. More practically speaking, the degree with the aid of three numeric characteristics, by which the transformation between linguistic concepts and numeric values will become possible.
The outline of the remainder of this paper is as follows. Section 2 presents the background and summary of the state-of-the-art approaches. Section 3 describes the proposed model. The test results and discussion of the meaning are shown in Section 4. The conclusion of this work is given in Section 5.

Preliminaries and Literature Review
In this section, we summarize material that we need later that includes the cloud model, fuzzy time series, and Heikin-Ashi candlesticks. Finally, some state-of-the-art related works are discussed.

Cloud Model
The cloud model (CM) proposed by Li et al. [17] relies on probability statistics and traditional fuzzy theory [18,19]. The membership cloud model as shown in Figure 1 can mix the fuzziness and randomness to objectively describe the uncertainty of the complex system. This model makes it possible to obtain the range and the distribution of the quantitative data from qualitative information, which is described by linguistic value and effectively transits precise data into appropriate qualitative language value. The digital character of the cloud can be expressed by expected value (Ex), entropy (En), and hyper entropy (He). CM uses Ex to represent the qualitative concept and usually is the value of x corresponding to the cloud center. En represents the uncertainty measure of the qualitative concept. It measures the ambiguity of the quantitative numerical range. He symbols the uncertainty measure of entropy, namely the entropy of entropy, which reflects the dispersion degree of cloud, which appears in the size of the cloud's thickness [17][18][19][20][21]. based membership to a random variable on the interval [0, 1], which can give a brand new method to study the relationship between the randomness of samples and uncertainty of membership degree. More practically speaking, the degree with the aid of three numeric characteristics, by which the transformation between linguistic concepts and numeric values will become possible. The outline of the remainder of this paper is as follows. Section 2 presents the background and summary of the state-of-the-art approaches. Section 3 describes the proposed model. The test results and discussion of the meaning are shown in Section 4. The conclusion of this work is given in Section 5.

Preliminaries and Literature Review
In this section, we summarize material that we need later that includes the cloud model, fuzzy time series, and Heikin-Ashi candlesticks. Finally, some state-of-the-art related works are discussed.

Cloud Model
The cloud model (CM) proposed by Li et al. [17] relies on probability statistics and traditional fuzzy theory [18,19]. The membership cloud model as shown in Figure 1 can mix the fuzziness and randomness to objectively describe the uncertainty of the complex system. This model makes it possible to obtain the range and the distribution of the quantitative data from qualitative information, which is described by linguistic value and effectively transits precise data into appropriate qualitative language value. The digital character of the cloud can be expressed by expected value (Ex), entropy (En), and hyper entropy (He). CM uses Ex to represent the qualitative concept and usually is the value of x corresponding to the cloud center. En represents the uncertainty measure of the qualitative concept. It measures the ambiguity of the quantitative numerical range. He symbols the uncertainty measure of entropy, namely the entropy of entropy, which reflects the dispersion degree of cloud, which appears in the size of the cloud's thickness [17][18][19][20][21]. The theoretical foundation of CM is the probability measure (i.e., the measure function in the sense of probability). On the basis of normal distribution and Gaussian membership function, CMs describe the vagueness of the membership degree of an element by a random variable defined in the universe. Being an uncertain transition way between a qualitative concept described by linguistic terms and its numerical representation, the cloud has depicted such abundant uncertainties in linguistic terms as randomness, fuzziness, and the relationship between them. CM can acquire the range and distributing law of the quantitative data from the qualitative information expressed in linguistic terms. CM has been successfully applied and gives better performance results in several fields such as intelligence control [11], data mining [19], and others. Figure 2 illustrates the types of cloud model (see [11,17] for more details). The theoretical foundation of CM is the probability measure (i.e., the measure function in the sense of probability). On the basis of normal distribution and Gaussian membership function, CMs describe the vagueness of the membership degree of an element by a random variable defined in the universe. Being an uncertain transition way between a qualitative concept described by linguistic terms and its numerical representation, the cloud has depicted such abundant uncertainties in linguistic terms as randomness, fuzziness, and the relationship between them. CM can acquire the range and distributing law of the quantitative data from the qualitative information expressed in linguistic terms. CM has been successfully applied and gives better performance results in several fields such as intelligence control [11], data mining [19], and others. Figure 2 illustrates the types of cloud model (see [11,17] for more details).

The Fuzzy Time Series Model
Fuzzy time series is another concept to solve forecasting problems in which the historical data are linguistic values. The fuzzy time series has recently received increasing attention because of its capability to deal with vague and incomplete data. There have been a variety of models developed to either improve forecasting accuracy or reduce computation overhead [22]. The fuzzy time series model uses a four-step framework to make forecasts, as shown in Figure 3: (1) define the universe of discourse and partition it into intervals; (2) determine the fuzzy sets on the universe of discourse and fuzzify the time series; (3) build the model of the existing fuzzy logic relationships in the fuzzified time series; and (4) make forecast and defuzzify the forecast values [23][24][25].
Nevertheless, the forecasting performance can be significantly affected by the partition of the universe of discourse. Another issue is the consistency of the forecasting accuracy with the interval length. In general cases, better accuracy can be achieved with a shorter interval length. However, an effective forecasting model should adhere to the consistency principle. In accounting, consistency requires that a company's financial statements follow the same accounting principles, methods, practices, and procedures from one accounting period to the next. In general, the effect of some parameters in fuzzy time series such as population size, number of intervals, and order of fuzzy time series must be tested and analyzed [26,27].

Heikin-Ashi Candlestick Pattern
The current forecasting models do not contain the qualitative information that would help in predicting the future. Japanese candlesticks are a technical analysis tool that traders use to chart and

The Fuzzy Time Series Model
Fuzzy time series is another concept to solve forecasting problems in which the historical data are linguistic values. The fuzzy time series has recently received increasing attention because of its capability to deal with vague and incomplete data. There have been a variety of models developed to either improve forecasting accuracy or reduce computation overhead [22]. The fuzzy time series model uses a four-step framework to make forecasts, as shown in Figure 3: (1) define the universe of discourse and partition it into intervals; (2) determine the fuzzy sets on the universe of discourse and fuzzify the time series; (3) build the model of the existing fuzzy logic relationships in the fuzzified time series; and (4) make forecast and defuzzify the forecast values [23][24][25].

The Fuzzy Time Series Model
Fuzzy time series is another concept to solve forecasting problems in which the historical data are linguistic values. The fuzzy time series has recently received increasing attention because of its capability to deal with vague and incomplete data. There have been a variety of models developed to either improve forecasting accuracy or reduce computation overhead [22]. The fuzzy time series model uses a four-step framework to make forecasts, as shown in Figure 3: (1) define the universe of discourse and partition it into intervals; (2) determine the fuzzy sets on the universe of discourse and fuzzify the time series; (3) build the model of the existing fuzzy logic relationships in the fuzzified time series; and (4) make forecast and defuzzify the forecast values [23][24][25].
Nevertheless, the forecasting performance can be significantly affected by the partition of the universe of discourse. Another issue is the consistency of the forecasting accuracy with the interval length. In general cases, better accuracy can be achieved with a shorter interval length. However, an effective forecasting model should adhere to the consistency principle. In accounting, consistency requires that a company's financial statements follow the same accounting principles, methods, practices, and procedures from one accounting period to the next. In general, the effect of some parameters in fuzzy time series such as population size, number of intervals, and order of fuzzy time series must be tested and analyzed [26,27].

Heikin-Ashi Candlestick Pattern
The current forecasting models do not contain the qualitative information that would help in predicting the future. Japanese candlesticks are a technical analysis tool that traders use to chart and Nevertheless, the forecasting performance can be significantly affected by the partition of the universe of discourse. Another issue is the consistency of the forecasting accuracy with the interval length. In general cases, better accuracy can be achieved with a shorter interval length. However, an effective forecasting model should adhere to the consistency principle. In accounting, consistency requires that a company's financial statements follow the same accounting principles, methods, practices, and procedures from one accounting period to the next. In general, the effect of some Entropy 2020, 22, 991 6 of 20 parameters in fuzzy time series such as population size, number of intervals, and order of fuzzy time series must be tested and analyzed [26,27].

Heikin-Ashi Candlestick Pattern
The current forecasting models do not contain the qualitative information that would help in predicting the future. Japanese candlesticks are a technical analysis tool that traders use to chart and analyze the price movement of securities. Japanese candlesticks provide more detailed and accurate information about price movements compared to bar charts. They provide a graphical representation of the supply and demand behind each time period's price action. Each candlestick includes a central portion that shows the distance between the open and the close of the security being traded, the area referred to as the body. The upper shadow is the price distance between the top of the body and the high for the trading period. The lower shadow is the price distance between the bottom of the body and the low for the trading period. The closing price of the security being traded determines whether the candlestick is bullish or bearish. The real body is usually white if the candlestick closes at a higher price than when it opened. In such a case, the closing price is located at the top of the real body and the opening price is located at the bottom. If the security being traded closed at a lower price than it opened for the time period, the body is usually filled up or black in color. The closing price is located at the bottom of the body and the opening price is located at the top. Modern candlesticks now replace the white and black colors of the body with more colors such as red, green, and blue. Traders can choose among the colors when using electronic trading platforms (see Figure 4) [6,7]. analyze the price movement of securities. Japanese candlesticks provide more detailed and accurate information about price movements compared to bar charts. They provide a graphical representation of the supply and demand behind each time period's price action. Each candlestick includes a central portion that shows the distance between the open and the close of the security being traded, the area referred to as the body. The upper shadow is the price distance between the top of the body and the high for the trading period. The lower shadow is the price distance between the bottom of the body and the low for the trading period. The closing price of the security being traded determines whether the candlestick is bullish or bearish. The real body is usually white if the candlestick closes at a higher price than when it opened. In such a case, the closing price is located at the top of the real body and the opening price is located at the bottom. If the security being traded closed at a lower price than it opened for the time period, the body is usually filled up or black in color. The closing price is located at the bottom of the body and the opening price is located at the top. Modern candlesticks now replace the white and black colors of the body with more colors such as red, green, and blue. Traders can choose among the colors when using electronic trading platforms (see Figure 4) [6,7]. There are a few differences to note between the two types of charts, and are demonstrated by the charts above. Heikin-Ashi has a smoother look as it essentially takes an average of the movement. There is a tendency with Heikin-Ashi for the candles to stay red during a downtrend and green during an uptrend, whereas normal candlesticks alternate colors, even if the price is moving dominantly in one direction. Since Heikin-Ashi takes an average, the current price on the candle may not match the price the market is actually trading at. For this reason, many charting platforms show two prices on the y-axis: one for the calculation of the Heikin-Ashi and another for the current price of the asset [7-10].

Related Work
Researchers that believe in the existence of patterns in a financial time series that make them predictable have centered their work mainly in two different approaches: statistical and artificial intelligence (AI). The statistical techniques most used in financial time series modeling are the autoregressive integrated moving average (ARIMA) and the smooth transition autoregressive (STAR) [2]. On the other hand, artificial intelligence provides sophisticated techniques to model time series and search for behavior patterns: genetic algorithms, fuzzy models, the adaptive neuro-fuzzy There are a few differences to note between the two types of charts, and are demonstrated by the charts above. Heikin-Ashi has a smoother look as it essentially takes an average of the movement. There is a tendency with Heikin-Ashi for the candles to stay red during a downtrend and green during an uptrend, whereas normal candlesticks alternate colors, even if the price is moving dominantly in one direction. Since Heikin-Ashi takes an average, the current price on the candle may not match the price the market is actually trading at. For this reason, many charting platforms show two prices on the y-axis: one for the calculation of the Heikin-Ashi and another for the current price of the asset [7-10].

Related Work
Researchers that believe in the existence of patterns in a financial time series that make them predictable have centered their work mainly in two different approaches: statistical and artificial intelligence (AI). The statistical techniques most used in financial time series modeling are the autoregressive integrated moving average (ARIMA) and the smooth transition autoregressive (STAR) [2]. On the other hand, artificial intelligence provides sophisticated techniques to model time series and search for behavior patterns: genetic algorithms, fuzzy models, the adaptive neuro-fuzzy inference system (ANFIS), artificial neural networks (ANN), support vector machines (SVM), hidden Markov models, and expert systems, are some examples. Unlike statistical techniques, they are capable of obtaining adequate models for nonlinear and unstructured data. There exists a huge amount of literature that uses AI approaches for time series forecasting [2,4,8]. However, most of them are inaccurate: the computer programs are more effective in syntax analysis than semantic analysis. Furthermore, most of them follow the quantitative forecasting category; qualitative forecasting is useful when there is ambiguous or inadequate data. Most of the current studies were conducted from single time scale features of the stock market index, but it is also meaningful for studying from multiple time scale features [8]. With the development of deep learning, there are many methods based on deep learning used for stock forecasting and have drawn some essential conclusions [3].
In the literature, many studies have used an integrated neuro-fuzzy model to estimate the dynamics of the stock market using technical indicators [3]. This approach integrates the advantages in both the neural and fuzzy models to facilitate reliable intelligent stock value forecasting. However, most of these works did not consider the fractional deviation within a day. Another group of research work utilized hidden Markov models (HMMs) to predict the stock price based on the daily fractional change in the stock share value of intra-day high and low. To benefit from the correlation between the technical indicators and reduce the large dimensionality space, the principal component analysis (PCA) concept was deployed to select the most effective technical indicators among a large number of highly correlated variables. PCA linearly transforms the original large set of input variables into a smaller set of uncorrelated variables to reduce the large dimensionality space.
In addition, some researchers are currently using soft computing techniques (e.g., genetic algorithm) for selecting the most optimal subset of features among a large number of input features, and then selected features are given as input to the machine learning module (e.g., SVM Light software package). Technical analysis is carried out based on technical indicators from the stock to be predicted and also from other stocks that are highly correlated with it. However, the decision is carried out only based on the input feature variables of technical indicators. This leads to prediction errors due to the lack of precise domain knowledge and no consideration of various political and economic factors that affect the stock market other than the technical indicators [3,8].
Song and Chissom [13] suggested a forecasting model using fuzzy time series, which provided a theoretical framework to model a special dynamic process whose observations were linguistic values. The main difference between the traditional time series and fuzzy time series was that the observed values of the former were real numbers while the observed values of the latter were fuzzy sets or linguistic values. Chen et al. [16] presented a new method for forecasting university enrolment using fuzzy time series. Their method is more efficient than the suggested method by Song and Chissom due to the fact that their method used simplified arithmetic operation rather than the complicated MaxMin composition operation. Hwang [22] suggested a new method based on fuzzification to revise Song and Chissom's method. He used a different triangle fuzzification method to fuzzily crisp values. His method involved determining an interval of extension from both sides of crisp value in triangle membership function to get a variant degree of membership. The results obtained a better average forecasting error. In addition, the influences of factors and variables in a fuzzy time series model such as definition area, number and length of intervals, and the interval of extension in triangle membership function were discussed in detail. More techniques that used fuzzy time series for forecasting can be found in [23][24][25][26][27].
Nison [5] introduced the Japanese candlestick concepts to the Western world. Japanese candlestick patterns are believed to show both quantitative information like price, trend . . . etc., and qualitative information like the psychology of the market. It considers not only the close values, but also the information on the body of the candlestick can offer an informative summary of the trading sessions [28] and some of its components are predictable [29]. Some researchers have combined technical patterns and candlestick information [30]. In the last decades, several researchers have used Japanese candlesticks in creative forecasting methods [31][32][33][34][35][36]. Lee et al. [31] suggested an expert system with IF-THEN rules to detect candlestick patterns, flag sell, and buy orders with good hit ratios in the Korean market. The authors in [32] displayed Japanese candlestick patterns using fuzzy linguistic variables and knowledge-based by fuzzing both the candle line and the candle lines relationship. In [33], a prediction model was suggested for the financial decision system based on fuzzy candlestick patterns. Lee [34] extended this work through creating and using personal candlestick pattern ontologies to allow different users to have their explanation of a candlestick pattern. Kamo et al. [8,35,36] suggested a model that combined neural networks, committee machines, and fuzzy logic to identify candlestick patterns and generate a market strength weight using fuzzy rules in [35], the type-1 fuzzy logic system in [36], and finally, the type-2 fuzzy logic system in [6].
Naranjo et al. [37] presented a model that used the K-nearest neighbors (KNN) algorithm to forecast the candlestick one day ahead using the fuzzy candlestick representation. Naranjo et al. [38] fuzzified the gap between candles and added it as an extended element in candlesticks patterns. However, Japanese candlestick has contradictory information due to the market's noise [38]. Recently, the Heikin-Ashi technique modifies the traditional candlestick chart and makes it easier to reduce the noise, eliminate the gaps between candles, and smoothen the movement of the market and let the traders focus on the main trend. The Heikin-Ashi graph is not only more readable than traditional candles, but is also a real trading system [10].
In general, most existing fuzzy time series forecasting models follow fuzzy rules according to the relationships between neighboring states without considering the inconsistency of fluctuations for a related period [38][39][40]. This paper proposes a new perspective to study the problem of prediction, in which inconsistency is quantified and regarded as a key characteristic of prediction rules by utilizing a combination of the cloud model, Heikin-Ashi candlesticks, and fuzzy time series (FTS) in a unified model that can represent both fluctuation trend and fluctuation consistency information.

Proposed Model
The purpose of the study is to predict and confirm accurate stock future trends due to a lack of insufficient levels of accuracy and certainty. However, there are many problems in previous studies. The main problems in data are uncertainty, noise, non-linearity, non-stationary, and dynamic process of stock prices in time series. In the prediction model, many models are used. The statistical method like the ARMA family is achieved with the trial and error basis iterations. Traders also have problems that include predicting the stock price every day, finding the reversal patterns of the stock price, the difficulty in model parameter tuning, and finally, the gap exists between prediction results and investment decision. Additionally, traditional candlestick patterns have problems such as the definition of the patterns itself being ambiguous and the largest number of patterns.
In order to deal with the above problems, the suggested prediction model uses both cloud model and Heikin-Ashi (HA) candlestick patterns. Figure 5 illustrates the main steps of the suggested model that include preparing historical data, HA candlestick processing, representing the HA candlestick using the cloud model, forecasting the next day price (open, high, low, close) using cloud-based time series prediction, formalizing the next day HA candlestick features, and finally, forecasting the trend and its strong patterns. The following subsection discusses each step in detail [9].

Step 1: Preparing the Historical Data
The publicly available stock market datasets contain historical data on the four price time series for several companies were collected from Yahoo (http://finance.yahoo.com). The dataset specifies the "opening price, lowest price, closing price, highest price, adjusted closing price, and volume" against each date. The data were divided into two parts: the training part and the testing part. The training

Step 2: Candlestick Data
The first stage in stock market forecasting is the selection of input variables. The two most common types of features that are widely used for predicting the stock market are fundamental indicators and technical indicators. The suggested model used technical indicators that are determined by employing candlestick patterns such as open price, close price, low price, and high price to try to find future stock prices [5,6]. A standard candlestick pattern is composed of one or more candlestick lines. However, the extended candlestick (Heikin-Ashi) patterns have one candlestick line. The HA candlestick uses the modified OHLC values as candlesticks that are calculated using [5]: where HaL indicates the length of the body, upper shadow, or lower shadow of the HA candlestick. The HaCOLOR parameter represents the mean body color of the HA candlestick. Heikin-Ashi

Step 2: Candlestick Data
The first stage in stock market forecasting is the selection of input variables. The two most common types of features that are widely used for predicting the stock market are fundamental indicators and technical indicators. The suggested model used technical indicators that are determined by employing candlestick patterns such as open price, close price, low price, and high price to try to find future stock prices [5,6]. A standard candlestick pattern is composed of one or more candlestick lines. However, the extended candlestick (Heikin-Ashi) patterns have one candlestick line. The HA candlestick uses the modified OHLC values as candlesticks that are calculated using [5]: Herein, each candlestick line has the following parameters: length of the upper shadow, length of the lower shadow, length of the body, color, open style, and close style. The open style and close style are formed by the relationship between a candlestick line and its previous candlestick line. The crisp value of the length of the upper shadow, length of the lower shadow, length of the body, and color play an important role in identifying a candlestick pattern and determining the efficiency of the candlestick pattern. The candlestick parameters are directly calculated using [9,10].
where HaL indicates the length of the body, upper shadow, or lower shadow of the HA candlestick. The Ha COLOR parameter represents the mean body color of the HA candlestick. Heikin-Ashi candlesticks are similar to conventional ones, but rather than using opens, closes, highs, and lows, they use average values for these four price metrics. In stock market prediction, the quality of data is the main factor because the accuracy and the reliability of the prediction model depends upon the quality of data. Any unwanted anomalies in the dataset are known as noise. Outliers are the set of observations that do not obey the general behavior of the dataset. The presence of noise and outliers may result in poor prediction accuracy of forecasting models. The data must be prepared so that it covers the range of inputs for which the network is going to be used. Data pre-processing techniques attempt to reduce errors and remove outliers, hence improving the accuracy of prediction models. The purpose of HA charts is to filter noise and provide a clearer visual representation of the trend. Heikin-Ashi has a smoother look, as it is essentially taking an average of the movement [9,10].

Step 3: Cloud Model-Based Candlestick Representation
There is no crisp value to define the length of body and shadow in the HA candlestick; these variables are usually described as imprecise and vague. Herrin, to transform crisp candlestick parameters (HA quantitative values) to linguistic variables to define the candlestick (qualitative value), the cloud model was used. To achieve this goal, fuzzy HA candlestick pattern ontology was built that contains [4,8]: -Candlestick Lines: Four fuzzy linguistic variables, equal, short, middle, and long, were defined to indicate the cloud model of the shadows and the body length. Figure 6 shows the membership function of the linguistic variables based on the cloud model, then used the maximum µ(x) to determine its linguistic variable. The ranges of body and shadow length were set to (0, p) to represent the percentage of the fluctuation of stock price. The parameter value of each fuzzy linguistic variable was set as stated in [8]. See [8] for more details regarding the rationale of using these values. These fuzzy linguistic variables are defined as: The body color Body Color is also an import feature of a candlestick line. It is defined by three terms Black, White, and Doji. A Doji term is defined to describe the situation where the open price equals the close price. In this case, the height of the body is 0, and the shape is represented by a horizontal bar. The definition of body color is defined as [10]:  (7). Additionally, the parameter value of each fuzzy linguistic variable was set as stated in [8]. Figure 7 shows the membership function of the linguistic variable based on the cloud model: improving the accuracy of prediction models. The purpose of HA charts is to filter noise and provide a clearer visual representation of the trend. Heikin-Ashi has a smoother look, as it is essentially taking an average of the movement [9,10].

Step 3: Cloud Model-Based Candlestick Representation
There is no crisp value to define the length of body and shadow in the HA candlestick; these variables are usually described as imprecise and vague. Herrin, to transform crisp candlestick parameters (HA quantitative values) to linguistic variables to define the candlestick (qualitative value), the cloud model was used. To achieve this goal, fuzzy HA candlestick pattern ontology was built that contains [4,8]: -Candlestick Lines: Four fuzzy linguistic variables, equal, short, middle, and long, were defined to indicate the cloud model of the shadows and the body length. Figure 6 shows the membership function of the linguistic variables based on the cloud model, then used the maximum μ(x) to determine its linguistic variable. The ranges of body and shadow length were set to (0, p) to represent the percentage of the fluctuation of stock price. The parameter value of each fuzzy linguistic variable was set as stated in [8]. See [8] for more details regarding the rationale of using these values. These fuzzy linguistic variables are defined as: The body color is also an import feature of a candlestick line. It is defined by three terms Black, White, and Doji. A Doji term is defined to describe the situation where the open price equals the close price. In this case, the height of the body is 0, and the shape is represented by a horizontal bar. The definition of body color is defined as [10]:  (7). Additionally, the parameter value of each fuzzy linguistic variable was set as stated in [8]. Figure 7 shows the membership function of the linguistic variable based on the cloud model:  In our case, membership cloud function (forward normal cloud generator) converts the statistic results to fuzzy numbers, and constructs the one-to-many mapping model. The input of the forward normal cloud generator is three numerical characteristics of a linguistic term, (Ex, En, He), and the number of cloud drops to be generated, N, while the output is the quantitative positions of N cloud drops in the data space and the certain degree that each cloud drop can represent the linguistic term. The algorithm in detail is: -Produce a normally distributed random number En' with mean En and standard deviation He; -Produce a normally distributed random number x with mean Ex and standard deviation En'; Drop (x,y) is a cloud drop in the universe of discourse; and -Repeat step 1-4 until N cloud drops are generated.
Expectation value (Ex) at the center-of-gravity positions of cloud drops is the central value of distribution. Entropy (En) is the fuzzy measure of qualitative concept that describes the uncertainty and the randomness. The larger the entropy, the larger the acceptable interval of this qualitative concept, which represents that this conception is more fuzzy. Hyper entropy (He) is the uncertain measure of qualitative concept that describes the dispersion. The larger the hyper entropy, the thicker the shape of the cloud, which shows that this conception is more discrete [20,21].
-Forecast the next day price (open, high, low, close) In the fuzzy candlestick pattern approach, the measured values are the open, close, high, and low price of trading targets in a specific time period. The features of the trading target price fluctuation are represented by the fuzzy candlestick pattern. The classification rules of fuzzy candlestick patterns can be determined by the investors or the computer system. In general, using a candlestick pattern approach for financial time series prediction consists of the following steps [21]: -Partitioning the universe of discourse into intervals: In this case, after preparing the historical data and defining the range of the universe of discourse (UoD), open, high, low, and close prices should be established as a data price set for each one. Then, for each data price set, the variation percentage between two prices on time t and time t + n is calculated ((Close t+n − Close t )/Close t ) × 100 to partition the universe of discourse dataset into intervals. Based on the variation, the minimum variation D min and the maximum variation D max are determined that define U = [D min − D 1 , D max + D 2 ], where D 1 and D 2 are suitable positive numbers. -Classifying the historical data to its cloud: The next step determines the linguistic variables represented by clouds (see Figure 8) to describe the degree of variation between data of time t and time t + n and defined it as a set of linguistic terms. Table 1 shows the digital characteristics of the cloud member function (Ex, En, He) for each linguistic term. -Building the predictive logical relationships (PLR): The model builds the PLR to carry on the soft inference A t−1 → A t , where A t−1 and A t are clouds representing linguistic concepts, by searching all clouds in time series with the pattern ( A t−1 → A t ). -Building of predictive linguistic relationship groups (PLRG): In the training dataset, all PLRs with the same "current state" will be grouped into the same PLRG. If A 1 , A 2 ,· · · , A m is the "current state" of one PLR in the training dataset and there are r PLRs in the training dataset as A 1 → A 1 ; A 1 → A 2 ; . . . . ; A 1 → A m , the r PLRs can be grouped into the same PLRG, as A 1 → A 1 , A 2 , . . . ., A m . Then, assign the weight elements for each PLRG. Assume A i has n 1 relationships with A 1 , n 2 relationships with A 2 , and so on. The weight values (w) can be assigned as w i = (number of recurrence of A i )/(total number of PLRs).
-Calculating the predicted value via defuzzification: Then the model forecasts the next day (open, high, low, close) prices through defuzzification and calculates the predicted value at time t P(t) by following the rule: Rule 1: If there is only one PLR in the PLRG, ( A 1 → A i ) then, (n 1 × Ex 1 ) + (n 2 × Ex 2 ) + . . . + (n p × Ex p ) n 1 + n 1 + . . . + n p + S(t − 1) (9) Rule 3: If there is no PLR in the PLRG, ( A 1 → # ) where the symbol "#" denotes an unknown value; then apply Equation (8). Ex i is the expectation of the Gaussian cloud C i corresponding to A i , n i is the number of A i appearing in the PLRG, 1 ≤ i ≤ r, and S(t − 1) denotes the observed value at time t -1.

Experimental Results
In order to test the efficiency and validity of the proposed model, the model was implemented in MATLAB language. The prototype verification technique was built in a modular fashion and has been implemented and tested in a Dell™ Inspiron™ N5110 Laptop machine, Dell computer Corporation, Texas, which had the following features: Intel(R) Core(TM) i5-2410M CPU@ 2.30GHz, and 4.00 GB of RAM, 64-bit Windows 7. A dataset composed of real-time stocks series of the NYSE (New York Stock Exchange) was used in the experimentation. The dataset had 13 time series of NYSE companies, each one with the four prices (open, high, low, and close). Time series were downloaded from the Yahoo finance website (http://finance.yahoo.com), Table 2 shows the companies' names, symbol, and starting date and ending date for the selected dataset. The dataset was divided into 2/3 for training and the other 1/3 for testing. In the proposed forecasting model, the parameters were set as follows: the ranges of body (p) and shadow length were set to (0, 14) to represent the percentage of the fluctuation of stock price because the varying percentages of the stock prices are limited to 14 percent in the Taiwanese stock market, for example. It should be noted that although we limited the fluctuation of body and shadow length to 14 percent, in other applications, the designer can change the range of the fluctuation length to any number [4]. The four parameters (a-d) of the function to describe the linguistic variables SHORT and MIDDLE were (0, 0.5, 1.5, 2.5) and (1.5, 2.5, 3.5, 5). The parameters (a, b) that were used to model the EQUAL fuzzy set were equal to (0, 0.5). Regarding the two parameters D 1 and D 2 , which are used to determine the UOD, we can set D 1 = 0:17 and D 2 = 0:34, so the UoD can be represented as [6,8]. Finally, the number of drops in the cloud model used to build the membership function is usually equal to the number of samples in the dataset to describe the data efficiently. The mean squared error (MSE) and mean absolute percentage error (MAPE) that are used by academicians and practitioners [4,21] were used to evaluate the accuracy of the proposed method. Tables 3-6 show the output of applying each model step for the Yahoo dataset.  The suggested model was verified with respect to the RMS on both the training and testing data. The predicted prices of the model were found to be correct and close to the actual prices. There was a clear difference between the MSE values for the training and testing data, showing that the model was overfitting the training data as the error on the training dataset was minimized. The reason for this is that the model was not as generalized and was specialized to the structure in the training dataset. Using cross validation represents one possible way to handle overfitting, and using multiple runs of cross validation is better again. The model RMS is summarized in Table 7.  Table 8 shows the comparison results between our two versions of the suggested model: the first one uses open, high, low, and close price as the initial price in the cloud FTS model (Cloud FTS) and the second method uses HaOpen, HaHigh, HaLow, and HaClose prices as the initial price in the cloud FTS model (HA Cloud FTS), and other two standard Song fuzzy time series (FTS) [13,14] and Yu weighted fuzzy time series (WFTS) models [23]. In Song's studies, the fuzzy relationships were treated as if they were equally important, which might not have properly reflected the importance of each individual fuzzy relationship in forecasting. In Yu's study, it is recommended that different weights be assigned to various fuzzy relationships. From Table 8, the MSE of the forecasting results of the proposed model was smaller than that of the other methods for all datasets. That is, the proposed model could obtain a higher forecasting accuracy rate for forecasting stock prices than the Song FTS and Yu WFTS models. In general, the MSE values changed according to the nature of each dataset. It can be noted from the table that the Wells Fargo dataset yielded the best results in terms of RMS for both the training and testing data. In general, the Wells Fargo dataset is a small dataset (2,313 row and 12 column) that is probably linearly separable, so it produced high accuracy. This is a bit difficult to accomplish with larger data, so the algorithm produced lower accuracy. One possible explanation of these results is that, compared with standard models that use FTS only, utilizing FTS with the cloud model helps to automatically produces random membership grades of a concept through a cloud generator. In this way, the membership functions are built based on the characteristics of the data instead of traditional fuzzy-based forecasting methods that depend on the expert. From the point of view of the importance of using HA candlesticks with the cloud model for forecasting, utilizing the HA candlesticks showed significant features that could identify market turning points and also the direction of the trend that helps improve prediction accuracy.
The last set of experiments was fulfilled to validate the efficiency of the suggested model compared to state-of-the-art models listed in Figure 9 using the Taiwan Capitalization Weighted Stock Index (TAIEX). The data used for comparison were obtained from a website https://www.twse.com.tw/ that provided the stock prices prevailing at the NASDAQ stock quotes. As shown in Figure 9, the proposed model can perform effective prediction where the predicted stock price closely resembles the actual price in the stock market. The MSE of the suggested model was 665.40 compared with 1254.90, 4530.45, and 4698.78 for the other methods, respectively. Clearly, the suggested model had a smaller MSE than the previous methods. One of the reasons for this result is due to the merging between the cloud model and HA candlesticks, which makes it possible to account for the vagueness and uncertainty of the pattern features based on data characteristics. price closely resembles the actual price in the stock market. The MSE of the suggested model was 665.40 compared with 1254.90, 4530.45, and 4698.78 for the other methods, respectively. Clearly, the suggested model had a smaller MSE than the previous methods. One of the reasons for this result is due to the merging between the cloud model and HA candlesticks, which makes it possible to account for the vagueness and uncertainty of the pattern features based on data characteristics.

Conclusions
In recent years, mathematical and computational models from artificial intelligence have been used for forecasting. Knowing about future values and the stock market trend has attracted a lot of attention by researchers, investors, financial experts, and brokers. This work analyzed stock trading due to its high non-linear, uncertain, and dynamic data over time. Therefore, this paper presented a Japanese candlestick-based cloud model for stock price prediction that minimizes the investor risk while investing money in the stock market. The proposed work presented an enhanced fuzzy time series forecasting model based on the cloud model and Heikin-Ashi Japanese candlestick to predict and confirm the accurate stock trends. The objective of this model was to handle qualitative forecasting and not quantitative only. The experimental result showed that using HA Cloud FTS and Cloud FTS had a lower average than the other methods used in the literature. This low average proves the high accuracy of the proposed model. HA Cloud FTS provided a MSE = 0.779 for the training data and 0.176 for the test data and Cloud FTS gave a MSE of 0.939 for the training data and 0.240 for the test data; these results mean that the HA Cloud FTS method, which uses HaOpen, HaHigh, HaLow, HaClose prices as the initial price, has a significant improvement in stock market trend prediction. Future work includes embedding Neutrosophic logic to enhance qualitative forecasting.