A Data Analytical Study of the Japanese Equity Market over a Lengthy Period

The primary goal is to examine the time series of the Japanese equity market over a lengthy period of time to determine aspects of this time series as it moves and what are the factors that may determine its. We study the Index of the Tokyo Exchange over nearly four decades by data analytical methods also known as exploratory data analysis with some principle methods of statistical testing. The aim is to put to rest some arguments concerning the applicability in long range memory modeling some use of methods based serial correlation in the data over this nearly four decades of monthly observations. Those in the past have indicated that Japanese time series of equity prices contain only small percentages of permanent and temporary components. Also, Japanese stock prices are often characterized by “bubbles” in their movements. The data analytics attempts to validate this phenomena over the length of this study. Business and Economics Journal B u s i n es s a nd E oics Jour n a l ISSN: 2151-6219 Citation: Jarrett JE, Li Y (2016) A Data Analytical Study of the Japanese Equity Market over a Lengthy Period. Bus Eco J 7: 241. doi: 10.4172/21516219.1000241 Volume 7 • Issue 3 • 1000241 Bus Eco J ISSN: 2151-6219 BEJ, an open access journal Page 2 of 7 the Japanese equity market. As a number previous studies, we collected data over almost four decades from the PACAP Databases on Japanese equity markets kept at the University of Rhode Island/CBA. The data include the closing prices of the Tokyo Stock Exchange from 1975 until and including 2012. They are denoted as decade (group) 1, 2, 3 and 4. Our purpose of this study is to determine what factors in the time series of Japanese equity prices that may cause this great difficulty in prediction. We propose to study the value weighted and equal weighted monthly return over a lengthy period to explain the lack of ability of predicting accurately Japanese equity prices. Data analytics and exploratory data analysis By graphically exploring data, the research presents a great deal of information about the time series of the closing prices. Examine Figure 1 which shows the variation in the data of the nearly 40 years by month. The advantage of using monthly data is reduce the seasonal effect of daily fluctuations or hourly fluctuation in prices which would tend to diminish what one learn from the data. Figure 1 is a time series plot of closing prices of the Tokyo Stock Exchange (Figure 1). The figure indicates what the earlier studies concluded that the Japanese equities grew rapidly until about month180 and thereafter dropped significantly until the latest periods studied. Naturally, there were significant ups and downs of the closing prices often referred to “bubbles” by previous authors noted before. Stagnation is the term used to describe the variation observed in the data reducing the expectations of greater growth in the Japanese economy. If we attempt to utilize time series decomposition we would observe the results in Figure 2. [Time Series decomposition is discussed in Jarrett, 1987 and 1991 and in other sources. As you can observe, the multiplicative decomposition model indicates a solid trend in the data over time, but it does predict month to month movements which vary greatly from the trend. There is obviously cycles in the data and some seasonality as well. Movements do not tend to be long term as characterizing the data over the entire period studies. Changes occurred and did seem to ignite the stagnation felt in the equity market with the “bubbles” related to the ups and downs of the market. To determine and compare the changes in the Tokyo stock market over a lengthy period of time the data collected over the almost four decade era are in turn analyzed to summarize the information therein. The following setoff tables provide the information observed for the Tokyo Stock Index during this period (Tables 1A-1E). The above Table 1 contains a set of panels denoted by letters A, B, ...E where each panel produces the information or data analysis where one possess vital information about the time series of the Tokyo Stock Index for the near forty period studied. Note in A that we possess 456 observations with a mean of about 1167 and standard deviation 414 368 322 276 230 184 138 92 46 1 3000

Previous studies of Japanese equity markets include Hamao, Mei and Xu [5], where changes in the Japanese equity markets were documented as having relationships to the manner in which the Japanese banking systems environment. Regulations in the environment often resulted in "bubbles" in a long-term stagnation in the Japanese equity markets. The particular methods that Japan employed to stimulate and regulate the Japanese banking systems and general economy resulted in this stagnation and were related to the so-call bubbles in its economy. For example, Bayonne and Collyns [6] documented the era when Japan went from great growth in its asset prices to virtual stagnation producing the worst crises in Japan since the outcome of the Second World War. Furthermore, Ray, Jarrett and Chen [7] produced evidence of both temporary and permanent components in the time series of a sample of listed Japanese equities. The last study using ARFIMA time series methods identified these components but also indicted some of the great difficulties in predicting prices of Japanese securities. They indicated that the inclusion of the temporary component in a sample of 15 individual listed Japanese firms. I addition, listed equities of the Tokyo Exchange contain at most 5 to 15% of permanent components and, thus, there may be a small amount of predictability in listed equity prices. Nagayasu [8] using the ARFIMA-FGARCH model studied the efficiency of the Japanese equity market by examining the statistical properties of the return and volatility of the Nikkei 225. He found also, some evidence of a long range dependence. This differs from the notion of the efficient market hypothesis (EMH) and is valid for the sample studied and the period of the data. This suggests that Japan's market reform of the early 2000's resulted in no significant efficiency improvements.

Goals of this study
The motivation in this study is to examine past data on the Japanese equity markets based on the inability of financial forecasters to predict accurately both the direction and size of change in the principle Japanese equity market and see what may cause predictions to be accurate. We examine the evidence concerning the lack of long memory or serial in a roughly period of time series data on returns to *Corresponding author: Jarrett JE, Professor, Management Science and Finance, University of Rhode Island, Kingston, RI, USA, Tel: 4018744169; E-mail: jejarrett133@gmail.com Page 2 of 7 the Japanese equity market. As a number previous studies, we collected data over almost four decades from the PACAP Databases on Japanese equity markets kept at the University of Rhode Island/CBA. The data include the closing prices of the Tokyo Stock Exchange from 1975 until and including 2012. They are denoted as decade (group) 1, 2, 3 and 4.
Our purpose of this study is to determine what factors in the time series of Japanese equity prices that may cause this great difficulty in prediction. We propose to study the value weighted and equal weighted monthly return over a lengthy period to explain the lack of ability of predicting accurately Japanese equity prices.

Data analytics and exploratory data analysis
By graphically exploring data, the research presents a great deal of information about the time series of the closing prices. Examine Figure  1 which shows the variation in the data of the nearly 40 years by month. The advantage of using monthly data is reduce the seasonal effect of daily fluctuations or hourly fluctuation in prices which would tend to diminish what one learn from the data. Figure 1 is a time series plot of closing prices of the Tokyo Stock Exchange (Figure 1).
The figure indicates what the earlier studies concluded that the Japanese equities grew rapidly until about month180 and thereafter dropped significantly until the latest periods studied. Naturally, there were significant ups and downs of the closing prices often referred to "bubbles" by previous authors noted before. Stagnation is the term used to describe the variation observed in the data reducing the expectations of greater growth in the Japanese economy. If we attempt to utilize time series decomposition we would observe the results in Figure 2. [Time Series decomposition is discussed in Jarrett, 1987 and 1991 and in other sources.
As you can observe, the multiplicative decomposition model indicates a solid trend in the data over time, but it does predict month to month movements which vary greatly from the trend. There is obviously cycles in the data and some seasonality as well. Movements do not tend to be long term as characterizing the data over the entire period studies. Changes occurred and did seem to ignite the stagnation felt in the equity market with the "bubbles" related to the ups and downs of the market.
To determine and compare the changes in the Tokyo stock market over a lengthy period of time the data collected over the almost four decade era are in turn analyzed to summarize the information therein. The following setoff tables provide the information observed for the Tokyo Stock Index during this period (Tables 1A-1E).
The above Table 1         Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. Stated differently, data with high kurtosis tend to have heavy tails, or outliers. Data with low kurtosis tend to have light tails, or lack of outliers. A uniform distribution would be the extreme case. Normal distributions have a Kurtosis of approximately 3. Hence, the data are certainly far form normally distributed. Further the coefficient of variation is 48.519 indicating that the mean is slightly more than twice the size of the standard deviation. This corroborate the picture one observes after calculating the measures of skewness and kurtosis.
Panel B above, contains the descriptive measures estimated from the data of the Tokyo Stock Index. Note the mean and median are relatively close in value, i.e., 1167 versus 1134. The mean is larger indicating that the tail of the distribution is in the positive direction The standard deviation and variance are the same values as before the range is large (2592) and interquartile range (the first quartile minus the third quartile) is 831.605 which determines the middle fifty percent of the data distribution which will be utilized later in observing and interpreting the Boxplots.
Observing Panel C, we accomplish tests for location to determines if the mean of the distribution equals zero or not, i.e., µ=0 or µ ≠ 0. We reject the null hypothesis at less than .0001 for the test using Student t-statistic, Sign M and the Signed Rank statistics. The latter two tests are done since the assumption of normality in the Student t-statistic methods is probably not valid. Again this indicates the mean of data does not have a zero value and indicates the movement over the nearly four decade period exists.
Observe Panel D where the distribution of the data is displayed in a table where one can conclude that the data is not normal nor is it uniformly distributed around the median. The median is not equidistant from the maximum and minimum nor is it equidistant from the First and Third Quartiles. Finally, Panel E produces data on the extremes of the distribution which corroborates the information observed in earlier panels (Table 2). Table 2 ,also, contains a set of panels denoted by letters A, B, …E where each panel produces the information or data analysis where one possess vital information about the time series of the Tokyo Stock Index for the first decade period studied. Note in A that we possess 120 observations with a mean of about 498 and standard deviation of about 150. The skewness coefficient is about 0.4737 indicating a movement over time in the upward direction and a lack of normality in the distribution of the data. The Kurtosis coefficient of about -0.2242. Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. Stated differently, data with high kurtosis tend to have heavy tails, or outliers. Data with low kurtosis tend to have light tails, or lack of outliers. A uniform distribution would be the extreme case. Normal distributions have a Kurtosis of approximately 3. Hence, the data are certainly far form normally distributed. Further the coefficient of variation is 48.519 indicating that the mean is slightly more than twice the size of the standard deviation. This corroborate the picture one observes after calculating the measures of skewness and kurtosis.
Panel B above, contains the descriptive measures estimated from the data of the Tokyo Stock Index. Note the mean and median are relatively close in value, i.e., 498 versus 464. The mean is larger indicating that the tail of the distribution is in the positive direction The standard deviation and variance are the same values as before the range is large (624.49) and interquartile range (the first quartile minus the third quartile) is 196.95 which determines the middle fifty percent of the data distribution which will be utilized later in observing and interpreting the Boxplots.
Observing Panel C, we accomplish tests for location to determines if the mean of the distribution equals zero or not, i.e., µ=0 or µ ≠ 0. We reject the null hypothesis at less than 0.0001 for the test using Student t-statistic, Sign M and the Signed Rank statistics. The latter two tests are done since the assumption of normality in the Student t-statistic methods is probably not reliable. Again this indicates the mean of data does not have a zero value and indicates the movement over the first decade period exists.
Observe Panel D where the distribution of the data is displayed in a table where one can conclude that the data is not normal nor is it uniformly distributed around the median. The median is not      (Table 3).
Similar to the above tables, Table 3 contains the data analysis to the second decade of the study. The date table contains five panels denoted by letters A, B,…E with every panel giving evidence of the distribution of data each panel produces the information or data analysis where one possess vital information about the time series of the Tokyo Stock Index for the second decade period studied. Note in A that we possess 120 observations with a mean of about 1755 and standard deviation of about 480. The mean and standard deviation are both larger than in the previous decade. The skewness coefficient is about 0.3004 indicating a movement over time in the upward direction and again a lack of normality in the distribution of the data. The Kurtosis coefficient of about -0.6030. As stated earlier, Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a nor mal distribution. Stated differently, data with high kurtosis tend to have heavy tails, or outliers. Data with low kurtosis tend to have light tails, or lack of outliers. A uniform distribution would be the extreme case. Normal distributions have a Kurtosis of approximately 3. Hence, the data are certainly far form normally distributed. Further the coefficient of variation is about 27.3304 indicating that the mean is slightly less than four times the size of the standard deviation. This corroborate the picture one observes after calculating the measures of skewness and kurtosis.
Panel B above, contains the descriptive measures estimated from the data of the Tokyo Stock Index. Note the mean and median are relatively close in value, i.e., 1755 versus 1677. These values are both larger than the mean and median found in Table 2. Again, the mean is larger indicating than the tail of the distribution is in the positive direction. The standard deviation and variance are the same values as in Panel A, but the range is larger as that found in Table 2 (1950 versus 624.49). The interquartile range and interquartile range (the first quartile minus the third quartile) is almost 736 and is much greater than the similar statistics in Table 2 which determines the middle fifty percent of the data distribution. These values will be reflected in the figure on Boxplots to be analyzed later.
Observing Panel C, we accomplish tests for location to determines if the mean of the distribution equals zero or not as before, i.e., µ=0 or µ ≠ 0. We reject the null hypothesis at less than 0.0001 for the test using Student t-statistic, Sign M and the Signed Rank statistics. The latter two tests are done because the assumption of normality in the Student t-statistic methods is probably not valid. Again this indicates the mean of data does not have a zero value and indicates the movement over the first decade period exists.
Observe Panel D where the distribution of the data is displayed in a table where one can conclude that the data is not normal nor is it uniformly distributed around the median. The median is not equidistant from the maximum and minimum nor is it equidistant from the First and Third Quartiles. Finally, Panel E produces data on the extremes of the distribution which corroborates the information observed in earlier panels that the distribution is not symmetrical. The lowest extremes are observation numbers from 121 to 127 and largest extremes lie in 177 to 181 (Table 4).
To reduce redundancy, we observe only the large changes Decade 3 from the previous decades. Note in Panel A the mean and standard deviation became more disparate then in the earlier decades and the coefficients of variation is now only approximately 19.250. This indicates that the standard deviation is now about five times larger than the mean. Any forecasting model would probably been wrong during this decade and we can suspect that the Boxplot to be examined later will by much larger between the third and first quartile. Panel B gives evidence of some symmetry in the distribution of the data since the mea and median are relatively close in value (1270.400 and 1267.365 respectively. The interquartile range and the range are also not small but not the larges values. Hence, in Panel C the null hypothesis of the mean equaling zero is also rejected at p-values of less than 0.0001. Panel D indicates another phenomena that differs from the other decades studied. The median is roughly equidistant from the maximum and minimum. The same appears to be roughly the same         for the distance between the median and first quartile and the distance between the median and the third quartile. This is even amplified by examining the extreme observations in Panel E. All the lowest extremes include observations 337 through 341. This appears to corroborate the "bubble" observed by earlier studies. The largest extremes include two observations of 256 and 258 and another trio at 300 through 302. Analyzing data in this way certainly reinforces the conclusions of some earlier researchers who made these observations but did not present the analysis of this type ( Table 5).
The fourth period studied includes 96 observations based on the latest data available at the start and finish of this data analysis. We observe only important changes Period 4 which contains eight years of time series data. Note in Panel A the mean and standard deviation became more like Decades 1 and 2 than Decade 3. In Period 4 as in earlier decades, the coefficients of variation is now only approximately 31.472. This indicates that the standard deviation is now slightly more than three times larger than the mean. Although not like Decade 3, forecasting models would probably have difficulty during this decade and we can suspect that the Boxplot to be examined later will differ than Decade 3. Panel C indicates the same results as in all other periods (decades). Panel D shows that the median is no longer equidistant from the limits of the first and third quartiles and the distance from the median to the maximum and minimum are equal at all. . Finally, Panel E indicates the low extremes are in observations 443 to 552 and extremes in the highest direction are in the observations of 375 to 390. Theses extreme values indicate wide extreme in the data and therefore my also give evidence in "bubbles" in stock prices in the Tokyo market.
In the next section, Boxplots by decades will provide additional evidence of the difficulty to forecast or totally use data analytics to produce accurate and totally informative diagnostics of the lengthy period studied.

Exploratory data analysis with boxplots
A box-plot [20] is an underrated method of conveying location, variation and skewness for information contained in a data set. The purpose is often used to determine and detect and illustrate location     variation changes with different groups in a data set. Figure 3 are the boxplots of the variable Original (Tokyo Stock Exchange Index) for the four groups (decade or period) which summarizes the data analytics performed in the previous section. Group 1 has the narrowest middle interval, Group 2 has the widest middle interval with Group 3 having middle interval that appears having the first and third quartile limits that are equidistant from the median with a mean virtually equal to the median. Group 4 has a mean that is much great than the median with the limit of the third quartile being very much greater than the median. There is no doubt that the groups are not like each other and those of long term memory models may not be very useful in explaining or forecasting the variation in data across time (Figure 3).

Some additional comments about normality and autocorrelation
Now, we plot the entire data set data against a theoretical normal distribution in such a way that the points should form an approximate straight line. When the plot indicates departures from this straight line, the conclusion suggests that the observations depart from normality. Again, [20] points out that the normal probability plot is used to answer the following questions; "Are the data normally distributed?" From Figure 4, the normal probability plot suggests that there are series departure from normality in the data especially at the early and latter parts of the time series (Figures 4 and 5).
One can observe an additional point in Figure 5 concerning the relationship between the residuals and fitted values. Note, autocorrelation also known as serial correlation, is the correlation of an observation or value in a time series with itself at different points in time. From the plot of residuals versus the Originals, one easily observes the pattern in the plot indicating that autocorrelation is present. If autocorrelation was not present, we would see that the plot would have no pattern. Durbin-Watson Statistic and other test would show the absence of pattern. Although not shown here, the results are obvious from Figures 4 and 5 [21].

Conclusion
The study included a thorough data analysis using modern analytics and exploratory data analysis to permit us to ascertain aspects of the Tokyo (Japan) Stock Index to determine reactions to claims of researchers in previous years. Like any stock exchange, Tokyo's stock prices change from period to period and do not fluctuate in the same manner from decade to decade. The ability to predict future prices of Tokyo securities is related to other Asia-Pacific economies as well as its trading partners in North America, Europe and other parts of the world. The analytics included an exhaustive analysis of data to determine if long memory modeling was recommended by the analysis for predicting future events and economic factors that genuinely influence prices on the Japanese equity Market. Earlier studies did suggest reasons for the difficulty in predicting Japanese stock prices including an exhaustive one measuring the permanent and temporary components in Japanese stock prices of individual firms. The 'bubbles" in Japanese security prices concluded by others earlier are given greater support from this study. Lessons for other Asia-Pacific economies are subject to similar activity as Japan when study other Pacific Basin economies.