Distribution of Labour Productivity in Japan Over the Period 1996-2006

The distribution of labour productivity is investigated by analyzing the longitudinal micro-level data set which contains the detailed financial conditions of large numbers of Japanese companies over the period 1996?-2006. The authors show that the distribution of labour productivity in both the high and low productivity ranges follows a power law distribution. The generalized beta function of the second kind, which asymptotically reproduces a power law function, is applied to explain the distribution of labour productivity. By comparing the power law exponents that characterize high and low productivity ranges, the authors show that for manufacturing industries, inequality in the low productivity range is larger than that in the high productivity range. For the manufacturing industries, the authors also clarify that the change of inequality in the low productivity range has strong correlation with GDP. In addition, by comparing the power law exponents of the high productivity range in the manufacturing and non-manufacturing industries, the authors show that the inequality of the non-manufacturing industry is higher than that of the manufacturing industry. --

in equilibrium must satisfy. In Section 4, we investigate the distribution of labour productivity and apply GB2 to explain it. By using parameters contained in GB2, we consider marginal productivity and the dispersion of both high and low productivity regions. Finally, Section 5 presents conclusions and discussions.

Data Sets
In this article, we use the data constructed from two data sets. One is the Nikkei Economic Electronic Database System (NEEDS) sold by Nikkei Media Marketing Inc (2007), and the other is the Credit Risk Database (CRD) compiled by the CRD Association (2007). NEEDS contains financial and non-financial data on companies listed on stock markets. The number of financial items for each company filed in NEEDS exceeds 800. CRD is a collection of financial and non-financial data on small and medium enterprises, including default information. The total number of companies recorded by CRD is illustrated as dashed lines in Fig. 1. The Ministry of Internal Affairs and Communications in Japan reported that 1,529,619 Japanese companies existed in 2004, while CRD contained 985,561 companies for that year. Hence, approximately 65% of Japanese companies were covered by CRD in 2004. This coverage ratio is almost the same over the period [1996][1997][1998][1999][2000][2001][2002][2003][2004][2005][2006]. Although CRD contains 93 financial items for each company, almost 60% of the companies completely filed out these items. We investigated these companies in this article.
We identified companies included in both NEEDS and CRD, and merged these two data sets to avoid overlaps. The number of companies, N I , included in the merged data is shown in Fig. 1. Here I identifies each business sector. Figures 1(a) and (b) show the change of the number of companies for the manufacturing and the non-manufacturing industries, respectively. The former is comprised of food, textile, pulp & paper, chemical & pharmaceuticals, petroleum, rubber, ceramics, steel, nonferrous metals, metal products, machinery, electrical machinery, transportation equipment, precision apparatus, and unclassified manufacturing. The nonmanufacturing industry is comprised of agriculture & forestry & fisheries, mining, construction, energy, transport, information & telecommunications, wholesale, retailing, finance & insurance, real estate, service, and unclassified non-manufacturing.

Marginal Labour Productivity in Equilibrium
Though many definitions of productivity have been proposed by many researchers and organizations, we consider labour productivity in this article. We define labour productivity x as value added per employee: where Y is the value added and L is the employee. Though there are many definitions of value added, we use the definition proposed by the Bank of Japan: where O, C, I, R, T , and D are ordinary gain, labour and welfare expenses, interest expense and discount premium, rent, taxes and public charges, and depreciation expense during the year, respectively. Operating profit Π is defined by where p, r, K, and w are price, interest rate, capital, and wage rate, respectively. Each company maximizes its profit Π by adjusting L: Thus, marginal labour productivity ∂Y /∂L satisfies In equilibrium, since there is no arbitrage opportunity for wage rates, actual wage rate w/p is equal for each company. Hence, the marginal labour productivity of i-th company and that of j-th company are equal, i.e.: If we assume the Cobb-Douglass production function given by Eq. (6) can be rewritten as If we have sufficiently long time-series data for Y i,t , K i,t , and L i,t , we can estimate β i for each company. Here subscript t means time. In addition, if we have sufficiently long time-series data for x i,t , we may regard the averaged value or the mode value of x i,t as x i in Eq. (8). Therefore, we can investigate the validity of Eq. (8). However, estimating β i and x i is difficult by such manipulation, because we do not have such long time-series data. Hence, to overcome this problem, we abandon the investigation of the validity of Eq. (8) at the company level and consider the validity of Eq. (8) at the business sector level. Thus we modify Eq. (8) to where I and J identify the business sectors.
An example of the estimation of β I is shown in Fig. 2(a) and (b), which show the precision apparatus sector in 2003. This sector is constructed from N I = 2, 872 companies in 2003. In these figures, each dot corresponds to a company, and the surface is given by log 10 Y = log 10 A I + α I log 10 K + β I log 10 L, where A I = 10 3.09 , α I = 0.159, and β I = 0.890, respectively. Figure 2(a) shows the distribution over the surface and the positive correlation between log 10 K and log 10 L. Figure 2(b), which is the perspective from a parallel direction to the surface described by Eq. (10), shows that each dot exists near the surface and suggests that the Cobb-Douglass production function is applicable.
To investigate the validity of Eq. (8), we must define x I . Below, we use the modal value of x i as x I . In the next section, we explain the method to obtain x I . Figure 2: Distribution of log 10 K, log 10 L, and log 10 Y of companies belonging to precision apparatus sector in 2003.

Distribution of Labour Productivity
To obtain the value of x I , we must study the distribution of x i . In addition, by studying the distribution of x i , we clarify the characteristics of labour productivity. An example of distribution of x i is shown in Figs. 3(a)-(d), which show the precision apparatus sector in 2003 as in the case of Fig. 2. Figure 3(a) displays a histogram of x in the range 0 < x ≤ 40000 and shows that the distribution has a peak around x ≈ 5000 and a fat tail in the high productivity region. This fat tail behavior is further clarified by considering the distribution of log 10 x. Figure 3(b) illustrates the histogram of the distribution of log 10 x and shows that the fat tail is applicable to both sides of the distribution.
The behavior of the tail part is further clarified by studying a cumulative distribution. Figure 3(c) is a log-log plot of cumulative distribution P > (x), defined by in the continuous representation. Here p(y) is a probability density function. This figure shows that log 10 P > (x) in the low productivity region is represented by a linear function of log 10 x: where ν is the value of the slope. Therefore, the cumulative distribution in the low productivity region is represented by the power law function given by where log 10 A is an intercept.
A complementary cumulative distribution is also defined by in the continuous representation. Figure 3(d) is a log-log plot of the complementary cumulative distribution, which shows that the complementary cumulative distribution in the high productivity region is also represented by the power law function given by where log 10 A ′ is also an intercept. Now, we consider the functional form of the distribution of labour productivity. Oulton (1998) pointed out that the distributions of labour productivity in the UK are approximately lognormal. The fitting by lognormal distribution is shown as the dashed line in Fig. 3(a). This figure shows that the distribution is not explained by the lognormal distribution. Hence, we fit the distribution of labour productivity by GB2.

GB2
GB2 is given by where µ, ν, and q determine the shape of the distribution and x 0 determines the scale (Kleiber and Kotz, 2003). Here N B is the normalization factor given by where incomplete beta function B(r, s) = B(1, r, s) is defined by The cumulative distribution and the complementary cumulative distribution of the probability density function p(x; µ, ν, q, x 0 ) are given by In limit x → 0 and x → ∞, Eq. (16) behaves as follows: Equation (16) is maximum at To fit the distribution of x by GB2, we calculate parameter set (µ, ν, q, x 0 ) that maximizes the log-likelihood given by For the distribution shown in Fig. 3

Applicability of GB2
GB2 is applicable to the distributions of x for many sectors belonging to the manufacturing industry, except the machinery sector. For example, the distribution of x of the machinery sector in 2003 is shown in Figs. 4(a) and (b). The solid lines are GB2 with (µ, ν, q, x 0 ) = (3.51, 1.29, 17.13, 8299). Figure 4(a) shows that the distribution has two peaks which exist in the low productivity region and x ≈ 8000. We call these peaks the small peak and the large peak, respectively. From this figure, we recognize that GB2 deviates from the distribution in the left side region of the large peak, i.e., the small x region. However, the distribution in the right side region of the large peak, i.e., the large x region, is well fitted by GB2. This is clarified by considering the complementary cumulative distribution shown in Fig. 4(b), which is the log-log plot of the distribution; therefore, the value of µ is reliable even if GB2 cannot explain the distribution of small x region in the case of the machinery sector. As in the case of the machinery sector, GB2 is not applicable to the low productivity region in non-manufacturing industries except for the real estate sector. An example of the distribution of labour productivity in the non-manufacturing industries is shown in Figs. 4(c) and (d), which show the distributions of labour productivity in the service sector in 2003. Figure 4(c) illustrates the histogram of x in the range 0 < x ≤ 40000 and shows that there are two peaks in the distribution. This distribution pattern is the same as the machinery sector shown in Fig. 4(a). Here, the solid line is GB2 with (µ, ν, q, x 0 ) = (1.96, 0.83, 14.35, 6737). The distribution of the right side region of the large peak is well fitted by the GB2. However the distribution of the left side region of the large peak is not fitted by GB2. Figure 4(d) displays the complementary cumulative distribution and shows that the high productivity region is well explained by GB2.

Marginal Labour Productivity
As mentioned in the previous section, we use the modal value of x i as x I . Since it is reasonable to regard the value given by Eq. (23) as the modal value of x i , we can consider marginal labour productivity by multiplying β I and x I calculated from Eq. (23). The change of β I x I is shown in Fig. 5. The symbols in this figure are identical to those in Fig. 1(a). Here, we ignore the textile, the petroleum, the machinery, and the unclassified sectors, and the non-manufacturing industry. We obtained value ν < 1 for the textile sector in some years; therefore Eq. (23) is not applicable. This is why we ignored the textile sector. The petroleum sector contains a small number of companies, as shown in Fig. 1(a); therefore we excluded this sector from our investigation. As mentioned above, since GB2 is not applicable to the low productivity region for the machinery sector and the non-manufacturing industry, the value of ν is not reliable. Thus we ignore the machinery sector and the nonmanufacturing industry in this study. The unclassified sector is mixture of many kinds of businesses, so it is meaningless to analyze it as one industry. Figure 5 shows that marginal labour productivity β I x I is in statistical equilibrium, and depends on the business sectors. As in the statement in Oulton (1998) and the references therein, we believe that labour productivity depends on the business sectors. The value of marginal labour productivity is distributed in the range 4 × 10 3 < β I x I < 8 × 10 3 for many sectors except for the food sector.

Inequality in the High Productivity Range
The inequality in the high productivity region is quantified by the value of µ. A small value of µ means unequal distribution of labour productivity. The change of µ in the manufacturing industry is shown in Fig. 6(a). The symbols here are the same as those in Fig. 1(a). We ignored the petroleum and unclassified sectors for the same reason as in the case of Fig. 5. Figure 6(a) shows that µ is distributed in the range 2 < µ < 4 and that the food sector has the smallest value over all periods.
The distributions of the labour productivity of each sector and each year in the non-manufacturing industry are almost the same as Figs. 4(c) and (d). Hence, the distribution in the high productivity region is well fitted by GB2, and we can investigate the value of µ. The change of µ is shown in Fig. 6(b). The symbols here are the same as those in Fig. 1(b). We ignored the finance & insurance and unclassified sectors, because the finance & insurance sector contains too few companies as shown in Fig. 1(b), and the unclassified sector is a mixture of many kinds of businesses.  Figure 1(b) shows that µ is distributed in the range 1 < µ < 3.
By comparing Figs. 6(a) and (b), we can find that the range of µ in the nonmanufacturing industry is smaller than that in the manufacturing industry. Hence, the inequality of the labour productivity of the non-manufacturing industry is higher than that of the manufacturing industry in the high productivity region. Clarifying why is future work.

Inequality in the Low Productivity Range
The inequality in the low productivity region is quantified by the value of ν. As in the case of µ, a small value of ν denotes unequal distribution of labour productivity. As mentioned above, since GB2 is not applicable to the low productivity region in the non-manufacturing industry, the obtained value of ν is not reliable. However, GB2 is applicable to the low productivity region in the manufacturing industry. The change of ν in the manufacturing industry is shown in Fig. 7(a), where the symbols are the same as those in Fig. 1(a). Here, we ignored the petroleum and unclassified sectors for the same reason as in the case of Fig. 5. This figure shows that ν is distributed in the range 1 < ν < 2. This distribution range of ν is smaller than that of µ. This means that the distribution of the labour productivity in the low productivity region is more unequal than that in the high productivity region.
Typical behavior is observed in this figure; i.e., ν increases around the year 2000 and decreases around 2003. These behaviors of ν shown in Fig. 7(a) suggest the change of GDP, which over the period 1996-2006 is shown in Fig. 7(b). By comparing Figs. 7(a) and (b), we expect a strong and positive correlation between ν and GDP. The correlations between µ and GDP, and ν and GDP are summarized in Table 1. This result shows that the correlation between µ and GDP is weak or negative. On the other hand, the correlation between ν and GDP is strong and positive. This means that the inequality of the labour productivity of the manufacturing industry in the low productivity region has positive and strong correlation with GDP.

Conclusion
In this article, the distributions of labour productivity in Japan over the period 1996-2006 are investigated by analyzing micro-level data sets. We applied GB2 to explain the distributions and clarified that the distributions of labour productivity in the manufacturing industry, except the machine sector, are well explained by GB2. We also defined marginal labour productivity as β I x I and obtained it for each sector and each year. By comparing each value of marginal labour productivity, we clarified that it depends on the sectors. In addition, we obtained power law exponents µ and ν for each sector and each year. Here, µ represents the inequality in the high productivity region, and ν represents that in the low productivity region. By comparing µ and ν, we clarified that the inequality in the low productivity region is larger than that in the high productivity region in the manufacturing industry. We also showed that ν has positive and strong correlation with GDP in the manufacturing industry.
The generalized beta function of the second kind (GB2) is also applied to the distribution of labour productivity in the non-manufacturing industry. However, except for the real estate sector, it is not applicable to each sector in the nonmanufacturing industry. The existence of two peaks in the distribution, especially the peak in the low productivity region, prevents the application of GB2 to the distribution. However, GB2 is applicable to the high productivity region of all sectors in the non-manufacturing industry. The distribution in the high productivity region is characterized by power law exponent µ. By comparing the µ of the manufacturing industry with that of the non-manufacturing industry, we clarified that the µ of the non-manufacturing industry is smaller than that of the manufacturing industry. This means that the distribution of labour productivity in the non-manufacturing industry is more unequal than that of the manufacturing industry in the high productivity region. Clarifying why is future work.
In this article, we abandoned the application of GB2 to the distribution in the low productivity region of the machinery sector and the non-manufacturing industry. However, as shown in Fig. 8, the distributions in the low productivity region show power law distribution. These figures are log-log plots of cumulative distribution P > (x) for the machinery sector in 2003 and for the service sector in 2003, respectively.
One simple way to replicate this asymptotic behavior is the superposition of GB2: p(x; µ, ν, q, x 0 ) = c 1 p(x; µ 1 , ν 1 , q 1 , x 1 ) + c 2 p(x; µ 2 , ν 2 , q 2 , x 2 ), where c 1 and c 2 are superposition constants. Verify the validity of Eq. (27) is also future work. In this article, we did not consider why two peaks emerged in the distribution of the labour productivity for the machinery sector and almost all of the sectors in the non-manufacturing industry. However, it is natural to expect that the existence of a peak in the low productivity region has a relation with the size of companies. It is also natural to expect that the peak in the low productivity region has a relation with the entry barrier of the sector, because many new companies enter the sector if the entry barrier is low. Verify these assumptions is also a future problem.
The distribution of labour productivity must be explained by a microscopic model that describes the behavior of each company. It is natural to expect that the model is described by the stochastic process with entries and exits of companies. Therefore, estimating the entry and exit ratio by an empirical study is important. In addition, estimating the growth rate of labour productivity is also important to construct realistic models. www.economics-ejournal.org