Time evolution of companies towards a stable scaling curve obtained from flow diagrams in three-dimensional phase space

Although scaling relations of power law forms have been revealed in a wide variety of complex systems, the origin of scaling relations in real systems remains mostly unsolved. Based on a long tradition in physics, we here explore the phase space dynamics of business companies, whose different measures of size have been found to exhibit multiple scaling relations. Using a large scale dataset of Japanese companies covering over two million companies for over two decades (1994–2015), we compile the data of three measures of size, namely, annual sales, number of employee and number of trading partners. Tracking the historical time evolution of companies, we first show that there exists a stable region that attracts most of the data points in the long time scale, although the company dynamics in the three-dimensional phase space looks random in a shorter time scale. We then elaborate on ‘evolving flow diagrams’ of the averaged motion in the 3D space. Remarkably, the flow diagrams indicate that a 3D curve which represents the scaling relations between the three size measures can be regarded as an approximate attractor of the averaged flows. The results could serve for better modeling and understanding of companies dynamics and our method can be applied to other dynamics such as social and biological phenomena.


Introduction
Scaling relations and scaling laws with non-trivial exponents have been central issues in physics of complex phenomena. Since the non-trivial scaling relation around a critical point was first discovered over a century ago [1], the scaling concept had guided the exploration into critical phenomena [2][3][4][5][6][7]. It was subsequently applied to soft matter physics and led to the discovery of universal features of polymers [8,9] and colloidal aggregates [10]. Meanwhile, scaling relations had also been found for complex systems such as biodiversity in ecological communities [11,12], animal bodies and cells [13][14][15][16], human cities [17][18][19] and business companies [20][21][22][23][24][25]. Recent physical and mathematical studies have attempted to explain scaling relations in such systems and provided plausible theories for animal bodies or cells [26,27] and for human cities [28,29]. All these studies have employed statistical mechanical approach, where macroscopic laws and patterns are explained by assuming approximate laws and dynamics at more fundamental levels.
However, there remains a significant challenge in statistical physics approach to complex phenomena, because the dynamics at fundamental levels has not been sufficiently established for biological, social or economic systems. We assume that this lack of established laws at microscopic levels has led to a plethora of models for such systems. On business companies, for example, models have highlighted several mechanisms or factors as diverse as hierarchical organization [30], stochasticity in competition [31,32], financial [33] or hiring/ firing [34] behaviors, preferential attachment of company units [35], social networks [36] and multiple independent components in companies [35,[37][38][39][40]. Although they often agree with the data at the level of distributions calculated from a snapshot of the systems, these models lack validation by direct observation at microscopic levels, or fail when their microscopic dynamics is explored [41]. The situation is similar in biodiversity research: two classes of models with radically different assumptions on dynamics at microscopic levels both can explain the rank-size distribution of abundance [42,43] and spatial distributions [44,45] of natural species.
During the development of statistical mechanics, there emerged an alternative approach, in which one explores laws and dynamics only in terms of the 'macroscopic' variables that allow direct observation. More specifically, the evolution of complex systems in the phase space of 'macroscopic' variables is studied. This approach dates back at least to Lorenz [46,47], who is also famous for the pioneering work [48] in chaos theory. He proposed the analogue method: trajectories of systems with unknown dynamics could be predicted by collecting the data from the past with configurations similar to the present. By applying this method to the world trade, Cristelli et al [49] recently demonstrated that countries and territories were in either turbulent or laminar zones in the phase space of their GDP and 'fitness' measure in the trading network. This suggests that the methodology in this line could be vital to studies of complex phenomena. We consider this approach to be complementary to the statistical mechanical approach. Although observed scaling properties in complex systems may be explained by the statistical physics models enumerated above, explanation of such properties by the dynamics at macroscopic levels would be valuable: the characterized dynamics could be used to justify or falsify the possible theories on dynamics at microscopic levels. Nevertheless, such an approach to the scaling relations has not been tested yet to the best of our knowledge.
Here we empirically observe how companies evolve into the previously known scaling relations in a threedimensional phase space. Instead of the statistical mechanical approach, we introduce the macro-variables approach. To this aim, we utilize a comprehensive dataset of approximately 2 million Japanese companies accumulated over more than 20 years since 1994. We compile the data of three measures of size from the dataset: (i) monetary size estimated typically by the annual sales, (ii) labor size measured by the number of employees, and (iii) the size of transactional activity which can be characterized by the number of companies with which the focal company has a direct trading relationship. These measures of size were previously studied in the context of scaling relations with scale-invariant fluctuations [24]. Based on this huge dataset, several pieces of evidence are presented which indicate that companies tend to evolve into a single three-dimensional 'scaling curve', on which companies are 'stable' in the sense that their average velocity is close to zero.
The rest of the paper is organized as follows. First, we briefly discuss our compilation of the data. Second, we show that the previously reported multiple two-dimensional scaling relations can be represented by a curve in the three-dimensional phase space. We then inspect the time evolution of companies testing and disproving the hypothesis that it is an anisotropic random walk with constant diffusion coefficients. We further derive evolving flow diagrams from the data to demonstrate that companies move towards the scaling curve on average. Lastly, we discuss the implications of our results and future research directions.

Data compilation
We compile the data utilized here from an exhaustive dataset by a major credit reporter that summarizes the description of Japanese companies in the period of 1994-2015 (COSMOS 2 by Teikoku Databank, Ltd.), available to us in January, 2017. The dataset contains total of 2.415×10 6 companies (1.263×10 6 in yearly average). We filtered out a small fraction of financial companies and governmental organizations whose sales are defined very differently from other ordinary companies. We also removed a very small number of sales data which were recorded more than 8 years after the publication of the financial statement. Additionally, we excluded the sales data where the end of fiscal year changes, because some of them are not likely to be annual sales. Therefore, our final dataset, which consists of totally 2.395×10 6 companies (1.247×10 6 in yearly average), primarily concerns manufacturing, construction or wholesale companies. Here, we have the data of annual sales, s, and the number of employees, ℓ.
We also construct a trading network for every year from the list of trading relationships between companies, to obtain the degree k (i.e. the number of direct trading partners) of every company. The network includes 3.051×10 6 trading links per year on average. This enables us to consider the companies' size in terms of transactional activity in the network of trading partnership [24]. We note that data of the trading network is almost monotonically growing with years, from 1.826×10 6 trading links in 1994 to 3.873×10 6 in 2015.

Three-dimensional scaling curve
We first present a unified picture for the previously reported triple scaling relations between pairs of variables in the three-dimensional phase space of (k, ℓ, s). Watanabe et al [24] demonstrated the triple scaling relations between the number of trading partners, k, the number of employees, ℓ, and the annual sales, s, for middle-and large-sized companies: k s k , µ ℓ , means that the median s for companies with ℓ employees increases as ℓ 1.3 when ℓ is increased. Furthermore, fluctuations around the conditional medians were found to be scale-invariant: the probability of having, for example, double or half the conditional median sales is independent of k and ℓ. In other words, the distributions of ℓ/k 1.0 , s/k 1.3 and s/ℓ 1.3 are independent of the size of companies, and these variables are, in this sense, intensive. Such scaling relations fit our dataset well for all 22 years with slightly different scaling exponents of 1.2 for k-s and ℓ-s scaling (figure 1). This means that when the three-dimensional distribution of companies is projected to a two-dimensional plane by neglecting one of the variables, companies are densely located on a line defined by the scaling relations. This highly suggests that the three-dimensional distribution of companies can be represented by a single curve in the three-dimensional phase space (red line in figure 1), near which the data points are densely located. We hereafter refer to this hypothetical curve as the 'scaling curve'. While the scaling curve is linear in logarithmic scales for middle-and large-sized companies, this does not hold for small-sized companies. Indeed, in the two-dimensional plots of conditional quantiles (figure 1), the scaling exponents are evidently different at the zone of small-sized companies and that of middle-or large-sized ones. Therefore, we hereafter refer to these two different regimes of scaling as the small-company and largecompany regimes, considering them separately.

Alignment of companies towards a curve
Here we first visually inspect the data to obtain an intuitive view of the three-dimensional dynamics of companies. As in the previous section, we regard a company as a point in the three-dimensional space of the logtransformed measures of size (K, L, S) ≡ (log k, log ℓ and log s). Historical changes of size measures of companies, initially near a point in the space, are demonstrated in figure 2. Historical changes of total of 6792 companies near (k, ℓ, s)=(10, 10, 500) are tracked to show the representative evolution in the phase space (see figure 2). It is evident from the figure that companies are not evolving isotropically in this 3D representation. Rather, there seems to exist a curve along which companies are most likely to move.
Applying principal component analysis (PCA) on the data can further reveal whether the evolving dynamics of companies can be modeled as an anisotropic random walk with constant diffusion coefficients. If we assume the dynamics as direction-dependent yet time-or location-independent diffusion from a specific point, two consequences follow. First, the characteristic direction along which the points are most widely distributed does not change with time. Second, the three-dimensional shape of point distributions at different times are similar.
Both conclusions can be conveniently tested with PCA. Indeed, the first principal component (PC1) resulting from PCA specifies the characteristic direction along which the data are most widely distributed, whereas the second principal component (PC2) indicates the second characteristic direction of the distribution after it is projected into the plane perpendicular to PC1. The procedure is repeated until all the variances are accounted. Note that the principal components (PCs) are perpendicular to each other. Stability of characteristic directions can thus be evaluated by the stability of PCs, while similarity of distributions can also be tested by the stability of ratios between the variances along the PCs.
To further characterize the apparent anisotropic moves, we apply PCA on two different groups of companies starting in 1994 from different points in the three-dimensional space, x k s , , 10, 10, 500 ( ) denote the unit PCi vectors in the year t for companies starting from x 1 and x 2 , respectively. Two results emerge from the analysis. First, the PC unit vectors turn to be remarkably stable across years (see figure 3). We measure the differences between the principal component vectors in different years by the angles between them, u u u u , arccos Overall, the differences from the final year, u u , ,2015 q +D ( ) , are very small (θ =π/2) and become almost negligible 5 years after the initial year, as seen in figures 3(a) and (b). Therefore, we conclude that the characteristic directions do not change over time. Second, however, there is a clear sign of deviation from a random walk. The variance along PC2 and PC3 almost reaches a plateau 10 years after the initial year, while variance along PC1 increases linearly (figures 3(c) and (d)), which is enough to reject the random walk hypothesis. This indicates that fluctuations along PC1 of the distribution may be modeled as a random walk, but fluctuations perpendicular to PC1 reach a steady state. It thus suggests that there exists a Langevin-like 'field' that drives back the randomly deviating companies into a line parallel to PC1, as seen in figure 2(f) with the red dashed central line.

Evolving flow diagrams
Considering the results in the previous section, it is natural to hypothesize that companies that are distant from the scaling curve have the tendency of flow towards this curve. To support this hypothesis, we elaborate on what we refer to as the 'evolving flow diagrams' by plotting the estimated vector field of the annual growth in the three . The angle θ is relatively small compared to π/2 in the last 15 years. In panels (c) and (d), the mean squared variation (variance) along the respective principal component is plotted against the year. PC1 variance is increasing linearly with time, while the second and third ones (PC2 and PC3) almost reach a plateau 10 years after the initial year. This suggests that companies do diffuse in the PC1 direction whereas the diffusion in the PC2 and PC3 directions is constrained. Panels represent the results for 6792 companies of the initial size (k, ℓ, s)≈(10, 10, 500) (panels (a) and (c)) and 26509 companies of the initial size (k, ℓ, s)≈(3, 3, 100) (panels (b) and (d)).
dimensional space of k, ℓ and s. This is inspired by previous work on the prediction of countries' economic growth [49], where the authors advocate the applicability of analogue methods [46,47] to economic systems.
We first estimate the average growth rates of size measures, k, ℓ and s, at a specific point (k 0 , ℓ 0 , s 0 ) in the three-dimensional phase space. We do this by collecting the data sufficiently near the point and computing the arithmetic mean of log-transformed growth rates. Let us define d k s , , )as the Euclidean distance of company c from a fixed point (k 0 , ℓ 0 , s 0 ) in the previous year t−1 in the logarithmically scaled space: We sample the companies within the Euclidean radius d log <log[10 1/8 ] and eventually get the Nadaraya-Watson estimate with the kernel function of rectangular pulse [50]. Nevertheless, when the resulting sample size is less than 200, the threshold of d log is enlarged until the sample number reaches or exceeds 200 to suppress the variability of estimates, therefore employing the 200-nearest neighbor method. Thus, the final estimation of and #(S) represents the number of elements in a set S. Taking the logarithm of growth rates, we can limit their possible ranges of several orders of magnitude within those expected from exponential distributions. This makes the arithmetic mean a more robust estimator of the typical value. Note that the arithmetic mean of logtransformed growth rates is equal to the logarithm of geometric mean of growth rates. We then render two-dimensional diagrams that represent slices of the three-dimensional space, exploiting the method developed in the field of computer graphics [51]. In short, the method follows two steps: (i) placing some points randomly on the plane and drawing streamlines that pass through them, and (ii) repeatedly comparing the original image with a randomly modified one and selecting the one in which the streamlines are placed more homogeneously. Random modifications include inserting, deleting, lengthening, shortening and parallel moving of a streamline.
We show some slices of the vector space in figure 4. Two slices of constant sales (s=10 2.0 and s=10 3.5 million yen, respectively) are shown in the figure. The streamlines with arrows represent the average movement of companies parallel to the slice, while background colors indicate the average flow orthogonal to the slicing plane. For example, one can see the average time evolution of a company of the size (k, ℓ, s)=(10, 100, 100) from figure 4. The point (k, ℓ)=(10, 100) can be specified on the slice of s=100 (see the yellow dot in figure 4(a)). The red background around the dot indicates that the company grew in sales by a factor of over 1.15 on average in a year, while the employees was declining rapidly by a factor of 1.6 and almost no change is expected for the number of trading partners. Note that the data in all years of 1994-2015 are joined together to have the maximal sample size.
Clearest in figure 4 is the mean flows in ℓ (the number of employees) and k (the number of trading partners) towards a point of stability. It can be seen in figure 4 that the point of stability is located near (k, ℓ)= (2,3) in the slice of s=10 2.0 , and near (k, ℓ)=(100, 15) in the slice of s=10 3.5 , respectively (see the red dots in both panels). Deviations from the point are compensated by the moving parallel to the slice, i.e.their simultaneous changes in the trading (k) and employment (ℓ). Therefore, this point marks the steady state of the companies having a constant sale.
It is also seen from comparison of figures 4(a) and (b) that the point of stability shifts in the direction of a large number of trading partners (k) and employees (ℓ) as the fixed value of sale (s) is set to be higher. This means that the points of stability for slices of constant sales constitute a curve in the three-dimensional space, which changes from low values of k, ℓ and s to higher values. We hereafter refer to this curve defined from the mean flows as the stability curve.
We further notice in figure 4 that those points on the stability curve can be characterized by relatively very slow absolute changes in sales: average flows around the points of stability are close to zero in comparison to other zones of the space. Therefore, the stability curve can be regarded as an attractor of the mean flow of companies in the phase space, since the average flows on the curve is almost zero in any direction.
Although visually impressing, the results should be interpreted with some care. First, the estimates are most accurate in zones of small sizes (bottom-left in figure 4) and less accurate in other zones, because of poor statistics due to less numbers of sample companies. In particular, the number of samples is fewest at the edges of bottom-right and top-left, where very few companies exist. When the growth rates are estimated for such points without any company around, these estimates are based on data points on the nearest edge of the distribution. Furthermore, the estimates are biased when there is a gradient of data density: the center of the distribution then has more weight than peripheral regions, so that the effects of moving outward from the center of the Note that our method interpolates the average rates so that one can have estimation for points around which few companies actually exist. A stable point can be seen for both slices, while the stable point is shifted in the high-k and high-ℓ direction with increasing sales. distribution on growth rate are always underestimated. Finally, the continuous increase of trading link data (almost two-fold in the two decades) is likely to add positive bias in the estimate of k flow (the change in number of trading partners), turning the direction of average flows rightward in figure 4. This might be the cause of flows in top-right of figure 4(b) that never reach the stability points but continuously go in the right. However, our main conclusion remains unaffected, as the flows towards the steady state cannot be explained merely by these biases and errors.
Considering the results presented in figures 3 and 4 together, it is natural to hypothesize that the stability curve is almost identical to the scaling curve as shown in figure 2(f). We support this by plotting the evolving flow diagrams for sloped slices parallel to the PC vectors ( figure 5(a)). PCA is here newly applied to all the data joined together from all years, yielding the PC vectors, where u u , K L and u S are the unit vectors in the direction of log k, log ℓ and log s axes, respectively. The differences between these PC vectors and those calculated above for the two groups of companies are small ( u u , 0 .  ) ]. Note that the PC1 value is extensive in the sense that it scales with the company size while the PC2 and PC3 values are intensive just as ℓ/k 1.0 , s/k 1.3 and s/ℓ 1.3 discussed in section 3.
As seen in the diagram, it supports our hypothesis that companies are on average flowing towards the PC1 line that pass through the mean of the distribution (marked by the red dot in figures 5(a) and (b)). On the other hand, companies are almost 'neutrally' stable on the PC1 line: after a small perturbation, a company on the line tends back towards the line, but a displacement parallel to the PC1 line would not be recovered, due to the fact that the flow along the PC1 line is close to zero near the line (see figure 5(b)).
Some non-ideal dynamics of the system is evident in figure 5. First, one would notice the small flow on the PC1 line in the declining direction. This is consistent with the previous studies that have reported negative sales growth of companies on average [52], considering that a large proportion of the companies are located near the PC1 line by definition. Second, the blue-colored area in figure 5(a) is far larger than the red-colored area, meaning that PC2 is generally highly likely to decrease. Since the PC2 value (i.e. −0.88 ln k+0.48 ln ℓ−0.03 ln s) is determined mainly by the number of trading partners (k), this suggests that the number of trading partners (k) of a company tends to increase constantly. This is particularly well shown in figure 4(b): high-k and high-ℓ companies seem to tend towards higher-k states rather than the stable state. We attribute the increase of k to the constantly growing amount of data itself, since adding new companies to the database necessarily leads to increasing the trading relations previously not recognized in the data. Nevertheless, the PC1 line remains an approximate attractor in the dynamical system of the averaged motion, in the sense that companies flow primarily towards the line and the speed of flow is considerably lower on the line. We conclude that the PC1 line, which we consider to be a part of the scaling curve, approximately agrees well with the stability curve in the small-company regime. Note that we obtain similar results for the large-company regime as presented in appendix.

Dissipation rates
Apart from the average flow that we described above, we can also regard the difference in the disappearance or dissipation rate as the origin of a stable distribution of companies in the phase space. Recall that a random walk process has no steady state. In order for a stochastic process to have a steady state, diffusion has to be compensated by some countereffects. In our case, we suppose that if companies disappear more rapidly at the zones far from the scaling curve compared to the zones near it, this difference in the dissipation rate could keep the distribution near the scaling curve against the diffusive effects that lead the system to non-stationary dynamics.
In order to explore our hypothesis, we employ a method slightly modified from the one in the previous section. First, a company is determined to be nonexistent in a year t if there is no record for the company's existence either in the year t or in the following year t+1. We then compute the estimated dissipation rate at each point, k s , , m  ℓ ( ), from the data of all years as follows: where δ c,t is 1 (if the company c does not exist in year t) or 0 (otherwise) and d thr is here set to be log[10 1/4 ]. We plot two slices of the scalar field of dissipation rates, k s , , m  ℓ ( ), into respective heatmaps ( figure 6). The slices are set in the same way as in figure 5. Figure 6(a) clearly shows that m  is higher at zones far from the PC1 line compared to zones near the line. It is also evident that the rate of dissipation is lower at the zones of larger PC1, which is in line with an earlier report that such a rate decreases with company size [53]. The difference in dissipation rate in the PC3 direction also conforms to the expectation in figure 6(b), although m  is almost monotonically increasing with the PC2 value against our hypothesis. Note that a low value of PC2 indicates that the company has more trading partners compared to the average one of those with the same number of employees, therefore suggesting superiority, rather than inferiority, of the company. Note that in similar plots for the large-company regime, however, the tendency of higher dissipation rate at zones far from the scaling curve is more evident (appendix). In conclusion, the difference in the dissipation also plays a role in stabilizing the distribution of companies by keeping them near the scaling curve against random diffusion.

Discussion and conclusion
We have analyzed the dynamics of companies, highlighting the general tendency of companies towards a steady state. As we have seen in figure 5, there exist tendencies in the time evolution leading companies to a 'stability curve' in the three-dimensional space. For small-size companies, the curve is parallel to the PC1 vector of the company distribution ( figure 5(a)). Note that the PC1 vector does not agree with the previously reported scaling exponents, especially with regard to the number of trading partners, k. As the companies are primarily scattered along the first principal component (PC1) direction by definition, equation (1a) indicates that ln ℓ increases by 0.61 on average when ln k increases by 0.30, thus suggesting the square relationship between k and ℓ (ℓ ∝ k 2 ). On the other hand, the relation between k and ℓ has been found to be almost linear (ℓ ∝k) in the previous study [24]. This is probably because of the different data that has been analyzed. In our analysis, the vast majority of companies are the small companies, while the small-size companies have been excluded from the analysis in the previous study. Indeed, when companies of k 10 ⪅ that belong to the small-company regime are excluded (see appendix), the PCA recovers the previously reported scaling relations (see figure A1) and the stability curve is again parallel to the PC1 vector of the truncated company distribution (see figure A2). Therefore, we conclude that the stability curve can be approximated by a single scaling curve (composed of two functions for different regimes) over the whole range.
We are inclined to interpret our results of flows towards the scaling curve as an emergent property. If the scaling relations and the bounded dynamics were the result of a rigid regulation, there could be a clear-cut threshold over which companies are not allowed to exist. On the contrary, the distribution of companies in the three-dimensional space is rather smooth, as revealed in the previous study [24]. Moreover, the dissipation rate is higher at zones far from the scaling curve compared to zones near the curve. This suggests that there are difficulties in staying at the zones distant from the scaling curve, and that the moves towards the scaling curve are adaptive and spontaneous.
It remains an open question what are the origins of the two different regimes of scaling. One possible factor is the discretized values of k and ℓ that might strongly affect an analysis particularly when they are less than 10. Nevertheless, different scaling exponents might also be relevant to differences in actual dynamics of companies. For example, the number of companies having 5-10 trading partners are more than expected from the powerlaw tail of the distribution in our dataset (as seen in [54]). This might suggest that there is a 'barrier' that prevents those in the small-company regime from entering the large-company regime. Existence of such a barrier would be suggestive of different mechanisms actually at work in different regimes.
While all kinds of companies are collectively used in our current analysis, finer-grained analyses are possible in the framework we have presented here. For example, the dynamical similarity between companies of different industrial sectors can be evaluated. Since survival rates of companies are known to be surprisingly similar across different sectors [55], it seems reasonable to hypothesize that the dynamics in the phase space is also similar. Figure 6. Estimated dissipation or disappearance rates for companies on sloped slices defined by principal components (PC). Blue, gray and red background colors represent a low, medium or high dissipation rate, respectively. Each panel represents the same slices as in figure 5: (a) the PC1-PC3 plane and (b) the PC2-PC3 plane. Companies are more likely to disappear when their PC3 value is far from the central red point (i.e. the mean of the distribution) in the PC1-PC2 plane. This is also seen in the PC2-PC3 plane. All the variables (k, ℓ and s) are greater than a unit inside the zones marked by white dashed lines.
Another possible direction is to study the dynamics for companies of different ages. Despite much consideration devoted to the topic, empirical research on whether companies behave differently depending on their age is still lacking.
It is important to note that there are always temporal fluctuations in companies' size measures [20,30,35], which are presumably a dominant factor of company dynamics (diffusion) especially around the stability curve. We speculate that the diffusion effects of the stochastic growth rates are in equilibrium with the average flow discussed above. This might lead to the universal fat-tailed distribution of companies around the scaling relations [24] through a process similar to a random multiplicative process [56]. However, the connection between the common scaling function and the stochastic dynamics is yet to be established.
Our results might support the claim [49] that the analogue method is beneficial in addressing the problem of dynamics which cannot be solved by the standard regression analysis, or more generally, by the statistical analysis of a snapshot of a system. As shown in figure 5(b), low-PC3 companies subsequently tend to increase their PC1 value, while high-PC3 (and high-PC2) ones are likely to decrease it slightly. Therefore, we could expect that the 'natural' size of companies would be larger for low PC3 compared to high PC3, even if their PC1 values are the same. This point cannot be revealed only by the PCA.
Finding the novel stylized fact of agreement between the scaling and stability curves, we believe that our results could be also beneficial to future modeling and theoretical considerations. Although researchers have formulated numerous models [30][31][32][33][34][35][36][37][38][39][40][41], we are not aware of any work that could be used to explain the agreement between the scaling and stability curves demonstrated here. Thus, we suggest that future theoretical studies should incorporate this phenomenological multi-variate evolution of companies.
We expect that similar results would emerge when applying our method to a different set of companies in other areas or countries. Interestingly, different scaling exponents for the s ℓ relation have been found in datasets from different countries and areas. While sales (s) increase superlinearly with employees (ℓ) in the Japanese data, the relation is almost linear in the dataset from the United States of America and sublinear in the one from the People's Republic of China [25]. It would be a promising research direction to carry out comparative studies across different countries and areas. Furthermore, analogous results might emerge when one examine diverse natural, social or economic systems such as city growth [57] or the species-area relationship in ecological systems [12].
Appendix. Scaling and stability curve in large-company regime A.1. PCA with unbiased thresholding Here we discuss the method to obtain the threshold that discriminates between the large-and small-company regimes, so that we can filter out the data points in the small-company regime. We first define a property of data filtering (i.e. self-consistency) in relation to the principal component vectors for the filtered data. We then present an algorithm to search for filtering criteria that have the property. This algorithm gives a series of possible threshold planes as a function of a scalar parameter of company size. In ideal cases, one can find a single 'true' threshold of company size, above which the candidate threshold planes are all placed in parallel to each other.
We define a 'self-consistency' for a data filtering, a x 1, We hereafter refer to the threshold plane (A.1) of a self-consistent criterion as a self-consistent threshold plane. The definition is inspired by the notion of self-consistency for a principal curve along which the data are aligned [58,59]. In our data, the principal curve presumably amounts to the scaling curve (red bold curve in figure 1). Self-consistent threshold planes are all parallel to each other in certain ideal cases. Suppose that the original dataset can be modeled as where x i denotes the vector representing the ith data, v 0 a vector representing the origin, i r i.i.d.random scalar variables, u 0 a vector representing the characteristic direction and e i i.i.d.random vectors that are perpendicular to u 0 . Note that multivariate Gaussian distributions with a non-identity covariance matrix are included in this class of multivariate distributions. Given that the variance of e i along any direction is smaller than the variance of i r , the PC1 vector should be almost parallel to u 0 [58]. In this case, removing the data below a threshold plane perpendicular to the original PC1 vector (i.e. accepting the data with i r larger than a threshold) does not affect the location of the PC1 line, as long as the variance of i r in the new dataset is larger than those of e i . As a result, self-consistent threshold planes are always almost perpendicular to the PC1 vector for the original dataset, and thus almost parallel to each other.
We can determine if the data conform to the model (A.2) based on whether the self-consistent threshold planes are placed in parallel. Assuming that the multivariate company distribution in the large-company regime can be described by the model (A.2) (i.e. 'straightly' distributed as in figure 1), the self-consistent threshold planes should be all parallel to each other above a 'true' threshold.
A series of self-consistent threshold planes for our data is determined by minimizing the difference between the initial threshold plane and a plane that is perpendicular to the PC1 line for the filtered data. In more detail, our procedure is as follows. First, we set an arbitrary threshold plane and define a filtering criterion, a k a a s log log log 1.
Next, we apply PCA to the filtered data and obtain a PC1 line, which passes through the mean of the filtered data distribution. Then, we can determine the plane, a k a a s log log log 1, that is perpendicular to the PC1 line and intersects the PC1 line at a number of trading partners, k targ . We hereafter refer to this parameter, k targ , as the target size of a self-consistent threshold plane. Then, the sum of squared errors, a a i i i 1 , is minimized until it almost reaches zero. The method thus determines a self-consistent threshold plane given the single parameter of company size. Data for all years (1994-2015) are used as the input to the procedure. The method is tried for values of k targ that are evenly spaced in logarithmic scale. We use the Nelder-Mead method [60] as implemented in R (ver.3.5.1) for minimization. The final sum of squared errors after the minimization is less than 10 −10 for all the values of k targ plotted in figure A1.
The principal component vectors for the self-consistently filtered datasets are plotted against the target size, k targ , in figure A1(a). The vector components largely fluctuates at k targ <10, but are remarkably stable at k targ 10. In this zone, the scaling exponents for pairs of variables also approximately agree with the results from the standard regression analysis, when the data of the explanatory variable less than a threshold (set to be The values of components are stable after k targ reaches 10, above which the filtered data are considered to belong to the large-company regime. (b) Scaling exponents calculated as ratios between the vector components are plotted against k targ . When k targ 10, the ratios agree with the exponents from regression analysis applied to the data in largecompany regime (horizontal dashed lines). 100 here) are excluded from the regression. The regression analysis is used here to quantify the previous examination of scaling exponents [24]. The horizontal dashed lines in figure A1(b) indicate the scaling exponents from the standard regression analysis of the pairs of variables, (log ℓ, log k), (log s, log k) and (log s, log ℓ), with the former being explained and the latter being explanatory. These results highly suggest that the self-consistent threshold plane for k targ =10 best discriminates between the small-and large-company regimes.

A.2. Evolving flow and dissipation rates in large-company regime
We here discuss the evolving flow diagrams and the heatmaps of dissipation rates for the large-company regime. To obtain the figures, we apply the same methods as in the main text, except using the PCA results after the selfconsistent data filtering of k targ =10.
As seen in each panel of figure A2, the tendency towards a stable state is also evident in the large-company regime. The stability curve is parallel to the scaling curve (the PC1 line marked by a red line in figures A2(a) and (c)), although the location of the stability curve, especially regarding the PC2 value, differs from the scaling curve. Indeed, as shown in figure A2(b), companies are stable at a PC2 value considerably lower than the average (central red dot).
In figure A2(c), the scaling PC1 line (red solid line) is located differently from the stability line (red dashed line). We consider this to be the consequence of continuously growing amount of the trading network data. Due to this increase, a company without any real change in k typically experience an increase of the trading partners in the dataset. As the normalized PC2 value (i.e. k 0.80 ln 0.55 ln -+ ℓ + s 0.22 ln ln 0.110 + ) is highly affected by the number of trading partners (k), increasing k would decrease the PC2 value without any real change, pushing the stability curve towards a low-PC2 location. We expect that the position of the scaling and the stability curves would be identical if the amount of data were stable.  (panel (a)). Although PC2 is stable a magnitude of order below the mean (panels (b) and (c)), the stability curve is parallel to the PC1 line (panel (c)).
Likewise, the estimated dissipation rates (m ) of companies in the large-company regime are consistent with our conclusion on the smaller-company regime. The rate, m , is typically higher in the zones far from the scaling line compared to those near the scaling line, as illustrated in figure A3. In particular, the dissipation rate gets higher as the PC1 value gets higher (figures A3(a) and (b)) or the PC2 value get more distant from the average (marked by red dot in the center of figure A3(b)). This might help to constrain the multivariate company distribution around the PC1 line. Although the minimum of m  is not located in the exactly same place as the scaling line, the contours of constant m  are almost parallel to the scaling line (the horizontal red line in figure A3(a)). This indicates that m  is almost independent of the PC1 value, which is consistent with the previous observation that the survival rate of publicly traded North American companies are almost independent of their age and business sector [55]. Note that companies with the lowest dissipation rate are located in the zone of a high average sales growth, as shown in figures A2(a) and (b). We consider that this high growth prevents companies from disappearing.