The gradual evolution of buyer–seller networks and their role in aggregate fluctuations

Buyer–seller relationships among firms can be regarded as a longitudinal network in which the connectivity pattern evolves as each firm receives productivity shocks. Based on a data set describing the evolution of buyer–seller links among 55,608 firms over a decade and structural equation modeling, we find some evidence that interfirm networks evolve reflecting a firm’s local decisions to mitigate adverse effects from neighbor firms through interfirm linkage, while enjoying positive effects from them. As a result, link renewal tends to have a positive impact on the growth rates of firms. We also investigate the role of networks in aggregate fluctuations.


Introduction
The interfirm buyer-seller network is important from both the macroeconomic and the microeconomic perspectives. From the macroeconomic perspective, this network represents a form of interconnectedness in an economy that allows firm-level idiosyncratic shocks to be propagated to other firms 1 . Previous studies has suggested that this propagation mechanism interferes with the averaging-out process of shocks, and possibly has an impact on macroeconomic variables such as aggregate fluctuations (Acemoglu et al. 2011;Acemoglu et al. 2012;Carvalho 2014;2007;Shea 2002;Foerster et al. 2011;Malysheva and Sarte 2011). From the microeconomic perspective, a network at a particular point of time is a result of each firms link renewal decisions in order to avoid (or share) negative (or positive) shocks with its neighboring firms. These two views of a network is related by the fact that both concerns propagation of shocks. The former view stresses the fact that idiosyncratic shocks propagates through a static network while the latter provides a more dynamic view where firms have the choice of renewing its link structure in order to share or avoid shocks. The question here is that it is not clear how the latter view affects the former view. Does link renewal increase aggregate fluctuation due to firms forming new links that conveys positive shocks or does it decrease aggregate fluctuation due to firms severing links that conveys negative shocks or does it have a different effect?
It is important to stress the fact that previous research, in macroeconomics as listed above, has implicitly assumed a static link structure where link renewal does not take place. However, anecdotal evidence suggest that firms may renew their link structure in order to avoid negative shocks and share positive shocks with their neighboring firms. For instance, in the financial crisis of 2008 many banks were reported to sever its links with bad performing firms while forming new links to better performing firms. If these decisions took place broadly then shocks would not propagate as the previous papers have suggested.
To investigate the trade-off between the propagation of shocks and link renewal, we conduct an empirical analysis on the effect of link renewal on the overall growth rate of an economy. Our analysis is novel in the sense that we take the link renewal aspect of the network explicitly into account. This is performed by employing a firm level data instead of sectoral level data. Due to data availability, we use a firm-level dataset from Japan where we have both network data as well as log growth rate of each firms over a decade. We hope that similar results holds for other countries as well.
Using the unique dataset, we take structural equation modeling to estimate the effect of link renewal on the overall growth rate of a network. Our model can be seen as a firm-level variant of the multi-sector model of (Long and Plosser 1983), which is canonical in the business-cycle literature. After estimation of the structural parameters, wherein we discuss the results and identification issues, the effect of link renewal is estimated by performing a counterfactual analysis of the propagation of shocks. Specifically, the analysis is performed by first estimating the individual shocks using the estimated structural model and then propagating the shocks back using networks from different years and comparing the consequences. From this excercise our first result shows that the current network is often the best network configuration, which optimizes both the propagation of positive shocks and the avoidance of negative shocks compared with previous networks. Furthermore, we show that for positive shocks, the future network is often better than the current network in the sense that it propagates positive shocks better than the current network. This is explained by the asymmetry in cost between severing a link and link formation. It is easier to sever an existing link when one's neighbor faces negative shocks than to form a new link, or a new path to distant targeted nodes, in the opposite case. We then provide some evidence that link renewal has a positive effect of increasing the average growth rate of firms, thereby answering to the main question of the paper. Finally, by comparing the average log growth rate for each year and the average individual shocks estimated from our model, we show that at least 37% of the aggregate fluctuations can be explained by the network effect.
The rest of the paper is organized as follows. In "Introduction" section, we summarize the basic notation used throughout the paper. We also offer a brief description of the dataset used in the paper and provide a basic descriptive analysis. "Data and notation" section presents the structural model. "Model" section illustrates our inference procedure and presents the estimation results. We also discuss identification issues. In "Estimation" section, we use the model to perform counterfactual analysis of the propagation of shocks and address the gradual evolution of the network. "Counterfactual analysis of propagation of shocks" section addresses the impact of the interfirm buyer-seller network on aggregate fluctuations. "Network effect on aggregate fluctuations" section concludes.

Data and notation
The network and financial data used in this paper are from the Teikoku Data Bank 2 . These data are based on questionnaires completed by more than 100,000 firms in Japan for the accounting years 2003 to 2012. We use a subset of this data where we have both network and financial information throughout the 10-year period (i.e., 55,608 firms). In the questionnaires, firms are asked to name several (up to five) upstream and downstream firms with which they trade. This scheme is akin to the fixed rank nomination scheme used in social network analysis (Hoff et al. 2013).
We define two types of adjacency matrix: downstream and upstream. We denote by G the adjacency matrix describing the downstream network, where the downstream firms are listed in each row. Thus, it is reported by firm i that firm j buys from firm i if and only if G ij = 1. H is defined similarly for the upstream adjacency matrix. When necessary, we use subscripts to indicate time points, so the buyer network for accounting year 2012 is denoted by G 2012 . We could combine these two adjacency matrices and create matrices such that H = G T holds using interpolation of links. However, because the data do not include the weight (i.e. transaction volume) spurious links might be formed using this interpolation. To elaborate on this point, suppose that a stationery store sells a considerable number of pencils to firm A, which manufactures cars. From the stationery store's point of view, firm A is a major buyer that determines its sales revenue. However, from firm A's point of view, the stationery store is far less important than the upstream firm from which it purchases automobile parts for use in production. Because in this paper we focus on links that have strong relationships, we focus on the raw form without performing any interpolation of relations. It is worth noting that thus G does not equal its transpose of H. Table 1 summarizes some basic descriptive statistics concerning the log growth rate of firms during the period 2003-2012. Log growth rate is measured by log S(t) S(t−1) where S(t) describes sales reported in each firms financial statement. It can be seen that the average log growth rate of firms fluctuates around 0, showing a moderate cycle. As stated previously, because we are using a subset of the data, 55,608 firms were used to calculate the average log growth each year. Table 2 summarizes the number of nonzero elements in the two adjacency matrices, as well as their evolution. It can be seen that, except for 2008, the numbers of links formed and severed have shown a steady evolution. It can also be seen that the overall number of links appears to be stable over time.  In Fig. 1, we present a contour plot showing the log growth rate of the following year (contour) to the current log growth rate (x-axis) and current size (y-axis) for each firm where the contour was estimated using two-dimensional splines. It can be seen that above 8.1 billion yen (i.e., exp (9)), there is a clear persistent pattern whereby a positive growth rate tends to be repeated, and vice versa.
One reason which could explain the irregular pattern among the small and medium sized firms (i.e. middle left and middle right area) is subisidiary firms, which are affected by decisions made by their parent company (e.g., participating in an absorption-type merger, corporate group restructuring). However, even ignoring this part of the data, it can be seen that overall, there seems to be a persistent pattern in the log growth rate of firms.
In Table 3, we show the proportions of positive and negative log growth rates of firms around newly formed and severed links. First-order, second-order, and third-order nodes are defined by the steps needed to reach the node from the newly formed or severed link.  First-order, second-order, and third-order nodes are defined by the length of the newly formed or severed link. Bold font indicates the cases where (i) the proportion of positive log growth rate of nodes is higher for newly formed links than severed links or (ii) the proportion of negative log growth rate of nodes is higher for severed links than newly formed links in a given year For the sake of clarity, a schematic diagram showing the first-order, second-order, and third-order nodes is provided in Fig. 2. Bold font in Table 3 indicates the cases where (i) the proportion of positive log growth rate of nodes is higher for newly formed links than severed links or (ii) the proportion of negative log growth rate of nodes is higher for severed links than newly formed links in a given year. It can be seen that for all years, the network tends to form links between nodes experiencing a positive firm-specific idiosyncratic shock (and vice versa). This provides our first insight into the connection between the log growth rate of firms and the link renewal process of the network.

Model
The model that we use in this paper is where y t denotes the growth rate of sales of each firm 3 and t denotes the normal firmspecific idiosyncratic shock characterized by μ and σ . The intuition behind the model is that log growth rate of a firm could be broken down into three parts: economy-wide plus firm-level idiosyncratic shocks, lagged effect from previous year, and propagation effect from the interfirm buyer-seller network (both simultaneous and lagged). There are seven unknown parameters in total.
Our model can be seen as a firm-level variant of the multi-sector model of (Long and Plosser 1983), which is canonical in the business-cycle literature. In (Long and Plosser 1983), each sector (i.e., firm) is explicitly assumed to use materials produced by other sectors (i.e., firms), and these sectoral linkages represent interconnectedness in the economy, propagating idiosyncratic sector-specific shocks to other sectors. Previous works have used the multi-sector business cycle model to break down aggregate fluctuations down into aggregate economy-wide common shocks and sectoral shocks (Abe 2004;Foerster et al. 2011). These models have been used to shed light on aspects of sectoral growth and business cycles. The goal in this paper is to bring this model to the firm level studying the propagation of firm-level idiosyncratic shocks. The difference between (Long and Plosser 1983)'s sectoral-level and firm-level linkages lies in the link renewal process among firms. In a sectoral-level setting, if the total demand for goods from other sectors is kept the same, then the strength of the links with other sectors does not change. However, even in this case, the interfirm network structure might differ due to link renewal behaviors at the firm level. Our main goal in this is paper is to take this link renewal behavior explicitly into account.
The general consensus in macroeconomics has been that sector-specific shocks should average out over the entire economy based on Lucas's "diversification argument" (Lucas 1977). However, this view has recently been challenged from the network perspective by several authors (Shea 2002;Acemoglu et al. 2012;Acemoglu et al. 2011;Carvalho 2007) suggesting that in the presence of certain sectoral network structures, this argument may not apply. In particular, (Acemoglu et al. 2012) has shown that the rate of decay in aggregate fluctuations depends on the network structure governing interdependency among sectors. Our model is closely related to (Acemoglu et al. 2012), but much closer to (Shea 2002) in that we model effects from both upstream and downstream linkages. Our work is also related in spirit to (Foerster et al. 2011;Malysheva and Sarte 2011) in providing a systematic econometric analysis of the propagation of shocks and the relationship to aggregate fluctuations. The difference is that while (Foerster et al. 2011;Malysheva and Sarte 2011) focus on sectorial linkages, we focus more on micro connections in interfirm networks.

Parameter estimation
Inference of parameters is most easily performed using Bayesian inference (Westveld and Hoff 2011;Goldsmith-Pinkham and Imbens 2013). In our case, this is also due to the heavy computation involved in handling large amounts of network data. Using Eq. (1) and placing conjugate normal priors on β G , β H , β LG , β LH , γ , and μ 0 , and a scaled inverse gamma prior on σ 0 , y t obeys a multivariate normal distribution with To perform maximum likelihood in this setting, it is necessary to calculate the determinant | |, where has size 55, 608 2 even when focusing our attention on just one year. The time complexity of calculating this determinant is cubic, making it impractical to evaluate when optimizing the likelihood 4 . The other term that involves heavy computation is the inverse matrix. We approximated the inverse matrix using the first 30 terms of the Neumann series (or power series) as in (Bramoulle et al. 2009).
The unknown parameters in our model are β G , β H , β LG , β LH , γ , μ 0 , and σ 0 . Bayesian inference was performed with diffuse priors (i.e. normal(0, 100) for βs, γ and μ and scaled − inverse − gamma(1, 1) for σ 0 ), using Gibbs sampling of 10 years of data, which converged quite rapidly. A Markov chain of 10,500 iterations was generated, the first 500 of which were dropped as burn-in steps. We provide a trace plot of β G in Fig. 3. Other paratemers converged similarly. Thinning was performed every 10 steps, resulting in 1000 samples, which we used to approximate the joint posterior. Table 4 reports the posterior mean of the parameters along with 99% posterior confidence intervals. In general, all the parameters related to network effects are significantly different from 0, suggesting that the network effect is present as both a lag and a contemporaneous effect. The parameter γ being significantly positive implies that there is persistency in firms log growth rate as was expected from Fig. 2. The parameter μ 0 being slightly negative corresponds to the fact the overall Japan was shrinking during the period of analysis.

Identification issues resulting from measurement errors
Although the use of the log growth rate in analyzing network effects is due to stationarity concerns log differencing makes each variables noisier. Moreover sloppy reporting  by small and medium-sized firms also contaminates the variable with additional measurement errors. Estimation of true regression parameters when all measurements have additional noise was studied by Frisch in the 1930s under the rubric of statistical confluence analysis (Frisch 1934;Hendry and Morgan 1989). Similar to its modern descendant, partial identification (Manski 2009;Tamer 2010), our results show that estimation of the structural parameters ignoring measurement error provides lower bounds on estimates of the true structural parameters. While this argument may seem trivial at first, it is important when we estimate the effects of the interfirm buyer-seller network on aggregate fluctuations in "Network effect on aggregate fluctuations" section. As noted in the Introduction, since our interest is in aggregate fluctuation we are interested not in each firm's log growth rate, but in the average log growth rate of all firms in an economy at a particular year. Additional zero mean measurement errors for each firm disappear when we take the average of these growth rates, and thus have no impact on the overall dynamics of the average log growth rate. However, we are trying to estimate these underlying parameters from log growth rates including additional measurement errors. In this case, our estimated parameters (e.g., the parameter estimates reported in Table 4) would be different from the true structural parameters responsible for generating the aggregate fluctuations in the average log growth rate of firms.
Taking measurement errors into account, our observed log growth rate of firms is generated from where the first equation models the network effect as in Eq.
(1) and the second one models additional measurement errors. Assuming that η has mean 0 and a finite first moment, the law of large numbers guarantees that this additional measurement error cancels out in the aggregate. Assuming that both t and η t are normally distributed random variables, it is obvious that there is a simple relationship between the parameter estimates ignoring this additional structure and the true parameters. The relationship is where r is defined as Hence, our parameter estimates ignoring measurement errors, as in Table 4, give a scaled estimate of the true parameters. This effect is confirmed by the following experiments. We first generate the underlying true log growth rates of firms using the actual network data with β G = 0.06, β H = 0.06, β LG = 0.04, β LH = 0.04, γ = −0.3, μ = 0, and σ = 0.3. Then, for each firm, we add additional noise η ∼ normal(0, 0.15). Table 5 reports the posterior means of parameter estimates with and without this additional noise. We see that the parameters are scaled as predicted by Eq. (7).
In summary, the analysis performed in this section have clarified that the estimated structural parameters only provide a lower bound on the true parameters. This was a result of identification issues concerning measurement errors. Hence the message here is that our evaluation of propagation of shocks, performed in the next sections using the estimated parameters, could only be seen as a lower bound concerning the true level of propagation in an economy.

Counterfactual analysis of propagation of shocks
To assess the nature of the evolving network, we perform counterfactual analysis of the propagation of shocks. We do this by the following procedure. Using a structural model describing the interfirm buyer-seller network, we estimate the structural firm-specific shocks for year t as where β G and β H are parameters, e t and y t are vectors, and the rest matrices. Using these estimates for all firms, we compute a firm's growth in a counterfactual world, assuming that the structure of the network is that of year t instead of year t by Note that y t|t (i.e., propagating shocks using the network from the same year as the log growth rate) is the same as y t . Comparing y t |t for different years enables us to ascertain what the log growth rate of firms might have been if the network structure was that of year t . Moreover, motivated by Table 3, we perform this analysis of evolving networks by separating the estimated e t s into positive shocks (i.e., e pos t ) and negative shocks (i.e., e neg t ) where we set all the values that are not positive in the former case or negative in the latter case to 0. We propagate each of these shocks in the network. Thus, y t |t is now replaced by for positive shocks and for negative shocks. We assume that the structural parameters are fixed and set them as β G = 0.06 and β H = 0.05. As before, we approximated the inverse matrix using the first The true parameters are reported in the text 30 terms of the Neumann series (or power series) as in (Bramoulle et al. 2009) to speed up calculations. Comparing y pos t |t and y neg t |t for different years enables us to compare the propagation (avoidance) performance of each network in the face of positive and negative shocks that arrived in year t. Figures 4 and 5 show the results of comparing the standard deviation of y pos t |t and y neg t |t for all years. It can be seen that the current network is often the best network configuration, which optimizes both the propagation of positive shocks and the avoidance of negative shocks compared with past networks. Furthermore, we see that for positive shocks, the future network is often better than the current network in the sense that it propagates positive shocks better than the current network. We also note that the improvement caused by rewiring the network just after the shock has arrived is higher for negative shocks than for positive shocks. This is quite an interesting result, and is worth elaborating. The main reason is the asymmetry between forming and severing links. Severing a link, and often switching to better (but not necessarily the best) nodes, is easier than forming a link targeting good (if not the best) nodes facing positive shocks. This is because the latter requires additional search costs and negotiation time for the two firms to reach agreement. Further, because of the existence of layers (or a hierarchical structure) in the network, creating a path to distant nodes with which one is unable to form a direct link is a complex task that requires decisions by one's neighbors. For example, if a firm wants to buy automobile parts that use a certain high-quality metal, it has to find an automobile parts manufacturer that uses the metal in their own production or wait until some automobile parts manufacturer starts using the metal in their own production. Given this basic limitation governing the microeconomic link renewal process of firms, link formation can only evolve gradually in response to newly arrived shocks. The view of local rewiring of links is also shared with works in social networks such as (Mele 2010; Krivitsky and Handcock 2014).
If there was a hypothetical social planner that could rewire all the network structures in an economy to an optimal state, the behavior summarized in this section would not take place. However, in reality, microscopic connectivity patterns are determined by each agent's decisions to avoid negative shocks and share positive shocks. These decisions are made based on local information which each firms gathers without having access to the full picture of the global state of the network. Moreover, apart from the fact that they only have access to local information, there is asymmetry in cost between forming and severing links which also contributes to the gradual process of link renewal. The analysis performed in this section provides some insights into the gradual evolution process, suggesting how the decentralized myopic decisions of individual firms gradually lead to an improvement in the overall state of the network.

Network effect on aggregate fluctuations
Using the parameters reported in the previous section, we estimate the role of networks in aggregate fluctuations by comparing the average log growth rate of firms (i.e., y t ) and the average shocks for individual firms (i.e., e t ). For each year, we calculate e t s by The average e t is used as the average shock for individual firms. We also simulate each firms log growth rate assuming that there was no link renewal during the whole period of study. This is performed by using Eq. (9), setting t as 2003. The average value of y t |t is used as the average log growth rate in the counterfactual world assuming that no link renewal took place during the whole period of study. 5 Figure 6 shows the results. By comparing the case when there is link renewal (black rectangles) and without link renewal (blue square), we see that the average log growth rate shifts downwards when there is no link renewal. This was expected because as was seen in the previous sections link renewal has two effects. One trying to mitigate negative shocks from propagating and one trying to share positive shocks with their neighboring firms. In recession period, link renewal is more motivated by the former process making the black circles higher than the blue squares (because by link renewal the network succeeded in mitigating negative shocks). While in boom period, link renewal is motivated more by the latter process also making the black circle higher than the blue squares (because by link renewal the network succeeded in sharing positive shocks). Figure 7 shows the cumulative average log growth rate of each of the cases depicted in Fig. 6. Comparing the cases when link renewal take place (black circles) and when firms are connected and without link renewal (blue square) in Fig. 7, we see that on average firm growth rate is 0.0027 higher when there is link renewal. 6 Hence we conclude that link renewal has the positive effect of increasing the average log growth rate of an economy by effectively mitigating negative shocks and sharing positive shocks among firms.
We next investigate aggregate fluctuation. Comparing the two cases when firms are not connected (red triangle) and connected (black circles) in Fig. 6, we see that the average log growth rate tends to fluctuate more when they are connected. It is worth emphasizing that we only have nine data points in the calculation. Nevertheless, the estimated standard deviation of the fluctuation is 0.023, while that of the original average log growth rate of firms is 0.037. Thus, the network effect on aggregate fluctuations can be calculated as 1 − 0.023/0.037, which is around 37%. Note that as discussed in "Estimation" section, the estimated structural parameters provide a lower bound as a result of identification issues concerning measurement errors. Therefore, we conclude that at least 37% of the aggregate fluctuations can be explained by the network effect. 7 It is also worth noting that this figure is similar to that in (Foerster et al. 2011), who studied variability in log growth of the IP index in the United States and showed that, after the great moderation, 50% of the variability in log growth of the IP index could indeed be explained by sectoral linkages.

Conclusion
In order to answer the question concerning the trade-off between propagation of shocks and link renewal in the interfirm buyer-seller network, we provided an empirical analysis on the effect of link renewal on the overall growth rate of an economy. To this aim we used a firm-level dataset from Japan where we have both network data as well as log growth rate of fimrs over a decade. Using the unique dataset, we took structural equation modeling to estimate the effect of link renewal. By means of counterfactual analysis, we first showed that the current network is often the best network configuration which optimizes both the propagation of positive shocks and avoidance of negative shocks compared with previous networks, perhaps reflecting each firms motivation to avoid other's negative shocks and share other's positive shocks. We then showed that for positive shocks, the future network is often better than the current network in the sense that it propagates positive shocks better than the current network. This asymmetric behavior was explained by the asymmetry in cost between severing and forming links. We then provided some evidence that link renewal has a positive effect of increasing the average growth rate of firms at the macroeconomic level answering to the main motivation of the paper. Last but not least, as a bonus of our structural equation modeling, we also showed that at least 37% of the aggregate fluctuations can be explained by the network effect. This is in line with previous research which focused on sectoral linkages such as (Foerster et al. 2011). Endnotes 1 Examples of firm-level idiosyncratic shock includes: productivity shocks stemming from successful innovations, discovery of new export destination, changes in capacity utilization including strikes and supply shock such as sudden change in raw material prices. It should not be confused with economy-wide shocks such as inflation, wars and policy shocks.
2 http://www.tdb.co.jp/index.html. 3 Which is defined by the difference of logarithm of sales between two consecutive years. 4 It took about 5-8 h to calculate this term on a modern desktop computer using the fully optimized software (Danny et al. 2010) 5 To be more precise we are assuming that the network stayed as that of year 2003 during the whole period of study 2004-2012. 6 This is calculated by taking the mean of y t − y t |t . 7 As could be suspected by Fig. 6 the number only slightly changes when comparing the case when there is no link renewal to the case when firms are not connected at all.