SIZE DISTRIBUTION OF PORTUGUESE FIRMS BETWEEN 2006 AND 2012

This study aims to describe the size distribution of Portuguese firms, as measured by annual sales and total assets, between 2006 and 2012, giving an economic interpretation for the evolution of the distribution along the time. Three distributions are fitted to data: the lognormal, the Pareto (and as a particular case Zipf) and the Simplified Canonical Law (SCL). We present the main arguments found in literature to justify the use of distributions, emphasizing the interpretation of SCL coefficients and its analogy with thermodynamics. Methods of estimation include Maximum Likelihood, modified Ordinary Least Squares in log-log scale and Nonlinear Least Squares considering the Levenberg-Marquardt algorithm. We apply these approaches to Portuguese firm data. In the sales case, the evolution of estimated parameters in both lognormal and SCL reflects the existence of a recession period more pronounced after 2008.


Introduction
Firm size distributions have been widely studied for a long time. Among the efforts in this area are those related to several distributions: lognormal, power (in particular Zipf) and Simplified Canonical Law, based in classical studies by, for instance, Pareto, Zipf, Gibrat and Mandelbrot. Lognormal and power distributions can be justified from the Gibrat's law and, in power case a constraint must be added, for instance, on the minimum size of firms. As for SCL, firstly presented by Mandelbrot [1] in the context of linguistics, the argument used to deduce the distribution of words frequency in texts is adapted to the distribution of firm sizes. This is done by defining production units, which can be aggregated to constitute firms so that its network can be used flexibly to satisfy the most diverse needs in the economy in the less costly way, and reducing as much as possible the delay in achieving it. For this distribution, different interpretations of the estimated parameters found in the literature are confronted and discussed in the light of data. This paper presents a correction in the expression of the normalization parameter. Using this interpretation, we attempt to find a characterization of the Portuguese economic activity rhythm, diversity of activity sectors and competition. Diversity is captured by an entropy measure at different levels of aggregation, and not just at the higher level as is usually done in literature. The interrelations between these economic characteristics and others, such as the level of concentration of firm size distribution, are reassembled from literature and confronted. We depart from traditional literature adding a statement concerning the possible reduction of firm size concentration in recession times. Analyzing entropy at different levels of aggregation allows verifying how the evolution of activity rhythm can have a distinct impact in diversity at those different levels.
The rest of this paper is organized as follows. In Section 2, we provide some existing approaches for firm sizes model formulation. In Section 3 we discuss the interpretation of the coefficients of the Simplified Canonical Law. An empirical application of Portuguese firms distribution over the period 2006 to 2012 is presented and discussed in Section 4. In section 5, the main conclusions and topics for future research are presented.

Firm sizes model formulation and existing approaches
Gibrat [2] argued, in a first model formulation for firm sizes, that it is reasonable to assume the Law of Proportional Effect, that is, firm growth is at each moment given by a random value, with finite variance, which is independent of the firm size at that moment. Hereafter we will denote firm size variable by S. As a consequence of this assumption, he concludes that firm size should follow a lognormal distribution. In fact, denoting by S t the firm size at moment t and by σ t the corresponding rate of growth: then, we have taking logarithms, ln S t = ln η t + ln η t−1 + · · · + ln η 1 + ln S 0 .
Since the terms ln η i , i = 1, . . . , t are independent and identically distributed, by applying the Central Limit Theorem ln S t follows approximately a Gaussian distribution and S t a lognormal one (see Sutton [3] and Steindl [4]).
Another class of distributions used to model firm sizes is the one of power laws, in particular in the upper tail: where k ∈ IR. These were introduced by Pareto [5] to describe personal income distribution (in fact, Pareto considered also a more general distribution function, not only with respect to the power law but also to SCL, although without any justification for it, besides good fit). This relation may be expressed in an alternative form, namely the rank-frequency plot: where r is the rank of the firms when they are ordered by size, starting with the largest, and A ∈ IR. This is easily verified by noting that the rank is at least the number of firms with size greater or equal than the one ranked r, and so it is proportional to the tail probability in (1). This implies that A = 1 k , k = 0 (see Zhang et al. [6] and Newman [7]).
In fact, this empirical relation between size and the corresponding rank was first pointed out by Zipf [8]. He reported that USA firm sizes, as measured by assets, follows approximately the law: the so called Zipf's law, which is a particular case of Pareto's law (A=1). This was a study, among others made by Zipf, concerning social and economic examples. The original matter studied by Zipf [8] was the frequency distribution of words in language, obeying the law of least effort. More recent literature on the application of Zipf's law to firm size include Stanley et al. [9] and Axtell [10] for USA firms, Gaffeo et al. [11] for G7 countries, Hernandez-Perez et al. [12] for developing countries, Okuyama et al. [13] for Japanese firms and Zhang et al. [6] for chinese firms. Testing the Zipf's law involves considering the null hypothesis that A is one.
Several authors proved that power law can be theoretically justified by applying the Gibrat's law combined with some additional constraint. Simon [14] and Simon and Bonini [15] obtained the Yule distribution: where B denotes the Beta function and α is a normalizing constant, by assuming that new firms are born at a constant rate. This distribution function behaves asymptotically as a power function (as s → +∞). Champernowne [16] proved that Pareto's law for the income distribution is obtained if we assume that rate of variation can't increase above a certain level (which can be different for upward or downward variation). Relaxing this assumption, Pareto's law can still be observed asymptotically. Gabaix [17], in the context of the study of city size, considered a continuous time model combining Gibrat's law with the assumption that there is a lower level below which city size can't decrease. As a result, city size dynamics is represented by a reflected geometric Brownian motion and the corresponding distribution is a Zipf's law. For other specifications see, for instance, Rutherford [18] and Eeckhout [19].
We consider a generalization of Zipf's law (and as a matter of fact of Pareto's law) introduced by Mandelbrot [1], the so-called Simplified Canonical Law (SCL) and also known as Zipf-Mandelbrot law. It was introduced by Mandelbrot and is stated as: where P is a normalizing constant, m ∈ IR, and N is the maximum number of different entities (words, firm sizes, . . .) so that P = , p(r) is the relative frequency corresponding to the rank r. This law was found by Mandelbrot to be adequate for the study, in linguistics, of the frequency of the words used in a text or language, when they are ranked, starting from the more frequent. He obtained this law by considering an optimization criterium based on minimizing the cost diference between coding a message word by word and coding it by considering blocks of words of any size (see Mandelbrot [1]). The word by word coding has, in fact, the drawback of having a greater cost than a coding by block of words specially suited for the message to be conveyed, but has the advantage of smaller delay. Ramsden and Kiss-Haypál [20] adopted the use of SCL to model firm sizes by writting: where S(r) is the firm size associated to the r-th firm, when firms are ranked by size in decreasing order.
Remark 1. We have a reparametrization by setting θ = 1 A and ρ = m.
Departing from existing literature we claim that since P is a normalizing constant and S(r) are not proportions, P must be given by Before presenting, in the next section, the interpretation of the coefficients given by Ramsden and Kiss-Haypál [20] and by other authors (e.g. Hernandéz-Pérez and Peréz-Mesa [12,21]), we applied the same arguments given by Mandelbrot in linguistics, to a corresponding explanation in the firm size context. By defining economic units to perform specialized productive functions, a firm may be seen as formed by a group of such unitary sections, just as letters in linguistics (or more generally, unit signs such as phonemes) can be aggregated to build words. On its turn, some firms may be assembled to work together when the purpose is to build a productive project to provide a set of goods or services (just as words are put together to code a message). Generally, it is more time efficient to have a flexible structure which can be combined in different ways to build different projects, than to build each productive project from the scratch (just as coding the whole message as a block). On the other hand, to build the whole project at a time could achieve cost reduction (economic efficiency). Ramsden and Kiss-Haypál [20] identify as advantages of bigger sizes the economies of scale, capacity of lobbying, and diversification as a way to decrease idiosyncratic risk. The firm size distribution is obtained so as to minimize the cost gap between building each project from the scratch, or using the structure of firms to build the projects by combining the efforts of existing firms. Thus, firm sizes are defined in a static stable structure adequate to satisfy the needs of the economy.

Interpretation of Simplified Canonical Law coefficients
Ramsden and Kiss-Haypál [20] proposed an interpretation for the parameters of the SCL: θ and ρ. Based on the interpretation of Mandelbrot [1], they identify θ with the concept of temperature in thermodynamics which, in the context of a country, may be seen as a degree of "economic activity" as made explicitly by Peréz-Mesa and Galdeano-Goméz [21]. This interpretation is confirmed by the empirical analysis of Hernandéz-Pérez, Angulo-Brown and Tun [12], which identify a tendency of θ to be positively related to the degree of development of countries.
Ramsden and Kiss-Haypál [20] remark the analogy of firm size analysis with the study of the behavior of protein molecules, when facing changes in temperature (see Frauenfelder [22]). In fact, when temperature raises, the average transition rate between protein conformational states increases too. This is expressed in the following version of the SCL: where t 0 and n are real constants, t denotes time (being the initial moment t 0 ), N (t) is the proportion of protein molecules that did not change. Temperature influences the parameters t 0 and n, so that N (t) declines faster as temperatures raises. Similarly in an active economy, it is easier to occur a state transition as a result, for instance, of some policy. So, this may explain why economic policies are more effective in more developed countries. Note that n is positively correlated with temperature in (6) just as θ is positively correlated with the degree of "economic activity" in (5). Comparing (5) and (6), we see that in the first the (absolute) frequency is given by the independent variable r, while in (6) the (relative) frequency is given by the dependent variable N (t). In (5) as θ increases the decline of S(r) is less pronounced, which suggests that θ is negatively correlated with concentration. This is empirically confirmed by Peréz-Mesa and Galdeano-Goméz [21], which take recourse to simulation to find a negative relation between θ and the Gini index. This is in apparent contradiction with the observation of Ramsden and Kiss-Haypál [20] that θ is negatively correlated with the Shannon entropy where p i is the proportion of output in the i th category when considering the highest level of aggregation. At this level, H measures the sectorial unevenness of economic activity. This is explained by the fact that developed economies specialize and thus concentrate in capital intensive activities, yielding higher returns. For further understanding on interrelation between firm size concentration and sector diversity, we analyse different patterns of unevenness at lower levels of aggregation (subsectors). Then an interesting analysis is to study Shannon entropy evolution for different aggregation levels. As Peréz-Mesa and Galdeano-Goméz [21] pointed out, sectorial concentration can lead to a spillover effect, that is, improve the situation of firms with similar size, stimulating competition, and innovation. As for ρ, it's the parameter that defines the departure of SCL (5) from power law (2), with a more significant deviation at small r. As pointed out by Mandelbrot [1] and reminded by Ramsden and Kiss-Haypál [20], this departure is similar to the deviation of the Fermi-Dirac law from the Maxwell-Boltzmann law. What separates them in terms of their application to physics, is that in the first, in opposition to the second, the particles are interacting so that they can't coexist in the same microstate. That is, they are fermions (see Fermi and Dirac [23,24]). For instance, electrons at the same energy level, exclude themselves mutually. Similarly, in the context of biology, for ecosystems constituted by ecological niches, Gause [25] state that, as a result of competition, one specie can exclude other when they are both in the same ecological niche.
Ramsden and Kiss-Haypál [20] argue that the same thing happens with firms belonging to the same sector of economy for high values of ρ. The system becomes non degenerate as ρ → 0 and competition becomes tolerated inside the sectors. As a result, for a greater ρ there will be higher diversity of activity sectors, consequently a positive correlation between ρ and H is found. Thus, it is expected that θ will be negatively correlated with ρ so that for more active economies competition is more tolerated. Peréz-Mesa and Galdeano-Goméz [21] emphasize that for an increase in ρ, there is a contraction in bigger companies, while for a small ρ (in more active economies) the difference between small and big firms is amplified, corresponding to the fact that there are very big firms dominating some sectors and preventing the existence of smaller ones. On the other hand, they also show that, for smaller θ the gap between smaller and bigger firms increases (concentration gets bigger as pointed out before). This effect prevails and we obtain the referred negative correlation between θ and concentration. When considering time evolution, we remark that big firms are specially sensible to economic contractions and tend to suffer significant losses, reducing its difference to smaller firms.

Results and Discussion
The study is carried out on a set of Portuguese firms over the period from 2006 to 2012. The main source of information was the IBAS (Iberian Balances Analysis System), which contains accounting and financial information of Portuguese firms. Among the firms listed in the IBAS database: i) those without accounting information during the review period were eliminated, ii) only those in activity were taken into account (total assets and operating results are both positives), iii) only those that do not suffer any significant changes such as mergers or total shut-down, in this period were considered, and iv) all firms considered are private equity. Given these criteria, a final sample of 8,480 firms was reached. Table 1 provides the number of firms by main economic activity.
We used the Portuguese classification of economic activities provided by INE (Portuguese National Bureau of Statistics). The main activities are subdivided into sectors and subsectors with a disaggregation in codes. In a code, an additional digit corresponds to an additional level of disaggregation. The maximum number of digits is five.
Next we present the estimation methods. The maximum likelihood estimator is applied to the lognormal fitting. The Matlab (Release R2011b) lognfit function was used to perform a parameter estimation when fitting a lognormal distribution to the data. In order to solve the nonlinear least squares curve-fit problem, we used the Matlab function lsqnonlin. This function allows to implement the Levenberg-Marquardt method, which combines the gradient descent method and the Gauss-Newton method. Estimation of power law involves modified OLS fitting in log-log scale: log(S(r)) = a − A log(r − 0.5)  Figure 1. Lognormal distribution fitting for firm asset mean value (see [26]). We present the results on the lognormal law, SCL and power laws (in particular Zipf's law) fitting to both assets and sales data. We also present the value for Shannon entropy taken from both the distribution of firm sizes and the distribution of "sector sizes", at different levels of aggregation, where a sector size is captured by the sum of the sizes of firms included in that sector. At firm size level, entropy reflects the level of firm size concentration. At the higher level of aggregation, entropy should be able to capture the association made in literature between activity rhythm and sector diversity. Intermediary levels show how the joint influence of concentration and sector diversity melt as the aggregation level varies. When considering production, the time interval include a period of recession after 2008 although the contraction in consumption is a later phenomenon. This is clearly reflected in the results presented bellow when firm size is given by sales but not in a so straightforward form when firm size is given by its assets. We first consider lognormal fitting. In the case of assets these reveal a clear positive trend in the fitted mean value of assets except for 2009 and 2012 (see table 2, and figure 1) and a less smooth evolution in the fitted dispersion coefficient, where there was an abrupt fall in 2009, but without significant change from 2006 to 2012 (see table 3 and figure 2). A possible explanation for this pattern is that the reinforcement of assets value is a mechanism to ensure survival in time of instability, the exception is the shock in 2009. If so, then assets value may not be a good proxy for firm size if it is meant to express the effect of economic context in the health of firms. In comparison to 2008 the dispersion is thereafter much lower, which seems to have a strong effect in the reduction of entropy at firm level (see table 6 and figure 5) indicating a less degree of diversity.  SCL fit results are now presented. Concerning assets, there is a clear dropdown in ρ after 2008 (see table 7 and figure 6) approaching values near to zero which would be interpreted as an evolution to a more competitive economy according to the explanation in [20]. This is also compatible with the downward tendency in entropy at all levels except the last one (see table 6 and figure 5). Also the positive evolution in θ is in tune with those evolutions, according to literature (see table 8 and figure 7). The combination of these signals points to a countercyclical behavior of assets level suggesting a reaction by firms to an adverse evolution.           (see table 11 and figure 10) is in accordance with these findings: at firm level, the pronounced decreasing evolution suggests an increasing of discrepancy between firm sizes; the increasing inequality path seems to evanesce as the aggregation level grows and reverts this tendency in the last two levels where accrued sector diversity appears to prevail.
Considering power law fitting for the years 2006, 2008 and 2012 we obtain the values for the parameter A presented in table 12 (in brackets we present the corresponding standard deviation). The estimated exponent is statistically greater than one. The evolution for parameter A, after 2008, sustains the interpretation given in the previous paragraph.

Conclusions
Firm size distributions used in this paper were the lognormal, Pareto (and the Zipf particular case) and SCL. They were fitted to a sample of Portuguese firms in the period from 2006 to 2012, through standard estimation methods used in literature. We stress the following points referred in the presentation of the laws. Firstly, in the generating mechanism of the SCL, it was highlighted the role of delay and not only cost in the decisions, leading to the construction of messages/economic projects. Also in this point, we addressed the question of the analogy between the criteria that lead to the distribution of frequent words and the distribution of firms. Second, when reparameterizing the SCL parameters to the case of firms distribution, we emphasize that normalization parameter P has to be defined taking into account that firm sizes are not proportions. Third, we make explicit the analogy between the behavior of biomolecules and of firm sizes, showing how parameters θ and n in (5) and (6) are related to temperature.
Results allow identifying sales as a good proxy for firm size in what concerns its relation to the evolution of activity rhythm. That is, estimated parameters in both lognormal law and SCL have a path evolution consistent with the known fact that there was a pronounced recession after 2008. As for total assets, our results show a countercyclical evolution with the economic tendency, which may be explained by the reaction of firms to an adverse environment. Shannon entropy was determined at different levels of aggregation. In the sales case, the decreasing path at firm level reveals an increasing concentration while the inversion of this path at higher levels of integration show an increase in sector diversity. It is significant that this inversion is only achieved on the last two levels.
We point out a time series effect not mentioned before in literature: in recessions period big firms tend to suffer greater losses and there is a mitigation in firm size concentration.
As topics for future research we consider the following. First, the study of the interaction between firm distribution and the economic environment using nonparametric methods to estimate that distribution. Second, the generalization of the concept of entropy to analyze diversity simultaneously with respect to more than one attribute, such as geographical localization both in the same economy and between different economies.