Dynamics in the Fitness-Income plane: Brazilian states vs World countries

In this paper we introduce a novel algorithm, called Exogenous Fitness, to calculate the Fitness of subnational entities and we apply it to the states of Brazil. In the last decade, several indices were introduced to measure the competitiveness of countries by looking at the complexity of their export basket. Tacchella et al (2012) developed a non-monetary metric called Fitness. In this paper, after an overview about Brazil as a whole and the comparison with the other BRIC countries, we introduce a new methodology based on the Fitness algorithm, called Exogenous Fitness. Combining the results with the Gross Domestic Product per capita (GDPp), we look at the dynamics of the Brazilian states in the Fitness-Income plane. Two regimes are distinguishable: one with high predictability and the other with low predictability, showing a deep analogy with the heterogeneous dynamics of the World countries. Furthermore, we compare the ranking of the Brazilian states according to the Exogenous Fitness with the ranking obtained through two other techniques, namely Endogenous Fitness and Economic Complexity Index.


Introduction
Large countries are often characterized by a strong internal heterogeneity between richer regions and poorer hierarchical regions. Just think to the difference between the GDP per capita (GDP p ) of the states of New York and Mississippi in the US [1], or the difference between the states of Kerala and Bihar in India [2], or between the unexplored forest of Amazon and the modern state of São Paulo in Brazil [3]. While the recent literature on Economic Complexity focused on countries [4][5][6][7], we believe that there are two very strong reasons to extend the scope of the analysis to the subnational level.
The first reason is purely academic. Indeed, sharp differences in economic outcomes in a uniform institutional area-with common cultural background and free movement of workers -are both a theoretical puzzle for traditional economics and an empirical opportunity for the Economic Complexity field. Indeed, the analysis of subnational entities competing on an even a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 playing field is the perfect experimental setup to identify the role of organizational and technical capabilities with respect to more traditional economic factors of analysis. In this paper we will analyze the case of Brazil, to see if the capabilities driven dynamics of a country is replicated at a smaller scale.
The second reason is to improve economic forecasting. Indeed Economic Complexity has been proved to be very effective in forecasting the economic performances of countries [8]. An understanding of subnational entities could give however more accuracy and more detail. It is clear for example that future GDP p growth of Brazil will depend not only on further growth of the Southern industrial core, but on the convergence of the other regions. This is crucial both to correctly forecast aggregate Brazil GDP p growth and to address the vast internal inequality of the country.
Brazil or, officially, the Federative Republic of Brazil is the ninth World economy in the GDP ranking of the year 2015 [9]. Its population is equivalent to 2.81% of the total World population [10] and its large area (8.515.767,049 km 2 ), divided by its twenty-seven Federative Units [11], make it the fifth largest country of the World [12]. The political and administrative organization of Brazil is hierarchically organized in a sequence of geopolitical structures: Union, states, Federal District and Counties. Each one is autonomous and organized according with the division of powers: legislative, executive, and judiciary. Due to the deep inequalities, but also for the good perspectives of growth, Brazil and the others Latin American countries were often a focus of economic development analysis during the last century [13][14][15][16][17].
Economists usually focus on monetary based indices to analyze economies such as the GDP. However, GDP alone, as shown by different studies [4][5][6][7]18], does not provide deep information about the perspective of growth and development of World countries. Several studies tried to gain information on the unobservable characteristics of countries by looking at stock indices to exploit the "wisdom of the crowd" [19,20]. In order to gain a direct measure of the country capabilities, the last decade has been marked by a line of research of new indices inspired by the science of complex systems, able to better describe and explain the large scale World economy [4][5][6][7][21][22][23] and to estimate global and regional inequalities [24,25].
In this respect, different authors recently introduced two indices: Economy Complexity Index (ECI) [4] and Fitness [5]. Furthermore, Cristelli et al [7], through a novel method called Selective Predictability Scheme (SPS), showed that the comparison between GDP p and Fitness provides a highly performing forecasting tool for several countries.
In this paper, we first present an overview of Brazil as a whole from the point of view of the Economic Complexity approach. In this context we compare its export basket and its Fitness with the ones of the BRIC group of countries (Brazil, Russia, India, and China) [26].
Then, we focus on the comparative study of the economies of the single Brazilian states. Based on the "classical" Fitness algorithm, we introduce a new methodology, called Exogenous Fitness, able to measure the Fitness of subnational entities, and we apply it to the states of Brazil. In analogy with what was proposed in [7], we analyze the coevolution of GDP p and Fitness studying in this way the predictability of the economic growth of the Brazilian states.
Furthermore, we compare the Exogenous Fitness with: (i) the (Endogenous) Fitness -i.e, the natural application of the "classical" Fitness algorithm to the subnational entities of a country-; (ii) the results published by the Dataviva platform (an application of the ECI algorithm) [27].
The paper is structured as follows: first we introduce the methods and we provide an overview about Brazil. Then, we show the results of the Exogenous Fitness applied to the Brazilian states and the comparisons with the other techniques. Finally, we conclude with a general discussion about the implications of the results with respect to both points of view of scientific community and policy makers. In the Appendix A, we describe in detail the used database.

Methods
In this section we describe the algorithms and methods involved in the calculation of the states and countries Fitness coupled to the Complexity of exported products.

Revealed Comparative Advantage (RCA)
The Revealed Comparative Advantage (RCA) [28] is a quantitative criterion to assess the relative advantages of a country, or, in this case, of a Brazilian state, in the export of certain products compared to the average export of those products. Defining q sp as the flow of the export (in US dollars) of the product p by the state s (see the section Database for the data origin), the RCA is defined as: Therefore, it is the ratio between the share of the export of product p with respect to the total export of State s divided by the share of the export of product p with respect to the total Brazilian export. From the calculation of the RCA for each state-product pair, we build the binary state-product matrix M sp . We consider the state s an exporter of a product p, if RCA sp ! 1 and, consequently, we set M sp = 1. On the contrary, if RCA sp < 1, we set M sp = 0.
An analogous criterion is used to define the World countries-products matrix M cp (see the section Database). This binary matrix shows which country has a comparative advantage in a certain product with respect to the World average [29,30].

(Endogenous) Fitness
Recently, different studies have shown the economic relevance of the diversification of the export basket for the competitiveness of a country [4,6]. The matrix M shows a substantial nested structure highlighted by a strong triangularity, which can be interpreted in the following way: each country approximately exports all the possible products it has the capabilities to produce [5].
Here, considering the geographic size of Brazil and its federal structure, we assume that the same concept is also valid to understand the development and growth of its states. In this framework, we apply the Fitness algorithm to the states-products matrix of elements M sp above defined [5], a statistical approach based on non linear maps coupling Fitness of states and Complexity of Products, to compare Brazilian states. The (Endogenous) Fitness algorithm is defined by the following iterative equations [6]: The elements M sp are the elements of the previously discussed binary states-products matrix. The initial conditions satisfy the relations:F ð0Þ where we assume C = 1 for each state s and for each product p [6].
At each iteration of the algorithm, the Fitness of each state is proportional to the sum of its exported products weighted by their Complexity stressing the importance of having at the same time both a diversified export basket and the most complex possible products in it. The formula for the Complexity of a product is motivated by the following argument: the more the exporters of a product and the smaller their Fitness, the less its expected from the Complexity. In this manner, a state with low Fitness abruptly influences the Complexity of all the products it exports [6]. Therefore, an highly Complex product is made only by few countries/states with high Fitness, while a little Complex product can be made by all the countries/states, both with high and low Fitness. The stability and robustness of the algorithm has been studied in [6,31] and the Fitness ranking of the states and the Complexity ranking of the products is unambiguously defined after a large enough number of iterations. Fig 1 shows the matrix M sp of the year 2015, by ordering the states according to the Fitness (the upper the higher complexity), and the products according to the Complexity (the more right the higher the complexity).
In Fig 2 we show the products spectroscopy [32] of the years 2005 (dashed lines) and 2015 (filled colors) for few Brazilian states such as: São Paulo, Paraná, Ceará, and Roraima. The spectroscopy is a graphic representation of the export volume (in US Dollars) of a state for each product with M sp = 1 ordered at increasing Complexity from left to right [32]. We subsequently group the products (10 for bin) and we summed the export volumes of each product inside each bin. The spectroscopy allows to compare the diversification and the Complexity of the exportation of the states. The figure shows the spectroscopy of high Fitness states such as São Paulo (diversified all along the Complexity spectrum) and Paraná (with a clear peak on medium-high Complexity products), a middle rank state such as Ceará and a low Fitness state such as Roraima (with few low Complexity exports). From the figure, it emerges that a very developed state such as São Paulo has a high flow of exports for a very diversified number of products with a bias towards the high Complexity ones. Paraná has a high peak in several complex products, while Roraima has only one peak in the less complex products. Ceará is a middle ground between the two.

Exogenous Fitness
Here, we define the new Exogenous Fitness algorithm, an innovative method to calculate the Fitness of subnational entities of a country grounded on the measure of the products Complexity from the World-wide trade network. Exogenous Fitness is a coherent extension of the "classical" Fitness algorithm [6], with the assumption of an obvious concept: products have an intrinsic Complexity, reflected by the trade on the global World scale by all countries, while the trade from the regions of a single country may not represent well such intrinsic Complexity as it can be affected by local biases. In particular if we consider only Brazil to define the Complexity of the exported products, we can introduce local economic biases in its measure related to the peculiar features of Brazil economy. Indeed, as shown in Fig 1, there is a big range of products made only by few states that make the measure of Complexity very inaccurate. From this observation, it is natural to use as the best measure of Complexity of products the ones Q W p extracted from the Fitness algorithm applied to the trade of goods of all World countries, i.e. we take: Indeed, the Complexities of the products obtained applying the Endogenous Fitness to the World countries (Q W P ) can be considered the same of the Complexities of the products inside Brazil (Q B P ) and, therefore, we simply define them as Q p . Therefore, the algorithm consists of two steps: 1. We apply the (Endogenous) Fitness (Eq 2) to the World countries, as previously done in [5][6][7]. The criterion adopted to determine if a country c is a "good" exporter of a given product p is again based on the RCA extended to all World countries: we set M cp = 1 if RCA cp ! 1 and M cp = 0 otherwise (see the section Database for the source of the data).
Applying the (Endogenous) Fitness algorithm to the matrix M cp , after a sufficiently large number of iterations the algorithm converges to the fixed point so that, we obtain the respective Fitness F c for each country and the Complexity Q W p for each product.
2. From the assumption Eq 3, we use as Complexity of the products exported by Brazilian states Q p the values obtained by the Fitness algorithm applied to the export of all World countries. Therefore, we use the information in the matrix M sp and the product Complexity Q p to calculate the Fitness of the Brazilian states through the following formula: The relevance of developing the Exogenous Fitness measure is two folds. First of all, using world wide data we extract all the information to compute the Complexity of products to better compute the Fitness of states. Since the algorithm works by exploiting differences of capabilities, using world wide data we gain additional information related to the export baskets of countries with a wider range of Fitness and capabilities. Of course we still expect the two measures to be highly correlated in rank, in particular for a country like Brazil that contains such a vast array of development levels. As we will see in section Comparison with other techniques, this is indeed the case. The second reason is that the Exogenous Fitness allows to have for states Fitness values comparable with those of countries. Indeed, while the ranking between Exogenous and Endogenous Fitness are highly correlated, their actual values and distributions are vastly different. As detailed explained in the paper [33], while the ranking for the Fitness measure is always well defined, the shape of the matrix directly affects the convergence properties of the algorithm to a polarized distribution. Employing the Exogenous Fitness method we have smoothly changing values that allows for the forecasting exercises of section Results.

Overview of Brazil
First, we analyze Brazil as a whole applying the (Endogenous) Fitness to World countries in the time interval from 1995 to 2015. In Fig 3 we show the matrix M cp of the World countries of year 2015 obtained by ordering the countries according to the Fitness and the products according to the Complexity. In that year, Brazil is ranked in the 44th/147 position (equivalent to the raw 103 in Fig 3).
In Fig 4, we show the dynamics of the World countries in the Fitness-Income plane emphasizing the BRIC countries (Brazil in green, Russia in blue, India in orange, and China in red). The figure shows that India and China have in 1995 lower values of GDP p than Brazil and Russia, but higher values of Fitness. According with [7], this difference justifies the dynamics in the plane of the four countries for the next years. Indeed, India and China continued their economic growth during the following years, while Russia and Brazil entered a period of recession [34].
In order to zoom on the differences among the dynamics of the BRIC countries, we analyze the variation of the Fitness of such countries during the interval from 2003 and 2013. The variation of the Fitness can have two different causes: (i) changes in the export basket, (ii) changes in the products Complexity. We can decompose the variation of Fitness [35] as:  where we have indicated with ΔX = X(t 1 ) − X(t 0 ) for a generic quantity X. The first term in the last step of the equation is the contribution to DF c due to the variation in the export basket, while the second one is the term due to variation of products Complexities. In Table 1, we show both the percentage variations due to the two terms. The results show a deep decrease of both terms for Russia and we can see how the loss of competitiveness of Brazil is mostly due to the drop of products that were previously exported, and not so much related to the change in complexity of those products. In contrast China has increased its export basket and the Complexity of the exported products. Instead, India in 2013 exports more complex products, but has decreased its exports diversification.    Table 1, show that China and India both have a diversified export basket and export complex products. Such factors determine a high Fitness and consequently a growth of the GDP p in the subsequent years. On the contrary Brazil and Russia export simple products with a consequently low Fitness so that these countries entered a recession period [34].
In the next section we show the results of a deepened analysis of the internal economy of Brazil through the application of the Exogenous Fitness to the Brazilian states.

Results
We applied the Exogenous Fitness algorithm to the Brazilian states in the time interval from 2000 to 2015 obtaining for each year both well-defined values of Fitness for each Brazilian state, and the ranking of states in terms of their Fitness (shown in Fig 6).
We show in Fig 7 a map of Brazil where each state is colored according to its Fitness. From the figure, it emerges Southern states have larger Fitness, and therefore have a better economic development, than Northern states. This result is in agreement with other monetary and nonmonetary indices such as the Human Development Index (HDI) and the GDP [27].
Furthermore, we show in Table 2 Table 2 we observe that São Paulo has a diversified export basket with high peaks in complex products and, at the same time, it increases both the export basket and the Complexity of the exported products in the considered time period. Paraná and Ceará, in contrast with the aggregate behavior of Brazil, in the same period grew in diversification becoming more competitive-even in the face of a minor decline in the complexity of their exported products. Roraima, on the contrarty, shows a deep decrease in the diversification.
As mentioned in the previous section, Fig 4 presents the dynamics of World countries in the Fitness-GDP p plane. It shows a high degree of heterogeneity of the dynamics of countries. Indeed, the plane can roughly be divided into two regions: one with an unpredictable "chaotic" regime of the evolution of countries, and the other with a predictable "laminar" regime. In order to overcome the limitations of linear regressions, Cristelli et al [7] proposed an  innovative data-driven non-parametric prediction scheme called the Selective Predictability Scheme (SPS). It is inspired by the so-called method of analogues [36,37] and through a measure of concentration it delimits predictability regions inside the Fitness-Income plane. The measure of concentration consists in dividing the plane into a grid and analyzing the time evolution of the distribution of countries inside each box with at least five countries inside.
In analogy with what has just been explained for World countries, in Fig 8a,  Therefore, in order to validate the predictability of the dynamics of the states in the Fitness-Income plane, here we develop a novel intuitive method, the measure of direction. First of all let us fix the time window [t 1 , t 2 ] in which we want to study the evolution of each state in the plane log(Fitness) − log(GDP p ). The time lag Δ = t 2 − t 1 has to be taken large enough to get a sufficient noise reduction in the dynamics. We choose t 1 = 2003 and t 2 = 2013. Second, we divide the plane in a fine grid of 100 × 100 cells and we define two bandwidth; one for the xaxis, and the other for the y-axis. For each cell, we define around its centroid a threshold area of sides given by the two bandwidths. Then, for each cell k with at least three states at the time t 1 inside its threshold area, we computed the average dot productD k : wherev i ¼ṽ i v i whereṽ i ¼ ½logðF i ðt 2 ÞÞ À logðF i ðt 1 ÞÞî þ ½logðGDP p i ðt 2 ÞÞ À logðGDP p i ðt 1 ÞÞĵ and i andĵ are respectively the versors in the Fitness and GDP p directions. N is the number of states with starting point inside the threshold area of cell k. The coefficientD k gives the average cosine among the versors of all states initially inside the threshold area of cell k and varies from (−1, 1]. It measures the dispersion of the directions of evolution in the plane in the time window [t 1 , t 2 ] of all states initially in the threshold area of cell k: when it is close to 1 all states initially in the threshold area of cell evolve in a coherent parallel way. The smaller isD k the larger the dispersion of these trajectories. A color map of the coefficientD in the different cells is shown in Fig 8b. From the figure it emerges that there is a region where the directions of evolution of the states tend to be parallel (showed in green) and a region where the directions of motion tend to be unevenly directed (showed in red). Increasing/decreasing the bandwidths and, therefore, the threshold area only changes the resolution of the image, but the two regions remain well-defined. In Fig 8b we used an x-axis bandwidth 0.86, and a y-axis bandwidth 0.38, providing an almost continuous variation of the colors map.
In order to investigate which is the main direction of the versors in the green region and the further directions in the red region, we divided the plane into a broader grid (10x10). For each cell we sum all the vectors inside it and then we calculate the versor of the sum vector. We show the result in Fig 8c. From one hand, from the figure we can observe a region where the states tend to evolve in the same direction (shown in green). Therefore, in this region, the future evolution of countries is predictable with good confidence. On the other hand, another region (shown in red) can be detected where the versors tend to be unevenly directed. The dynamics of the states in this region is basically unpredictable. Furthermore, in the middle of the two, there is a region of transition, shown in the figure by the overlapping of the two colors.
Lastly, in Fig 8d we show the dynamics of the states in the Fitness-Income plane highlighting in green the states with high predictability of the motion and in red those with low predictability. From the figure emerges that states as Ceará, Pernambuco, and Bahia, despite having low values of GDP, are in a region of high Predictability and, therefore, they will probably continue to growth in the same direction. While for states as Acre, Tocantins, or Alagoas the dynamics is more chaotic and predictions are less reliable.

Comparison with other techniques
In this section we compare the results obtained implementing the Exogenous Fitness with the results of the Endogenous Fitness and the ones published by Dataviva [27] obtained by applying the Economic Complexity Index (ECI).

Exogenous Fitness and Endogenous Fitness
We apply the (Endogenous) Fitness algorithm to the Brazilian states in the time interval from 2000 to 2015 obtaining the time evolution of the ranking of the states according to such kind of Fitness (shown in Fig 9). Calculating the Spearman correlation coefficient between the ranking obtained through the Exogenous and the Endogenous Fitness for each year in the analyzed time interval, we obtain an average valuer ExEn ¼ 0:97. This result shows a strong correlation between the rankings obtained through the two different Fitness algorithms.
The Endogenous Fitness algorithm provide us a well-defined annual ranking of the Brazilian states, but not well-defined quantitative values of Fitness and products Complexity. In fact, all Fitness values except one tend to zero. After a fairly high number of iterations, however, the ranking of states stabilizes, and there are no more changes of ranking among the states. This circumstance is already been studied [33] and it is due to the shape of the matrices M sp . Indeed the external area (where M sp = 0) is greater than the internal area (where almost all elements M sp = 1) for each analyzed year.

Exogenous Fitness and ECI
In Fig 10 we show the time evolution (from 2002 to 2015) of the ranking of the Brazilian states according to ECI, directly downloaded by the Dataviva platform [27]. Therefore, in order to compare the ranking obtained through the Exogenous Fitness algorithm and the ECI algorithm, we calculate the annual Spearman correlation coefficient between the two rankings in the period 2002-2015, obtaining an average valuer ExECI ¼ À 0:14. This result shows an almost total absence of correlations between the two rankings, i.e. between the two algorithms.
Indeed, already from a qualitative point of view, ECI ranking seems to be unrealistic. For example, it ranks rich states in GDP, but also with high HDI [27], such as Santa Catarina or Paraná, in the last positions (respectively 26th and 24th position in 2015). Moreover, the state of Alagoas (last in HDI ranking of 2014 [27]) is unrealistically ranked in 4th position in the 2015.
In Fig 11, we show the map of Brazil where each state is colored according to its ECI. From the figure, it emerges that there is no geographic coherence among the ECI of the different states. For instance the figure shows that the state of Santa Catarina has a high ECI, but it is in the middle between the states of Rio Grande do Sul and Paraná that have a low ECI.  (Fig 8), using the ECI index the dynamics of the states is unpredictable. Indeed, all the states except São Paulo and the Distrito Federal are concentrated in a small region of the plane and, therefore, totally indistinguishable.

Discussion
In this paper we first compared the dynamics of Brazil in the Fitness-Income plane with the other BRIC countries. In Fig 4, we observed that IC (India and China) countries, both with a high Fitness compared to the BR (Brazil and Russia) countries, grow in GDP p for the entire analyzed time interval. Table 1 shows that IC improve the Complexity of export baskets in the analyzed time interval, and China even shows an improvement of the diversification. Instead, BR countries did not invested in diversification and in Complexity of the exported products (as shown in Table 1). These results strengthen an hypothesis previously formulated in [7]: Fitness is the driving force behind growth.
In the second part of the paper, we introduced a new algorithm called "Exogenous Fitness" to calculate the Fitness of subnational entities and we applied it to the states of Brazil. The comparison between the Fitness and the GDP p showed an heterogeneous dynamics of the Brazilian states in the Fitness-Income plane. Indeed, two regions are distinguishable in the plane: one with high predictability and the other with low predictability. Here, we have shown that economic forecasting is possible for those states in the high predictability region, while it is not for those in the low predictability region. As a consequence of this analysis Fitness seems to be the driving force behind growth. Indeed, the dynamics in the high predictability region is characterized by high values of Fitness, while high value of GDP p is not a good signature of growth. The heterogeneous dynamics observed for the Brazilian states shows a strict analogy with the heterogeneous dynamics observed for the World countries [7]. Furthermore, by comparing the export "spectroscopy" of BRIC countries with the one of Brazilian states of São Paulo, Paraná, Ceará, and Roraima, and, comparing the variations of the Fitness, we observe that countries/states with diversified export baskets produce high complex products and grew in Dynamics in the Fitness-Income plane: Brazilian States vs World Countries GDP p in the considered period. This observation can be important for the evaluation of perspectives of economic growth for Brazilian states, and, more generally, for developing countries.
The time evolution of the ranking obtained through the Exogenous Fitness algorithm shows that developed states in the top part of the ranking change little their positions, with a smooth slow motion. On the contrary states in the inferior part of the ranking changes drastically their position during the analyzed time-interval. These facts are probably due to the stability of the developed states that are in the high predictability region of the Fitness-GDP p plane and the instability of the states in the low predictability region.
Finally, we showed the non-correlation (r ExECI ¼ À 0:14) between the ranking obtained though the Exogenous Fitness algorithm and the results of the ECI published by Dataviva [27]. Analyzing qualitatively the ranking of the states according to ECI, we argued that this ranking appears quite unrealistic. Therefore, we propose here the Exogenous Fitness algorithm as its valid substitute. Instead, comparing the Exogenous and (Endogenous) Fitness we obtained a strong correlation (r ExEn ¼ 0:97) for what concerns the ranking of states. This result shows that the two algorithmic tools are almost similar in identifying the ranking of the states, but