An evolutionary theory for interpreting urban scaling laws

We try here to illustrate the relevance of an evolutionary theory of urban systems for explaining their hierarchical properties. The largest cities became larger because they were successful in adopting many successive innovations. Larger cities capture innovations in a continuous way (through adaptation, imitation, anticipation), and they concentrate a larger part of anything « new » at any time. Their functions demonstrate a higher level of complexity or sophistication of their urban activity and society. The most advanced technologies concentrate in largest cities, the common place activities are ubiquitous, whereas old ones remain in small towns only (or, in economic terms, there are increasing, constant or decreasing returns to urban scale). Such regularities can be expressed in the form of scaling laws that were recognised as revealing specific constraints on the structure and evolution of complex systems in physics and biology. Approaching urban activities by scaling laws provides a linkage between the concepts of urban function, city size, and innovation cycles. Over time, there is a substitution among the activities of the largest cities, where the oldest technologies and professions are replaced by the new ones, while the old ones are relatively concentrating in the smallest towns. Such an evolution is observed within urban systems where cities are fully connected and interdependent. There are processes of cooperation through exchanges of information and learning that enable an incremental diffusion of innovation and a continuous adaptation of urban activities and society in all parts of the system. Meanwhile, there are also processes of competition between cities that enhance their capacity to innovate and to levy the benefits of innovation. A distributed process of urban growth including a slight but continuing advantage for the largest cities is the result of that competitive process, while the successive adaptation of the urban functions follows the economic and social innovation cycles.


Introduction
Many initiatives are nowadays aiming at building a theory of complex systems.A number of bodies have been recently created for the purpose of developing a field of knowledge that is thought to be transversal to many sciences.This ambition is mentioned on the website of the earliest of these institutions, the Santa Fe Institute for Complex Systems (created in 1984) : "transcending the usual boundaries of science to explore the frontiers of knowledge" and a similar declaration appears on the website of the European Complex Systems Society (created in 2004).Among the transversal tools which are provided in the framework of this new paradigm, scaling laws are focusing the researcher's interest.Scaling laws are power-law relationships of the form Y = cX β, where Y represents a variable which varies in a systematic way with the size X of subsystems and c and β are parameters.They have two powerful advantages: they summarize structural features of systems in a very efficient way, and they reveal the effect of universal constraints acting on the structure and development over several orders of magnitude in these systems (West et al., 1997, Brown and West, 2000, Sutton, 2002).
Scaling laws in urban systems have been explored in various directions, such as the fractal organization of urban morphologies (Frankhauser, 1993, Batty, Longley, 1994) or of urban networks and traffic flows (Genre-Grandpierre, 2000), as well as the hierarchical differentiation of urban sizes, traditionally observed in urban geography through the prism of rank-size rule and central place theory.The purpose of this paper is to show how scaling laws could be used in geography for unifying previously independent theories of urban structure and development by connecting the concepts of urban function, innovation cycles, urban size and urban growth in an evolutionary theory of urban hierarchy (Pumain, 2004 and2006).We choose here to limit our investigation to the possible expression of scaling laws as mathematical relationships between the population size of large numbers of towns and cities and a variety of their activities.These scaling relationships will then not merely concern the distribution of a single variable, population size, as in our investigations about Zipf's law (Pumain, Moriconi, 1997) but they will rather resemble allometric1 relationships, describing how a variable changes with population size.
There is no question of a simple analogy which would "naturalize" the urban systems, on the contrary we want to develop a relevant framework for transferring these concepts to urban theory properly.Transferring the concepts for which scaling laws were analyzed in physics or biology to social systems is neither direct nor simple.G. West and his collaborators demonstrated that the observed allometric relationships in biological systems, exhibiting power laws with a ¾ exponent between metabolic rate and mass of the body among a full range of biological species, could be explained by the constraints of optimizing the transportation of energy through their smallest elementary units, which was solved by a fractal geometry of the corresponding branching networks (including parameters multiple of 1/4 instead of a multiple of 1/3 which would only reflect the geometrical constraints).If we think of the constraints which can both structure a system of cities and affect the development of its parts, we have to consider that energy cannot be taken as the major universal limiting constraint.The limiting factors to the development of social systems are not only physical, they include a very broad variety of resources of different kinds, of course energy and natural resources, but human labor as well, and also social, cultural and technological resources.Towns and cities are precisely the type of places where the social division of labor was progressively increased until very high levels, and where the fixation of inventions occurs, after they became successfully adopted innovations (Schumpeter, 1926).The main resources enabling urban development are the technical and cultural innovations which increase the productivity, the diversity and the cohesion of human activities; the availability of these resources relies on the production and exchanges of information.At the end, if we were able to convert each of these different resources into their energetic equivalent (as suggested once in the UNESCO program named "MAB" (Man and Biosphere), but this approach was not successful because of the difficulty of measurement and conversion of all activities into comparable units), it is possible that similarities could appear with natural systems (the largest cities exhibiting economies of scale in all energetic expenses), but that remains uncertain.
The first investigations we have made confirm the intuition of a major qualitative difference in scaling laws between the biological and social worlds.The exponent of 3/4 which was found when expressing the metabolic rate as a function of the size of living species reveals that the energetic expenses per living unit (cells) were reduced during the biological evolution, enabling larger bodies to survive.Conversely, the relationship which appears in the case of urban systems seems to demonstrate that the equivalents of energetic inputs and outputs per capita tend to increase with city size.An example of such a scaling relationship is that between patenting (as a proxy measure for invention or innovation) and population size.Recent work demonstrates that the statistically robust relationship between metropolitan patenting in the United States and population size is, mathematically speaking, super-linear (or to use the language of economics, exhibits increasing returns to urban scale).This is also true for GDP per capita statistics, and a few indicators of urban costs (Bettencourt et al. 2004, Strumsky et al., 2005).
A systematic comparative research was undertaken for measuring scaling laws in urban activities among 331 metropolitan areas of the United States (Lobo et al. 2005), 354 "aires urbaines" in France (Paulus, 2004), and 87 South African urban agglomerations (Vacchiani-Marcuzzo, 2005).These countries were chosen because they are very different in size, level of development and settlement style.Each of them is representative of three major ideal types of systems of cities which were generated by the history of urbanization: long-standing urbanization without major external shock (France), late urbanization developed wave-like, at first in connection with an external metropolis, in a developed (USA) and less developed country (South Africa).One can expect thus that the results which are presented can provide insights about general properties of systems of cities.The main empirical evidence is that there are variable values of the scaling exponents, according to the sectors under consideration: some activities are simply proportional to population size (exponent close to 1), others scale superlinearly (exponent larger than 1) whereas other activities scale sublinearly (exponent less than 1).However, we were able to recognize a coherent pattern in these different values, which are rather clearly related to the stage of the activities in the innovation cycles.We shall now develop this part of our evolutionary theory for explaining scaling laws in urban systems.

Theoretical interpretation
We propose to connect in a global interpretative framework the empirical facts that are available, about scaling laws on the one hand, and urban growth processes on the other.The theory states that in a system of cities, which are connected by a large variety of physical and social networks enabling the exchanges of information, towns and cities are engaged in a general competition for growth, or attempt to reduce the uncertainties which limit their development.This occurs mainly through the circulation of innovations, which reach cities in a differentiated way.The diffusion of innovation throughout the system conveys impulses for urban growth, (gains from adaptation), with variable intensity according to the stage in the cycle of innovation (Pred, 1977).The size of cities, the diversity of their functional specialization are the result of this cumulative process, which is self-maintained because the interurban competition is also an incentive to innovate, as a feed back effect from the system on individual towns and cities.

Innovation cycles
Knowledge and information, reflexivity and the capacity to learn and invent are driving the urban development.The crucial role that cities have played in the generation of innovationsintellectual and material, cultural and political, institutional and organizational -is well documented (e.g.Bairoch 1988, Braudel 1992, Hall 1998, Landes 1999).The role of cities as centers for the integration of human capital and as incubators of invention was rediscovered by the "new" economic growth theory, which posits that knowledge spillovers among individuals and firms are the necessary underpinnings of growth (Lucas 1988, Romer 1986).As Glaeser (1994) points out, the idea that growth hinges on the flow and exchange of ideas naturally leads to recognition of the social and economic role of urban centers in furthering intellectual cross-fertilization.Moreover the creation and reposition of knowledge in cities increases their attractive pull for educated, highly skilled, entrepreneurial and creative individuals who, by locating in urban centers, contribute in turn to the generation of further knowledge spillovers (Feldman and Florida 1994, Florida 2002, Bouinot, 2004, Glaeser and Saiz 2003).This seemingly spontaneous process, whereby knowledge produces growth and growth attracts knowledge, is the engine by which urban centers sustain their development through unfolding innovation.The essential role of knowledge generation, recombination and circulation within and across urban areas must be at the core of any proposed explanation for urban scaling.
If we think now at how the change in economic activity proceeds over time, we do not see any continuous flow of novelty but more or less intense periods of creativity: these major innovation cycles called "revolutions" in economic and historical literature.We recognize at least for the last three centuries five of these major cycles that are narrowly connected to urbanisation and diversification of urban activities: the development of planetary maritime trade and the banking sector in 17 th and 18 th century, the first industrial revolution of steam machines and railways at the beginning of 19 th century, the electricity and automobile at the turn of 20 th century, the electronic revolution in the middle of 20 th , and the now engaged revolution in converging technologies (NBIC, nano, bio, information and cognition) for the 21 th century.
It is clear that the time span of each cycle has been reduced over historical times (from hundreds years before the industrial revolution to a few tenths of years nowadays) but in each of them three main stages are always recognizable: . Emergence stage (leading technologies, highly skilled jobs) .Diffusion stage (mature technologies, skilled jobs) Stage of decay (banalisation and/or substitution by new products, old techniques, unskilled jobs)

Innovation sustains urban growth
When a city adopts an innovation (or is selected as a place to produce the corresponding goods and services), there is generally a return including profits that are generated by the new activities, as well as induced benefits.This process has been formulated a long time ago under the economic base theory.Urban growth is highest in the emergence stage (because of the initial advantage associated to a new production), so, at each time period, advanced cities keep in pace with innovation, draw returns from it and grow, whereas not adapting cities are growing less (or not at all).Urban growth may be translated in variable proportion as an increase in population or general wealth, but also can include more qualitative aspects as change in human capital, acquisition of knowledge and diversification of local resources.
It was observed in empirical studies that the urban growth rates were linked with the innovation cycles (Berry, 1991), including a changing relation with city size: at the beginning of a cycle, larger cities tend to grow faster, then growth rates tend to equalize, then small towns tend to grow faster (Robson, 1973).This last stage was even interpreted as a "counterurbanization" trend (predicting a decline or even the "end" of the largest metropolises) during the years 1970-80, whereas it simply marked the end of the post second war innovation cycle (Cattan et al., 1994).

Spread of innovation and diversification of urban functions
Each innovation wave is propagated in a system of cities according to two main different processes.The first one was identified long ago by T. Hägerstrand as a hierarchical process of diffusion of innovation: the largest cities capture first the benefit of the innovation, and later on let them filtering down the urban hierarchy.The largest cities are the one which benefited of their adaptation to many innovation cycles, which explain their big size.As a consequence, they also have developed a broader diversity of activities, and attained higher levels of social and organisational complexity.These characteristics explain that they have a higher probability to adopt any further innovation at an early stage.The many contemporary studies on the so-called "metropolisation" rediscover a process which has been for long constitutive of the dynamics of systems of cities (Pumain, 1982) at a time when the globalisation trends and the general conversion to the "information society" are designing a new broad cycle of innovations.
A second process in the diffusion of innovation is as old in the evolution of urban systems, and leads to the formation of new urban functional specialisations.A few cities can either create a new type of activities, or become selected for developing these, because specific amenities they have correspond to the location factors of the new activity.Usually, these specialisations narrowly connected with a "product cycle", did not sustain further the so created urban development and were giving rise to "generations" of cities (as cities of the textile industry in 18th century in Europe, heavy mining industry of the 19th century).Moreover, the multivariate analysis of urban activities have demonstrated that the main functional differentiation among cities could be described in terms of the broad innovation cycles that had driven the socio-economic change (Pumain, Saint-Julien, 1978, Paulus, 2004).The major "industrial revolutions" for instance still appears, after many decades, as "factors" differentiating the portfolios of activities, according to the intensity with which each city had participated in that innovation cycle.Actually, bunches of activities have been growing or decaying simultaneously within cities and were interpreted in terms of functional specialisation.
These two processes have consequences on the dynamics of systems of cities.The activities which can widely diffuse in the system tend to reinforce the relative weight of the large cities, because of the growth advantage linked to the earlier stages of innovation, whereas the activities which selected a few specialised towns because of some specific location factors, after boosting their development with sometimes spectacular growth rates, then tend to hamper their further development by weakening their capacity to adapt.

Scaling in urban systems
The largest cities became larger because these cities were successful in adopting many successive innovations.Many of these innovations become later part of the activity of all towns and cities, since they meet needs that became commonplace (think for instance of the primary and secondary education and health services in cities of the developed world today).But the costs of functioning in these large urban areas are also much higher, and many activities are forced to out-migrate to smaller settlements where they can sustain their economy.So at each time period, the activities belonging to a new cycle of innovation remain blocked for a while at that upper level of the urban hierarchy, then diffuse among other cities, then shrink, escaping at first from largest cities, and at the end remain concentrated in a few small towns only.Then at a given moment, it can be expected that most advanced technologies concentrate in largest cities, current ones are ubiquitous, whereas old ones remain in small towns.The corresponding activities can then exhibit three different scaling parameters: Leading technologies (top of current innovation cycle): β > 1 Common place (or banal) technologies (diffusion stage): β = 1 Mature technologies (decay or substitution stage): β < 1 This expectation was not contradicted by our empirical results.Of course, the official nomenclatures which we use are not always in exact correspondence with the historical innovation cycles: they were not designed in this purpose, even if they are revised from time to time for providing a better adapted description of current economic activities (Desrosières, 1993).According to the somehow arbitrary aggregation of activity sectors they give, it would be hazardous to interpret the value of scaling parameters in absolute terms.Moreover, it is obvious that the content of activities that we classify as "mature" can be seen as up to date in terms of technological and managerial processes at the level of the firm as the diffusing or even leading ones are.What we express here is an aggregated spatio-temporal view, at the level of a whole system of cities and for very long periods of time.

Scaling and diversity of urban functions
Our first testing of the theory relies on the repartition of the labor force in 354 French urban areas ("aires urbaines" that are somehow equivalent to the American Metropolitan Standard Areas).
According to the theory above, the activity profile of the largest cities is expected to be more diversified than in the smallest ones: If large cities adopt successfully many innovations cycles, they keep footprints of past cycles, so their functional profile should be more diverse, more complex.As a verification, the number of employees in 96 economic sectors has been collected in order to calculate a coefficient of specialization (Paulus, 2004).Isard's coefficient corresponds to the Euclidian distance between the economic profile of the town and the mean profile.When close to 0, it means that the city economic profile is diverse.On the other hand, a city with a coefficient close to 1 has most of its employment concentrated in one economic sector only.We computed a diversification index D based on Isard's coefficient I, that is D=1-I.
The relationship between city size and economic diversity is evident on the graph of figure 1.
The correlation is strong with a coefficient of determination equal to 50%, even if variations remain.All "aires urbaines" larger than 200 000 inhabitants belong to the most diversified group of cities. Less diversified cities are only small ones.

Scaling parameters and stage of activity sectors in the innovation cycle
Not only this global indicator, but more detailed investigations about scaling parameters, using data on economic sectors employment, is in agreement with our interpretation.We plot cities according to their size (logarithm of the number of inhabitants) on the X-axis and the logarithm of the number of employees in a given economic sector on the Y-axis.To calculate the scaling exponent (β), we estimate, using least squares technique, the slope of the line that fits the set of points.This data set is provided by the last French census, in 1999.
Table 1 report on scaling exponents of some economic sectors that are classified according to their approximate stage in innovation cycle.Research and Development (R&D) is a good proxy for the current innovation cycle (Figure 2 and Table 1).The β exponent is very high at 1.67.It confirms that this economic sector, which is emblematic of the "knowledge society", is much more developed in largest cities and remains absent or tiny in smallest ones.While the β exponent is high, we should notice the medium quality of fit, with a coefficient of determination equal to 60 %.Actually, small towns can adopt current innovation cycle, as well as the largest ones, and even be more advanced, when they have anticipated this development of R&D sector.Nevertheless, we can see that all "aires urbaines" larger than 500 000 inhabitants have the largest share of employment in this sector when compared to the average value of the whole urban system.The same can be observed about consultancy and assistance activities, which are, among all business services, the ones which developed more during the last two decades (table 1).Employment level in hotels and restaurants can be interpreted as a proxy for measuring the impact of the innovation of tourism.Tourism emerged at the end of the 19 th century, as long distance travels became fasters with railway networks.This activity widely spread during the 1960's and can now be considered as a diffusing activity.The β exponent is exactly equal to 1 and the quality of fit is very good.Just a few small towns have much more employees in hotels and restaurants than on average in the urban system.These cities are specialized and we can now ask the question of the durability of their dynamism.
Manufacture of food products, as a mature industry, scales sublinearly with city size (table 1).This activity was an innovation long time ago, when it substituted to domestic production.Nowadays it remains important in small towns only and tends to have lower proportions in Paris and other large cities, from which manufacturing activities flew away since the 1960's.
(Of course there can be other reasons as the proximity to places of production for locating food production in the countryside, a much detailed analysis of the sector should be made and we recognize here the inadequacy of the existing nomenclature for the purpose of our study).We have given some illustration of the scaling parameters which link the urban system hierarchy and stages in technological development of urban activities.In order to establish a stronger evidence of the validity of these parameters, we present in table 2 scaling exponents that have been calculated for the United States urban system.When defining urban units as SMAs, there are about the same number of cities than in the case of France.Employment data by sector come from the Census 2000, using the NAICS nomenclature.There is no exact matching between this economic nomenclature and the one used by the French statistical institute, but reasonable comparisons are possible.Globally, the values of scaling exponents of similar activities are close.The economic sectors which belong to the most recent innovation cycle, including all modern business services as finance and insurance, real estate or scientific services (generally summarized as APS and FIRE), are in both countries scaling superlinearly with city size.Banal activities as utilities, accommodation and food services or retail trade are scaling almost linearly with size.As far as sublinear scaling, it would probably characterize some subdivisions of the manufacturing sector, as it is the case according to the French nomenclature of activities.Manufacturing as a whole scales with a 0.89 coefficient and 0.97 in the US, demonstrating that these activities are considered as more mature in France than in the US.

Stages in technological development
Similar scaling exponents are most of times observed for US and French urban system (for instance, retail trade: 0.95 in both cases).In the US, some economic sectors scale more superlinearly with city size than in France.It is the case for finance and insurance industries, transportation or wholesale trade.Nevertheless, values of β exponents in France for these economic sectors are also above 1.The largest deviation between these results concerns educational services which scale with a 1.23 exponent in the US, whereas in France, the value is below 1.This can be a consequence of differences in the nomenclatures and perhaps linked to different state regulations, since the policies of social welfare, in France, aim at improving the proximity of population to educational infrastructures (schools and universities as well).The exponent values are more contrasted among economic sectors in US SMAs than for French aires urbaines.We recall here that there is a higher contrast in city sizes within the US urban system, as expressed by the higher value of the Zipf's coefficient than in France.We may wonder about the explanatory meaning of this observation: are economic sectors creating more inequalities among US cities because they shift quicker from one location to the next according to expected returns linked with city sizes?Or do they simply adapt to a previous state of the structure of the urban system, whose stronger hierarchical differentiation has other causes (mainly the historical period of the development of the system, including high speed transportation means which enabled larger urban concentrations and spacing between them, see Moriconi, 1993and Bretagnolle, 1999, Bretagnolle et al., 2002).

Scaling parameters and hierarchy among occupational groups
We also applied the same method to occupational groups, as they are described by the French census.Following this nomenclature, it is rather easy to classify labor force according to average skill levels.We found that the highly skilled jobs were scaling superlinearly with city size, whereas unskilled jobs (mainly workers) did scale sublinearly, and average level in skills (as teachers) were simply proportional to city size (Table 3).We can display a synthetic view of the strong relationship between professions and city size by using factor analysis.The input table is rather simple, including 8 rows for urban size classes and 5 columns for occupational groups.We consider here the less detailed partition for professional groups.On this plot (Figure 3), the distribution of occupations underlines the transition from small towns to Paris, the largest one.The society of small towns concentrates many workers, which we can consider as unskilled according to the current technological stage.Medium size towns concentrate relatively much more skilled employees, as technicians, clerks and salesmen.Largest cities and especially Paris count a larger proportion of highly skilled people.This cross-sectional view corresponds to the historical process of emergence of more and more skilled activities which emerge at first in the largest cities.Unless the economic nomenclature whose types of products change over time, reflecting more or less the progress of innovation cycles, the nomenclature of professions evolves at a slower pace.It represents in an approximate way the hierarchy of social status, which in modern societies is linked to the level of skill, even if it is in a loose way.The concrete content of skill may changes rapidly over time, while the identification of the corresponding social categories remains the same.That is why, unlike economic sectors, the aggregate categories corresponding to the highest skill (executives, intellectual professions), which are intensely contributing to innovations, are not likely to diffuse in all cities over time, but remain blocked in the largest.But if we would think of professions in a more detailed way, we could find that for instance the highly skilled mechanic who constructed automobiles one by one in the very center of Paris at the beginning of 20 th century is nowadays a worker in a decentralized plant in a much smaller town.Of course the equivalence is not easy to establish and the evolution not so simple.

A further test: evolution of scaling exponents through time
Another test of the theory consists in observing how the scaling parameters evolve over time.
One can expect that the now leading technologies can still increase their parameter value, while activities of older cycles should have decreasing parameters.Using our historical database on economic employment in French urban areas from 1962 to 1999 (Paulus, 2004), we have explored the evolution of the scaling parameters (Figure 4).On the upper graph of figure 4 are represented the sectors which have an increasing β exponent from 1960 to 2000.They are the three economic sectors which are involved in the current innovation cycle: a good example is R&D, whose β exponent was about 0.7 in 1962 and rise progressively, to 1 in 1968 and reaches 1.67 in 1999.It can be surprising that β exponent is less than 1 at the beginning, at a time when this sector starts to acquire a crucial role in the productive system.This can be understood if we keep in mind that, at this time, the total number of employees in this sector is very low and 90% of cities have no employees at all in this economic sector.In this context, some small towns hosting a research centre could count as many researchers as larger cities, Paris being an exception.But rapidly, as this research activity was becoming prominent in the economy, with a high rate of employment growth, the largest cities took a leading position.In the same context of a growing importance of the society of communication, we also notice that post and telecommunication scale at 1 in 1968 and at 1.2 in 1999.Consultancy and assistance activities are more stable, with a β exponent always above 1.On the same graph, education and retail trade exhibit β exponents which are really close to one at each date.This can be explained by their rather stabilized function at our stage of development (although the function was many times renewed in content), even if one could be tempted to interpret their spatially ubiquitous repartition, in a static functional perspective, as expressing the meeting of basic universal and elementary needs of resident populations.Probably, such activities could scale non linearly in countries with a lower development level or in earlier historical periods.

Decreasing β exponents
The graph below in figure 4 represents the decreasing values of β exponents for some economic sectors over time.Most of them are manufacturing industries.This can be understood as a process of hierarchical diffusion of the technological development of these industries, leading towards a higher relative concentration in the lower part of the urban hierarchy.Nevertheless, we shall notice that all these activities keep β exponents above 1.Manufacturing industries are not all mature.Diversity inside those manufacturing industries is large, with some offices that are at the edge of the technology and some others that belong to an older stage.For example, while employees in textile industry are less and less numerous, the value of the β exponent is decreasing but remains above 1 (in 1962, it was equal to 1.4 and at the end of the century, its value is 1.1).
We see here the difficulty in considering that economic activities or urban functions directly reflect different stages in innovation cycles.Economic data which are provided by the statistical offices are not well suited to isolate innovations.Nomenclatures of economic activity sectors are constructed to identify products, and are periodically revised for categorizing, new, innovative productions.But that process of categorization is not systematic: some new sectors are identified, while old sectors may remain under the same name in the nomenclature with a completely new content.A good example would be the automobile industry: at first the small innovative workshops where automobiles were invented did not appear in the nomenclature under any other name than "mechanics".Later a category "automobile industry" was invented.Today the category still exists, but the content of the activity has changed, involving robots and all technological improvements.Nowadays, its content is both reflecting innovation in the production process and a rather old invention in terms of technical way of transportation.So, one can understand the poor quality of fit and the value of β exponent remaining close to one after having decreased.
This last experience helps to decide between two alternative interpretations of the urban scaling laws: the first one is longitudinal, it considers that the model represents the relationships between the size of a typical town or city and parts of its activities at different stages of its development over time; the second interpretation of the model is transversal, as representing the repartition of different activities among cities of different sizes at a given period.The first interpretation does not consider the diversity in functional specialization among cities, but interprets the differences in the scaling parameters as reflecting the capacity of different activity sectors to adopt a spatial organization which optimizes the trade-off between the advantages and the costs of the locations in cities of a given size.The second interpretation admits the functional diversity of the system of cities and the progressive substitution of activities among the different levels of the urban hierarchy over time.Our last result showing the evolution of the scaling parameters of some economic sectors over time seems to comfort this second interpretative framework.Another reason is that we observe these relationships inside systems of cities that have had a long enough coherent development (i.e., systems of cities defined as national urban systems, even if we know that, at least for the largest cities, the major determinants of their evolution can come from their competition with cities belonging to other systems).
At a first sight, our theory may appear counter intuitive, since urban activities which scale sublinearly with city size, as it is the case among biological species for metabolic rates and size of organisms, should represent a better efficiency or adaptation: they have found a way of organization which provides scale economies.This may be true either for some specific urban services such as water or energy supply.To maintain such a static interpretation, one should pretend that innovative activities are wasting resources since they are more abundant in the largest places.We prefer then to substitute our evolutionary view, even if seemingly more complicated, to the more general economic explanation, which cannot meet the entire empirical evidence we have found.Of course, more empirical testing has to be done for consolidating our hypothesis.

Another view of innovation: foreign investment in South African cities
Systems of cities are by no means closed systems, and especially with globalisation processes one could consider foreign investment in cities as part of an "innovation" which makes the economy of cities more international.There is a huge literature on the new international trade and division of labour which we will not detail here.But we want to relate an experiment which we made while measuring the effects of foreign investment on a developing country.
Our hypothesis was that such an "innovation" would adapt to the previous existing structure of the system of cities and diffuse downwards in the urban hierarchy of city sizes.Céline Vacchiani-Marcuzzo ( 2005) made a survey of the foreign investments in South African urban agglomerations.The location of head offices and branches of foreign firms in cities is interpreted here as an "innovation" linked to the context of the globalisation of the economy.
We use the number of these foreign firms as an indicator of the participation of cities to the current cycle of reorganisation of the firms at a world scale.Therefore, our expectation is that it should scale superlineraly with city size.
In the case of South Africa, this interpretation must be controlled by a deeper analysis of the linkages between the national economy and the foreign countries: the location of foreign firms is by no means a new process, since it was a part of the colonial system.But there was a real booming of foreign investment in the country after the end of apartheid, and from 1994 onwards, the curves of growth of the number of foreign firms identifies this process as a real "innovation" of the last decade.This new process is clearly selecting the upper level of urban hierarchy, while smaller towns can still be chosen as locations for very specific economic sectors as the mining industries, which in the past contributed in a very significant way to the creation of the Northern part of the South African urban system.As a result, there is a clear and significant supralinear relationship between the number of headquarters of foreign firms and city size among the urban agglomerations over 100 000 inhabitants, and the relation is also true even if less significant for their secondary branches, but with a slightly lower value of the coefficient.Secondary branches are more widely diffused in the medium size cities.

Conclusion
We have suggested a possible connection between scaling laws and the process of change occurring in systems of cities.The diversity of scaling exponents according to the urban activities and their variations over time could be explained by the hierarchical diffusion process of innovations in systems of cities. Innovations are likely to be adopted at first in largest cities.They diffuse everywhere when the technology becomes commonplace and shrink towards the smallest towns in the latest stages of the product cycles.Of course, this schematic view holds only for activities which are likely to become ubiquitous, a few types of activities remain concentrated on a few locations, in connection sometimes with a functional specialisation of some towns.Our theoretical framework of interpretation has not been contradicted by the empirical results.Moreover, we can expect more from this approach, in terms of explanatory power, by linking scaling laws and specific processes of urban growth.In the mathematical theory of scaling, and according to Geoffrey West and Luis Bettencourt2 , the results could apply at the level of one city, and characterize its growth trajectory over time.We think that our reasoning is more adapted to another interpretation, linking scaling laws and the distribution of growth rates within a system of cities at given moments in time.But of course we have to provide a further test of the theory.This could be made by confronting the mathematical model of growth which is deduced of our scaling parameters with our empirical observations about urban growth in system of cities.Previously, we had used the Gibrat model as a proxy for this description (Bretagnolle et al., 2002).This model supposes a simple exponential model for the growth of individual cities, which would correspond to an exponent of 1 in scaling laws.As we have seen, at a given period we have a mix in urban activities, some of them scaling supralinearly, others just proportional to city size, and others sublinearly.Can we accept that "in average" (but this has to be demonstrated!)this make a global exponent of 1 and then an exponential growth?We also recall from our detailed observations that there is a slight trend for larger cities to grow a little more rapidly and small towns a little less, than the rest of the urban system, on the long term (Pumain, Moriconi, 1997).This deserves further research for a more rigorous formalisation of these intuitions.

Figure 1 :
Figure 1 : Economic diversity and urban size Source : INSEE -Recensement de la population, 1990

Figure 5 :
Figure 5 : Relationship between number of multinationals and city size in South Africa Source : Data base CIS-CVM, 2003

Table 1 : Scaling parameters and stage of economic sectors in the innovation cycle (France)
Source: INSEE, Recensement de la population, 1999, 350 aires urbaines

Table 2 : Scaling parameters and employment by sector in United States and French urban systems
Sources: United States Census 2000, INSEE, Recensement de la population, 1999* codes of the categories of the NAICS nomenclature ** some sectors were aggregated for matching the content of the US categories

Table 4 : Scaling parameters for the number of foreign firms in South African cities
Source : Data base CIS-CVM / Census of Population 2001 et Data base CVM