Playing With Power-Law Curves: A New Way to Analyze Market Structures and Sectors

This paper outlines a universal empirical method that illustrates how such curves can be used to reveal changes in economic structures and relationships. The method takes power law analysis from a static snapshot to something more akin to a motion picture. The approach leads to an alternative to the classic Herfindahl-Hirschman sector concentration approach. It also provides a relatively stable characteristic numerical description for each sector and market segment and eliminates the biases caused by price inflation. Collection of such segment data can lead to a characteristic number, a signature, for a region’s economy or a company’s overall competitive position. The method and the valuable insights it can provide is applicable in an unlimited number of fields. Examples include everything from mergers and acquisition and portfolio management strategies to areas involving studies in antitrust, banking, insurance, sports, medical science


INTRODUCTION
Power laws have been specified and explored for more than 100 years and are evident in virtually all parts of the economy and human endeavors.It all began with the discovery by Italian sociologist and economist Vilfredo de Pareto  whose studies found that in general 80% a nation's wealth was held by 20% of the population.This 80/20 split roughly applies to a diverse set of circumstances, e.g., 20% of posts to websites approximately generate 80% of the traffic.Similarly, 20% of a company's products might generate 80% of sales and/or profits even though in some industry sectors such as music, 97% of profits might be generated by 3% of artists and the percentages need not necessarily always add to 100%.1 A popular application of this was the discovery (by George K. Zipf, (1902Zipf, ( -1950) ) that word frequencies in languages are inversely proportional to rank in frequency tables.For example, in English the word "the" occurs most frequently and by itself accounts for nearly 7% and "of" for around 3.5% of all word occurrences.2 Pareto's Law is an example of the relationship "between the input and output of a tail distribution", whereas Zipf's Law represents "a power relation between the input and output of a rank distribution.4 This is the feature explored in this paper.
An idealized version of a power law with reference to movie box office rankings appears in Fig. 1 and would include thousands of film titles.This idealized version indeed substantially resembles the empirically derived presentation in which Axtell (2001 ranked U.S. cities by size of populations and found an almost perfectly straight downward-sloping (-1.03 versus a perfect -1.0) power law distribution (also shown in Gabaix 2016).
Fig. 1.Idealized power law applied to movie box office data. 5abaix (2008,2016) provides a most comprehensive and important work on power laws and shows how the underlying mathematics and empirical data are related.He illustrates with the cumulative distribution of daily stock market returns for different capitalization sizes of stocks.Differing growth patterns that emanate from initial circumstantial conditions involving availability of capital, accrued expertise and knowledge, forecasting ability, availability of skilled labor, and other economic factors, lead naturally to the development of Archives of Business Research (ABR) Services for Science and Education -United Kingdom power law characteristics.Analysis of these characteristics provide interesting and significant economic and socionomic insights. 6ower law distributions have been studied and applied in many different fields, for examples, Covid-19 ( Jang, 2021 and and Neipel, )2020, Interned topology (Faloutsos, et al., 1999).earthquakes (Meng et al,, 2019), trade (Eaton, Korum, and Kramarz, 2011), metabolic rates (West, Brown, and Enquist, 2000), and wealth (Levy andSolomon, 1997). And Clauset et al (2009) and Eliazar (2020) provide great depth in covering the statistical and empirical nature of such distributions.Eliazar indeed notes that power laws are prevalent in the physical sciences (e.g., Newton's law of gravitation, Coulomb's law of electrostatics, Kepler's law of planetary motions).It appears, however, that none of these have used the approach that is here presented.This paper aims to extend the existing framework by adding a dynamic aspect that provides a new and universally applicable way to understand sequential changes in a wide variety of sectors and product and service market share studies.As such it can also serve as an adjunct to the classic Herfindahl-Hirschman Index that is frequently used to analyze market concentration aspects and trends.
The study begins by describing how the classic power law relationship can be extended to provide a more dynamic picture of what occurs to a sector's share of market over time.To make an analogy, the classic power law presents a snapshot; the method shown here is a bit more like a movie.
The second section provides the basics of power law applications The third describes how the "snapshot" can be turned into more dynamic representation.It conveys the underlying thinking and methodology with a relatively long data series (41 years) that also includes the pandemic year of 2020.The fourth section extends this methodology to a wider variety of sectors and services and discusses comparative results.The fifth and last section summarizes.

POWER LAW BASICS
Power laws, sometimes called scaling laws, are generally described by the expression, Y = αX -B where the variables are X and Y, B is the power law exponent, and α is constant.The sign preceding the B is negative, which formulates the declining slope moving from left to right along the lower axis.Data on the axis can be shown in terms of logs of rank and frequency (Figure 1) or in ordinary numerical notation.Also, the rank and frequency axes can be swapped and the characteristic downward slope will still appear.
Classic power law studies show ranks and frequencies as a snapshot, even though the periods covered by the data can be extensive.Rankings, however, tend to change and such changes can be at least partially captured by relatively simple comparisons related to different periods of time.For instance, Fig. 2 shows changes in professional sports team values for the years 2017 and 2020.The entire valuation structure rose and over the intervening years shifted the curve upward.This does not, however, preclude the possibility that there might be periods when technological, economic, or demographic factors send the more recent curve below the earlier one and/or when the slope of the curve, i. e., its power law coefficient, becomes noticeably more or less steep.A period-by-period estimation of these slopes is a relatively simple way to visualize such underlying changes.But it does not provide a more useful, deeper, and sophisticated way of seeing what is happening -"looking under the hood" if you will -to see underlying features.
The method developed here is aimed at providing this more robust analysis in a way that can be universally applied to comparative studies of changes in sector market concentrations or of changes in market shares of products and services.

ANALYZING THE DATA SETS Means of Means and Characteristic Numbers
The motion picture business is a convenient place to start, because a large data set, from 1980 to 2020 is available from industry trade sources that include boxofficemojo.com,IMDb, or Variety magazine archives. 7.This data can be readily used to create a dynamic statistical profile that indicates how the rankings concentrations have changed over time.In the film industry, for example, there used to be in the 1980s a much wider dispersion of box office revenue outcomes than in the 2010s, a period that saw Disney and Universal (owned by Comcast) gain major shares of the overall box office, both domestic and foreign.Classic power law presentations generally do not show such changes.
In this case, an annual ranking of position from top performer on down to lowest for domestic (i.e., U.S., and Canada by industry convention) box office gross revenues by film title makes it possible to track such changes.As a practical matter, however, it is not necessary to rank below the top 20 positions as anything further down does not contribute much useful information.
Such a ranking can be done with data using monetary metrics (e. g., domestic box office gross, dbo) or units (e.g., number of tickets sold et theaters), but the monetary metric provides a most convenient data series.However, over a long stretch of time overall distortions and misleading results will be introduced by overall price inflation trends in the monetary metric or population growth and demographic changes in the use of unit ticket counts.In other words, box office results of 20 years ago ought not be directly compared to those of today.
Other factors such as the pandemic of 2020, which shut down cinemas everywhere, and the housing crisis/Great Recession of 2008-9 will make raw data long term comparisons ever less reliable.
As One way to circumvent these problems is to index each year's results.This removes the inflation bias from the rankings and also allows for inclusion of pandemic year 2020 data, when total dbo fell by 70% (but was still measurable as the numbers were above zero).
Table 1 shows the ranking of titles in 2018, in which the top rank is 100 and every subsequent item is indexed in descending order as a percentage of 100.The indexing can be computed as long as data for any product or service segment, section, or other variable factors are available The need is generally only for data on the top 20 items because anything below that rank always has much diminished if not negligible significance.
A plot of the power law curves for films, 2017-2020 appears in Fig. 3 and represents an indexed rank matrix with 20 rows and 4 columns (20 x 4).
The lines for several years will not necessarily all overlap or intersect each other but often do.And much additional information can now be extracted by playing with data in this matrix.The plot of these points displays the trends toward more or less market power/concentration over time and indicates that the lines can cross each other.Summary statistics can now be readily derived for each year and then categorized as in Table 2.  First, the mean values for each yearly indexed column is calculated and reveals that for 2017 to 2020 the sequence was: 51.3, 40.5, 34.8, and 28.0.From this it may be inferred that concentration or market share dominance held by the top ranks is weakening.
The average of these averages or a mean of annual column means (MoMs) is the same as when the average for each indexed annual rank (row) entries are averaged across the years.Higher MoMs suggest more market dominance and positional rigidity over time.The average of the row-rank items for this sample is 38.1.MoMs are thus a slow-to-change and characterizing statistical; signature of any industry segment or sector.In this example the characteristic number is thus 38.1.
This approach further enables market share strategists to see where they stand in relation to competitors as the calculation reveals the average index value that over time is required to be in a second or third position within an industry or marketing segment.Many companies, for example, General Electric, long held to the notion that if their division (in this case, aviation, health, or power) couldn't sustainably be at least among the top three competitors, it ought to be divested.
Separate regressions can of course also be run for each year and the coefficients can then be plotted over time .There are many ways to estimate the related slope coefficients for each year's indexed rankings, the most obvious being ordinary least squares (OLS) regressions (in which the index determines the ranking position).The OLS equation to be estimated, with constant c, and ε the error term, is: For examples, as shown in Table 2, for films in 2018 the coefficient α was -0.19 (with constant 18.3).And for films in 2017 the coefficient α was -0.25 (constant 23.1). 9A plausible alternative estimation, modified from Clauset et al. (2009) This is also shown in Table 2, but for current purposes does not lead to inferences that are much different than those derived from linear estimates.
Perfectionists might, of course, argue that inclusion of the top rank, which always has an index value of 100.0, ought to be excluded from the OLS regressions.But doing so only slightly changes the results and the interpretations not at all.Fig. 4 expands the movie data set back to 1980 and thus reveals the underlying long term trend.By and large and unsurprisingly, the trend for such relatively ephemeral pop culture market presence items is sideways.
9 In all such regressions, including for those data sets later mentioned, the coefficient p-values were 0.00 or just slightly above 0.00.

Concentration Metrics
If the slope trends steeper over time (i.e., the coefficients become increasingly negative), it's an indication of sector or product becoming more concentrated and that the top ranks are becoming increasingly dominant while the lower ranking items have a diminishing presence..As such, this presents another way to measure product or industry domination and concentration trends and it develops a measure that can be alternatively applied alongside the classic Herfindahl-Hirschman index (HHI) that is widely used in antitrust corporate merger analysis. 10  The HHI formula is calculated by squaring the market shares of each film competing in the market and then summing the resulting numbers so that: where s represents the market shares of firms.
For example, four firms with shares of 30, 30, 20 and 20 percent would generate an HHI of 2,600.An HHI of 1,500 is considered to be a highly competitive market, between 1,500 and 2,500 indicates a moderate degree of concentration, and anything above 2,500 suggests high concentration.

HHI Score <1,500
Competitive 1,500 to 2,500 Moderate concentration >2,500 Concentrated 10 For instance, see Bhargava et al (2017).See also Nocke and Whinston (2022) on horizontal merger thresholds.Their theoretical and empirical "results suggest that screens should likely focus much more on the merger-induced change in the HHI than on its post-merger level." The HHI approach is well-established and might also be calculated year-by-year to provide a more dynamic picture of how concentration has changed over time.But the alternative power-law slope changes estimated over time provide a similar but more flexible and direct way to visualize changes in market dominance for any number of products, services, and sectors for which data are available.As compared to the HHI, the power-law methodology is much more adaptable to fast-changing ranks in product and service sector categories.Such playing with power law curves is thus somewhat like moving from a snapshot to a film strip.The methods are, however, not mutually exclusionary and would in many instances likely to be used in conjunction with each other.

New Metrics
The power-law approach to concentration can be further refined by calculating, as has been already shown, the MoMs and then (somewhat arbitrarily) postulating that +1 standard deviation above the MoM is an indication of high concentration.This then provides a more statistically definitive and directly applicable metric than does the relatively static HHI.For the 2017-2020 box office ranking sample shown in Table 2, the MoM + 1 standard deviation is 48.8.But that's not all.Yet more insight is provided by taking the arithmetic year-to-year differences of the index values in each rank-row.Only the top row, starting at 100.0 will have zero such differences which for the other rank-rows can be either positive or negative.For instance, in this box office example of Table 2 there are 4 years and thus 3 differences for each row's (i.e., rank) index value.The average of these rank-value differences, 20 for each year, is -8.87.11 If the aggregate difference mean is relatively large it suggests a greater volatility and less rigidity of segment structure ranks of over time.
Panel data analysis provides another approach in measuring changes on a unit level.In many panel data studies, things like household incomes are being tracked over time' the equivalent here is annual rank position.To do this, the original box office data were transposed from columns to rows and then entered into an Eviews panel data file (balanced and stacked).The resulting slope coefficient involving 20 rankings over 41 years (1990-2020) is -19.6.

ADDITIONAL DATA SETS
The methodology further allows for statistical comparisons against other sectors.Although none of the data sets are as large as for domestic box office, they are all of a size that is sufficient to yield interesting and useful information.Five of the largest readily available data sets that also happen to be a part of the North American entertainment industry are in Broadway show attendance, tv network (broadcast and cable) viewership, theme park attendance, popular music concert grosses, and video games.All have somewhat similar economic sensitivities and characteristic features.
But to gain broader perspective and contrast, the methodology was further applied in the same manner to several additional selected data sets from dissimilar sectors.The annual power law curves for these sectors ought to thus present a much different power law profile than in entertainment and media.
The chosen additional sectors include rankings of international passenger traffic at major airports, international visitor arrivals by country, temperature and precipitation in the U.S., and major-bank holding company assets.Given that airports are in fixed locations and involve extremely large long-term capital investments, the rankings for the top 20 are not likely to change rapidly and are thus unlike those in films, shows, tv viewership, and concert grosses, in which the investments are relatively much smaller and the perishability of rank position index values over time is much greater. 12Meteorology/climate data provide another interesting application.Average annual precipitation (rain, snow, sleet) and average annual temperatures in Fahrenheit were ranked for the years 2005-2020 for 49 states (ex-Hawaii which is in the middle of the Pacific and has its own and much different weather conditions).
Although the Arctic and some other areas in the world are apparently warming, the analysis shows that for the 49 states average precipitation and temperatures since 2005 (Fig. 5a) have not trended notably higher and neither have the annual slope coefficients (Fig. 5b).Of the two, precipitation is clearly more variable from year to year (Table 3).For both precipitation and temperature, the geography obviously doesn't change, but atmospheric conditions and ocean currents (e.g., La Niña and El Niño) shift frequently and will importantly influence the annual meteorological outcome.

a)
12 As of 2021, capital investments in films generally range from $50 million to $400 million if advertising and marketing.For shows and concerts the investment might be as high as $75 million.But with few rare exceptions (e. g., Phantom of the Opera) the bulk of the returns are generated within two or three years.Stock market features are also of potentially great interest.Here there are myriad opportunities to discover anomalies and performance features that have to date gone unnoticed.Some comparison might for instance include rankings of growth versus value stocks, first-hour trading gains or losses versus those in the last hour, different features of the S&P 500 versus the NASDAQ or the DJIA, and gains and losses in different classes of bonds, real estate and commodities.This method would also allow portfolio managers to more easily visualize concentration features among the assets held.
One simple illustration is to annually index the twenty largest ranked percentage gains and losses of the S&P 500 (Fig. 6) since 1926.Gains have a power law coefficient of -0.39 and a MoM of 59.4, whereas losses have a coefficient of -0.30 and an MoM of 46.6.Losses as compared to gains are thus more extremely concentrated in the very top rankings and with gains more diffused over the lower ranks than are losses. 13 13 Crash features are discussed in Gabaix (2016) and in (Vogel 2021).
-  Another way to visualize differences between sectors is to compute the standard deviation across all estimated coefficients.A comparison of such calculations that include all the preceding eleven sectors appears in Table 3.The largest and therefore most volatile sectors are for precipitation, airport passengers and temperatures, with park attendance and tourism arrivals the least volatile.This suggests that park admissions and tourism arrivals have over time been relatively more stable and predictable (at least pre-pandemic) than weather conditions and airport passenger traffic.
Lastly, panel data methods can further provide economic insights.In this application, the rank positions can be seen as equivalent to households being tracked over time.The stacked balanced panel data structure allows for addition variables to be included in the estimation.For instance, North American theme park admissions appear to be correlated to real Services for Science and Education -United Kingdom disposable income per capita. 14A summary of each sector's power-law characteristics appears in Table 4.
The ultimate payoff for all these calculations, however, is what can be called a characteristic value -the MoMs--that each data set's matrix generates.Every sector or market segment will have a different characteristic value that changes slowly over time: And the higher the characteristic value, the more concentration is evident.If enough sectors are included, a useful statistical profile is generated and illustrated in a scatter diagram as shown in Fig. 7.
A further extension for an entire economy would then also be to use the MoMs for a selected and standardized set of sectors by calculating the mean of all such sector MoMs, (for short, 3Ms).This would thereby provide a new metric by which to gauge relative concentrationpositions of national economies (and conglomerate companies too).

Fig. 7. Selected segment comparisons of slope coefficients versus MoMs
Here each industry or segment has a distinct numerical identifying feature (MoM), with dots in the lower right quadrant possessing more annual volatility ( shorter time spans.15Tourism destinations are also similar in that they are faddish and their popularity will depend on a diverse set of factors that have in the past included political upheavals, volcanic eruptions, public health considerations, and relative price changes due to currency fluctuations and local monetary policies.
In banking, the top three bank holding companies in the U.S. remain dominant in total assets over long periods.There is not much annual change in rankings as their total assets mostly grow in tandem: In banking there is a great "stickiness" in asset retention, which differs notably from the short-term faddishness and flightiness that is typical of audience preferences in entertainment and media-related items.Also in banking, the average slope coefficient is relatively flat for the remaining seventeen rankings because there is a solid and thick middle tier of banks with approximately similar but much smaller asset totals.This substantially lowers the MoM.The banking sector in particular generates much data that can be ranked in this way.

SUMMARY
Power law curves are everywhere features for all industry segments, sectors, and market shares.And the relatively simple statistical comparisons that are derived by playing with such curves can provide useful economic insights that might have otherwise not been noticed and/or have been overlooked.Among these are changes in sector or market share concentration metrics that are adjunctive to the standard HHI concentration measurement approaches.
The methods illustrated here also provide a relatively slow-to-change metric (MoM) that directly characterizes each sector's degree of fixed-rank positioning: Generally, but not always (e.g., bank assets), the higher the number, the less movement in ranking over time.It is thus not surprising that film, television viewing, and concert gross ranks are much more fluid and flexible than the very large long term capital expenditure operations required to support airport passenger traffic.
. This methodology can in future research be extended and refined and more broadly applied in many different fields (e.g., health services, medical sciences, insurance, education, sports leagues and relative player performance rankings, airline marketing and scheduling decisions, crime concentrations, energy categories, portfolio concentration studies, and mergers and acquisitions).Potential applications are limitless.

Fig. 2 .
Fig. 2. Professional sports team value rankings, 2017 and 2020, top seven by name, in $ billions based on Forbes team valuation data

Fig. 5 .
Fig. 5. Annual weather trends in 49 U.S. states, 2005-2020; a) precipitation in inches and temperatures in Fahrenheit degrees; and b) slope coefficients

2020 2017 Cowboys Yanke e s M an U Barce lona Re al M adrid Patriots Cow boys Yanke e s Knicks Lake rs Warriors Re al M adrid $ billions Patriots NY Knicks Archives of
Business Research (ABR) Services for Science and Education -United Kingdom an example, Gone With the Wind, released in 1939, is still the largest grossing inflation- adjusted (in 2020 dollars) title, with lifetime gross of $1.895 billion and estimated number of ticket sales of 202.3 million.The title in the second inflation-adjusted position is 1977s Star Wars Episode IV: A New Hope, which took in $1.669 billion from 178.1 million ticket sales.On this basis, meanwhile, a much more recent box office winner, The Avengers: Infinity War released in 2018, generated $678.6 million on ticket sales of 72.4 million and ranked 36 on the all-time adjusted gross list. 8

Table 2 . Estimation statistics for dbo, 2017-2020
+1 SD = 50.7 Archives of Business Research (ABR) Services for Science and Education -United Kingdom Vogel, H. L. (2022).Playing With Power-Law Curves: A New Way to Analyze Market Structures and Sectors.Archives of Business Research, 10(8).158-174.

Table 3 . Standard deviation ranking of annual regression coefficients for selected sector sample periods
Table 3) than those in the upper left quadrant.Passenger transit through the largest airports does not, for example, usually change much (in non-pandemic eras) from year to year as global routes, landing slots, very large capital investments, and fixed geography are decisive factors.Vogel, H. L. (2022).Playing With Power-Law Curves: A New Way to Analyze Market Structures and Sectors.Archives of Business Research, 10(8).158-174.