Followers do not dictate the virality of news outlets on social media

Abstract Initially conceived for entertainment, social media platforms have profoundly transformed the dissemination of information and consequently reshaped the dynamics of agenda-setting. In this scenario, understanding the factors that capture audience attention and drive viral content is crucial. Employing Gibrat’s Law, which posits that an entity’s growth rate is unrelated to its size, we examine the engagement growth dynamics of news outlets on social media. Our analysis includes the Facebook historical data of over a thousand news outlets, encompassing approximately 57 million posts in four European languages from 2008 to the end of 2022. We discover universal growth dynamics according to which news virality is independent of the traditional size of the outlet. Moreover, our analysis reveals a significant long-term impact of news source reliability on engagement growth, with engagement induced by unreliable sources decreasing over time. We conclude the article by presenting a statistical model replicating the observed growth dynamics.


Introduction
Originally designed for entertainment, social media platforms have evolved into significant channels for information dissemination [1][2][3][4][5], altering traditional agenda-setting dynamics [6][7][8][9].In this competitive landscape marked by many information sources, we aim to uncover the determinants of audience attention and the factors contributing to content virality [10], that is the propensity of content to achieve rapid diffusion and high engagement levels on social media platforms [11][12][13].Indeed, social media often dictate which topics become prominent while others are overlooked [8,14].As online users tend to favor information aligning with their existing beliefs, commonly ignoring opposing viewpoints [15][16][17], this behavior can create and reinforce online 'echo chambers' [18]-digital clusters of homogeneous thought where narratives are collectively shaped and solidified [19][20][21].The magnitude of the echo chamber phenomenon and its consequent effects on polarization may vary among social media platforms [22].Furthermore, many platforms implement algorithms designed to prioritize user engagement that might alter information spreading [23][24][25], thereby exacerbating ideological divisions [26][27][28].The rise of the attention economy is at the heart of digital discourse transformation [29][30][31][32].In this economy, a broad spectrum of content creators, ranging from news outlets to individual influencers, vie for limited users' attention [33][34][35][36].Like traditional market evolution, digital stakeholders chase user engagement, converting this captured attention into tangible revenues through advertising, service offerings, and subscription models [37,38].With revenues closely linked to audience reach and engagement, understanding the growth mechanisms of digital content creators is crucial.Our research aims to unravel the dynamics of the digital ecosystem, focusing on the evolution of content consumption and audience reach.We anchor our analysis in Gibrat's Law [39], originally formulated to explain traditional business growth, extending its application to the digital domain.The foundational premise of this law, positing that a firm's growth rate is independent of its initial size, has found relevance in various realms beyond business, like the growth patterns of city sizes [40][41][42].While various studies have explored Gibrat's Law across different contexts, yielding mixed methodologies and results [43][44][45][46], its implications for digital domains remain unexamined.Focusing on the supply and demand of news in the attention economy of social media platforms, we aim to determine whether the principles of proportionate growth hold in social media news dissemination.We systematically study the growth patterns of news outlets on Facebook, comparing their growth to audience sizes over different periods.For a deeper understanding of news engagement on social media, we obtain a list of news outlets from NewsGuard [47], an entity recognized for tackling misinformation by assessing the credibility and reliability of news sources.After selecting all the news outlets with a Facebook account listed on NewsGuard, we use their Facebook URLs to gather their data from CrowdTangle [48], a Facebook-owned tool that monitors interactions on public content from Facebook pages, groups, and verified profiles.This effort provides a comprehensive dataset: the Facebook historical data, from 2008 to the end of 2022, of over 1000 news outlets across four languages -English, French, German, and Italian.Thanks to the post-level granularity of our dataset, we can measure the growth of pages' metrics on various timescales by aggregating data according to a broader or narrower time window (daily, weekly, monthly, and quarterly), providing robust insights into online news outlets' growth dynamics.The paper is structured as follows: Initially, we define our analysis framework, investigate the growth regime, and assess its dynamics.Next, we introduce a stochastic model to replicate the observed growth patterns, illustrating the consistency between results and empirical evidence.Finally, we compare the growth of news outlets based on their information quality.We find that the ability to create viral content and capture widespread attention is untied to the size of the information provider.Engagement follows a universal growth pattern in short-term intervals.Contrary to common belief, we observe that Followers count is not a reliable measure of a page's peaks of influence; the impact on engagement becomes apparent only over extended periods.Additionally, we discover that the unreliability of a news source negatively affects engagement growth in the long term.

Results
We start by defining our framework of analysis.The simplest growth model, proposed by Gibrat [39], states that a given company's proportional growth rate is independent of its absolute initial size.His assumptions can be formalized by the following random multiplicative process for the size S: where t ≥ 0 is time, ∆t > 0, S t+∆t and S t are the sizes at time t + ∆t and t, respectively, and ϵ t is a random variable coming from an i.i.d.stochastic process uncorrelated to S s (0 ≤ s ≤ t) having mean µ and standard deviation σ.Due to the generic formulation of the original model, we adapt its interpretation to achieve a meaningful application in the context of social media.In terms of information spreading, virality refers to the rapid and widespread dissemination of information or content.By focusing on the extent and impact of the diffusion, virality refers to content engagement exceeding typical expectations, reaching a high level of users and interactions.In our analysis we focus on the latter facet, characterizing the growth of content performance with respect to the size of its source.The analysis is performed on two key metrics: Followers and Engagement.We first define how to assess page size and performance on social media, and our timescales of analysis.Then we evaluate the growth regime of both metrics concerning size for each timescale.

Metrics and Methodology for Social Media Page Analysis
In evaluating whether the size of a page affects its growth, we first need to establish how to measure the size and its performance.In the study of social media platforms, notably Facebook, we primarily rely on two metrics: 1) Page Followers, the number of users subscribed to a given page at the time of posting, representing a metric of reach, and 2) Engagement, encompassing the total number of users' interactions with the page's posts (that is, the sum of Likes, Comments, and Shares).The size of a page is typically inferred from its Followers count, a standard measure on such platforms.
The alternative would be using Engagement, but such a choice would introduce undesired issues.Indeed, the engagement definition is inherently ambiguous: it can be a cumulative sum of interactions over the entire lifespan or a count over a specific duration, such as a week.With period-specific measures, pages' size could fluctuate too widely (spanning even across orders of magnitude), thus leading to interpretational challenges.Conversely, using a cumulative engagement count to quantify size may over-represent past performances.Consequently, we opt to use Followers as a more stable representation of page size.On the other hand, Engagement represents the page performance in terms of users' attention.Our analysis leverages varied timescales to observe growth patterns.Specifically, we consider four time-granularity: daily (D), weekly (W), monthly (M), and quarterly (Q).Therefore, for each page, we consider both metrics, Followers, and Engagement, according to different time windows.To measure engagement, we consider aggregated data depending on the timescale of the analysis.Focusing on the total attention received by the news outlets, we consider higher total engagement as a higher users' attention, regardless of the number of posts.Such interpretation relies on the fact that publishing more posts does not lead to more engagement if users are not interested in a topic.Likewise, getting engagement with several posts implies high attention, and using the mean value would underestimate the latter.We further validate it by showing that both measures bring equivalent results.For Followers, since they already are a cumulative value, we take only a representative data point in the time window, depending on the chosen timescale (see Materials and Methods for further details).Transitioning between these scales offers diverse and new perspectives on growth dynamics.In the case of Followers' growth, daily measurements are deemed unsuitable due to limited variability, whereas all four scales are relevant for Engagement analysis, since news outlets are very active accounts and usually have multiple posts per day.

Assessing the growth regime
Based on the definition of Followers and Engagement, and according to (1), we refer to Followers and Engagement growth, respectively, as where the superscripts (F ) and (E) point to an intuitive notation for the process ϵ with mean µ F and µ E and standard deviation σ F and σ E for Followers and Engagement, respectively.Time measures the different timescales: D, W, M and Q.In (2), F t is the number of Followers at time t while E t represents the number of interactions generated at time t.In the same way, (1 + ϵ t ) are the growth rates of F t and E t , respectively.As Gibrat's Law was originally intended to explain the emergence of a log-normal distribution of sizes, we first assess that Followers and Engagement distributions comply with this assumption.Since Followers' records on CrowdTangle start from 1/1/2018 and stop on 31/12/2022, hence relying on a fiveyear timespan of analysis, we take into consideration such a period for the relationship Fig. 1 (A-C) p-values of Mann-Whitney U tests between classes of size for Followers and Engagement growth rate distributions.Panel titles indicate the metric being tested and the metric according to which we determine the size.Row and column headers represent the class size.Bold numbers represent p-values for which we reject the hypothesis that the growth distributions do not differ, with the alternative hypothesis that the smaller class grows at a higher rate.For readability, 0 represents p-values smaller than 0.0001.
between Followers and Engagement growth.For this reason, when we consider the metrics of Followers and Engagement jointly, we restrict our analysis period to 01/01/2018 -31/12/2022.In the analysis in which we do not account for Followers' value, we consider the entire 15-year timespan, ranging from 01/01/2008 to 31/12/2022.See Processing Methods Section and Fig. S1 in SI Appendix for further details.Fig. S2 in SI Appendix shows distributions of both metrics at the start and end of the considered period.To assess whether growth rate distributions vary based on page size, we define four classes of pages based on their Followers, so as to have comparable populations between them over the entire period.The considered four classes of Followers are: 10K-50K, 50K-150K, 150K-500K, and 500K-5M.The bin boundaries were defined by jointly considering two aspects: a comparable number of pages between classes and actual values of Followers for which it was reasonable to account for a page as small, medium, large, or very large.As a robustness check, we reported the clustering in Fig. S3 in the SI Appendix from which our classes and the clustering ones are predominantly overlapping.In evaluating growth regimes, we posit that the absence of size effects should, as an initial assumption, result in comparable growth rate distributions across different classes.To compare growth rate distributions among different classes of size, we apply, to each pair of bins, a Mann-Whitney U test on both metrics and for different timescales.Results are reported in Fig. 1, while Fig. S4 in SI Appendix reports p-values of the two-tailed tests.We first inspect the case of Followers, reported in Fig. 1A.Statistical tests across observed timescales consistently demonstrate that smaller pages experience greater Followers' growth than their larger counterparts.This evidence counters the notion of proportionate effect growth as described by Gibrat.In contrast, the growth dynamics of Engagement provide intriguing insights.Specifically, in Fig. 1B, the growth regime is influenced by the duration of the observed timescale.In short-term observations (daily and weekly scales), engagement variation is consistent irrespective of page size.Thus, from a micro-level perspective, engagement adheres to a universal growth regime, independently of page size.However, as we transition to a monthly scale, a deviation in the regime emerges, with smaller-sized classes outperforming their larger counterparts.This deviation becomes definite on a quarterly timescale, underscoring the influence of size on long-term engagement growth.
For a comprehensive perspective, we recalibrated our Engagement analysis, reported in Fig. 1C, categorizing size based on Engagement metric.Bins are delineated by the quartiles of Engagement distribution across all pages within a specific timescale, after trimming between the 5th and the 95th percentiles.Our analysis indicates that, in this case, the system predominantly diverges from Gibrat's law of proportionate effect.Exceptions are noted for middle-sized pages (those within the 2nd and 3rd quartiles) on a quarterly timescale.Thus, in short-term observations, Engagement consistently depends on its recent performance, irrespective of page size, while the influence of Followers becomes evident with the increase of the observed timescale.We note that our results show correspondences with evidence from prior studies in different domains, such as the distribution of growth rates displaying a 'universal' form that does not depend on the size [49], and the system experiencing a growth regime transition as the timescale widens [50].These findings bear significant implications.Notably, in the short term, size does not dictate the probability of engagement growth.Extrapolating this to individual posts suggests an egalitarian landscape where every news item, irrespective of its source or the number of its Followers, has an equal propensity to go viral, that is to suddenly gain disproportionate engagement.Consequently, the mere count of Followers proves inadequate in gauging the page's potential influence.

Analyzing Engagement and Followers Dynamics
Empirical evidence suggests that the logarithm of many growth rate distributions often takes an exponential form.Consistently with prior studies [51,52], our analysis reveals that the logarithm of Engagement growth rates adheres to a particular exponential distribution known as the Laplace distribution.In contrast, the growth rates for Followers display an asymmetry, exhibiting a right-skewed distribution.Our  analysis suggests a fit with a heavy-tailed distribution, specifically the Burr distribution [53].Since the Burr is exclusively defined for positive values, we here employ the absolute growth rates, rather than their logarithms.A visual comparison between the observed and fitted distributions is reported in Fig. 2 (see Materials and Methods section for details of the fitting procedure).The matching of the empirical distributions of Engagement growth with the Laplace brings significant upshots.As pointed out by previous works [54,55], growth phenomena could display a non-trivial relation between the positive and negative side of its rate distribution.The detailed balance property, or time-reversal symmetry, states that the empirical probability of changing size from one value to another is statistically the same as that for its reverse process.
Statistical tests provided evidence of how the Engagement's short-term fluctuations adhere to a universal distribution, independently of the page size, with µ E → 0 when passing from timescale Q to timescale D. Fig. S7 in SI Appendix shows parameters variation according to timescales for the considered size classes.Furthermore, the symmetry property of the Laplace distribution, with µ ≈ 0, directly implies the validity of detailed balance.We can draw two significant implications from this outcome.From a viewpoint of interpretation, assessing these statistical properties of short-term engagement provides a deeper understanding of news consumption dynamics, which can influence how news providers act to handle users' fluctuating attention.From a technical standpoint, ascertaining the universality of this dynamic and the probability distribution that describes it enables us to exploit it as a proxy for defining and detecting virality, namely gaining disproportionate engagement.

Modelling growth
Our empirical findings show that the impact of size on growth becomes evident only if observed through larger timescales and that the growth pattern is universal at the micro-level.Knowing the distributions that define the evolution of our metrics allows us to evaluate the variation of their parameters according to size and timescale.For each time scale, we can model parameters of both growth rate distributions based on Followers and Engagement values.The regression coefficients are reported in Tab 1.
We can thereby simulate growth on the chosen timescale, given two starting values of Followers and Engagement, F 0 and E 0 , by iteratively sampling growth rates value from the distributions modeled using the parameters specified in Case 1 and Case 2.

Case 1: Engagement growth
We consider the dynamics described in (3) with ϵ (E) following the Laplace distribution in (8) (see Materials and Methods), with: Case 2: Followers growth We consider (2), being ϵ (F ) the logarithm of the growth rate behaving according to the Burr distribution in (10) (see Materials and Methods), with: Results of simulations are shown in Fig. 3, representing the evolution of both metrics for different starting sizes (Followers) on three timescales used for the analysis (W, M, and Q).We selected three starting sizes representing pages with low, medium, and high number of Followers, i.e., 25K, 250K, and 1M, respectively.As results show, by observing the system on a weekly timescale, the engagement shows a basically steady evolution for all three sizes.As we extend the observed timescale, by passing from weekly to quarterly, the engagement growth of small pages begins to exhibit  convex behavior, while the growth curve of big pages shifts toward concavity, providing evidence of how Followers impact the engagement evolution only over long-term intervals.On the other hand, Followers of smaller pages always grow faster than bigger ones in each timescale.Our simulations consistently match empirical evidences.Despite the model's engagement growth probability being based on Followers, the universal characteristic of the process at the temporal micro-scale level is evident.These results highlight the limitation of using Followers as the sole metric to gauge overall page influence.Short-term outcomes seem to derive from a uniform stochastic process, possibly elucidating the influence of algorithms on user news consumption behaviors.While this suggests an environment where all content providers might be on an equal stand regarding visibility, it also necessitates continuous monitoring to mitigate the spread of harmful content, such as misinformation.Basing influence assessments solely on the number of Followers can lead to oversight.The potential presence of 'one-time' or 'hidden influencers' -entities with a disproportionate influence relative to their Follower count -needs attention.The missing of a clear engagement effect on Follower growth, the lack of significance of β2 on Burr's parameters variation, further emphasizes this, indicating that heightened interactions do not necessarily translate to a corresponding increase in Followers or sustained reach.

Growth and Information Quality
The effects of external factors on engagement growth manifest in the long term.Potential explanations for differences in page growth could be plenty, though one of the most relevant for society is the propensity of news outlets to produce unverified news and misinformation.For this reason, we conclude our analysis by comparing two subsamples of pages representing reliable and questionable ones.Since we do not account for Followers value, this analysis encompasses the entire pages' lifespan.The classification is performed based on reliability scores provided by Newsguard.Since our dataset comprises 898 reliable sources and 131 non-reliable ones, we performed a sampling of 131 reliable sources to obtain two comparable samples.See Materials and Methods and Fig. S8 in SI Appendix for further details about the reliability ratings and the sampling procedure.Fig. 4 shows growth rate distributions of engagement and their evolution across the various timescales.Tab.S1 in SI Appendix shows the p-values of Mann-Whitney U tests between the two sub-samples, as in our previous analyses.Both graphic representations and tests display how the trustworthiness of the news source plays a crucial role, as the engagement of unreliable pages progressively decreases as the time scale widens.Anew, the short-term fluctuations follow a universal dynamic, and neither the reliability turns out to determine growth differences.The long-term divergence, which may result from both users' behavior and platform moderation policies, along with the inherent randomness of short-term fluctuations, highlight the importance of continuous efforts to monitor the consumption of sensitive content, such as misinformation.

Conclusions
In historical media landscapes, prominent news outlets predominantly influenced agenda-setting, their reach determining the flow and focus of public discourse.However, the emergence of social media platforms-designed more for entertainment than information spreading-has reshaped this dynamic.While many assume larger outlets and their inherent reach would dominate social media discourse, our research challenges this perspective.Analyzing engagement metrics across diverse news outlets on Facebook, we find that news virality, namely a disproportionate growth of engagement, is not strictly tied to the traditional size of the outlet.Instead, a myriad of factors may drive online discourse: rapid user engagement [56], the reinforcing nature of echo chambers [22], the amplifying power of influencers [12,57], the emotional resonance of content [18], and even artificial amplification via bots [58].This complex web of drivers, some of which exhibit random behaviors, defies conventional models of media influence.Indeed, understanding the dynamics of the attention economy is pivotal for charting the trajectory of content creators on platforms like Facebook.In this work, we analyze a massive dataset composed of 57 million posts comprising the entire Facebook history, spanning 15 years, of over 1000 news outlets.In particular, this study took a deep dive into these dynamics, evaluating the applicability of Gibrat's Law -a principle traditionally applied to business growth -in social media content creation.Empirical results provided a nuanced understanding of growth patterns.We observe that the likelihood of generating viral content and capturing widespread attention is independent of the information provider's size.Indeed, engagement adheres to a universal growth pattern in short-term intervals.This pattern shifts as the analysis extends to longer timescales like monthly and quarterly intervals, where size effects begin to manifest.We validated this dynamic by comparing news outlets' growth based on their information quality, providing evidence on how, though the unreliability of the news source negatively impacts engagement growth, its effect only manifests in the long term.Another significant observation challenges conventional wisdom: Followers count is not a sufficient indicator of a page's potential influence, and its actual impact only emerges over extended periods.Our examination of growth dynamics further elucidated these insights.After detecting their probability distributions, we evaluated their behavior according to size and timescale.We developed a stochastic model validating our empirical findings, emphasizing that Followers do not always depict actual influence or engagement potential in the short term.This brings broader implications in the context of agenda-setting dynamics in the social media era.Our study shows that contrary to traditional media, influence is not strictly tied to size or following in the digital realm.This stochastic nature of short-term engagement suggests an environment where all content, irrespective of its source, stands a roughly equal chance of capturing attention, possibly elucidating the influence of algorithms on users' news consumption.This democratization of potential attention influences how narratives and agendas are set, with even smaller entities having the power to shape discourse.However, it also emphasizes the importance of vigilant monitoring mechanisms, given the risk of rapid misinformation or harmful content spread.Our research highlights the intricate dynamics of growth in the digital attention economy, revealing how traditional metrics may not align with real-world influence.It also offers key insights into how the modern agenda-setting dynamics are being reshaped in the era of social media.These findings are precious for content creators, platform designers, and policymakers as they navigate the complexities of the digital age.

Selecting Followers' value
In determining the Followers' value for each time window, we selected the closest observation to a given time point of the window, which we referred to as a 'representative point in time' since it varies depending on the timescale.For the weekly scale, we selected the value on the minimum observed date of the week.For the monthly scale, we selected the value of the closest observed date to the central point of the month (the 15th day).For the quarterly scale, we selected the value of the farther observed date.

Labeling of Media Sources
The reliability labeling of news outlets is based on the trust ratings provided by Newsguard [47].Each site is rated using nine basic, apolitical criteria of journalistic practice, related to credibility and transparency.Based on the nine criteria, each site gets a trust score of 0-100 points.NewsGuard labels the source as Trustable if the resulting score equals or exceeds 60.The total number of news outlets for which we have a trust rating is 1029.

Parameters Estimation
Here we provide details about the fitting procedure of distributions reported in Fig 2 of sections Analyzing Engagement and Followers Dynamics, and Modelling Growth.

Laplace Distribution
The probability density function of the Engagement growth rate is the Laplace distribution, expressed as: where x ∈ R and µ and b are parameters to be calibrated.In this respect, the parameters of the Laplace distribution can be derived analytically from the mean µ X and standard deviation σ X of the empirical distribution X, since

Burr Distribution
The probability density function for Followers' growth is described by the Burr distribution, whose density is: where x ∈ R and c and k are scalars to be calibrated.In particular, such parameters are evaluated by fitting the empirical cumulative distribution function of the observed growth rates with the Burr's one.

Regression of distribution parameters
To model parameters variation according to Followers and Engagement values, we first applied the fitting procedure described above to the growth distributions of the sub-samples obtained by binning based on Followers and Engagement, after trimming within the 5th and 95th percentiles of our observed distributions, in each timescale.
After obtaining the parameters of the Laplace and Burr distribution of each subsample, we performed the parameter regression as described by equations [4], [5], [6], and [7], of section Modelling growth.

Sampling of reliable news outlets
According to NewsGuard ratings, our dataset comprises 898 reliable sources and 131 non-reliable ones.We performed a sampling of 131 reliable sources to have two comparable samples.To obtain similar structural characteristics that are not being tested, namely Followers and Page's lifespan, the distance is computed using the maximum number of Followers and the page's creation date as distance variables, since most pages' last observations coincide with the end of the analyzed period.The resulting sample is obtained by computing the Euclidean distances of all the possible couples of Questionable and Reliable pages in a two-dimensional space, using Followers and Lifespan as space variables.Then we selected the partition of 131 Reliable pages for which the sum of their Euclidean distances from the 131 Questionable pages was minimized.
3 Author contributions statement E.S. and M.C. designed the paper; E.S. performed data collection and analysis; M.C., R.C. and W.Q. supervised the project; All authors wrote the paper.

Data availability
The data collection and analysis process are compliant with the terms and conditions imposed by Crowdtangle [48].Therefore, the results described in this paper cannot be exploited to infer the identity of the accounts involved.CrowdTangle does not include paid ads unless those ads began as organic, non-paid posts that were subsequently "boosted" using Facebook's advertising tools.It also does not include activity on private accounts, or posts made visible only to specific groups of followers.
5 Supporting Information

Data Collection
We download our data from CrowdTangle, a Facebook-owned tool that monitors interactions on public content from Facebook pages, groups, and verified profiles.
CrowdTangle is accessible to researchers upon request at this link https://www.crowdtangle.com/request.We obtain the list of news outlets employed in the analysis via NewsGuard, which provides, for each outlet, more than 30 distinct categories of descriptive metadata, including a breakdown of their assessment, an indication of its political slant, descriptions of the topics-or types of misinformation-it covers, and more.After selecting all the news outlets with a Facebook account listed on News-Guard, we use their Facebook URLs to gather their data on CrowdTangle.Using the tool 'Historical Data' provided by CrowdTangle, we download the entire history of each page from its creation date as a table containing information regarding each posted item in chronological order.

Processing Methods
With a post-level granularity, the table's columns include several relevant fields, including the post type (link, image, video), its text, the number of reactions to the post, and more.Among these columns, we select two relevant ones: Total Interactions and Followers at Posting.The Total Interactions column contains the total number of reactions per post (that is, the sum of Likes, Comments, and Shares) that we aggregate (sum) depending on the time scale of the analysis.The Total Interactions are thus our metric accounting for the Engagement.These data are available for the whole page history, that is, for the longest-running pages, from the beginning of 2008 to the end of 2022, when we downloaded data.Since pages' creation dates span through time, our analysis of growth rates is independent of the creation date once accounted for a sufficient life span and activity of the page.The cumulative time series regarding existing pages starting to be active on Facebook is reported in panel A of Fig. 5.The number of pages included in the dataset is 1082, and 94 % of them were created before 01/01/2018.The second variable we use for our analysis is Followers at Posting, which represents the number of users subscribed to a given page at the time of posting.This information, however, is only available for posts made since 01/01/2018.Before that day, CrowdTangle was not collecting such information, or it is not sharing it with end users as of now.For this reason, when we consider the metrics of Followers and Engagement jointly, we restrict our analysis period to 01/01/2018 -31/12/2022.In the analysis in which we do not account for Followers' value, we consider the entire 15year timespan.Quantitatively speaking, in panel B of Fig. 5, we report the evolution of the number of posts we consider in the analysis, which is around 57 million, and the Total Interactions, which is around 21 billion, over time.We note from 1/1/2018 onward, 63 % of posts and 56 % of Total Interactions were produced.Cumulative Engagement Density Fig. 6 Distributions of both possible size indicators, Followers and Cumulative Engagement, of the entire sample at the start and end of our analysis period.Both distributions manifest as heavy-tailed, here displayed on a logarithmic scale.9 We recalculated the Engagement using its mean value and repeated the tests reported in Fig. and 1C. (A-C) p-values of two-sided Mann-Whitney U tests between classes of size for Followers and Engagement growth rate distributions.Panel titles indicate the metric being tested and the metric according to which we determine the size.Row and column headers represent the class size.Bold numbers represent p-values for which we reject the hypothesis that the growth distributions do not differ, with the alternative hypothesis that the smaller class grows at a higher rate.For readability, 0 represents p-values smaller than 0.0001.

Rating
Questionable Reliable Fig. 12 Scatterplot of Questionable and Reliable pages sub-sample.According to NewsGuard ratings, our dataset comprises 898 reliable sources and 131 non-reliable ones.We performed a sampling of 131 reliable sources to have two comparable samples.The resulting sample is obtained by selecting the reliable pages for which the overall Euclidean distance from the non-reliable sample is minimized.We aim to achieve similar structural characteristics that are not being tested, namely Followers and Lifespan (here shown in days).Therefore, we compute distance using the number of Followers and the page's creation date as distance variables since most pages' last observations coincide with the end of the analyzed period.
class (row) grows less than smaller class (column) As a robustness check, in Fig S5 and S6 of SI Appendix, we reported two variants of the tests performed in Fig 1B, showing how the results still hold by using the mean engagement value or by changing the bin boundaries.

Fig. 3
Fig. 3 Results of growth simulation with different starting sizes.Sub-plot headers indicate the Followers starting value and the related timescale.Solid lines represent the mean cumulative distribution function value of the iteration time, shades represent the corresponding standard error.

Fig. 4 A
Fig. 4 A) Comparison of Engagement growth rate distributions of Questionable and Reliable pages for different timescales.B) Mean growths of Questionable and Reliable pages across increasing timescales.

Fig. 5 (
Fig. 5 (A) Pages' creation across time.(B) Evolution of the number of posts and Total Interactions over time.

Fig. 7 (Fig. 8 (
Fig.7 (A-B) Partitions of PAM clustering of news outlets, based on Followers and Cumulative Engagement values at the start (1/1/2018) and the end (31/12/2022) of our analysis period, respectively.Solid vertical lines represent the range limits of our selected classes of Followers.(C) Frequencies of the size classes across the analysis period.

Fig. 10 Fig. 11
Fig.10As the growth dynamic emerges clearly across the different timescales, changing the bin boundaries should not lead to changes in the results.We repeated the tests reported in Fig.and 1C, dividing each original bin into two sub-classes of an equal number of observations by cutting the bin through its Median Followers Value.The independence of growth from size in the short term still holds, with none of the 28 tests showing differences in growth in both Daily and Weekly measurements.On the monthly scale, the three smaller classes grow faster than the three bigger ones, while on the quarterly scale, most tests show statistically significant p-values, except for the pairs of adjacent classes.(A-C) p-values of two-sided Mann-Whitney U tests between classes of size for Followers and Engagement growth rate distributions.Panel titles indicate the metric being tested and the metric according to which we determine the size.Row and column headers represent the class size.Bold numbers represent p-values for which we reject the hypothesis that the growth distributions do not differ, with the alternative hypothesis that the smaller class grows at a higher rate.For readability, 0 represents p-values smaller than 0.0001.
Comparison of observed and theoretical growth rate distributions for Engagement and Followers.Red lines denote theoretical densities obtained by fitting empirical ones, with labels D, W, M, and Q indicating Daily, Weekly, Monthly, and Quarterly timescales.

Table 1
Regression coefficients and p-values of Laplace and Burr distribution parameters estimation for different timescales.