What is the best time to tweet a journal article? Quasi-randomized controlled trial

Introduction: Social media users are often advised to time their posts to increase readership and engagement. Our objective was to ﬁnd out which is the best time of the day to tweet a journal article. Methods: From January 2020 to October 2021, 112 articles from a medical journal were posted on Twitter three times each, once in each language: Portuguese, Spanish and English. Up to two articles were posted each week, with each of the week’s tweets being posted in a different hour of the day: 06, 09, 12, 15, 18 or 21:00. Tweet impressions and URL clicks were the two outcomes of the Bayesian multivariate multilevel negative binomial regression models. Results: No pair of times of the day achieved a 95% posterior probability of including the best time to tweet a journal article, both for impressions and URL clicks. The expected outcomes, the ratio between standard deviations, and the explained variability (R 2 ) all corroborated that the time of the day is of little consequence when tweeting journal articles. Conclusions: Contrary to popular advice and pre-algorithm research, journal staff need not be concerned with optimizing the time of the day when they disseminate their content on Twitter.


INTRODUCTION
Since their inception, scholarly journals increased their focus on research articles1 , adopted external peer review (Baldwin, 2018), migrated to the Internet and become increasingly open access (Khanna, Ball, Alperin, and Willinsky, 2023;Piwowar et al., 2018Piwowar et al., , 2019)), to name some of the major changes.These changes have a profound impact on how scholars find the articles they read.Searching online is by far the most common strategy, while following links from tables of contents is decreasing and following links from social media is increasing (Gardner & Inger, 2021).The relevance of social media can also be inferred from a higher level of readership among scholars who are active on Twitter and other social media (Tenopir, King, Christian, & Volentine, 2015).Recognizing the relevance of Twitter and other social media, scholarly journals started using such platforms to announce their articles and promote their brands (Haustein, 2019;Nishikawa-Pacher, 2023).
Disseminating journal articles on social media seems to increase their readership and perhaps even citation rates, but the size of the effect is an open question.Dissemination in social media increased visits and full-text downloads by a factor of three and four, respectively, in Allen, Stanton, Di Pietro, and Moseley (2013), and four and seven in Widmer et al. (2019).On the other hand, disseminating journal articles on social media increased total visits by only about 50% in Maggio, Leroux, and Artino (2019), with a dubious effect on full-text downloads.Furthermore, the effect on visits was negligible in Fox et al. (2015) and did not improve much after intensification, as described in Fox et al. (2016).There is also wide variability in the effect on citation rates, with Luc et al. (2021) finding an increase by a factor of four and Ladeiras-Lopes, Clarke, Vidal-Perez, Alexander, and Lüscher (2020) finding an increase of 43 One aspect of dissemination in social media is at what time of the day is the content posted.Intuitively, the content should be posted just before followers are online, especially when they are prone to engage with the posted material (liking, forwarding, following links etc) and there is relatively less material being posted by other accounts they follow.This timing might be of extreme importance, since the average tweet has "18 minutes of fame", as measured by the time it takes to receive half the retweets it will ever receive (Bray, 2012, novembro 12).Studying general users of social media, Spasojevic, Li, Rao, and Bhattacharyya (2015) described how an aggregate index of engagement varied with the time of posting depending on the social media platform (Twitter versus Facebook), topic, and even city.In the same vein, Kumar, Ande, Kumar, and Singh (2018) studied engagement with posts on Facebook and found the daily profile to vary between topics.Unfortunately, neither study is particularly useful for scholarly journals, because of the variability between topics, and because neither of the reported topics was relevant to scholarly journals in general.There are also computer algorithms to optimize the time of posting on social media (Karimi, Tavakoli, Farajtabar, Song, and Gomez Rodriguez, 2016;Zarezade, Upadhyay, Rabiee, and Gomez-Rodriguez, 2017) but they depend on gathering large-scale data.
Any effort to time posts on social media should consider that dissemination is mediated by the platforms' algorithm.Twitter's algorithm, for instance, considers not only how recent the tweet is, but also the characteristics of the tweet itself, the author of the tweet, the reader, and the connection between them, as described in Koumchatzky and Andryeyev (2017, maio 9) 2 .Because in 2016 Twitter (Huszár et al., 2022) and Instagram (Titcomb, 2016, março 16) started sorting their users' timelines with computer algorithms3 , any research with previous data may not apply anymore.Therefore, we conducted a quasi-randomized controlled trial to find out what (if any) is the best time of the day for a scholarly journal to tweet about its articles.

Context
This study was carried out in the Revista Brasileira de Medicina de Família e Comunidade (electronic international standard serial number [ISSN] 2179-7994, linking ISSN 1809-5909) -RBMFC for short -the scholarly journal of Brazil's national society on the specialty (Fontenelle & Sarti, 2019).It is the most productive journal in terms of publications by family and community physicians from Brazil, even though they publish in a wide array of journals (Fontenelle, de Oliveira, Rossi, Brandão, & Sarti, 2021).RBMFC is a diamond open access journal, with about 60 articles per year in continuous publication.The journal is currently indexed in LILACS (Latin American and Caribbean Health Sciences Literature), DOAJ (Directory of Open Access Journals) and COCI (OpenCitations Index of Crossref Open DOI-to-DOI Citations), among others.The country's Federal Agency for Support and Evaluation of Graduate Education (CAPES, from the name in Portuguese) categorizes RBMFC in collective health rather than medicine, because that is the knowledge area of the postgraduate programs most of the journal authors are affiliated to (see Fontenelle, Rossi, de Oliveira, Brandão, and Sarti (2020) and Fontenelle and Sarti (2022)).The journal's Twitter account was created in 2013 and revitalized in 2018.Before the intervention, we used Botometer (Yang et al., 2019), formerly known as BotOrNot, to block followers deemed likely to be malicious (partially) automated accounts (i.e., bots), which decreased our count from 970 to 751 followers.According to Twitter's "audience insights" (a tool which has been retired since then), 71% of followers were male and 29% female; 66% spoke Portuguese, 36% English, and 22% Spanish; and 68% of them were from Brazil.Furthermore, the organic audience was 66% male and 34% female; 75% spoke English, 24% Portuguese, and 20% Spanish; and 21% were from Brazil, 18% from the United States, and 13% from the United Kingdom.Tweets about journal articles probably had a less international organic audience than this average, because they were less likely to mention or retweet content from foreign accounts.According to Followerwonk (https://followerwonk.com/),accounts that followed RBMFC on Twitter had peak activity at 15:00, and their lowest activity at 03:00, Brasília Time (UTC minus 03:00).Throughout the study period we continued to use Botometer (Sayyadiharikandeh, Varol, Yang, Flammini, and Menczer, 2020;Yang et al., 2019) to block accounts suspected of being malicious bots, and by December 2017 the account had 1,185 followers4 .According to Followerwonk, the peak and lowest activity hours of the follower accounts were still the same.

Overall design and randomization procedure
This was a crossover, quasi-randomized controlled trial.Starting on January 13, 2020, each article from RBMFC was tweeted three times, once in each language (Portuguese, Spanish, English).The three tweets were posted in a random combination of hour of the day (06, 09, 12, 15, 18 or 21:00, Brasília Time) and day of the week (Monday, Wednesday or Friday) within the same week.Each week saw up to two articles being tweeted, so each of the six times of the day was used up to once per week, and each of the three days of the week was used up to twice per week, but by different articles and in different hours.All articles from the study period were included in the study, even reviewer acknowledgments and errata.
The randomization table was prepared in advance and included in the preregistration (https://osf.io/avyb9).As journal articles were published (in continuous flow), they were assigned to the first available spot in the first available week, which is why the allocation is more accurately described as quasi-rather than fully randomized.When multiple articles were published on the same date, we followed the order communicated by the journal's executive secretary, who did not participate in this study and did not know what were the next available spots for scheduling tweets.The only changes in the study methods since registration were minor adjustments in the statistical analysis code, as described in the "Statistical analyses" section.

Intervention
Tweets were scheduled using TwitterDeck or Twitter Web.Each tweet contained the article's title, the DOI (Digital Object Identifier) and a PNG (Portable Network Graphics) conversion of the first page with the alternative text "first page" in the tweet's language.When the article was published in more than one language, we opted for making the PNG from the full text in the same language as the tweet whenever possible, and to use the English version by default when the article was not published in the same language as the tweet.Some tweets also mentioned the Twitter account of the article's authors and up to three hashtags indicating the subject.Sometimes the hashtags varied from language to language.For example, in English we used #MedEd instead of #medicaleducation (which would be a more direct translation of the hashtags in Portuguese and Spanish), and only in Portuguese did we use #SUS (for Brazil's Unified Health System).

Measures
The study had two primary outcomes: impressions (how often was the tweet viewed) and URL clicks (how often was the article link clicked).Both outcomes were obtained from Twitter Analytics (https://analytics.twitter.com/)at least one month after the corresponding tweet was posted.For example, the statistics for October 2021 were downloaded in December 2021.However, the statistics for January through August 2020 were downloaded only after the study was registered on August 23, 2020.When tweets started being randomized, the study had not been registered yet because we had already decided on the study design but not its outcomes and stopping rule.
Besides the manipulated variables, which were filled in advance during the (quasi-)randomization procedure, some explanatory variables were recorded observationally.These variables included the journal section hosting the article ("research articles", "clinical reviews", "case reports" etc.), the language of the article, the inferred audience of the article, the number of hashtags, the number of mentions in the tweet (categorized as 0, 1, 2, 3+), and the article id and tweet id.The article language of a tweet was the language of the full text from which the PNG of the article was converted.The article audience categories ("researchers", "clinicians", "managers" etc.) were based on a well-known classification of primary care research (Beasley et al., 2004(Beasley et al., , 2007)), with an additional category "readers" for non-citable journal sections such as editorials and letters to the editor, which otherwise would often be hard to place in a single category.Articles were categorized by this study's first author and verified by the last one, both of whom hadn't seen the outcome data for the corresponding articles yet.The article id was obtained from its URL and used to verify whether all of the corresponding tweets were scheduled as planned.The tweet id was obtained from the tweets' URL on Twitter Web and used to merge the dataset with explanatory variables with those with the outcome data.The resulting dataset is openly available at https://osf.io/pg6na.

Sample size, interim analyses and stopping rules
The sample size had a lower limit and an upper limit.The lower limit was 37 tweeted articles, based on statistical power.The upper limit was 60 intervention weeks (with at least one tweeted article in each), which meant up to 60-120 tweeted articles.Because the journal publishes around 60 articles per year, we felt any effect not evident after 60 intervention weeks would not be large enough to justify preferring one hour of the day over another.
The statistical power was calculated based on a sequential analysis (the first two tweets, then the first four tweets, then the first six tweets and so on) of multiple data sets simulated from parameters that were estimated from previous, observational data.We calculated that at least 6, 25, and 97 articles would need to be tweeted to achieve 50%, 80% and 95% statistical power to achieve the inference criterion for tweet impressions.For URL clicks, the corresponding figures were 37, 119, and an uncalculated number higher than 120.For each outcome (impressions and URL clicks) and each combination of two times of the day (6, 9, 12, 15, 18 and 21 hours), the inference criterion was whether the combination had a posterior probability greater than 95% of including the time of the day that maximizes the outcome.
After the lower limit of the sample size (37 articles) was achieved, we started running the main statistical analyses on a monthly basis.When the analyses for both outcomes had achieved the inference criterion, we would stop randomization, collect the outcomes of the last already-tweeted articles, and run the final analyses.If, however the upper limit was reached, we would stop the randomization regardless of the results.

Statistical analyses
The relevance of the time of the day for each outcome was expressed by three estimands.The first one was the expected value of the outcome for each of the six times of the day.The second estimand was the ratio between (1) the standard deviation among the random intercepts for the time of the day and (2) the standard deviation among the random intercepts for the article identifiers.A ratio larger than 1.0 would mean the time of the day when an article is tweeted is even more important than what article is being tweeted, together with unmeasured variables such as whether the article was tweeted during the vacations.The third and final estimand was the Bayesian R 2 ("explained variability") of the regression model (Gelman, Goodrich, Gabry, & Vehtari, 2019), in comparison with another model which is identical except for lacking the time of the day.An R 2 near 100% would mean the time of the day would explain all the unexplained variation which would be present in a model without the time of the day.All estimates were reported with their median and 95% credible intervals (CI).
The main analysis consisted of a multivariate negative binomial model with tweet impressions and URL clicks as the outcomes and article identifier (the variable from the randomization table, not the one recorded when the tweet was scheduled) and time of the day as random intercepts.The parameters had weakly informative prior distributions which were centered around estimates from the previous year's observational data and were compatible with a wide range of values (more than 100 times smaller or larger values, in the case of the intercepts).The regression model, prior distributions and random seeds were fully specified in the preregistration.
In the subgroup analyses, we added an interaction between the time of the day and each of these variables: day of the week, tweet language, and article audience.As an additional ("per protocol") analysis, we repeated the main analysis restricting the data to tweets effectively posted as randomized.
The regression models were fitted with R 4.2.1 (R Core Team, 2017, maio 9) and its packages brms 2.17.0 (Bürkner, 2017) and CmdStanR 0.5.3, both of which leveraged Stan 2.30.1 (Stan Development Team, 2022a, 2022b).Compared to the registered plan (https://osf.io/avyb9),the final analysis code (https://osf.io/uqwbs)was reorganized, some parts were completed (data description and repeating the code for the per-protocol analysis) and the statistical software was updated.Any ambiguity in the analysis plan was commented in the final analysis code.

RESULTS
The randomization ran from January 13, 2020, to October 10, 2021, when the study reached the upper limit of 60 weeks of intervention, amounting to 112 tweeted articles and three times as many tweets (Table 1).Tweets were evenly allocated between the days of the week, times of the day and tweet languages, with 13 tweets deviating from the protocol (2 in the wrong week, 7 on the wrong day of the week, 10 in the wrong hour, and 1 in the wrong language).The median tweet had 2 hashtags (maximum 5) and mentioned 0 usernames of authors on Twitter (maximum 7).Most tweets were about articles in the "research articles" section ( 186), aimed at clinicians (114), and published in Portuguese (291).In the subgroup analyses, there was very little variation in tweet impressions and URL clicks between the hours of the day within days of the week (Table 3), tweet languages (Table 4) or article audience (Table 5).Moreover, for each hour of the day there was little variation between the days of the week, between tweet languages or between article audiences.Compared with the standard deviation among the articles, the standard deviation among times of the day was only 0.11 as large for (the logarithm of the expected number of) tweet impressions, with 95% CI from 0.01 to 0.49.This ratio was 0.38 for URL clicks, with a much wider 95% CI, from 0.02 to 2.98.Again, the ratios were very similar in the per-protocol analysis (see statistical report).

Day of the week
Adding the time of the day to the regression model (in comparison to only the articles' identity) increased the explained variability only from 47.0% (95% CI, 33.0 -60.3) to 47.4% (33.8 -60.7) for tweet impressions, and from 7.9% (0.1 -23.1) to 9.2% (0.5 -24.5) for URL clicks.The other regression models had similar success in explaining the variability (see statistical report).

DISCUSSION
This study found out there's no best time to tweet a journal article, regarding both tweet impressions and URL clicks.Even after 21 months, there was not enough evidence to point to any pair of times of the day as including the single best one.Moreover, any effect that the time of the day might have is likely to be of no practical relevance, as indicated by the expected number of tweet impressions and URL clicks, among other estimands.
The trustworthiness of the study findings is underlined by its experimental design and the agreement between the multiple statistical analyses.Our main analysis was guided by the "intention to treat", that is, the tweets were analyzed as randomized.This way of analyzing the data was chosen because of its pragmatic interpretation: it informs the effects of deciding to tweet a journal article at specific times of the day.As an additional analysis, we restricted the data to tweets being posted as planned.This per-protocol analysis yields stronger effects because there is no "classification error" in the explanatory variables.However, because most tweets were posted as randomized, the results were virtually identical.Likewise, the results were virtually identical within different days of the week, languages in which the tweets were composed, and the inferred audience of the corresponding journal articles, implying these aspects do not influence the results.These findings stand in contrast with those of Spasojevic et al. (2015) and Bray (2012, novembro 12), as well as the findings of Kumar et al. (2018) and the recommendations of social media consultants and management platforms at large.As advanced in our Introduction, Spasojevic et al. (2015) and Bray (2012, novembro 12) analyzed data from before Twitter began sorting its timeline algorithmically5 .Because the timeline was strictly chronological before, any algorithm Twitter might adopt would necessarily lessen the effect of the hour when a tweet is posted.In other words, the fact that there is a peak time when users are online on social media needs not to translate anymore to there being an optimal time for posting.This disconnection between the tweet time and its impressions and URL clicks might be greater for scholarly journals than for the average Twitter content.Studying shortened URLs pointing to journal articles, Fang, Costas, Tian, Wang, and Wouters (2021) found the timing of clicks on Twitter to follow a pattern similar to the total number of clicks.Almost two thirds of the clicks took place after the first two days, and about one in seven clicks took place after the first month.This is a much longer time scale than the (pre-algorithm) "18 minutes of fame" of Bray (2012, novembro 12), perhaps because much of what is posted on Twitter is of little consequence (Haustein, 2019).

CONCLUSION
Our "null findings" may seem disappointing at first sight but they are actually liberating.Journal staff should not worry about the hour of the day when planning their posts on Twitter, nor when considering whether to subscribe to social media management platforms.Although this study randomized tweets from a single journal, its findings should generalize to other scholarly journals, since all of them publish content that stays relevant for many years, and because the findings were consistent across article audiences and tweet languages.

Table 1 .
Characteristics of the tweets According to the regression model, no time of the day achieved 95% posterior probability of containing the single best time to tweet concerning either outcome (see statistical report), and all six times of the day had similar expected impressions and URL clicks (Table2).The results were not materially different when restricting the analysis to tweets posted as randomized ("per protocol").

Table 2 .
Expected number of tweet impressions and URL clicks according to the hour of the day when the article is tweeted, in the main and per-protocol models

Table 3 .
Expected number of tweet impressions and URL clicks according to the hour of the day and day of the week when the article is tweeted DOI: 10.5380/atoz.v13.89296AtoZ: novas práticas em informação e conhecimento, 13, 1-12, 2024

Table 4 .
Expected number of tweet impressions and URL clicks according to the hour of the day and language of the tweet

Table 5 .
Expected number of tweet impressions and URL clicks according to the hour of the day when the article is tweeted, and the audience of the article