Causal overstatements reduced in press releases following academic study of health news

Background: Exaggerations in health news were previously found to strongly associate with similar exaggerations in press releases. Moreover such exaggerations did not appear to attract more news. Here we assess whether press release practice changed after these reported findings; simply drawing attention to the issue may be insufficient for practical change, given the challenges of media environments. Methods: We assessed whether rates of causal over-statement in press releases based on correlational data were lower following a widely publicised paper on the topic, compared to an equivalent baseline period in the preceding year. Results: We found that over-statements in press releases were 28% (95% confidence interval = 16% to 45%) in 2014 and 13% (95% confidence interval = 6% to 25%) in 2015. A corresponding numerical reduction in exaggerations in news was not significant. The association between over-statements in news and press releases remained strong. Conclusions: Press release over-statements were less frequent following publication of Sumner et al. (2014). However, this is correlational evidence and the reduction may be due to other factors or natural fluctuations.

However, it is not clear whether such research has much influence on the practice of academics and press officers in the preparation of press releases. Given the need to write short compelling statements about complex research, it is all too easy to inadvertently allow over-statements. We believe that the majority of exaggerations are not purposeful, but arise from the desire to be impactful, clear and accessible. It may be very difficult for this to change. The finding that many news exaggerations are already in press release text (Sumner et al., 2014) certainly attracted interest and controversy (BMJ Altmetric, 2019). It was received positively by many press officers who are motivated to communicate science carefully, and helped catalyse some initiatives (The Academy of Medical Sciences, 2017). It was discussed in science communication conferences, blogs, twitter, and directly with press officer teams while developing a collaborative trial (Adams et al., 2019).
However, we do not know if sharing awareness has any potential effect on practice. Here we simply assess whether the rate of over-stated claims was lower in the 6 months following publication of that article (January to June 2015) compared to the equivalent 6 months in the previous year (January to June 2014).
Our main interest was press release claims -since these were the source identified in Sumner et al. (2014) -but we also assessed the associated news stories. Clearly our data can only establish whether a detectable difference occurred, and will not establish its cause. We can additionally compare the data to a third time-point in 2011 (with some limitations).
We focus here on causal claims based on correlational evidence, a common and potentially impactful form of over-statement in academia and science reporting ( . Therefore we focus here on causal claims based on correlational evidence evidence -which we will refer to as 'causal over-statement' -testing whether rates were lower in a six-month period in 2015 compared to the same months in 2014.

Method
Collection of press releases, journal articles and news Press releases from 2014 and 2015 were collected from the same sample of 20 universities as used in Sumner et al. (2014), as well as from the BMJ, which published the paper. The press

Amendments from Version 1
In response to reviewer 1's comments we have added more information about the data from 2011 to enable a better comparison. We have noted in abstract and discussion that regression the mean/ natural fluctuation is a possible explanation, and removed the phrase 'press release practice is malleable'. We have also adopted the phrase 'causal over-statement', as recommended. We have included Haber et al. and Shaffer et al. where recommended. We have corrected the typos.
In response to reviewer 2's comments we have now defined 'aligned' in methods and the legend of Figure 2. We have rephrased and expanded the sentence 'the effect on news is diluted by other factors and so here we may have had sufficient power only to detect the effect on press releases'. As the reviewer notes, it is very difficult to avoid causal language even when we are primed to do so by the very topic of the paper. We now note more explicitly that the causal effect is hypothetical, and the data is merely consistent with this. We have also adopted the phrase 'causal over-statement', as recommended, and added more information about the reception of the original study in the introduction. . We chose a 6-month period to aim for a sufficient number, and compared equivalent months in case press release output has seasonal changes (e.g. associated with academic year). Online repositories (websites, and EurekAlert.org) were searched for any press releases from the included institutions. This resulted in a corpus of 4706 press releases. The sample was then restricted to those relevant to human health, using the same criteria as (Sumner et al. 2014); which included all biomedical, lifestyle, public health and psychological topics), that reported on a single, published, peer-reviewed research article. This left 1033 relevant press releases. To ensure similar sample numbers across institutions and to reduce the sample to one we had resources to code, we implemented a cap of 10 press releases for each time period for each institution, through random selection where necessary. This resulted in a sample of 368 press releases, for which the associated peer-reviewed journal articles were retrieved. For each press release, relevant news articles (i.e. those which make reference to the source research) were collected via keyword searches using Google Search and the Nexis database (LexisNexis, New York, NY), up to 28 days after publication of the press release, and up to one week before (to allow for rare embargo breaches). The sample was then limited to cases where the study design was observational cross-sectional, observational longitudinal, or an observational meta-analysis (N=168 press releases). For analysing overstatements in press releases and news, we only used cases where the journal article was not already overstated (N=98 press releases; 322 news).

Article coding
Prior to coding, the corpus of articles underwent a redaction process using Automator software (5.0, Apple Inc.) to remove any references to the year 2014 or 2015. This was so that the coders, who were aware of the aim of the study, were not aware which condition each article belonged to. The articles were coded using the standardised coding sheet used by Adams et al., 2017 (see raw data folder 'before_after_data' in (Chambers et al., 2019)). For this analysis, only information regarding the statements of causal or correlational relationship was used. Two researchers (LB and AC or RCA) independently coded each article and any disagreements were discussed, with a third coder (AC or RCA) where necessary. This created a database with 100% agreement in coding.
Coding of causal and correlational claims. We used the scale developed by Adams et al. (2017), in which directly causal statements and can cause statements are classed as overstatements for correlational evidence. On the other hand, it was not classed as an over-statement if the claim contained might, may, could, linked to, predicts, associated with and other associative or conditional phrases. We refer to these phrases as 'aligned' with correlational evidence. Although Sumner et al. (2014) originally distinguished between some of these phrases, readers were found not to consistently rank any of them as stronger than the others (Adams et al., 2017). In contrast, readers consistently ranked can cause and directly causal statements as stronger statements.
The strongest claims relating two variables in the study (e.g. a food and a disease) were recorded from the abstracts and discussion sections of journal articles. For press releases and news articles, the strongest statement was coded from the first two sentences of main text (where these were directly relevant to the research; general context was excluded).
We defined over-statements as causal or can cause claims based on correlational evidence. We only analysed cases where the journal article did not already make such claims, since our focus was not on claims taken straight from the publication, where publications lags mean that such causal claims may have been originally penned at any time in 2014 or early 2015.

Statistical analysis
Consistent with our previous approach (Sumner et al., 2014), generalised estimating equations (GEE) were used (in SPSS version 24) to provide estimates and confidence intervals adjusting for the clustering of multiple articles to one source (multiple news articles from one press release; or multiple press releases from the same institution). The GEE is an extension of the quasi-likelihood approach and is used in situations in which data is clustered to estimate how much each data point should contribute statistically. The key part of the process is to estimate the correlation of data within clusters. At one extreme, all data within clusters might be fully correlated, in which case there is really only as many samples are there are clusters; separating the data points within clusters adds no additional information. At the other end of the extreme, data within clusters may be entirely uncorrelated; in this case the clustering does not matter and all data points can be treated as independent. In reality, data within clusters tends to be somewhat correlated, and the GEE estimates this and applies a weighting factor to those data points depending on the degree of correlation. The approach is accessibly explained by Hanley et al. (2003), so we do not replicate the equations here. We used a logit linking function because the data is binary, and an exchangeable working correlation, which is a common approach for clustered data and makes a parsimonious assumption that correlations are similar between all data within clusters.

Press release overstatements
In the sample from 2014, 28% (95% confidence interval = 16% to 45%) of press releases made a causal over-statement: a causal claim or can cause claim when the data were correlational and the journal article had not made a similar claim. In the sample from 2015, this rate was significantly lower, at 13% (95% confidence interval = 6% to 25%, see Figure 1). Thus, the odds of such overstatement were higher in 2014 (odds ratio = 2.7, 95% confidence interval = 1.03 to 6.97).
These numbers can also be compared to those for news in 2011, analysed by Adams et al. (2017). The number of news with causal over-statement was 32% (95% confidence interval = 24% to 41%).

News statements as a function of press release statements
To assess whether the drop in press release over-statements meant a weakening of the previously found association between news claims and press release claims, we assessed this association following the same methods as previously described . Across 2014 and 2015 combined, the odds of over-stated news claims were 12 times higher (95% confidence interval = 4.5 to 32) for over-stated press releases (69% news over-stated, 95% confidence interval = 49% to 84%), than for aligned press releases (16% news over-stated, 95% confidence interval = 10% to 24%). This association between news and press releases ( Figure 2) was not different between the years (odds ratio = 1.1, 95% confidence interval = .2 to 6.2) and is consistent with the association between news and press releases seen for exaggerations and other content previously.

Discussion
We set out to assess whether there was evidence of changes in press release practice after academic publications about health news and press releases. We found an approximate halving (28% to 13%) in the rate of causal over-statements in press releases based on correlational evidence in the 6 months following a widely shared publication (Sumner et al., 2014) compared to an equivalent 6 months the preceding year ( Figure 1). These rates can additionally be compared to the rate of 19% in a dataset from 2011 (abeit with some differences in methodology).
This evidence is correlational itself, and may not mean that the publication caused the change, since other factors may also have changed between 2014 and 2015. There has been scrutiny of health news and press releases from multiple quarters, and also press officer staff turnover may spontaneously change the balance of language in causal claims. At one extreme, it is possible that the changes were fully random; that 2014 was an unusually high year, and a drop to 13% was merely natural fluctuation/ regression to the mean. At the other extreme is a fully causal explanation; that press releases were on a trajectory of rising causal over-statement, and awareness-raising reversed that trend. The truth normally lies somewhere between extreme interpretations, and all the above factors may have played a role. Moreover, whatever the causal chain, the drop or fluctuation shows that a high rate of causal language is not inevitable in press releases, despite the need to be concise and appealing.
Beyond the main focus on press releases, we also saw a numerical reduction in overstatements in news, but this was not significant ( Figure 1). However, we found a strong association between news and press release language (Figure 2 . There was no reason for this association to change while the time pressures on journalists remain intense, and importantly it did not weaken with the reduction of overstatement in press releases. This strong association raises the question of why significant reduction in press release overstatement was not mirrored in significant reduction in news overstatement (Figure 1), if a causal chain were operating such that press release claims influence news claims.
In fact the data are consistent with such a causal effect, because it is expected to be diluted by other factors. Numerically, if news carries overstatements for around 70% of overstated press releases and 15% of non-overstated press releases (e.g. Figure 2), and if this is difference is causal, we can calculate the expected change in news overstatement as a result of the change in press release overstatement seen in Figure 1 (28% to 13%). We would therefore expect overstatements in around 30% (0.7*28+0.15*72) of news in 2014 and 22% (0.7*13+0.15*87) of news in 2015, which is close to what we saw, and clearly a diluted effect compared with the press release reduction of 28% to 13% (this outline calculation assumes similar news uptake for press releases regardless of overstatement, consistent with previous results; ( . Therefore our results are consistent with the non-significant effect in news being due to dilution and insufficient power, and should not be taken as evidence for no change in the news despite a difference in press releases. We based our analysis on press releases and news for journal articles that did not already make causal claims. Of additional note, a GEE analysis of the journal articles themselves showed there were already causal claims in an estimated 40% of the 168 peer-reviewed journal articles based on correlational evidence (and meeting our other inclusion criteria). This tendency to use causal language, even in peer reviewed research conclusions, has been noted previously (Wang et al., 2015). It would be worth following such rates to find if they too might show signs of decreasing as awareness is raised.
We analysed only one form of overstatement -causal claims, which are a cornerstone of scientific inference. There are many

Figure 2. Causal overstatements in news articles as a function of press release overstatement and year of publication. 'Aligned'
press releases or news are those that do not make causal claims stronger than 'may cause'. The association between news and press releases is present in both years and not statistically different between years. Error bars represent 95% confidence intervals.
other forms of potential overstatement that we did not analyse, including the two originally assessed by Sumner et al. (2014): advice to readers and human inference based on non-human research. There are different reasons why we did not use these here as a test for professional practice change. For advice, we cannot compare it objectively to an aspect of study methods, and have recently reported that the association between advice in news and press releases did not replicate (using a subset of the 2014/15 sample used here; Bratton et al., 2019). We believe this may show that advice exaggeration is difficult to define. Audiences change between journal articles, press releases and news, and thus the appropriate phrasing of advice may legitimately change.

Conclusions
Previous converging evidence suggests that press releases have a strong influence on health and science news, which in turn influences public health decisions and health practioners (see Introduction). Here, we found a reduction in press release causal overstatement associated with the publication of a study examining news and press release exaggerations. Although correlational, this evidence may suggest that press release practice can change in response to research, given that casual overstatements do not seem to have been decreasing prior to this. However, natural fluctuation or other factors that changed between 2014 and 2015 could also explain the data.

I confirm that I have read this submission and believe that I have an appropriate level of
This paper follows up on studies of what Alexandra L.J. Freeman in her review calls "causal over-statement" or "unwarranted/unsupported causal claims". Previous studies by the same team and others have found correlation between unwarranted causal claims in press releases from universities and in related news stories (unwarranted, because they were based on statistical correlations and not on actual cause-effect relations). The team's first study came out in December 2014, reporting on press releases from 20 leading British universities.
The present study looks at press releases and media reports from the first half of 2014 (before the first study made news) and from the first half of 2015 (just after). There is a higher number of unwarranted causal claims in the 2014 press releases (28%) compared to the 2015 ones (13%). The difference is statistically significant. However, there is not a significant difference between the number of unwarranted causal claims in the 2014 media reports compared to the 2015 ones.
At the same time, the study finds "strong association" between "aligned" press releases and "aligned" news stories and "overstated" press releases and "overstated" news stories. ("Aligned" is not defined in the present nor in the original study, but probably means something like no causal claims based on correlational evidence.) This association, the authors note, raises the question of why the found drop in unwarranted causal claims in the press releases is not matched by a similar drop in unwarranted causal claims in the news stories. They speculate that "the effect on news is diluted by other factors and so here we may have had sufficient power only to detect the effect on press releases".
It is not entirely clear what this sentence implicates in terms of cause and effect, but probably the effect on news caused by the original study and mediated by the study's effect on press releases. It is unfortunate that the authors discuss their findings in terms of cause and effect when what they really have is correlational evidence, particularly in a paper that deals with unwarranted causal claims.
correlational evidence, particularly in a paper that deals with unwarranted causal claims.
Nevertheless, the findings are interesting and generally clearly reported. I support Freeman's suggestion to replace the term "causal claims about correlation" with another term. As Freeman also points out in her review, the results obviously needs to be taken with some reservation due to the limited number of data points (just two). In my view, the study's most important implication is that it opens up for analyzing the potential impact of academic research on health communication in conjunction with press releases and news stories.
In order for the reader to understand the reception of the original study, it would have been nice with a little more background in the introduction. The paper says that the original study "attracted interest and controversy" with a reference to BMJ Altmetric. I encourage the authors to expand on the context in which this study was carried out and provide a full overview of the research activities in which this study is embedded.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: Science communication.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 27 Apr 2020 , Cardiff University, Cardiff, UK Petroc Sumner Thank you for your very helpful comments. We now defined 'aligned' in methods and the legend of figure 2. We have rephrased and expanded the sentence 'the effect on news is diluted by other factors and so here we may have had sufficient power only to detect the effect on press releases'. As the reviewer notes, it is very difficult to avoid causal language even when we are primed to do so by the very topic of the paper. It is unfortunate that our statistical terms do not differentiate between an 'effect' in the data that is merely a correlation and a causal 'effect'. The effect we refer to here is the hypothetical effect press release text has on news text -consistent with the strong between an 'effect' in the data that is merely a correlation and a causal 'effect'. The effect we refer to here is the hypothetical effect press release text has on news text -consistent with the strong correlation between news and press release statements seen in several studies (and again in Figure 2 of the current paper). We now note more explicitly that the causal effect is hypothetical, and the data is merely consistent with this. We have also adopted the phrase 'causal over-statement', as recommended, and added more information about the reception of the original study in the introduction.
No competing interests were disclosed.

Competing Interests:
practice is malleable' seem a little strong.
The team have only two data points: 2014 and 2015, and there are a lot of potential confounders. They themselves point out that they were unable to do a similar analysis of claims of findings based on animal research being of relevance to humans, because of the confounding influence of the Concordat on Openness on Animal Research in 2014. There may well be other confounding influences on the claims of unwarranted causality which they report here. The authors do acknowledge this in the Discussion, but discount other factors as being unlikely to cause such a large decrease.
The team also mention that 'there appears to have been a rise between 2011 and 2014'. The apparent fall in 2015 could therefore have included a degree of regression to the mean after a particularly heinous year for exaggeration in 2014. Adams In summary, then, I think that these findings are interesting and seem methodologically sound, but I would not draw too much conclusion from a single pair of data points (especially when a third is hinted at that might undermine the proposed pattern). Stronger evidence would have been a longer time series, preferably mixed with qualitative research with press officers about their awareness of the Sumner (2014) publication, and perhaps a study comparing the practices of institutions who had and had not been exposed to the findings of Sumner (2014). However, I recognise that this article is the product of additional analysis of a dataset collected for other purposes and that it does not stand alone in the literature. I would therefore advise that the conclusions are not couched too strongly, and that the conclusions are framed to be based on the totality of the evidence rather than only these results.
A few minor points: In the title and throughout the manuscript, the phrase 'causal claims about correlations' is a little hard to read. Using the shorter 'causal over-statement' or 'unwarranted/unsupported causal claims' in places might make it easier to read. This is particularly a problem because the manuscript itself is reporting a correlation and so the terminology can get complex.