Correlation vs causation: The case of competitive funding and research quality

On September 27, the German Science Foundation (DFG) announced its decision to award the status of a research cluster of excellence (Exzellenzcluster) to 57 cluster proposals from all disciplines. This was the first step of its so-called excellence strategy, (Exzellenzstrategie/ExStra, formerly known as the Exzellenzinitiative/ExIni). Each cluster receives about 7 to 12 million Euro per year, which is a huge financial boost in the context of the German academic system.

Since the beginning of the debate about ExStra/ExIni, there have been supportive and critical voices, with the debate continuing and reemerging when an important decision is made, such as the one last week. This time, some of the critical voices referred to a recent study purportedly showing that an increase in the share of competitive research funding does not increase research quality, but rather decreases it. I wholeheartedly endorse the goal of evaluating science policy and academia, but I firmly believe that this study does not allow one to draw any causal inferences about any element of the political and academic system and research quality (there is no point in calling out anyone who made this claim on Twitter, so I am not including such tweets here). The published study is available here; a ungated summary with the main findings is available here.

First, the study uses as the outcome a field-normalized measure of highly-cited publications (top 10%). This is not research quality. Neither the rate of output nor the number of citations are indicators of quality. For judging the quality of articles, one would have to read them or, at a minimum, derive some quality indicators to code them, such as having a clearly formulated research question, interpreting the results correctly, pointing out the limitations of the study, etc. One might counter what many researchers and, possibly, funders want are a great many citations and the use of this measure is justified. This might be, but what matters is that the authors and critical voices of ExStra interpret citations as a measure of quality and that’s fallacious.

Second, the data is aggregated to the national level. This is understandable because collecting data about the share of competitive funding on the departmental or individual level would be very demanding. However, lots of insightful variation can go lost in aggregating data to the national level and the problem of ecological inference comes into play. Within each country, more competitive funding might lead to more publications of top-cited articles, but the trend is negative across countries on the aggregate level.

Third, the analysis is based on bivariate correlations. This is fine for exploratory purposes, but, in this case, not for causal inference. The theoretical model that is presented in the study suggests that the total effect of competitive funding is identified, but I highly doubt it (see the model in the ungated report). One can think of common causes such as ‘competitive pressure imposed by science policy makers’ that point into ‘autonomy’, ‘competitive project’ funding and ‘national evaluation research system’ or just into ‘competitive project funding’ and the outcome (belief in the benefits of competition leads to more competitive funding, but increased internal pressure to be productive makes researchers submit papers too early that are of lower quality, leading to fewer citations). At the least, one would need to argue in much more detail that this causal model is complete (if we take it as a causal model) because it does not look complete as it stands.

The proper interpretation of this study is much more limited: a higher share of competitive funding is weakly negatively correlated with the publication of highly-cited articles. This is an interesting insight, but not a causal one. I am not saying that ExStra is terribly good at improving research quality because I do not know how ExStra/ExIni influenced the German academic system. The point is that none of us know the causal effect of competitive funding, even after having read this study.

About Ingo Rohlfing

I am a political scientist. My teaching and research covers social science methods with an emphasis on case studies, multi-method research, causation, and causal inference. I also became interested in matters of research transparency and credibility. ORCID: 0000-0001-8715-4771
This entry was posted in causal inference, political science, science, Wissenschaftssystem and tagged , , , , . Bookmark the permalink.

3 Responses to Correlation vs causation: The case of competitive funding and research quality

  1. brembs says:

    I think most of your points are well taken and accurate. However, the paper did not look at “quality” nor just at highly cited papers. Instead, it looked at efficiency, i.e., the number of highly cited papers (field normalized at that), the research-euro gets you. I would tend to say that across many fields, using a normalized measure, citations do show you how much attention these publication have received. Granted, scientists cite for various reasons, but absent any better measure in our current, antiquated infrastructure (we should have had better measures since at least the 2000s!), this is about as good as one can currently do if one wants to assess the cost-effectiveness of research. (again, it is embarrassing enough that this is still is the best we can do, but with antiquated data collection, you can only do antiquated analysis)

    Second, of course Twitter is too coarse for a thorough distinction between correlation and causation, but the conjecture of the study remains: if competition were to significantly increase effectiveness (null-hypothesis), one would expect a positive, not a negative correlation. The negative correlation doesn’t automatically falsify the null-hypothesis (that competition increases effectiveness), for a number of reasons, but it does cast doubts at least over the magnitude of the assumed effect. Phrased differently: if that had been the outcome of one of my experiments, I would have seriously reconsidered my null-hypothesis, or at least the assumptions under which it was made. For instance, if competition really increased effectiveness, but I don’t see this in the data, are there more important factors at play I should rather pay attention to than competition?

    Thus, for these two reasons, I think it is perfectly acceptable and scholarly to refer to these data when critiquing the ExStra and some of the assumptions under which it was devised.

    That being said, efficiency was never one of the main reasons for the ExStra. The ExIni was devised as a political compromise to circumvent the “Kooperationsverbot”, not as a way to increase efficiency in Germany. Without the “Kooperationsverbot”, there would likely never have been a competition between universities – in that case, federal funding would have been allocated to universities in a much less bureaucratic manner. This is public knowledge. I would speculate that not extending the ExIni would have been too much of an admission of failure and much other bad PR, such that there was simply little other choice than to keep it going, especially in the face of the glowing Imboden review.

    • ingorohlfing says:

      Thanks a lot for commenting. The authors are not consistent in describing what the outcome is, but they refer to quality in their article (p. 370, 380), using top-10% citations as a proxy (this is how I read it on page 370). In their summary of the study, there can be no doubt they mean research quality because the title reads: “Making academics compete for funding does not lead to better science”. The one-sentence summary also refers to “quality”, making me believe they mean to measure research quality in the article. If not and it is only about efficiency, then the measure might be defensible (although I am not sure why the top-10% should be a better measure of “return on investment” then top-50% or just all citations).

      I would not speak of hypotheses and testing in relation with this study. It is neither design-based inference nor model-based inference, it is just a descriptive analysis. This is not to downplay the value of descriptive research. My concerns relate to the interpretation of the study as presenting causally interpretable evidence, which is also reflected in the title of the authors’ summary. The wording in the article is a little bit more careful, but, from my point of view, not careful enough (see last paragraph of conclusion).

      In any case, I think we agree that a thorough evaluation of the Exzellenzinitiative is in order. The Imboden report that preceded the Exzellenzstrategie does not seem to rely on any quantitative measures: https://www.gwk-bonn.de/fileadmin/Redaktion/Dokumente/Papers/Imboden-Bericht-2016.pdf

  2. brembs says:

    I would agree that equating the top 10% cited papers with “quality” is difficult to justify, even when one assumes that scientists by and large only cite work they fins “good”. 🙂

    More importantly, though, IIRC, the Y-axis of most, if not all figures is indeed efficiency, i.e., how many of these 10% most cited papers does a funding € buy the country. Papers alone is not a desirable measure, because it would value numbers over content. Citations alone are not a desirable measure, because they depend on the number of papers. Citation rate suffers from left-skewed distributions and publication date, especially when one looks at short time scales and likely is not trivial to calculate from such aggregate figures. From that perspective, it seems to me that looking at a certain percentage of highly cited publications comes about as close as one can get these days to one of the purported goals of current research funding, i.e. international visibility, without prohibitive manual labor in the calculations. It may not be perfect, but it looks good enough to me, i.e., using anything better with our lousy infrastructure would mean disproportionately more manual labor. The main premise the paper sets out to test is indeed if competition increases effectiveness, and other neoliberal mantras. Indeed, the very first sentence of the paper is: “What are the characteristics of research systems that influence efficiency?”.

    So in my reading (maybe wishful reading? :-), what they were testing are statements of certain governance structures having a positive effect on efficiency and not finding the evidence to be corroborating said statements. In my book, that casts doubt on the validity of these statements, even if there remains room for the statements to still escape falsification.

    Indeed, the Imboden report did not look at quantitative measures. Likely for good reason. We looked at employment data and it didn’t look good:

    http://www.faz.net/aktuell/feuilleton/forschung-und-lehre/verbesserung-der-arbeitsbedingungen-an-unis-13354907.html
    Nature looked at publication data and it also didn’t look very impressive:

    https://www.nature.com/news/germany-claims-success-for-elite-universities-drive-1.18312

    From the data I know, the ExIni and ExStra were and are a huge waste of money – at best.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.