Going Back in Time: Understanding Patterns of International Scientific Collaboration

We study the interpretation of the observed/expected ratio as an indicator of international scientific collaboration. This indicator cannot be considered as representing the affinity for collaboration, in the sense of ‘the more the better’, between two countries


INTRODUCTION
In 1992 Luukkonen, Persson and Sivertsen published a wellknown article entitled "Understanding patterns of international scientific collaboration" in the journal Science, Technology, and Human Values. [1]Nowadays (January 2021) it is cited 326 times in the Web of Science (WoS).This places the article in the top position of articles published in this journal in 1992, and at the 11 th place of all articles in this journal published since 1980.
Luukkonen et al. [1] noted that international scientific collaboration has increased both in volume and importance, a point that holds even more nowadays.The authors wonder how one might explain country-to-country differences in the rates of international co-authorship and patterns of collaboration within the global network of collaborating countries.They draw the readers' attention to cognitive, social, historical, geopolitical, and economic factors as potential determinants of the observed patterns.Indeed there is no single main factor but many factors that influence the propensity of countries to collaborate internationally.Luukkonen et al. [1] offers an important tool for monitoring the effects of government science policy aiming to enhance international collaboration.
In their article, to study the importance of specific bilateral relations relative to all such relations within the whole network, the authors define the observed/expected ratio of co-authorship (OER) for a pair of countries (X,Y) in the set of publications under study as: where C X,Y denotes the number of collaborations between countries X and Y (whole counting); C X the total number of collaborations country X has with all other countries, C Y the total number of collaborations country Y has with all other countries, and T the total number of pairwise country collaborations in the set of publications under study.Hence T is the number of links in the weighted collaboration network, with countries as its nodes.It is assumed that C X and C Y are different from zero and for all countries X, C X,X is set equal to zero.Remark that in this counting scheme a publication with 5 authors, 2 from country X, 2 from country Y and 1 from country Z yields 3 country collaborations, namely X-Y, X-Z, and Y-Z.Clearly, OER is a relative indicator.
One may observe that formula (1) is one of the many variations of the activity (AI) or Balassa index, [2][3][4][5] which we recall for completeness sake as formula (2): where: O CD denotes the number of publications by country C in domain D during a given publication window; O D denotes the total number of publications in the world in domain D during the same publication window; O C denotes the number of publications -in all domains -by country C during the same publication window; O W denotes the total number of publications in the world and in all domains during this publication window.
However, to study the relative importance of country Y for country X while taking the other relations into account, the authors of [1] used an asymmetric form of OER.We denote it as AOER.It is defined as: Purely mathematically we have to point out that if country X is the only country with whom other countries collaborate (T = C X ), then AOER must be set equal to zero.The AOER(X,Y) indicator reflects the relative importance of country Y to country X, calculated as the share of country Y within all collaborations of country X divided by the share of country Y within all collaborations in the network (minus the collaborations of X).So, the size of the X-Y relation within X's network is compared to the size of Y within the whole network, excluding X.

A PROBLEM WITH THE INTERPRETATION OF AOER
It was shown in [4,6] that the activity index (AI) is not a good indicator for research policy applications.Its main problem is that, in the context of the research production of a given country in a given field, it depends on the activities of other countries in other fields (O W ).
AOER does not suffer from the same problems as the general AI, as the aspect of "other fields" does not play a role.Yet, it seems that this indicator has other problems in interpretation.
Consider the following collaboration matrix, dealing with four countries: where C X,Y = 2; C X = 4; C Y = 4 and T = 20.Then, we first we have that AOER(X,Y) = (2/4)/(4/16) = 2; and when adding two collaborations between X and Y AOER(X,Y) becomes: (4/6)/(6/16) = 16/9 < 2. So, two extra collaborations between X and Y decrease the relative importance of country Y for country X, which seems contradictory.In the next section, we study this "anomaly" in more detail.

AOER IS NOT A MEASURE FOR THE ABSOLUTE NUMBER OF COLLABORATION LINKS OF COUNTRY Y WITH COUNTRY X
Intuitively, one might think, or maybe want, that a measure of collaborativeness between two countries should increase if these two countries increase their collaboration, as in the above example.This may or may not be a good property, but we prove that the AOER and the OER do not meet this requirement.Indeed, we shall show that from a certain natural number n on, where n denotes the number of new collaborations between countries X and Y, AOER(X,Y) decreases, with limit zero.
We compare where country X and Y have n more collaborations while all other data stay the same.We wonder for which natural number , when do extra collaborations between X and Y lead to a decrease of

AOER(X,Y)?
For simplicity, we write C X,Y as a > 0, C X as b, and C Y as c.Then we have to solve: ).Then, after an initial increase, a decrease in AOER (with respect to the initial value) happens for n > 5. Table 1 shows the corresponding values with n = 0 as the starting value.We see that AOER 5 (X,Y) is still equal to AOER(X,Y), but from n = 6 on, AOER n (X,Y) < AOER(X,Y).
We observe that for OER(X,Y) the collaboration between countries X and Y increases more and more their OER value will tend to 1. Depending on the original value of OER this would imply either an increase or a decrease.

OTHER PROPERTIES OF AOER(X,Y)
In this section, we present some properties of AOER(X,Y) but warn the reader that this indicator is not a measure for the affinity (here understood as absolute increase in numbers of collaborated articles) between two countries.Recall that a ≤ b ≤ T and a ≤ c ≤ T. To avoid uninteresting cases we moreover assume that T > max(b,c).
a) If b increases (by an amount of p > 0) at the cost of collaborations between others (hence T stays the same), and also a stays the same then AOER(X,Y) decreases.
Proof.We have to check if Yet we warn that the following property d does not necessarily hold.
d) Assume that country Y increases its links (by an amount of p > 0) with other (not X) countries, at the cost of links with country X (so c and T stay the same).Hence a becomes a-p (of course a ≥ p), and b becomes b-p.Does this lead to a decrease in the value of AOER(X,Y), because a has decreased?
The answer is that this is not necessarily the case.
We have to compare

DISCUSSION AND CONCLUSION
1) Can one use or invent other measures instead of (1) and (3)  for which an increased collaboration between two countries leads to an increase?That would, from a bilateral point of view, be more satisfactory.Could the F-measure, introduced in [6] be used?Probably not.Or should one prefer Salton's or Jaccard's measure as suggested in Luukkonen et al.'s follow-up paper. [7] Could a similar study be done using fractional counting, instead of whole counting for country collaborations?Could [8] be useful for this?
3) In [7] the authors point out that in international collaboration studies bilateral as well as multilateral aspects are of interest.Moreover, one should certainly consider absolute and relative values.
We admit that we performed an ad hoc study.There is no required set of axioms, agreed upon by most colleagues, for an acceptable measure of relative research collaboration.We think that this is a general issue for informetric indicators.One should know the properties of indicators, leading to advantages and disadvantages for a given application.Vice versa, one should agree on a set of axioms for acceptable indicators of a certain type (for a certain purpose) and then find out which indicators are acceptable or invent new ones that are.In this context, we recall the case of the h-index.This index has been characterized axiomatically, e.g., by Woeginger [9] and a set of axioms, that excluded the h-index, for consistent rankings of departments has been proposed. [10]Yet, even then, users of such indicators may decide that an indicator is "good enough" for practical use, say in a heuristic way (for a precise description of the term "heuristic" we refer the reader to [11] ). Few indicators meet all possible requirements so that practitioners in our field often apply indicators that are PAC (probably approximately correct). [12]ncurring with the authors who constructed the AOER in 1992, we conclude that the observed/expected ratio as used in [1] reflects the relative importance of country Y to country X, calculated as the share of country Y within all collaborations of country X divided by the share of country Y within all collaborations in the network (minus the collaborations of X).So, the size of the X-Y relation within X's network is compared to the size of Y within the whole network minus X.This indicator should not be considered as a measure of absolute affinity between countries X and Y.
This becomes: a.(b+n).(c+n)> b.c.(a+n), bc < a.(b+c) this already happens for n = 1.We consider two examples.Let a = 2, b = 3, c = 4 and T = 100.Then bc = 12 < a.(b+c) = 14.AOER(X,Y) = 16.67 and AOER 1 (X,Y) = 15.00.More generally, let a = 1, b = 3, c = 4 and T any number larger than or equal to 4 (we take T = 100 This inequality is equivalent with (T-b).(b+p)> (T-b-p).bor: Tb+Tp-b 2 -bp > Tb -b 2 -bp? or Tp > 0, which is clearly correct.b) If b increases (by an amount of p > 0) and all other collaborations stay the same (hence T becomes T+p), and also a stays the same then AOER(X,Y) decreases.Proof.We have to check if+ that only T increases (by an amount of p > 0), i.e. some extra links are introduced not involving X or Y. Then AOER(X,Y) increases.Proof.We have to check if + For a = 100, b = 120, p = 30, T = 420, (and c = 300) the left-hand side is: 0.833, while the right-hand side is 0.856 which is larger.