The evolution of networks of innovators within and across borders: Evidence from patent data

Recent studies on the geography of knowledge networks have documented a negative impact of physical distance and institutional borders upon research and development (R&D) collaborations. Though it is widely recognized that geographic constraints and national borders impede the diﬀusion of knowledge, less attention has been devoted to the temporal evolution of these constraints. In this study we use data on patents ﬁled with the European Patent Oﬃce (EPO) for OECD countries to analyze the impact of physical distance and country borders on inter-regional links in four diﬀerent networks over the period 1988-2009: (1) co-inventorship, (2) patent citations, (3) inventor mobility and (4) the location of R&D laboratories. We ﬁnd the constraint imposed by country borders and distance decreased until mid-1990s then started to grow, particularly for distance. We further investigate the role of large innovation “hubs” as attractors of new collaboration opportunities and the impact of region size and locality on the evolution of cross-border patenting activities. The intensity of European cross-country inventor collaborations increased at a higher pace than their non-European counterparts until 2004, with no signiﬁcant relative progress thereafter. Moreover, when analyzing networks of geographical mobility, multinational R&D activities and patent citations we cannot detect any substantial progress in European research integration above and beyond the common global trend.


Introduction
Rapid progress in information, communication, and transportation technologies and the overall trend of globalization have lead to the assertion "distance is dead" (Castells, 1996;Cairncross, 1997).
A natural tension exists, however, between this view and knowledge "stickiness": human activities and social interactions are known to geographically cluster to take advantage of knowledge spillovers, social capital and other agglomeration economies (Feldman, 1994). While the literature on innovation systems has focused on the interplay between clusters and networks of innovators (Breschi and Malerba, 2005), the "death of distance" conjecture has been thoroughly investigated in the literature on international trade and globalization studies. The most significant recent advances in that vein have been made by means of panel gravity regressions and indicate distance, borders and free trade areas still play a key role in trade networks (the so-called "tyranny of distance").
Since the seminal contribution of Freeman (1991), networks of innovators have attracted a great deal of interest as a tool for representing and analyzing division of innovative labor. Many types of network have been investigated, ranging from the informal scientific connections in invisible colleges and communities of practice, to the formal collaborative agreements between firms and other research organizations. With increasing frequency, growing data on scientific collaborations, collaborative R&D projects, and patents have been widely exploited to gain insight into the structure and evolution of networks in different industries, countries and time frames (see Powell and Grodal, 2005 and Ozman, 2009 for reviews).
Despite significant efforts in a growing body of literature analyzing networks of innovators, there is still a lack of large-scale quantitative understanding of the evolution of networks of innovators in space and time. The complexity of this problem arises from the variety of competing forces that underlie the economics and sociology of R&D collaboration. Prevailing wisdom states the spread of tacit knowledge and the formation of informal ties are uninhibited over short distances, but barriers increase with distance. However, for the transmission of codified knowledge and formal contractual collaborations, distance plays less of a role at large scales, even if national borders between different institutional settings can still reduce the effectiveness of contractual solutions. On the one hand, because technological advancements have increased the capacity to codify and share knowledge across large distances, it follows that the barriers induced by distance should be decreasing, and possibly vanishing, in R&D networks. On the other hand, innovators are increasingly attracted by large innovation "hubs" which combine local agglomeration economies with centrality advantages in knowledge and social networks also known as preferential attachment or "rich get richer" effect. The effect of globalization forces, which reduce the cost to collaborate at distance, compounds with the attractiveness of central regions in innovation networks, which tend to limit the geographical span of collaboration of smaller regions in their gravitation fields. Innovation networks can be more or less constrained by geographical distance depending on the interplay of these two forces. Also, the dynamic role of physical distance and institutional borders may differ significantly across different R&D networks depending upon the type of knowledge that is exchanged (tacit vs. codified) and the nature of the links: arm's length market transactions, hierarchical relations or network forms of coordination (Whittington et al., 2009).
Cross-network interdependencies should also be taken into account. For example, international mobility should have a positive impact on regional citation flows as inventor movement is thought to be an important driver of knowledge spillovers (Agrawal et al., 2006). International mobility may, in turn, have a positive impact on large distance collaborations as mobile inventors act as bridges across teams of inventors working for different organizations (Breschi and Lissoni, 2009). Conversely, one could argue that, the more individual inventors and research teams can freely move, the less R&D organizations will feel the need to locate R&D labs abroad or to sign collaborative agreements with foreign partners. In this sense, understanding the extent to which globalization reduces constraints on geographical mobility is important for assessing side-effects in other dimensions of R&D networks.
Here we employ a gravity approach to quantify simultaneously the strength of borders and distance on multiple innovation networks constructed from about 2.4 million patents recorded by the European Patent Office (EPO). We analyze a large sample of developed nations over many years to investigate the dichotomy arising from localizing constraints of R&D spillovers and agglomeration economies in R&D clusters vis-à-vis the tendency to expand R&D networks via long-range collaborations between inventors located in different countries and institutional settings.
Our study moves beyond previous efforts to understand the geography of research collaboration in many respects. First, we study a large set of developed countries at a low level of spatial aggregation. Second, we analyze a set of interrelated patent networks using the same analytic approach: (1) the network of individual collaborations among inventors (patent co-inventorship); (2) the internationalization of R&D activities by multinational firms (the patent applicant-inventor network), (3) knowledge spillovers as proxied by patent citations and (4) inventor mobility. Third, in our analysis we investigate, jointly, the physical distance effect and the country-border effect and control for other types of distance. Few previous studies have investigated the dynamics of the geographical distance and border-effects simultaneously and those that have focused either on Europe (Hoekman et al., 2010) or the United States (Singh and Marx, 2013). Fourth, we focus on the evolution of barriers to the internationalization of knowledge networks over two decades, whereas most of previous studies in this field have had to take a static viewpoint. Our more comprehensive analysis allows us to examine and ultimately quantify the effect of European integration efforts in the context of global network evolution trends. 1 Beyond scientific relevance of developing methods to quantify R&D network evolution, a better understanding of how distance and borders influence the structure and evolution of R&D networks is important to orient the policy debate. In particular, the European Research Area (ERA) vision of an "open space for knowledge and growth" stands as the most recent in a long line of integration efforts within the European Union (EU). The establishment of the ERA has been highlighted as key component of the competitiveness of the EU's Europe 2020 growth strategy. This is an attempt to reduce, perhaps even eliminate, the effect of national borders on scientific and R&D networks to create an area in which ideas and high skill human capital are free to flow and capitalize on transnational synergies and complementarities. Furthermore, a better understanding of the role distance and borders play in the structure and evolution of networks of innovators is key, not only for crafting effective policy, but even more simply, for assessing the true effectiveness of past, present, and future policy measures. This paper proceeds in the following manner. Section 2 presents a review of the relevant literature.
In section 3 we describe the data and methodology used. Section 4 presents the results of our analysis.
Finally, in section 5 we discuss our results and natural extensions of this research direction deriving some policy implications for the European Research Area.

The role of geography in networks of innovators
Regional networks of knowledge portray the knowledge exchange between entities located in different regions. A knowledge exchange is put into effect whenever the benefits that individuals or organizations receive from accessing new pieces of knowledge are greater than the associated costs. In regional networks, the costs of accessing remote knowledge are related to different forms of proximity that can be summarized in five dimensions: physical, institutional, cognitive, social and organizational (Boschma, 2005). The likelihood of a knowledge exchange is positively affected by these different forms of proximity. Most previous analysis of the globalization of knowledge production have focused on two specific spatial biases. First, the degree to which travel and communication costs result in physical distance being an impediment to collaboration. Second, the extent to which institutional friction arising from country-to-country differences create challenges for collaboration across national systems of innovation (Freeman, 1995;Lundvall, 1992;Nelson, 1993;Gertler, 1995). This second aspect has been investigated with particular reference to the European Union, where specific actions have been taken to favor scientific and technological collaboration between Member States. Table 1 summarizes previous findings on spatial biases in innovation networks.
Geographical networks of knowledge can be modeled according to a number of empirical strategies (Broekel et al., 2013). Application of gravity models to scientific and technological collaboration have provided strong evidence for a negative effect of physical distance and country borders on the likelihood of collaboration (Ponds et al., 2007;Maggioni and Uberti, 2007;Scherngell and Barber, 2009;Frenken et al., 2009b;Hoekman et al., 2009Hoekman et al., , 2010Scherngell and Barber, 2011;Scherngell and Hu, 2011;Scherngell and Lata, 2012;Pan et al., 2012;Hoekman et al., 2013). This body of evidence is robust over various kinds of data (scientific publications, patents), the type of network (collaborations between individuals/institutions, citations, labor mobility) and the geographic unit of analysis (country, regional, sub-regional).
When focusing on the evolution of spatial biases, the prevailing wisdom is that globalization and advances in information, transportation, and communication technologies should reduce the role of distance in socio-economic interactions (Castells, 1996;Cairncross, 1997). This issue has been thoroughly explored in the literature on trade through the lens of gravity models (Coe, 2002;Brun et al., 2005).  .,.
.,. (2013 The conclusion that spatial biases are attenuating is the general interpretation invoked when observing an increase in the cross-border shares of collaborations and an increase in the average distance of collaborations. However, using different methodologies, some studies provide contrary evidence that the constraint of distance is becoming more binding over time (Hoekman et al., 2010;Singh and Marx, 2013;Ponds et al., 2007;Boerner et al., 2006) and that the country-border effect is not decreasing (Singh and Marx, 2013;Ponds, 2009;Frenken, 2002).

Singh and Marx
Among all studies examining the dynamics of spatial biases, two in particular are worth mentioning in the context of our analysis. Hoekman et al. (2010) estimate gravity models using data on copublications between NUTS2 regions in 33 European countries for the period 2000-2007. They find that the negative effect of distance on inter-regional collaborations increases over the focus period and that the country-border effect decrease, though not statistically significantly. is modeled with a weighted logistic regression, they observe an increase over time in the citations received from US patents relative to citations received from non-US patents. 3 That study also finds that the rate of decay in the probability of citation as a function of distance has slightly increased over time. That is to say, the effect of distance is increasing.
Here we aim to bring coherence to the issue of the dynamics of distance and borders by considering a broad range of countries (EU and non-EU) and by applying our methodology to four different R&D networks, each with their own dependance upon tacit versus codified knowledge, thus providing an overall consistency check for our results. More specifically, we study 50 OECD and OECD-partner countries at the NUTS3 level of spatial aggregation. 4 Most previous studies used NUTS2 and the few that used NUTS3 focused on a single country (Ponds et al., 2007;Frenken et al., 2009b) or a few countries (an exception being Hoekman et al., 2009 who analyze EU27 countries plus Norway 2 However they find a decreasing border effect with respect to regional borders. 3 They also find that the state-border effect decreases over time as consistent with Hoekman et al. (2010). 4 The Nomenclature of Units for Territorial Statistics (NUTS) is a geo-code standard for referencing the subdivisions of countries for statistical purposes. The nomenclature has been introduced by the EU for its member states. The OECD provides an extended version of NUTS, called Territorial Levels (TLs), for its nonEU member and partner states. For European countries, TL2 and TL3 are largely consistent with the Eurostat classifications NUTS2 and NUTS3 (Maraut et al., 2008). and Switzerland). We apply the same analytic approach to four patent networks: (I) the network of patent co-inventorship, (II) the location of R&D labs (the applicant-inventor network), (III) patent citations and (IV) inventor mobility. This is in contrast to previous studies that generally focused only on one network at a time (see Table 1). In our analysis we investigate, jointly, the distance effect (Ponds et al., 2007) and the country-border effect (Ponds, 2009) in Europe and other OECD countries. Few previous studies have investigated the dynamics of the distance and border-effects simultaneously (Hoekman et al., 2010;Singh and Marx, 2013). Our analysis spans two decades from 1998 to 2009. Only an handful of very recent contributions adopted a long-term dynamic perspective to investigate the evolution of knowledge networks (see Table 1). Moreover, our methodology provides the opportunity to evaluate the overall effectiveness of EU integration policies toward cross-border cooperation. By using a regression approach that is capable of determining evolution of the countryborder effect for European versus non European countries, our analysis provides an important and timely insight into the rate of European R&D integration relative to the rest of the world. Specifically we seek to test three basic research questions: i) is the effect of physical distance decreasing in magnitude?
ii) is the effect of country borders decreasing in magnitude?
iii) is the country border effect decreasing in magnitude faster within Europe than among the rest of the developed world?
To do that, we construct four networks (I-IV) using readily available information extracted from patents, which serve as representations of knowledge geography and provide quantitative structures for measuring knowledge diffusion.
(I) The patent citation network. Since the pioneering work of Jaffe et al. (1993), patent citations have been utilized extensively to measure the diffusion of knowledge across a variety of dimensions: geographic space, time, technological fields, organizational boundaries, alliance partnerships, and social networks (Almeida and Kogut, 1999;Jaffe and Trajtenberg, 2002;Peri, 2005;Gomes-Casseres et al., 2006). A principal assumption underlying this approach is that citations trace out knowledge flows and technological learning as knowledge embedded in the cited patent is transmitted to inventors of the citing patent. Given that access to codified knowledge typically does not require interaction between individuals, it is recognized that distance and institutional borders should be relatively less important in this network. Such studies focus on citations as means to transfer codified knowledge but acknowledge that citations are less effective means of spreading tacit knowledge than personal, face-to-face contacts.
(II) The co-inventor collaboration network. Though many empirical studies have analyzed the role of patent citations as measures of knowledge flows it has also been stressed that economic agents can access knowledge from many sources other than just codified knowledge. In particular, a distinction between two means of spreading tacit knowledge has been made in the literature, operating either through informal social interactions, arm-length market-based relationships, inter-organizational alliances, or hierarchical solutions within R&D organizations. Examples of the first case are social ties with current and former colleagues and those developed in social events (conferences, membership in professional associations etc.). Geography is relevant here as proximity facilitates the development of social relationships and raises incentives to invest in social capital (Agrawal et al., 2006). In the second case, the transmission of knowledge is regulated by a contract, such as a labor contract, licensing or formal collaborations, which explicitly set a compensation for the exchange of knowledge (Breschi and Lissoni, 2009). Geography matters either because labor mobility among different institutions or laboratories can be constrained in space, or because formal agreements require frequent interactions and monitoring that are more easily conducted locally. The network of co-inventions stands somehow in between these two categories as either the collaboration can be ruled by a formal agreement or inventors can decide to collaborate informally with colleagues located in different areas. The co-inventor network is affected by geography as spatial proximity and co-location may facilitate the transfer of complex knowledge as frequent face-to-face interactions maybe required. Though easing of communication and travel constraints is expected to reduce the importance of spatial proximity in this network (Giroud, 2013), the result can depend on the degree of complementarity between remote and face-to-face interactions.
The popularity of patent citations and collaborations as representations of knowledge flows is likely due to the pursuit in economics of pure externalities (spillovers), i.e. a transfer of knowledge which is not mediated by the market (Breschi and Lissoni, 2009). On the other hand, the next two networks we present operate through market-based channels. Specifically, these networks capture the relationships between organizations and affiliated inventors, and the mobility of inventors moving across organizations or across regional laboratories within the same institution. 5 (III) The location of R&D labs. The geographical links between applicants and affiliated inventors is relevant to the analysis of the geographic distribution and globalization of the innovative activities of firms (see Keller, 2004 andNarula andZanfei, 2005 for surveys). Multinational firms are well known to be drivers of the internationalization of innovation activities (see Wolfmayr et al., 2013) as international location of a firm's subsidiaries facilitates knowledge transfer across borders. The literature on the internationalization of business suggests a number of different reasons for undertaking technological activities outside the home country (Dunning and Lundan, 2009;Florida, 1997;von Zedtwitz and Gassmann, 2002). Among these, knowledge-seeking motives such as proximity to university and innovative firms as a means to benefit from spillovers and agglomeration advantages, and access to high quality scientific and technical talent, have become considered extremely relevant since the late 1990s (Florida, 1997;von Zedtwitz and Gassmann, 2002;Patel and Vega, 1999;Granstrand, 1999).
For example, von Zedtwitz and Gassmann (2002) show that these knowledge related factors are by far the most important motives for performing "research" (rather than "development") activities at foreign locations. Indeed, localized foreign knowledge that is tacit can be accessed or imported for firms by moving closer to the source. This goal can be achieved by setting up subsidiaries abroad (Phene and Almeida, 2003) and by hiring scientists (learning-by-hiring), or by sending firms scientist 5 Beyond mobile workers in the strict sense, i.e. workers switching employer or the establishment they work in, mobile inventors can be also consultants or academic scientists that offer their services to different companies (Breschi and Lissoni, 2009). abroad to the subsidiaries (Kim et al., 2009). Evidence from the international business literature suggests that knowledge outflows from the multinational corporation's home base are outweighed by inflows from its foreign-based subsidiaries (Singh, 2007;Kogut and Zander, 1993;Dunning, 1992), and that both knowledge flows appear to follow from personnel flows (Singh, 2007). Focusing on the location of patent inventors is not a novel way to map the geographical distribution of a firm's innovation activities (Cantwell, 1989) but has attracted less attention than it deserves due to data limitations (Harhoff and Thoma, 2010). Patents are extensively used as indicator of the location of firms' R&D activities given that systematic firm-level data on R&D expenditures by location are either not collected or not available for analysis. Prior studies have shown the existence of a home country bias in R&D thus proving the existence of strong institutional barriers to the internationalization of R&D activities (Belderbos et al., 2013).
(IV) The inventor mobility network. Inventor mobility data can be used to measure the geographical distribution of knowledge spillovers (Breschi and Lissoni, 2009;Almeida and Kogut, 1999;Kim et al., 2009;Agrawal et al., 2006;Miguèlez and Moreno, 2013). Mobile individuals are endowed carriers of knowledge stock and play a key role in the diffusion of knowledge by acting as vehicles for knowledge spillovers across organizations and locations through person-to-person interaction. The role of individuals as active agents in the creation and spatial diffusion of knowledge is often emphasized in the literature (Almeida and Kogut, 1999;Howells, 2012), particularly because person-to-person contact involving a transfer or exchange of personnel is gathered as an efficient means of transmission across organizational boundaries for tacit knowledge (Kim et al., 2009). For example, Breschi and Lissoni (2009) argue that the most fundamental reason why geography matters in constraining the diffusion of knowledge is that mobile researchers are not likely to relocate in space, that account to a large extent for localization of co-inventions or citations (Miguèlez and Moreno, 2013).
Summarizing, patent citations offer evidence of the flow of codified knowledge. Mobile inventors, on the other hand, bring with them tacit and experience-based knowledge. Similarly, collaboration networks, either within or across organizational boundaries, imply sophisticated forms of coordination and knowledge transfer grounded in team and firm based interactions.
Based on the relative complexity of the knowledge being transferred in each of the innovation networks, the first hypothesis investigated in this paper is: Hypothesis 1. R&D collaboration networks are the most constrained by distance, followed by inventor mobility and patent citations. Globalization forces will have an immediate effect on patent citations, followed by co-authorships. But it takes longer for inventors to move freely and firms to overcome home country bias through foreign R&D investment. Therefore, when the effect of globalization saturates, preferential attachment to leading regions will prevail, leading us to a second general hypothesis: Hypothesis 2. The effect of geographical and institutional barriers is increasing over time in importance, particularly at national and regional levels and for small and mid-sized regions. The increase in strength of geographical and institutional barriers will be stronger and start earlier for patent citations followed by co-authorships, foreign R&D investments and eventually inventor mobility.
These effects have some compelling policy implications for the future of the European Research Area, and indicate priority should be given to unleashing the mobility of skilled labor force within Europe and boosting the global attractiveness of European research hubs.

Data
The data analyzed in this study are drawn from the OECD REGPAT database (Maraut et al., 2008;Webb et al., 2005) which compiles all patent applications filed with the European Patent Office (EPO) from the 1960s to present. In this database the geographical location of each inventor and applicant has been matched to the appropriate 5,552 NUTS3 region in one of the 50 OECD or OECD-partner countries. 6 This allows us to construct four geographical networks: (I) co-inventors, (II) applicantinventor, (III) patent citations and (IV) inventor mobility. For each network we define y m,n as the number of links between NUTS3 region m and n. In (I) y m,n is equal to the number of patents jointly invented by the two regions. We use a full-counting approach so that a patent with I(> 1) inventors accounts for I−1 i=1 (I − i) regional links (hence, patents with only one inventor do not appear in this network by construction). Unlike (I), networks (II-IV) are directed networks in which we distinguish the pair (m, n) with respect to the pair (n, m). In (II) the region of the applicant is linked to the regions of the affiliated inventors. The inventor's region usually indicates where the invention was made (often a laboratory or a research establishment, or the place of residence of the inventor) while the applicant's region indicates where the holder (usually a company, university or other type of entity) has its headquarters. In the database there is no direct information on affiliations, but it can be trivially retrieved for patents associated with a single applicant, the case for approximately 94% of the whole set of patents. In (III) for each pair (m, n) of NUTS3 regions we count the number of times that (a patent of an inventor in) region m cites (a patent of an inventor in) region n (y m,n ), and the number of citations that m receives from n (y n,m ). In (IV) a link indicates one inventor moving from one region to another one. Inventors regional migration can be tracked observing patent activity in at least two different years. In the case that an inventor has no patents for one or more years, we can track her region only at the beginning and at the end of the gap. In that case the flow is referred to the first year in which the inventor is observed again. 7 Names of inventors have been cleaned and ambiguity over first names and initials have been dealt with, but have not been fully disambiguated.
More precisely, our approach tracks the flow of names between regions which is a gross proxy of pure 6 This is the full list of countries in the data set: Austria, Belgium, Germany, Denmark, Spain, Finland, France, Greece, Ireland, Italy, Luxembourg, Netherlands, Portugal, Sweden, United Kingdom, Australia, Bulgaria, Brazil, Canada, Switzerland, Chile, China, Cyprus, Czech Republic, Estonia, Hong Kong, Republic of Croatia, Hungary, Israel, India, Iceland, Japan, South Korea, Liechtenstein, Lithuania, Latvia, Macedonia, Malta, Mexico, Norway, New Zealand, Poland, Romania, Russian Federation, Slovenia, Slovakia, Turkey, Taiwan, United States, and South Africa. 7 We stress that, while there might be some overlapping in the mobility and applicant-inventor networks, these capture very different R&D relations between regions. The applicant-inventor network captures the way in which applicant institutions organize the geographical structure of their laboratories. Any inventor move is associated to two applicant-inventor links, the one referring to the outgoing region and the other to the destination region. A data point for inventor moves can correspond to a data point in the applicant-inventor network only for the move destination region and in the particular case that the outgoing region is the same of the region of new applicant. This happens when an applicant relocate the inventor far from the applicant region.
inventor flows. Looking simply at the flows of names we may erroneously count as inventor move two authors sharing the same name and residing in two regions in two subsequent years. To minimize this source of error and get rather close to the true individual flows, we drop flows that most likely correspond to this case. Namely, we drop regional moves of names whenever the name is observed in the incoming region even in earlier years. This rule is not able to identify pure moves only in the unlikely situation in which two authors with the same name move simultaneously. Our goal is to count the number of moves between regions, not to track the careers of individual inventors over time. This results in observing a 13% of inventors who have been active on more than one NUTS3 region in the period 1981-2010. 8 For the econometric analysis we create a balanced panel of data by networks for the period 1986-2009. In the estimation of the distance and border effects, the sample is restricted to all pairs of regions with at least twenty patents in every year. 9 For the third exercise we retain in the sample all those pairs for which at least one link is registered in the time period.

Econometric methodology
The root of our econometric approach is the gravity model (Anderson and Van Wincoop, 2003;McCallum, 1995), a standard tool in the econometrics of trade which has been recently applied to the analysis of R&D network (see Table 1). In its most elementary form, the gravity approach models the intensity of the interaction between two nodes as a (negative) function of the geographical distance between the nodes and a (positive) function of their respective sizes. 10 In applications to regional networks of knowledge it is typically assumed that knowledge transfers are hindered not only by physical distance, but also by other separation measures that can account for institutional, social, organizational and cognitive effects (Boschma, 2005).
We describe our econometric approach by focusing on three key ingredients that are relevant to test our research hypotheses. First, the probability density function of the dependent variable y i , which is relevant to model the impact of each independent variable. Second, the selection of independent variables. Third, the econometric specification and measures we use to estimate the temporal evolution of the spatial effects of interest.

Probability model for the intensity of R&D collaboration between regions
The dependent variable is the number of links (y i ≡ y (m,n) ) between NUTS3 regions (m and n) and we model its probability distribution with a count density. A number of models can be found in the literature to handle count densities, including the Poisson model, Negative Binomial model variants, and Zero-inflated models, as listed in Table 1. Since a large portion of NUTS3 region pairs have zero links, we opted for a Zero-Inflated Negative Binomial (ZINB) density, as consistent with Hoekman Frenken et al. (2009b) and Chessa et al. (2013). Zero-inflated models are suitable when data exhibits "excess zeros" as they take into account a large number of zero entries (Cameron and Trivedi, 1998). The expected number of links is thus a non-linear function of independent variables and is modeled as where, due to non-linearity, the impact of each regressor in X i on the dependent variable is a function of the whole set of regressor in X i , and due the Zero-Inflated supplement, the impact is mediated by two distinct parameters in the vectors β 0 and β 1 (see Appendix A for technical details on the ZINB model).

Independent variables
The selection of independent variables is relevant to the specification of the two linear indices Xβ j (j = 0, 1). First, we include the standard "gravity" variables distance, size m and size n . The continuous variable distance measures the distance, in kilometers, between the centroids of the NUTS3 regions. size m and size n denote the size of each of the two regions, which is represented by the total number of links attached to the region in a given year. Consistently with the literature, we add to the baseline gravity specification a further set of variables controlling for separation effects. First, we add three spatial measures that account for different dimensions of distance: border, neighbour (Peri, 2005;Maggioni and Uberti, 2007;Scherngell and Barber, 2009;Paci and Usai, 2009;Scherngell and Barber, 2011;Scherngell and Hu, 2011;Scherngell and Lata, 2012;Miguèlez and Moreno, 2013), and area (Peri, 2005;Frenken et al., 2009b;Chessa et al., 2013). The dummy variable border flags pairs of NUTS3 belonging to different countries (border = 1 if m and n are in different countries, border = 0 otherwise).
This variable is almost always present in gravity equations and takes into account that a common institutional framework eases coordination and interactions among individuals and groups (Gertler, 1995;Edquist and Johnson, 1997;Boschma, 2005). Indeed, the relevant institutional elements, either formal such as intellectual property rights, funding schemes and labor markets, or informal such as the cultural background, social norms and the language, have a strong national component (Hoekman et al., 2009). For this reason, the knowledge transfer is expected to be larger in regions that belong to the same country. However, the borders of common institutional frameworks do not necessarily correspond to national borders as other settings can overlap both at higher and lower divisions. At a lower level, geographical contiguity of regions surrounds cultural, social and language similarities that can facilitate knowledge transfers independently of national borders and physical proximity. For example, contiguous regions that share a national border may be as proximate in terms of informal institutions as contiguous regions that do not share a national border. Also, taken two pairs of regions with the same physical distance, say (m, n) and (m, k), region m could be more inclined to interact with k if m and k are contiguous while m and n are not. We thus include the dummy variable neighbour that flags pairs of adjacent NUTS3 (neighbour = 1 if adjacent, neighbour = 0 otherwise).
To control for institutional proximity at a higher macro-political division, we split the networks in three kinds of links (S = 3) according to the geographical area: links within the EU area, links within the non-EU area and the flows between the two areas. These macro-area effects are captured by the categorical variable area. European countries are assigned to the EU area in case they have been formally within the EU for most of our sample period. Thus we consistently use EU15 as definition of EU in the empirical analysis and countries that joined EU between 2004 and 2009 are assigned to the non-EU area. 11 To identify the impact on knowledge flows that can be genuinely attributed to the spatial measures of our interest, it is important to control for non-spatial measures that can be related to spatial measures and affect the diffusion of knowledge as well. We therefore include as further separation variable the continuous variable techdist that measures the technological distance between regions in a given year (Peri, 2005;Maggioni and Uberti, 2007;Paci and Usai, 2009;Hoekman et al., 2010;Barber, 2009, 2011;Scherngell and Hu, 2011;Scherngell and Lata, 2012;Miguèlez and Moreno, 2013). This variables proxy for cognitive proximity between regions since reflects the extent to which they share a common, related, or complementary technological knowledge base. Regions closely located may have comparable technological background thus the effect of physical distance may be overestimated if we omit to control for this (Hoekman et al., 2010). techdist is constructed using patent classes according to the International Patent Classification (IPC). In particular, for each region m we compute the vector t(m) that measures the share of patenting in each of the technological subclasses for a given year. Technological subclasses correspond to the third-digit level of the IPC systems. We define the technological distance between regions m and n as techdist m,n = 1 − r 2 where r 2 = corr[t(m), t(n)] 2 is the Pearson correlation coefficient between the technological vectors t(m) and t(n) (see Moreno et al., 2005 andBarber, 2009). The possibility that distance and techdist are complements or substitutes is accounted for by including also the interaction term distance * techdist (Agrawal et al., 2008). 12

Strategy to test the three research questions
Given our set of independent variables, the linear indices Xβ 0 for the zero-generating process and Xβ 1 for the Negative Binomial process are modeled in parallel as 11 Our analysis of European integration thus refers to the extent of cross-border collaboration between EU15 countries. We also tried in a separate regression to control for the impact of New Member States, which turned out to be negligible. 12 We thank an anonymous referee for pointing out the role of interactions between proximity measures. We estimated the elasticity for distance over different levels of techdist. Our results are in line with (Agrawal et al., 2008) as show for the citation network that physical proximity and technological proximity are substitutes, i.e. the marginal benefit of physical distance is larger the lower the technological proximity. We do not report here this analysis to avoid overloading of the analytical content, but figures can be provided upon request.
where j = 0, 1 and year t are dummy variables that capture exogenous yearly shifts. We make use of the general model highlighted in Eqs. 1 and 2 to perform three sets of estimates according to our research questions (i-iii). The linear indices modeled in Eq. 2 are adjusted in each of the three exercises to allow for interactions between the relevant spatial measure and year dummies. This allows the impact of spatial measures of our interest to vary year-by-year. Maximum likelihood estimates of parameters in the linear indices are then used in Eq. 1 to compute yearly marginal changes in the expected value of y i . Marginal changes of E(y i |X) are essentially absolute or relative differences between Y and Y that represent respectively the expected number of links in the base status (y ) and in the status reached as result of the relevant spatial effect (Y ). Given the non-linear dependence of Y i on X i the computation requires to set specific values for each regressors. The sample used in estimation is always a balanced panel of regional pairs. In case of unbalanced panels, some regions may appear or disappear over the sample period causing attrition bias.
For cases (i) and (ii) we run estimates on a balanced panel of data by networks for the period 1988-2009. The sample used for estimation is constructed from regions with at least 20 patents in every year. We chose to set a threshold on patents for two reasons. First, there is a large concentration of NUTS3 regions pairs with no links, since many regions have very few patents. Second, our measure of technological distance requires a reasonable number of patents to be statistically reliable. 13 Given this rule, the estimation sample is identical in case (i) and (ii) and is constant across networks.
To test our research question (i) concerning the strength of distance over time, we make use of maximum likelihood estimates of parameters in Eq. 2 and compute the elasticity of y i with respect to distance over years. 14 Specifically we estimate for each year the quantity Estimates of Y and Y are obtained replacing in Eq. 1 parameter estimates from Eq. 2 and setting regressors values at sample means (i.e. X = X ) except for year t and distance. 15 To test our second question (ii) we estimate the evolution of the country-border effect. To do this 13 Robustness checks were performed using different thresholds, both lower and higher than 20. Results hold very similar to those reported in this article. These are made available by the authors.
14 The elasticity is formally defined as ∂E(y i ) ∂distance distance E(y i ) . From Eq. 1 we can compute the derivative as Winkelmann (2008) for the computation of marginal effects for the ZINB model. we modify Eq. 2, adding interactions of border with year dummies, resulting in Given maximum likelihood estimates of parameters in the augmented Eq. (4) we compute the marginal effects of the border variable over years. We report percentage changes as in case (i) to allow comparisons among networks. In particular we compute for each year the semi-elasticity defined as Estimates of Y and Y are obtained replacing in Eq. 1 parameter estimates from Eq. 4 and setting regressors values at sample means (i.e. X = X ) except for year t and border.
To test our third question (iii) we change the specification of Eq.
(2) and separate the data into EU and non-EU sets, removing flows between the two. Linear indices are modeled following Chessa et al.
(2013) who apply a Difference-in-Difference (DiD) strategy to isolate the country border effect within EU. 16 Global forces drive a general increase in the propensity to collaborate over national borders that gross up the impact of EU-specific integration policies. Since EU and non-EU developed countries are similarly exposed to global trends that affect the integration of their R&D systems, we use the non-EU OECD members as control group to identify the integration effect that can be attributed to EU specific factors. In the impact evaluation jargon, this group acts as a counterfactual, in the sense that it is used to proxy EU trends that would have emerged in the absence of efforts devoted to boost integration in EU. EU recent members that have not been either in the EU or non-EU area during most of our sample period, are removed from the group of non-EU OECD members. 18 Here we do not include in the regressions the variable techdist which require to set a threshold on the number of patents. For the comparison in integration trends between the two areas (EU and nonEU) is important to take into account in estimation possible differences in the integration behavior of small regions.

now modeled as
where the trinomial variable area collapses in the binomial variable eu as links between the EU area and non-EU area are removed for identification purpose. In particular, eu flags pairs of NUTS3 regions that are within the EU (eu = 1) and pairs of NUTS3 regions for which neither are in the EU (eu = 0). The dummy variable border still flags pairs of NUTS3 regions within the same country but now links pertain always to the same area (EU or non-EU) whether or not they are crossborder or within-border. For example, Italy-France can be a valid cross-border link for EU and USA-Japan can be a valid cross-border link for non-EU. However Italy-USA, Italy-Japan, France-USA and France-Japan are excluded. In terms of the standard DiD formalism (Angrist and Krueger, 1999;Heckman et al., 1999;Athey and Imbens, 2006;Blundell and Costa Dias, 2009)  Denoting the actual and counterfactual outcomes of our dependent variable as Y and Y respectively and taking into account our DiDiD extension, we define the yearly treatment effect as Estimates of Y and Y are obtained replacing in Eq. 1 parameter estimates from Eq. 6 and setting specific values for regressors. We refer all regressors values to a generic pair of cross-border EU regions in the baseline year (i.e. we average over the sub-sample of these pairs obtaining X = X border,eu,t * ), except for border, eu, and year t . Relative to the baseline year t * (we use the arbitrarily chosen year 2004), τ t reflects the impact of changes in institutional factors specific to the EU which have taken place in a given year t with respect to t * .

Contextual decomposition of the coinventor data into size, distance, and time subsets
In order to provide further insight into the contextual evolution of the distance effect, we analyzed and compared the changes in the co-inventor network for specific non-overlapping subsets of the aggregate data.
Specifically, we first separated the network data into three time 8-year groups: T 1 = 1986 − 1993, 1994 − 2001, and T 3 = 2002 − 2009. Secondly, we separated the coinventor pairs into 3 distance regimes: "small" (d m,n ≤ 100 km, corresponding to g d = 0), "medium" (100 < d m,n ≤ d EU km, corresponding to g d = 1), and "long" (d m,n > d EU km, corresponding to g d = 2). Our choice of the thresholds separating the three distance groups is motivated by the empirical distribution of d m,n values shown in Fig. 1(A), which appears to be a mixture of at least three underlying distributions: one representing the inter-regional scale, one representing a local intercontinental/national scale, and one representing a global scale. Hence, we choose a d EU value to represent the characteristic distance of EU-EU links, which we estimate by calculating the average d m,n value for distinct region-region collaborations (not weighted by copatent link intensity y m,n ) over each period T i and also for T all : d EU m,n (T 1 ) = 376, d EU m,n (T 2 ) = 478, d EU m,n (T 3 ) = 531, d EU m,n (T all ) = 562. We choose to use the largest of these four values, d EU ≡ d EU m,n (T all ) = 562 km. Finally, we separated regions according to patent productivity as a proxy for size S n : "S = Small", "M = M edium", or "L = Large". Figure 1(B) shows the probability density function P (S) calculated by aggregating S n values into the three time periods defined above; the overall functional form of P (S) appears to be stable over time, but with a slight shift in the range reflecting increasing patent activity over the entire period. The partitions in S n were chosen such that regions with S n ≥ 419 patent counts (corresponding to the top 100 regions in the three-year time period 1986 − 1988) were defined as "Large". These large "hubs" account for f L = 48% of the total patents in period T 1 . The threshold between "Small" and "Medium" was defined such that regions in each of these two groups equally share the remaining 100 − 46.8 = 52% of the net patents, hence f M = f S = 0.26. Using this definition, the cutoff between small and medium sized regions is S c = 136 patent counts (in terms of region rank, this corresponds to the rank threshold r c = 321, meaning that regions ranked 101-321 fall into the "medium" size group). Using these definitions, each collaboration count y m,n can be distributed into In summary, we disaggregated the data across 3 consecutive 8-year periods, 3 distance scales, and 6 permutations of region-region size pairing, for a total of 54 data subsets. Then, for each subset we measured the distance effect ρ by estimating the parameters of the basic gravity model, Hence, each data subset provides a comparable estimate of ρ(T i , g(S m , S n ), g d ) which depends on the time period group T i , the size pairing group g(S m , S n ), the distance group g d .
The results of this contextual analysis are shown in Fig. 1(C). Each datapoint is colored according to the size partition, and the size of each datapoint is proportional to the average log d ij of the data entering into the calculation of a given ρ value. A decrease in the magnitude of ρ over time is consistent with distance becoming less of a collaboration barrier. To provide a measure for the change in ρ over time, we calculated the percent change between consecutive periods, which is shown for each subset in Figures 2 and 3. There are six types of cross-border links: EU-EU, EU-World, EU-USA, World-World, World-USA, and USA-USA, where World represents all regions not in either the EU or USA groups. Each subpanel is a pie-chart with the total number of link counts y m,n for each division in period T i . The fraction f EU −EU of the total links that are EU-EU links and the percent change in this quantity between consecutive periods, are colored red and the largest negative values (meaning that the ρ value decreased in magnitude indicating a decrease in the collaboration barrier due to distance) are colored light blue. Figure 2 shows the values for T 1 and T 2 and Figure 3 shows the values for T 2 and T 3 .

Results
In this section we present the results of our three regression models which estimate the evolution of (i) the distance effect in global patent activity, (ii) the border effect in global patent activity, and (iii) the cross-border effect in EU countries relative to non-EU countries. Finally, we discuss the main conclusions of our work about the two main research hypotheses we stated at the end of Section 2.

Evolution of the distance effect
Distance is a crucial aspect of knowledge transfer when the flow of knowledge is based on human interactions (i.e. co-inventor, applicant-inventor) or mobility. While these networks require costs of communication or costs of moving, citations on the other hand benefit from the availability of online repositories of bibliometric records which only require an access cost. Figure 4 shows the evolution of the average distance of R&D collaborations between NUTS3 regions. The mobility network shows the largest change over the entire period, as the average distance for inventor relocation was 1,267 kilometers in 1986 and has almost doubled by 2009, increasing to 2,051 kilometers. Note that we do not use inter-regional links (d nn = 0) for the calculation of the average and standard deviation distance for the mobility network. As documented in previous literature for various collaboration measures, we also find that the average distance of R&D collaboration has increased for all four cases.
However, we present evidence of saturation in recent years, mainly due to geographic upper-bounds on the largest collaboration distances. Hence, the increase in the average distance may be largely due to a "tail effect" in the extreme right tail of the distribution, as the distribution is relatively stationary in the range 50 < d m,n < 1000 km.
Interestingly, in double logarithmic scale, it is evident that each P t (d m,n ) can be characterized by at least three underlying distributions. Aggregated, these distributions result in an overall triplehumped distribution shape. For example, approximately 8% of the probability density belongs to the group g d = 2 of long-distance collaborations spanning more than d EU = 562 km. Notably, there is a sharp cutoff feature 6, 000 km, representing the trans-pacific/atlantic length scale. About 82% of the coinventor link counts belong to the small distance g d = 0 group, with d m,n ≤ 100 km. The remaining 10% of link counts fall into the medium distance regime, chosen to represent the characteristic distance scale for Europe.
Having demonstrated gradual shifts in the frequencies of d m,n over time, we now shift our focus to the impact that distance has on the likelihood of collaboration. Figure 5 shows network-by-network estimates of t , the elasticity of y m,n with respect to d m,n , as defined in Eq. (3). For each year we report the point elasticity evaluated at sample means and the 95% confidence interval. For example, for an "average" pair of NUTS3 region in 2008, a 1% increase in distance implies a 1.24% decrease in the expected number of links for the applicant-inventor network, a 1.13% decrease for the co-inventor network, a 0.94% decrease for the mobility network, and a 0.27% decrease for the citation network.
As a computation example of these quantities, for the coinventor network in 2008, 2008 = −1.13% is obtained from (Y − Y )/Y = (0.11061 − 0.11187)/0.11187 = −0.0113. These estimates demonstrate that distance is still a significant constraint on inter-regional connectivity for each network. The magnitude of the citation network elasticity is much less, indicating that distance impedes the flow of codified knowledge much less than the flow of tacit knowledge and people, in line with our first research hypothesis.
Regarding the time evolution of the distance effect in the co-inventor, applicant-inventor, and citations networks, we observe an overall positive trend corresponding to an increase over time in the magnitude of the distance effect (see Figure 5). For the first two networks, the positive trend emerges around in mid 1990s, while for citations it starts earlier and is roughly stable from 2002 onwards. 19 For mobility, there is a significant decrease from 1998-1997, then stabilizing afterward.
Figure 5 demonstrates overall that for three of the four networks, the magnitude of the distance effect is increasing over time, especially since mid 1990s, which is concurrent with the observed saturation in the average distance of links shown in Figure 4. 20 Furthermore, Figure 6 demonstrates that the positive trend in the magnitude of t persists even when we remove regressors one by one until what remains is a basic gravity-like model, which includes only physical distance, its interactions with year dummies, year dummies and the size of nodes.
However, it is crucial to note that if the size of nodes (regions) is omitted from the regression, resulting in a dependent variable that is regressed against only distance, its interaction with year dummies and year dummies, then the positive trend becomes a negative trend, as consistent with the observed positive trend in the average distance of links. The change in the trends suggests that the increase in the average distance of collaboration can be explained by the attraction premium of large regions ("hubs") who gain an excess number of collaborations from afar. Following from general proportional growth models, as the central nodes continue to grow, the (new) peripheral nodes are more likely to connect to hubs, resulting in an increase of the average distance since there are a relatively small proportion of "hub" regions distributed throughout the globe. By accounting for this "attractive force" between two nodes by incorporating region size in the gravity approach, the net result is an overall increase in the magnitude of the distance effect with time. 21 Do large patent hubs attract collaborations differently than medium and small regions? In order to gain insight into the size dependency of the distance effect, and in particular the role that large "hub" regions play in the evolution of cross-border collaboration, we partitioned the co-inventor network data into 54 different datasets depending on the sizes of the regions collaborating (6 size-size groups), the distance over which the co-inventors collaborated (3 distance groups), and the time period (3 consecutive 8-year periods). The region size proxy S n is defined as the number of patents from region n in each specified time period T i .
For each subset we calculated the relative change in y m,n with respect to a relative change in distance, using the benchmark gravity model defined in Eq.(8) to estimate the distance effect parameter ρ. Figure 1(C) shows the 54 regression estimates for ρ(d ij , t), which are separated into 3 panels depending on the d ij group. Each datapoint is colored according to the size partition, and the size of each datapoint is proportional to the average log d ij of the data entering into the calculation of that given ρ value. For a specific data subset, a decrease in the magnitude of ρ over time is consistent with distance becoming less of a collaboration barrier for a given type of inventor collaboration. Next we list our main observations by distance group.
Short-distance collaborations (g d = 0): For short-distance collaborations, the magnitude of ρ is increasing for all the groups (LS, LM , and LL) that involve large hub regions. Hence, at short distance, the relative effect of distance is becoming a stronger impediment to these intra-regional collaborations involving large hubs. The relatively large magnitude increase in ρ for the LL group from period T 1 20 The increase in the magnitude of the distance effect is a general pattern among the developed countries we analyzed, and is not being driven by specific countries or groups of similar countries. We estimated several models using different sub-samples and the results demonstrate that our general conclusions are in fact robust with respect to diverse subsets of countries. These results are made available by the authors. 21 Continuing to reintroduce the other controls back into our regression model, the overall trend in the magnitude of the distance effect remains the same, there is an overall increase in the magnitude of the distance effect. Figure 6 refers to the applicant-inventor network. A similar plot for co-inventorships is available upon request.
to T 2 may be due to the competitive aspects of hub "attractor" regions, which attract a significant portion of their collaboration activities locally, and in doing so may shield small regions from potential collaborations with other hubs.
Within the subset of LL links representing collaboration between nearby hubs, an increasing share is being represented by EU-EU links (the percent change %∆f EU −EU = 56.5% and 28.6% for the T 1 to T 2 and T 2 to T 3 periods, respectively). However, this pattern of growth within the LL link subset was also present for USA-USA links, which dominates the EU fraction by roughly a factor of two, indicating that the EU is significantly behind the US in this category.
Medium-distance collaborations (g d = 1): For medium-distance collaborations, corresponding to the characteristic scale of Europe, the magnitude of ρ decreased for the LL and LM groups from T 1 to T 2 , but then stagnated from T 2 to T 3 . Meanwhile, for the SS and M S groups, the ρ value increased from T 1 to T 2 and also from T 2 to T 3 , indicating that there is a decreasing likelihood of these types of collaborations, possibly due to crowding out by large regions. Interestingly, from T 2 to T 3 , the magnitude of ρ increased for all groups except for LL.
Concerning the relative number of EU-EU versus USA-USA links, the trends vary across time period. For M M , the EU increased its share two-fold relative to the USA, For LS, the USA increased its share slightly with respect to the EU. For LM , both the EU and the USA increased their share, however the ratio sustained the value f EU −EU /f U SA−U SA = 2.3. For LL, again both regions increased their share, however the ratio decreased from f EU −EU /f U SA−U SA = 1.3 in T 2 to 0.9 in T 3 . This last feature demonstrates a trend in the US for large hubs to increasingly collaborate, showing a 1132% growth from period T 1 to T 2 and sustaining a 157% growth from period T 2 to T 3 , reflecting the competitive cross-border complementarity that the 2013 European Research Area Progress Report highlights as a key criteria for transnational development of the ERA (see European Commission (2013a,b)). It is possible that certain EU funding criteria that require cross-border collaborations may come at the expense of decreasing the complementarity of the match, and hence the competitiveness of the collective. These types of policies may be orthogonal to the market-based collaboration forces existing in the rest of the world, and may further delay or send the EU off course in its ambition to create an effective, efficient, and geographically distributed ERA.
Long-distance collaborations (g d = 2): For large-distance collaborations, the magnitude of ρ significantly decreased for all groups from T 1 to T 2 , but much less so from T 2 to T 3 , suggesting a stagnation in the role of distance at the global scale, commiserate with the saturation demonstrated in Figures   1(A) and 4.
For both the LM and LM subsets, there is significant growth in f EU −EU from T 1 to T 2 and also from T 2 to T 3 . There are also positive signs of global integration, as the LL link counts indicate a significant increase in the EU-USA share (27.8% growth) and also the EU-World share (21% growth) from T 2 to T 3 . There is also an increase in the share of EU-EU LM and M M links from T 2 to T 3 .
Nevertheless, the USA-USA links dominate the network at the global scale, commiserate with the findings of the 2013 Science Europe and Elsevier SCIVAL Analytics report on inter-country/state coauthorship (see Science Europe (2013)). Concerning the rest of the world, the fraction f W −U SA > f EU −W if any of the two regions is L, and f W −U SA < f EU −W otherwise, indicating that the USA has a higher outcome rate from teaming scenarios between large hubs and smaller regions. Such teaming schemes, also known as "research buddy" model, is at the core of EU Horizon2020 policies aimed at raising the quality and competitive level of low-performing member state regions by partnering them with established European hubs (see Institutional Consortium (2013)).
Summarizing this subsection on the the role of distance, the overall trend is that most of the changes in ρ occur from T 1 to T 2 , but there is stagnation in the final period T 2 to T 3 . Furthermore, by analyzing the trends in the specific subsets of link types, a question emerges concerning EU crossborder collaboration policies: are EU policy efforts consistent with external global trends? If not, then EU policies aimed at spurring ERA development may possibly be orthogonal to, and/or significantly lagging, the economic, knowledge, and labor market forces that influence cross-border collaboration patterns globally. We will address this issue in the results subsection 4.3.

Evolution of the border effect
In Figure 7 we report the evolution of the border effect. These plots report the yearly semielasticity π t as defined in Eq. (5), i.e. the percentage change in the expected link count y (m,n) when the dummy border changes from 0 to 1. For example in 2008, taking an "average" pair of NUTS3 regions, the country border effect reduces the expected y (m,n) by 88.7% for the co-inventor network, 88.4% for the applicant-inventor network, 85.2% for the mobility network and 48.1% for the citation network. 22 Thus, the effect of country borders is clearly quite strong. Similar to the distance effect and in line with our research hypothesis 1, it is far less important for the citation network and more important in the co-inventor and applicant-inventor networks than for the mobility network.
Unlike the distance effect and coherently, we find some indication that the border effect is decreasing in magnitude for the all four networks. For the co-inventor, applicant-inventor and citation networks, the trend in the semi-elasticity is overall negative when considering the entire period of analysis. However, the trend is not continuous as there appear to be sub periods with positive trend.
Notably, we observe an increase in the magnitude of the border effect starting around in the late 1990s or early 2000s for the co-inventor, applicant-inventor, and citation networks, and persisting until recent years. For the citation network the overall negative trend turns out to be not statistically significant. Nevertheless, the levels still remain lower than 1988, the first year of analysis. For mobility network, the border effect resembles the pattern observed for the distance effect, there is a significant decline until 1995 followed by a relatively stable period until 2009.
To provide a better understanding of the kind of collaborations which happen more often across countries, we split regional links according to the size of regions. We identify the 100 top regions as small regions. 23 In Figure 8 we report the temporal evolution of the percentage of co-inventor links, subdivided among these three groups, that are also cross-border. An overall increase in the total share of cross-border links is apparent, with roughly half of the total owed to the cross-border activity between Top100-NonTop100. Thus, the cross-border share is drastically larger for links between small and large regions, indicating that when small regions collaborate across borders they are significantly more likely to collaborate with large regions. The Top100-NonTop100 share also shows the most significant increase of the three subgroups, suggesting that, in the light of our regression results, an easing of the cross-border effect can be partially explained by the increased likelihood of small regions pairing with large regions. However, links between top regions (Top100-Top100) rarely occur across borders, with only a mild increase over the period we examine. 24 . This suggests that borders have a context-dependent size-dependency, one which shields large competitive hubs from each other, but is less of a barrier for small regions seeking to pair with a hub.

Evolution of the European research integration
Efforts to stimulate and reinforce R&D integration in the EU have been ongoing for decades, with the aim of developing an innovation system that can benefit from cross-border knowledge spillovers.
By way of example, consider the COST (European Cooperation in Science and Technology) programs that were initiated in the 1970s to spur intergovernmental European cooperation, and the Cooperation Programme that was embedded within recent Framework Programmes (FPs) to make explicit the emphasis on cross-border activities. Moreover, cross-border R&D activities are not only important to the EU, but reflect overall globalization trends. As the prevalence of multi-institutional and multinational teams increases across science (Wutchy et al. (2007); Jones et al. (2008)), the evaluation of the propensities and incentives to collaborate internationally will become an important and active area of research (European Commission (2013b,a); Science Europe (2013)).
In the previous sections we aggregated all OECD countries, focusing on the evolution of the border effect on a global scale. However, the 21 year time span of our networks provides the added opportunity to observe, analyze, and interpret long-term trends in the context of European research integration policies as well as other global factors. Hence, in this section we measure the role of borders in EU vis-à-vis non-EU collaboration networks by comparing two types of links: (a) links that are completely contained in the EU, (b) links that are completely external to the EU. Flows between EU and non-EU are removed for identification purpose. This analysis points out the impact of EU-specific factors (a systemic "treatment") aimed at increasing cross-border connectivity, by measuring changes in the link counts relative to non-EU countries which did not receive the treatment. Figure 9 shows estimates of τ t as defined in Eq. (7), indicating that there are some positive signs of integration in European patent activity. In the case of the co-inventor network, we find an increasing overall trend of cross-border collaboration between inventors in Europe vis-à-vis other 23 The percentages of all links pertaining to the three groups are 44.1% (Top100-Top100), 18.2% (Top100-NonTop100), 37.7% (NonTop100-NonTop100) respectively.
24 A similar reaching out effect has been already noticed for the US Life Sciences patent network by Owen-Smith et al. (2002) OECD countries. This effect was remarkable since the mid Nineties, but has stalled since 2004. As a computation example, for the coinventor network in 2008, this results in τ 2008 = 0.2988 − 0.3210 = −0.0222, i.e. no difference in the impact of EU-specific factors upon integration with respect to 2004, the baseline year. However, analogous trends in the other networks are difficult to identify. 25 Apart from some positive effects for the applicant-inventor in 1986-1997, and citation network in 1996-2000 and 2005-2009, no significant trends can be depicted for the entire sample period. 26 We replicated the analysis including also new member states in the group of European NUTS3 regions. In the augmented group of 27 countries the role of new members in the R&D networks is anyways very small, accounting for a tiny percentage of the whole links. This is reflected in estimates as the evolution of the treatment effects is very similar to what we report. 27

Conclusions
The main results of our econometric analysis are summarized in Table 2, which illustrates for each network trends in the dynamics of the three effects. The overall trends we find are consistent for all networks, except inventor mobility, and in line with previous results in the literature (see Table   1). However, by taking a closer look into the dynamics of innovator networks across geographical and institutional borders, we notice that there has been some relevant changes of the global tendency over time. First, results in Figures 1 and 4 show that the increase of the average distance saturated across all networks. The outcome of our contextual analysis, as depicted in Figures 1, 2 and 3, reveals that the slowdown of the globalization trend compounds with the emergence of innovation hubs, which attract connections on a global scale. The second effect has induced an increase in the effect of geographical and institutional barriers since mid-Nineties, as shown in Figures 5, 6 and 7. Coherently with our second research hypothesis this effect appears to be stronger for small and mid-sized regions on national and regional geographical scales. As predicted, Table 2 shows that the increase in geographical and institutional barriers is stronger and starts before for codified knowledge (patent citations) whereas it is weaker and more recent for tacit knowledge (inventor mobility) with other R&D networks as intermediate cases.
Looking jointly at the trends in Figure 7 and Figure 9 for the country-border effect in the global patent activity, we can draw interesting insights about the impact that EU-specific integration has on the global evolution of the border effect. Positive trends in Figure 7 can be driven by: stronger integration in type (a) links (EU-EU), stronger integration in type (b) links (nonEU-nonEU), larger weight of type (c) links (EU-nonEU). Though the individual contribution of these links on the global patterns cannot be quantified from Figure 9, we can grasp whether the relative contribution of (a) versus (b) links reflects into the sign of global trends. In particular, we note that when the trend 25 In the case of inventor mobility, the number of non-zero link counts was too low to be modeled using ZINB, thus estimation was carried out aggregating the network at NUTS2 level. 26 These trends are very similar to the findings of Chessa et al. (2013), who use the same methodological approach but follow a different sample selection rule and use the co-applicant network in place of the applicant-inventor network. In particular, they select an unbalanced panel of NUTS3 regional pairs removing all regions below a cutoff on the number of patents. We interpret the finding that our conclusions are qualitatively very similar as a sign of robustness of the methodological approach.
27 Figures are made available by the authors.  1988-19931988-1993= 1986= 19931993 Applicant-Inventor 1988-19961988-1993198619961993-1999= 1999(III) Citations 198819881986-19961996-20002000 (IV) Mobility 1988Mobility -1997Mobility 1988Mobility -1995Mobility 1990Mobility -1992Mobility = 1997Mobility -2009Mobility = 1995Mobility -2009Mobility = 1992Mobility -2008  in Figure 9 becomes positive, i.e. integration in (a) is faster than (b), this typically reflects into movement of the opposite sign in Figure 7. In the co-inventor network, faster EU relative integration in 1995-2004 is consistent with a decline in the global trend between 1993 and 2004. In the applicantinventor network, a relative reduction in the border effect in EU in 1986-1997 pairs a global reduction in 1988-1997. In the citation network, a significant relative EU integration in 1996-2000 can be associated to a significant decline in the global effect in the same period. However, we can also identify significant declines in the global country-border effect that are not associated to faster integration in EU, suggesting that the impact of EU-specific institutional factors can provide only a partial explanation of the lessening on a global scale. This leaves some scope for the role played by the erosion of formal and informal institutional barriers at a global level. This interpretation seems to apply specifically to the mobility network for early years of the sample. In particular, evidence of easing in institutional barriers to international mobility in 1988-2005 can be explained by globalization of labor markets rather than by EU policies. All in all, the sudden stop of the process of European integration since 2004 seems to be caused by an inversion of the global trends. This has some important implications for the Horizon 2020 agenda that we will discuss in the concluding Section.

Final Discussion
We analyzed the temporal evolution of spatial biases in the strength of inter-regional connectivity within a set of regional patent networks. Focusing on a set of fifty developed nations over the period 1988-2009, using inter-regional links at the NUTS3 region level, we have contributed to the body of literature on the geography of knowledge by analyzing four different R&D networks: co-inventor, applicant-inventor, citations, inventor mobility. Making use of a gravity-like econometric approach and controlling for a number of separation effects, we estimated year-by-year effects of physical distance and country-borders, and the trend of integration in the European Union as compared to the other developed countries.
Contrary to the widespread notion that the importance of distance has been decreasing over time due to globalization and technological advancement, our results show that the constraint imposed by geographical distance on R&D inter-regional links seem to have actually increased in three of the networks analyzed: co-inventor, applicant-inventor, and citations (see Table 2). On average, interregional links take place at a larger distance, although this trend has saturated in the last decade for each of the networks we analyzed. The greatest in the frequencies of observed inter-regional links distance has been in the extreme tail of the inter-regional distance distribution. We observe roughly a factor of 10 increase in the likelihood of collaborations for distances above 2,000 km when comparing 1986 to 2008. Although these long-distance collaborations are far less likely than the average, a factor of 10 increase can induce a "tail effect" which can account for the increase in the average distance of collaborations reported in the literature and confirmed here. Hence, in light of these insights, it is possible to say that distance is not necessarily "dead", but that distance has at the least "saturated". Nevertheless, the role of distance is still a rather important factor that distinguishes the size groups of regions we analyzed, and provides insight into the role of large "hubs" which can readily attract over long distance, but also shield each other competitively over short distances. In light of our decomposition of the coinventor data into size, distance, and time subsets, we urge policy makers to consider the subtle but important roles of context when evaluating progress in cross-border collaboration and mobility. Within large datasets, there can be niche patterns that are overlooked when data are pooled without controls and/or decomposed into conditional subsets.
Our decomposition method sheds light on the size-dependent mechanisms of preferential growth and attachment in dynamic innovation networks.
To this end, by controlling for numerous variables and analyzing conditional subsets of the network data we were able to gain new insight in to the mechanisms underlying the evolution of inter-regional connectivity. The overall increase in the average distance can intuitively be understood as the result of large "hub" regions attracting new peripheral regions as they enter the system and grow over time.
However, ceteris paribus, for a pair of regions of a given size the strength of their connectivity gets more sensitive to physical distance with time. This means the cost of inter-regional collaboration at a given distance is still large, even increasing, but whenever a small region becomes connected to a hub the relevance of this cost is counterbalanced by the benefit of linking to a core region. Indeed, large and diversified regions tend to extend their basin of attraction across national borders, prevalently toward small regions. A possible explanation for the preferential attachment of big regions has to do with their capability to combine local proximity to attract socially distant groups with the capability of attracting socially close but geographically distant communities of inventors (Agrawal et al., 2008). It is worthwhile to further explore the sources of the preferential attachment mechanism across multiple R&D networks in future research.
As the EU seeks to implement a "Teaming Scheme" (aka "Research Buddy" plan) whereby top research centers will pair with "low-performing" member states to set up research centers, there should be a better understanding of the role of size on collaboration pairing (see Institutional Consortium (2013)). Indeed, we find that the distance effect is becoming larger in magnitude for medium-distance collaborations between small/medium regions pairing with large regions (see Figure 3, where all %∆ρ values are positive in the g d = 1 column, with the exception of the LL group).
The smart specialization policies should take into account for the size-dependency of cross-border collaboration. We find evidence at the short distance scale that large regions may competitively shield surrounding smaller regions, meaning that a beneficial interaction between a small/medium sized region and a large region may not develop because of the competition between large hubs over short scales. Hence, investing in the development of large attractive hubs may lead to inefficient dilemmas within regions with a high density of hubs, an issue that should be considered as the EU encourages smart specialization policies in order to effectively, efficiently, and homogeneously develop the ERA.
The higher likelihood of small regions pairing with large regions across national borders contributes to an easing of the cross-border effect. Our estimates of the evolution of the cross-border effect indicate that national borders can be crossed more easily now than in the late 1980s, particularly due to a significant decrease up to the late 1990s or early 2000s depending on the network (see Table 2). Hence, when we observe collaborations to happen more frequently across borders, this is largely driven both by an erosion of institutional frictions that impede inter-national connectivity (Hoekman et al., 2010) and the "reaching-out" by international hubs, rather than a decrease in the costs associated with collaborating over distance. Part of this story can be explained in terms of the role played by the European Union in promoting inter-national connectivity within the area, though signs of integration are weak and overall significant only for collaboration between inventors (see Table 2). Moreover, while the contribution of EU-specific integration forces upon the reduction of the border effect on a global scale can be identified for some time frames, global trends often evolve independently, emphasizing the role played by the erosion of formal and informal institutional barriers not EU-specific. An outstanding example is the mobility network, which seems to have key relevance in driving crossnetwork interdependencies. In fact, we note that the time window where we observe a significant decrease in the border effect for mobility is somehow related to the period of decreasing border effect in the other R&D networks. This applies also for the distance effect, at least for co-inventor and the location of R&D activities. These results reinforce the view that individual mobility is the driving force of knowledge spread out (Breschi and Lissoni, 2009). However, given no evidence of impact on international mobility arising from EU institutions, this potential seems largely unexploited in the EU area (European Commission, 2013b).
In estimating the evolution of the distance effect we note that the mobility network stands out as the only network with a negative trend, though with no significant change after 1997. This suggests that the globalization of skilled-labor job markets which enabled a reduction in mobility costs have had a larger impact on the geography of knowledge than advances that favor a reduction in cost of communication. In particular, the result that the distance effect is steadily increasing in the network of citations despite well-known advances in technologies easing the codification of knowledge corroborates the notion that tacit and embodied knowledge still play a major role in diffusion. In particular, patents are pieces of codified knowledge building upon a stock of tacit knowledge that hinders its fruition (Breschi and Lissoni, 2009). Overall, the increase in distance effect supports the view that improvements in communication technologies, while on the one hand facilitating the substitution of face-to-face interactions with arm's-length communication, on the other hand create a greater need for close interactions to exchange complex knowledge which is responsible for research activities to agglomerate rather than to disperse (Gaspar and Glaeser, 1998).
For the geographical dispersion of the network of R&D activities (applicant-inventor network) we observe a significant decrease of the border effect over the period 1988-1999 and a significant decline of the distance effect over a similar time window (1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996). Also, both effects increase almost hand-in-hand since the late 1990s. An increase in these effects means that, once we account for the effect of size and other variables, inventors are more likely to be located nearby and in the same country as their institution (patent applicant). Excessive geographical dispersion of learning centers can lead to difficulties in controlling the generation and exploitation of knowledge, especially given its predominant content of tacitness. This argument has been invoked to explain a substantial change in international location decisions observed immediately after an opposite trend between 1985 and 1995.
In fact, the strong movement to establish a transnational configuration of R&D observed between 1985 and 1995 has been blamed to result in overly complex and unmanageable organizational architectures (Gerybadze and Reger, 1999). In the light of the role played by knowledge-seeking reasons in the internalization of innovative activities, our results are coherent with these trends and point out that limits encountered by R&D internalization strategies in controlling the accumulation of knowledge across geographical and institutional borders have not been reduced by globalization forces.
As concluding remarks concerning policy, we stress the importance of R&D clusters. Our evidence suggest that integration in research is being driven by the top regions reaching out to more peripheral regions and across borders. This trend in the evolution of R&D networks supports policies oriented to the exploitation of agglomeration economies in research clusters rather than targeting promotion of cross-border collaboration (Hoekman et al., 2010). This trend is in line with smart specialization strategies as they can be a valuable asset to speed up the creation and consolidation of a European Research Area. However, the importance of investment in programs that incentivize mobility of researchers throughout Europe seems to be reaffirmed, even if we do not have explicit evidence of tangible benefits in the European Union as opposed to the rest of the developed world.
However, it is also important to stress that policies embedded within EU funding programs aimed at transnational cooperation (see European Commission (2013b)) may run counter to global trends.
By way of example, while the overall intensity of cross-border activity is increasing in Europe, it may be evolving in a way that is orthogonal to the US/World. Specifically, for the LS and LM size pairing over medium distances (g d = 1), the EU has significantly larger fraction of EU-EU links than the US has US-US links (roughly twice as many, see Figs 2 and 3). Conversely, for the large distance (g d = 2) group, the EU-W link counts are significantly outnumbered by the USA-W link counts for pairings involving large regions. Hence, the policy implication is that while encouraging intra-EU collaboration is good for developing and sustaining the ERA, it may come at the cost of missing out on competitive global market forces which match collaboration partners according to "best-with-best" principles, independent of region (Boyle, 2013). To this end, EU policy makers might consider the pros and cons of a competitive funding system wherein the most competitive EU grants do not have any EU-collaboration criteria.
As a final remark, we point out some limitations in our analysis, which could be addressed in future research. We do not consider scientific publications or R&D projects and collaborative agreements in our analysis. Further investigation is needed to assess whether similar trends are present for basic research and other networks of innovators. Another extension maybe to explicitly test for the dynamic interplay between different R&D networks. Finally, the increasing availability of large data sets of bibliometric information should encourage the application of new quantitative methods to assess the efficacy of the European R&D policies for smart specialization and integration. Figure 1 Notes: 3 distance regimes were defined: "small distance" (dm,n ≤ 100 km, corresponding to g d = 0), "medium distance" (100 < dm,n ≤ d EU km, corresponding to g d = 1), and "large distance" (dm,n > d EU km, corresponding to g d = 2). The value d EU is chosen to represent the characteristic distance of EU-EU links, using d EU m,n (T all ) = 562 km.

Figure 3
Notes: % changes calculated over period T2 to T3. Shown are link distributions for T3.

Figure 4
Notes: All links between NUTS3 regions are used to compute the average distance. For inventors mobility self-loops are removed as not meaningful.   Notes: These plots report the year-by-year % of cross-country regional links for three sub-samples. Regional links are grouped in links of (1) both top 100 regions, (2)   .6 Appendix A. The ZINB model for the count of links Zero-inflated models allow zeros to be generated by two distinct processes and are generally used when data exhibits "excess zeros" (Cameron and Trivedi, 1998), that is larger number of zero observations than what expected with the Poisson distribution. The ZINB model supplements a count density,P , with a binary zero generating process ψ. This allows a zero count to be produced in two ways, either as an outcome of the zero generating process with probability ψ, or as an outcome of the count processP provided the zero generating process did not produce a zero (ψ i = 1).
The density distribution for the pair count y i is then given by where the zero generating process ψ i is parameterized as a logistic function of the regressors in Z i , with parameter vector β 0 : The count processP (y i ) is modeled as Negative Binomial of the second kind (NB2): where the conditional mean µ i is parameterized as an exponential function of the linear index Xβ 1 (µ i = exp(X i β 1 )), and α ≥ 0 is the overdispersion parameter.
In our estimation procedure we assume X i = Z i because there is no reason to expect some variables would be relevant only in one of the two processes. However, individual regressors can impact the y i estimator differently through the two distinct processes and their separate parameter vectors, β 0 and β 1 . Thus, drawing together Eqs. A.1, A.2, and A.3, and setting X i = Z i our model for the expected count is what we report in Eq. 1.   Notes: * significant at 5%; ** significant at 1%. Robust standard errors in brackets. The table reports estimates of four separate ZINB-gravity models for the count of links between NUTS3 regions (from these estimates we compute yearly elasticities of the distance effect reported in Figure 5). For each model we report estimates of parameters in the two ZINB parts vectors, namely β 1 (pure count part vector) and β 0 (zero-inflation part vector) as defined in Eq. 2. Parameters in β 1 capture the regressor effect on the the number of links, provided the zero-generating process did not produce a zero.

Appendix B. Tables of estimates
Parameters in β 0 capture the regressor effect on the probability of observing a zero. Cross-sections of region pairs are pooled over years and estimation is carried out on the whole sample clustering standard errors at region pairs. sizem refers to the smaller of the two regions for co-inventor, while refers, respectively, to citing, to applicant's and to exit region for citations, applicant-inventor and inventor mobility. Vuong test statistics support the choice of the ZINB over a pure version NB2 (ψi = 0, ∀i) (Vuong, 1989;Long and Freese, 2006) and likelihood ratio tests support the choice of ZINB versus the ZIP (Long and Freese, 2006).  Notes: * significant at 5%; ** significant at 1%. Robust standard errors in brackets. The table reports estimates of four separate ZINB-gravity models for the count of links between NUTS3 regions (from these estimates we compute yearly semi-elasticities of the border effect reported in Figure 7). For each model we report estimates of parameters in the two ZINB parts vectors, namely β 1 (pure count part vector) and β 0 (zero-inflation part vector) as defined in Eq. 4. Parameters in β 1 capture the regressor effect on the the number of links, provided the zero-generating process did not produce a zero. Parameters in β 0 capture the regressor effect on the probability of observing a zero. Cross-sections of region pairs are pooled over years and estimation is carried out on the whole sample clustering standard errors at region pairs. sizem refers to the smaller of the two regions for co-inventor, while refers, respectively, to citing, to applicant's and to exit region for citations, applicant-inventor and inventor mobility. Vuong test statistics support the choice of the ZINB over a pure version NB2 (ψi = 0, ∀i) (Vuong, 1989;Long and Freese, 2006) and likelihood ratio tests support the choice of ZINB versus the ZIP (Long and Freese, 2006).