Knowledge Hubs of Russia: Bibliometric Mapping of Research Activity

The intensified globalization of science and internationalization of research collaboration ties questions the conventional center-peripheral mode of national knowledge space. The traditional perception of metropolises acting as hubs for the absorption of global knowledge and its subsequent diffusion to provincial towns requires revision and refinement. The present study examines the agglomeration effect on the excellence centers formed around major cities in Russia with a population of over 1 million people. Spatial scientometrics techniques are applied for capturing and evaluating the territorial configuration and specifics of scientific centers around millionaire cities in 50 and 100 km zones. Scopus database is used for sourcing metadata on the key bibliometric indicators that reflect the magnitude of publication activity, the prominence and competitiveness of scholarly output, the integration into national and international research networks and the scientific profiles of cities. Research results suggest that cities with perceptible research activity located within a 50 km limit from a millionaire city are more integrated at the national level and are more competitive as compared to average values. However, the level of demand for their scientific results is often lower. Cities located within the 50-100 km zone to a lesser extent experience the positive influence of agglomeration, being quite isolated, featuring a self-sufficient development trajectory.


INTRODUCTION
In the late 1960s-70s, a booming discussion over application of scientometrics to national and regional knowledge management took place. While introducing the concept of 'scientific space', Krauze and McGinnis 1 have stressed on the benefits of attaching scholarly output to territories, although pointing to "a rigorous treatment [required that] is not as yet available" (p.442). Bibliometric cartography and the creation of 'maps' of the scientific literature has long remained a purely aspatial phenomenon dealing with clusters of highly interactive documents in science interlinked by co-citations using bibliographic coupling or direct citation techniques. 2,3 Meanwhile, scholars have widely used geographical thesaurus as metaphors to describe the scientific landscape -area of specialization, fields of research, scientific frontier. 4,5 Early papers published featuring a geographical projection of scientometric data have been written in the scope of countries.
A prominent Indian scholar Subbiah Arunachalam has made a significant contribution to this area by examining science in a vast variety of countries -Australia, Canada, India, Indonesia, Israel, Malaysia, Philippines, Singapore, Thailand. [6][7][8][9] Along with Blickenstaff and Moravcsik, [10] Lewison, [11] Yuthavong [12] and other scholars, they have bridged the information science to regional studies and human geography. Apart from setting a geographically defined research scope,these studies touched upon the national specifics, such as economic peripherisation and gross domestic product values, national research expenditure, local infrastructure, population size, language barriers, literacy rate and other. First observations on spatial patterns started to occur, such as peripheral territories being "derivative and imitative of science done in the centre, rather than 'original' or 'path-breaking' " [13] (p. 393). Lewison, [11] on the other hand, argued that scientific production of less favoured regions is by no means negligible as they demonstrate rapid growth within a limited number of scientific fields. May [14] has introduced a notion of the scientific wealth of nations, which indicates correlation between the research profile and productivity. Grupp et al. [15] have also examined the academic contributions to technological change of a region.
Mass digitalization of publishing business, globalization of content indexed by abstract and citation databases and enhancement of information technologies have improved the coverage, accuracy and functionality of the metadata pegged to a particular publication. These are data about authors, their affiliations with an address of location, subject categories, funding agencies, etc. All of these factors have enabled recent studies to become more sophisticated in a detailed assessment of spatial particularities. A focus on regions and cities has revealed new spatial traits in knowledge networking. Numerous studies register the effects of distance between research organizations, noting that it causes decay in mutual citations and cooperation, thus, stressing the importance of a territorial domain. [16][17][18] The location factor defines the asymmetry not only within the 'core-periphery' delimitation scheme, but also between the nodes. Burger and Meijers, [19] Li and Phelps, [20] Mikhaylova and Mikhaylov [21] bring the attention to functional heterogeneity of knowledge hubs, while highlighting the notion of 'knowledge polycentricity'the mosaic of competence clusters that excel in a specific niche (e.g. place-dependent, subject-specific, industry-focused, etc.). A Hungarian researcher György Csomós -one of the most productive scholars on spatial scientometrics nowadays, claims that conventional consideration of knowledge agglomerations blurs the significant divergence that occurs in scholarly output on a local level. [22] One of his findings suggests that decomposition of the highly productive Greater Boston Area reveals that over half of the output comes from suburban settlements (e.g. Cambridge -27.3%). He further suggests that apart of the challenges related to bibliometric and scientometric tools applied, researchers confront questions of interdisciplinary nature, for instance, those covered by human geography. [23] OBJECTIVES The present investigation aims to test a hypothesis that major cities of the country are 'knowledge hub' boosting the research activity of smaller cities clustered in close proximity. The increase in the distance from the nucleus -cities with over a million inhabitants, is expected to negatively affect research productivity both in quantitative andqualitative terms. Numerous bibliometric indicators based on publications and citation data on a city level are retrieved from the Scopus database for a publication period of 2013-17. In particular, the study evaluates overall publication output distributed by cities, research productivity as measured per 1000 people, publications in top-10% percentiles by CiteScore, the share of papers resulted from national and international collaboration, citations analyzed per paper and as field-weighted citation index (FWCI). The study is designed to answer the following questions: • What are the differences in scientific productivity between cities of different sizes?
• What are the spatial and functional patterns in the distribution of knowledge-generating cities of various types around millionaire cities?
• What are the key knowledge-generating hubs of Russia?

METHODOLOGY
For this study, the spatial scientometrics is applied for assessing the scientific space of Russia at the city level. Research design follows a combination of bibliometrics and statistical data. . Therefore, both the name of a city and of particular organizations located in it are taken into account. It should be noted that papers published in collaboration by scholars from different cities are accounted in whole for each location indicated.
The knowledge production structure of Russia was formed in the Soviet period. Research excellence centers are established all across the country supporting industries and national infrastructure projects, exploitation of natural resources and development of "economic districts". Institutes of the Russian Academy of Sciences (RAS) traditionally act as nuclei for scientific research, including the Siberian, Ural and Far-Eastern branches. Umbrella institutions -the regional scientific centers, unite most regional institutes of RAS. Over the past two decades, regional universities start to play a significant role in knowledge generation, the technology transfer in particular. The Russian Academic Excellence project "5-100" launched in May 2013 has provided strong development impetus to 21 universities in different regions of the country providing an extensive funding (https://www.5top100.ru/en). Thus, given the widespread distribution and multiple roles played, both the RAS institutes and numerous HEI are included in the assessment.
The set of bibliometric indicators used to draw a picture on the research profile of cities includes the number of publications, researchers and organizations per city, the main research excellence area using the 27 categories of the All Science Journal Classification (ASJC), the volume of citations and the level of FWCI, the share of papers written in national and international collaboration and the share of publications in top-10% percentiles by CiteScore. The Rosstat population figures are used for measuring the research productivity and for testing the hypothesis set on the agglomeration effect. For this purpose, each city is assigned to one of the following six types by population size: I -millionaire cities with over 1 million inhabitants, II -major cities with population over half-million, III -large cities of 250-500 thousand people, IV -big cities of 100-250 thousand people, V -medium cities with 50-100 thousand people and VI -small cities of up to 50 thousand people. The focus is given to analyzing the concentration of cities-scientific centers of various sizes around fifteen millionaire cities -the type I, as dominant attractors of research activity in accordance with the hypothesis set. Two boundaries defining the territorial remoteness from the core of the territorial scientific system are taken into account: a) 0-50 km -the immediate proximity zone and b) 50-100 km -the outer-belt. According to the studies on innovation diffusion and knowledge spillovers, [24][25][26][27][28] locating within a 50km limit or one-hour drive provides higher potential for the emergence of research ties due to frequent communication and even daily commute (incl. labor mobility).
The inbuilt algorithms of the QGIS 3.10 professional application for representing geographic data is used to define the precise coordinates and plot each city under study. Symbols are introduced for providing a clear visual hierarchy, thus, supporting readability of maps. The single scaling applied for all maps apart from Moscow for featuring an extensive density of adjacent cities in 50 and 100-km zone.

Analysis
Over the five-years period of 2013-17, Russian researchers have published 349,696 papers indexed in Scopus, which puts the country on 14 th place in the global ranking by the volume of scholarly output (15 th in 2013 and 12 th in 2017). Development dynamics of annual average research output suggests that Russian science is being increasingly internationalized -the share of papers indexed in Scopus as compared to the Russian citation index has grown to 2.8% in 2017 against 2.3% in 2013. Despite of the positive trends, the national scientific space remains highly divergent. Of 1118 cities analyzed in total, only 440 had at least one publication indexed in Scopus. The performance differences across the range of scientometric indicators are also evident when considering the type of city by population size ( Table 1).
The results clearly indicate a relationship between the size of a city and its scientometric profile. Larger cities have more balanced development within their group for each of the six indicators considered. The heterogeneity of indicator values increases when considering smaller cities. This pattern is confirmed by the data on the gap between minimum (except for zero) and maximum indicator values for cities within the same size group (  Tagil). Moreover, in the group of smaller cities, the proportion of those with zero value is higher than for larger ones.
For the indicator "I.6 -Research Productivity", the share of cities with one publication is estimated as a minimum threshold value. As in the case of the distribution of citieswith zero values by indicators I.1-I.5, same patterns are found for the indicator I.6. All cities with a population of over 250 thousand people had more than one scientific publication. An increase in the share of cities with one publication is observed from big to medium and further to small cities (3.4, 9.8 and 38.2%). Thus, we can assume that the scientific profile of the millionaire cities -type I, the major -type II and large cities -type II, is characterized by greater stability and self-sufficiency.
A detailed analysis of the spatial distribution of cities within each type by indicator revealed a number of patterns on the spatial configuration of territorial scientific systems of different levels. Firstly, there are significant differences between city types in terms of research productivity. While in absolute terms, the cities of I and II types are the core of the national scientific system making up for the highest 5. http://www.ras.ru/sciencestructure/regionalcenters.aspx?rcen= 6. https://elibrary.r) Journal of Scientometric Research, Vol 9, Issue 1, Jan-Apr, 2020  volume of research output, in relative terms (per thousand inhabitants) the leaders are small cities that occupy the top positions by this indicator. This perspective redefines small cities from being scientifically backward, peripheral and depressive to efficient high-performance research centers capable of generating knowledge. This shows how different are roles and functions of smaller cities in the national scientific system, as well as how important it is to utilize adaptive policies for their development.
Secondly, regularities are traced between the city type and the level of its integration into international and national networking, being an important mechanism for the knowledge flow. In cities with a population of over 250 thousand people, the maximum values for indicators I.3 and I.4 do not reach the threshold of 50%, indicating relative self-sufficiency of their research teams and the considerable volume of articles generated. At the same time, research cooperation itself is important for these cities, as indicated by the average share of scientific articles written in international and domestic coauthorship ( Table 1). Note that in most cases, the national collaboration domain among I-III city types prevails over the international (by the number of articles). We assume that these cities perform important functions of agglomeration centers, playing the role of drivers, locomotives for smaller A) National collaboration B) International collaboration Thirdly, the distribution of city types by demand (I.1 and I.2) and competitiveness (I.5) of scientific products indicates the existence of two different models of scientific systems and their corresponding strategies that are related to the city size ( Table 1). The first model is typical for cities with a population of over 250 thousand people. It implies the development of several research areas in parallel, both hard (natural sciences) and soft (social sciences and humanities) sciences, which is reflected in the dynamics of the average and field-weighted citation patterns, a large portion of papers published and a combination of global and regional-specific research topics. The second model is typical for cities with a population of less than 250 thousand people (primarily for type VI) and suggests a pronounced knowledge pecialization (incl. higher FWCI

DISCUSSION
The results obtained in the course of the study indicate that the development of urban knowledge-generating systems in cities with a population of under 250 thousand people is unstable and highly dependent on external factors influencing their research activity. In our opinion, one of such factors is agglomeration, which is manifested in the proximity to a large scientific center and its positive effect on smaller cities located nearby. In testing this hypothesis, we estimated the concentration of smaller scientific centers around 15 millionaire cities (type I cities) and assessed their scientific profiles ( Figure  1-3). It should be noted that 22% of the cities studied fall within the potential zone of influence of millionaire cities, with 52 cities being located in the immediate proximity zone of 50 km and 43 are at a 100 km distance -the outer-belt. For assessing the agglomeration effect, we analyzed the distribution of cities around millionaire cities by the share of co-authored publications, reflecting the availability of sustainable scientific ties ( Figure 1).
Cities located within 50 km from millionaire city are more susceptible to collaborations within the national network of scientific cooperation than other cities on average, which indicates the presence of additional links between the center and its satellites. This is confirmed by comparing the distributions of the three types of cities (relative to their location from the millionaire cities) by the level of involvement in national intercity cooperation networks (Figure 2A).
More than half of the cities in the 50 km zone have a high and medium level of national collaboration, which is significantly higher in comparison with the other two types. At the same time, cities located in the zone from 50 to 100 km from the millionaire cities, on the contrary, are characterized by increased alienation from interregional cooperation both in comparison with the average for other remote cities and with the cities of 50 km of the zone. Similar evaluation for international collaboration shows that cities in the 50 km zone almost completely repeat the distribution of remote cities, which indicates the absence of specific patterns for them on this indicator. Interestingly, the cities of the 100 km zone demonstrate the worst values for international collaboration. Thus, the cities of the 100 km zone are least integrated into various scientific networks of cooperation and the territory from 50 to 100 km from a millionaire city can be considered as a kind of 'knowledge exclusion zone'.
In order to further verify our hypothesis, we analyzed the distribution of cities around millionaire cities by the percentage of publications in the Top-10% highly cited journals and FWCI values, reflecting the global competitiveness of research centers (Figure 3).
Evaluation of the distribution of cities by the percentage of publications in the Top 10% Scopus-indexed journal percentiles by Cite Score shows that the cities of the 50 km zone have better positions as compared to the cities of the 100 km zone and the remote ones. For instance, the proportion of cities without publications in highly cited journals is almost 10% lower among knowledge centers close to millionaire cities ( Figure 4A). This result can be interpreted as the fact that the scientific teams of small cities located around the largest scientific centers, to a greater extent than other types of cities, accumulate competencies for the generation of high-quality intellectual products as a result of scientific cooperation with leading scientists from millionaire cities. For cities of the outer-belt, on the contrary, a different pattern is typical: they least demonstrate their competitiveness through publications in the top quality journals and 67% of them do not have these type of publications at all. This confirms the thesis on the 'knowledge exclusion zone' between 50-100 km from a millionaire city, when the positive synergetic effect from the proximity of a large center weakens, while the attraction to the core and outflow of resources intensifies. To consolidate their position as an independent center, such cities lack internal resources against the background of significant external impact.
In addition to research quality, the competitiveness of intellectual products created is defined by the level of demand being measured by FWCI indicator. The resulting distribution of city types by this indicator differs from the three previous indicators ( Figure 4B). A significant proportion of cities in the 50 km zone among those with FWCI values below average may be due to a number of reasons. Firstly, the small size of the cities and the low volume of publications generated restrict "friendly" cross-citation practices. Secondly, the unknown and weak branded research facilities and/or scientific organizations located in these cities does not enrich trust in the research results. Thus, articles are less cited even if being written at high scientific standards and published in first quartile journals. Slightly higher values are of cities beyond 50 km zone -the outer belt, which is probably due to a narrow specialization and a specific subject of research. This is particular true for industry research centers and institutes of the Russian Academy of Sciences. Figure 5 shows the distribution of cities by subject area and the volume of publication activity.
The highly advanced agglomeration has developed around Moscow, which is reflected in complementary specializations of many knowledge-generating cities around the capital and a significant volume of scholarly output. This knowledge agglomeration is distinguished by a variety of small scientific centers -18 different research areas are represent here (most excellence centers specialize in Physics and Astronomy -9; Engineering and Materials Science -5; Agricultural and Biological Sciences -4). Moreover, most of these cities with related specializations are located in high density within the 50 km zone (6 in Physics and Astronomy, 4 in Engineering, 3 in Agricultural and Biological Sciences; 3 in Chemistry; 2 in Medicine and Computer Science; as well as Pharmacology, Toxicology and Pharmaceutics; Chemical Engineering). Individual predominantly disconnected cities are located in the 100 km zone (with the exception of Materials Science), a third of them specialize in Arts and Humanities, Social Sciences, Economics, Econometrics and Finance.
In the next largest agglomerations of Yekaterinburg and Rostov, 5 specializations are represented, most of whichcan be considered as complementary. In the Rostov agglomeration, these are Chemistry, Engineering, Computer Science and in Yekaterinburg -Physics and Astronomy, Chemistry, Materials Science and Engineering. A significant difference between the two agglomerations is that Yekaterinburg is more polarized (research centers around Yekaterinburg are small in the number of publications and there is a greater gap between the center and the satellite cities). In the Rostov agglomeration, this phenomenon is smoother and the gap between Rostov-on-Don and a number of satellite research centers is smaller. Other knowledge agglomerations are significantly smaller, most of them are highly polarized and complementary small scientific centers are located 50 km from a millionaire city.

CONCLUSION
Despite the globalization and cross-continental innovation diffusion affecting the perception of distance in modern geography, the agglomeration factor continues to playa significant role in research and learning. Unlike technologies that can easily be moved from the place they were originally created, knowledge tends to be sticky and rooted in its initial territorial social systems, hence, displaying localization patterns. For a long time, the same approaches and models were used to study knowledge as to innovations, the generation of which is directly associated with the emergence of new knowledge. However, with the active development of scientometric methodologies for studying knowledge dynamics, a tool has appeared for assessing the spatial specificity of knowledge generation and diffusion. Earlier scientometric studies have shown that scientific knowledge in itself is also an object of study and is subject to certain patterns, including spatial ones. Our study focused on the scientific agglomerations of Russian cities, forming around the millionaire cities. We were interested in checking whether the agglomeration factor affects scientific knowledge domain.
To facilitate understanding of the processes that occur around the millionaire cities, we have identified two zones at a 50 km and 100 km distance from the center. The distance set is determined by a time factor -a distance of no more than 50 km potentially allows for more frequent, even daily contact of knowledge stakeholders and 100 km is a limit for maintaining connectivity on a systematic basis.
The study showed a number of curious patterns in the manifestation of the agglomeration effect. Firstly, the 50 km zone from a millionaire city is the zone of the greatest positive influence of a large scientific center on nearby cities with the population generally under 250 thousand people. The proximity of a large integrator is manifested in a higher level of integration of small research centers in the national knowledge system, which is reflected through the national co-authorship indicator and in above average quality of articles, as measured by the share of publications in the top-10% Scopus-indexed journals. The location in a 50 km zone from a millionaire city -often regarded as a knowledge hub for connecting the national and international scientific space, does not affect the networking of small knowledgegenerating cities with foreign researchers. The proximity to a large center also does not have a tangible positive effect on the demand for intellectual products created in small research centers. Sometimes it is even lower than in remote cities, which we associate with the lack of branding for a small city, an unformed pool of international trust and weak intra-city citation.
Secondly, not all millionaire cities in Russia were able to form scientific agglomerations around themselves. For example, Omsk does not have satellite research centers. The closest to it is located at a distance of 220 km. Systems of cities with disjoint research specializations (for example, around Voronezh) cannot be classified as scientific agglomerations for having low complementarity. The actual size of the millionaire city does not affect the number of research centers in the 50 km zone (excluding Moscow, where the capital status is the decisive factor).
Thirdly, the study revealed that the zone at a distance between 50 km to 100 km from a millionaire city is not another ring of wave of the agglomeration effect. Knowledge generating cities (i.e. scientific centers) located here, to a greater extent, are isolated from the scientific space, poorly adopting global standards for academic writing and research. Most of these cities have a different research area than a millionaire city, which restricts the formation of close cooperative ties. At the same time, a significant part of these cities have well established scientific specialization, producing intellectual product of demand. Interestingly, looking beyond the 100 km limit we find the potential partnerships for most millionaire cities considering cities specialized in the same or similar field of knowledge. This aspect requires individual examination.
Overall, the study suggests that the agglomeration factor has a significant impact on the generation of knowledge. However, the wave of progressive knowledge dissemination from the center to the periphery can be called into question. The spillover effect seems to exhaust beyond the 50-km limit and switch for a vice versa effect -the resource sourcing. This is reflected in the lagging research performance at a boundary of 50-100 km across the scope of indicators analyzed. Results point to the fact that knowledge-generating institutions located at a distance of over 50 km from the agglomeration center cannot be considered as full-fledged beneficiaries of its research infrastructure and facilities (e.g. Centers for collective use, Science and technology parks, etc.). Thus, public expenditure on science and technology should incorporate the needs of cities of the outer-belt, while discarding their duplication with the center.
Further research, in our opinion, should be focused on the consideration of wider zones around millionaire cities, as well as other cities with a population of over 250 thousand people in order to establish a detailed picture over the spatial patterns of various types of knowledge systems.