Health Research Networks Based on National CV Platforms in Brazil and Uruguay

Collaborations are an essential part of the research process. In the health area, these involve a great diversity of actors in various scientific and health subsystems. The study of collaborations has been developed mostly from the analysis of co-authorship in articles indexed in international platforms. However, these sources present some limitations to capture the production of knowledge in Latin American countries. This paper seeks to diversify the sources of information and the units of analysis for the study of collaborations in health research, by exploring data from two original and little-used sources of information in national and public Curriculum Vitae (CV) platforms: LattesCv (Brazil) and CVUy (Uruguay). Based on a wide sample of research projects extracted from CVs, networks for knowledge production are analyzed at micro (researchers) and meso (institutions) levels in each country. This preliminary analysis allows us not only to generate evidence on the nature and evolution of health research networks, but also to evaluate the advantages and limitations of CVs as a new source for the study of collaborative networks.


INTRODUCTION
In this study we understand health research from a broad perspective, where the production of knowledge is one of the functions that drives the Health Innovation System (HIS). [1] Collaborations are a constitutive and central part of the systemic vision of innovation and have been studied in multiple ways. In general, this definition confronts the linear approach of innovation, based on the idea that there is an interactive process which takes place among actors and institutions. [2] In other words, the acknowledgment that innovation can hardly exist without diversity in connections. [3] Latin American countries have very different and unequal HIS [4] leading to different capacities to mobilize collaborations. These capacities are particularly important when it comes to promoting collaborations with the health subsystem for solving local problems. [5] The study of collaboration has been based on bibliometric data from articles indexed in international databases. However, the use of these data presents a series of geographic and language biases that limit the analysis for Latin American countries. This paper seeks to explore new sources of information and units of analysis for the study of collaborations in health research. To that end, an exploratory analysis of collaborations in research projects contained in the Curriculum Vitae (CV) of researchers is carried out, for the case of two Latin American countries that have public curriculum platforms: Brazil (LattesCv-platform) and Uruguay (CVuy-platform).
Social Network Analysis (SNA) was applied to the data contained in more than 1,448 CVs in the case of Uruguay and 12,252 in the case of Brazil. We explore to what extent the data on collaborations built from CVs allow us to understand the evolution of collaboration networks between researchers (micro level) and between HIS institutions (meso level). The analysis focuses on the evolution of collaboration networks between the year 2000 and 2015. This period allows us to contextualize the collaboration networks in a special situation for the promotion and investment in STI in Latin America, from a scenario of economic crisis and little STI investment to a scenario of recovery and larger investment.
We are particularly interested in answering four exploratory questions in each country: (i). How does the size of collaboration networks in health research projects evolve over the period? (ii) How does the cohesion and connectivity between network actors evolve? (iii). Who are the main actors in the network and what role do they play? (iv) How do networks between actors in different HIS subsystems evolve? Of particular interest is the analysis of the nature of interinstitutional collaborations that bring together actors focused on the production of knowledge with external actors from other subsystems, especially health services providers. By answering these questions, we hope not only to generate evidence on the functioning of health research networks in each country, but also to evaluate the advantages and limitations of the use of CVs as a new source for the study of collaborative networks.

Background Health research and health innovation systems
Despite growing recognition of the research as a tool to improve healthcare and healthcare systems, there is great difficulty in defining what health research is and how it could effectively provide solutions. [6] In this study, we understand that health research involves a broad range of areas (biomedicine, social sciences, management and administration, public policy, engineering and medical technologies, among others) and when it is focused on problem solving it requires a collaborative process between different actors. This collaborative process occurs within cross-sector systems which involve a diversity of agents in different domains of action, or subsystems. [1] Combined, these subsystems constitute the HIS. Albuquerque and Cassiolatto [6] identify at least four types of HIS agents: (i) universities and research institutes, which bring together the scientific community and are the center of knowledge and technologies that make up the system. (ii) Healthcare services (hospitals, clinics, medical centers, etc.), which are mainly focused on their healthcare function but at the same time show a strong interaction with universities and the industrial sector. (iii) Health policy and regulatory institutions, including government agencies but also professional associations that play a decisive role in selecting agendas and financing. (iv) The industrial sector (service and supply companies, laboratories, etc.).
According to Consoli and Mina (1), innovation in HIS is nourished by knowledge transfer, especially from scientific research and clinical practice. This knowledge is highly dispersed within the system, and among disciplines, technology fields, geographical locations and institutions. Therefore, collaboration is the "gateway" for knowledge generation aimed at solving health problems.
The Social Network Analysis (SNA) is complementary to the innovation system approach, since it allows to identify the system's main actors, their strategic positions, the information flows and their evolution in time. Several studies have used this tool to identify the distinctive features of the HIS at national, regional, international or sector level. For example, in the late 1990s, Hicks and Katz (1996) [7] point out the strategic role of hospitals within the HIS in the United Kingdom. Consoli and Mina (1) also highlight the role of hospitals, specifically university and research hospitals due to their double function and their potential role as "led users". [8] In the case of Brazil, previous research also shows the growing participation of university hospitals in the production of scientific knowledge, combined with their other functions. [9] The application of the innovation system approach in Latin American countries has shown that this region -in addition to local and contextual potentialities -shows common barriers related to the development of STI. Some of the main barriers highlighted by Arocena and Sutz [3] are: (i) low levels of investment in infrastructure, (ii) lack of knowledge demand, (iii) poor promotion of endogenous knowledge and opportunities to apply it, (iv) difficulties to reconcile knowledge supply and demand, and (v) lack of stable linkage between STI institutions and productive actors at a national level. As pointed out by several authors these barriers also apply to the case of HIS in the Latin-American region. [10][11][12][13] Investment in STI in Latin America has been historically low, none of the countries exceeding 1% of GDP in Research and Development (R&D) investment (Ricyt, 2020). Brazil has been an exception experiencing a progressive increase in R&D investment between 2000 and 2013 up to approximately 2% of GDP, which since 2014 has dropped to 1%. [14] Uruguay shows a much lower level of R&D investment. Despite an increase in 2006, it has never exceeded 0,5% of GDP. [15] Even though a country's research capacity depends on multiple factors, human resources are crucial, particularly regarding scientists with a PhD training. Brazil has 6.9 doctors per 10,000 inhabitants [16] and, given the implementation of successive national plans and the main role played by the scholarship system, it stands out in the region for its highly mature postgraduate system. Postgraduate program in the health sciences sector is remarkably diverse with most doctoral students graduating in this field. [17] Uruguay's postgraduate program offer, on the other hand, is more recent and its scholarship system has seen a significant boost as of 2008. The total number of Uruguayan doctors is 4.5 per 10,000 inhabitants. [18] Despite its diversity, Latin American countries share the common challenge of applying national capacities towards promoting interactive learning focused on solving problems. [19] To achieve this, collaboration and exchange between diverse areas of knowledge (formal and tacit) are key. In the health sector, this is particularly important, not only because knowledge is highly distributed, but also because progress in improving health -for example, the development of new diagnoses and treatments -requires close collaboration S90 between basic medical research, clinical research, product development and healthcare. [1] Collaborations for knowledge production Scientific collaboration studies have a long tradition in social sciences and scientometrics. Laudel (2002) [20] defines it as "a system of research activities by several actors related in a functional way and coordinated to attain a research goal corresponding with these actors' research goals or interests" (18, p. 5).
In recent decades, several authors have highlighted the emergence of new forms of knowledge that leave behind isolated paths to increasingly substitute them for collaborations among different kinds of actors and disciplines. [21] Some of the main reasons for collaboration include the need to jointly afford the increasingly high costs of frontier research, the increase in interconnectivity between countries, the development of communication technologies, the need for addressing complex issues integrating knowledge. [22] Particularly in the health sector, from the late 1990s, a strong impulse for collaboration began at a global level based on the belief that it could help reduce disparities between countries. This impulse was mainly boosted by: i) scientific developments such as genomics and statistical techniques, ii) technological requirements necessary for analyzing large data sets, iii) and major global funding initiatives. [23] In addition to its advantages, collaboration also implies disadvantages, for example, in terms of time and resources. [22,24] And they may involve different costs, Boschma (2005) [25] points out that the levels of proximity or distance between collaborating actors can speed up or block innovative knowledge.
Historically, the study of scientific collaborations has been operationalized based on the co-authorship of articles. In Brazil, co-authorship networks have been widely used to analyze collaboration on research in specific diseases, [26,27] or with a focus on strategic health research institutions [28] among others. No records of this type of studies were found in Uruguay.
Co-authorship analyses have advantages in terms of reliability and replicability; however, collaboration and coauthorship are not synonymous. [20,22] Not all contributors appear as authors and not all authors actually participate in the collaboration. In addition, bibliographic datasets used for this bibliometric analysis show a series of geographic and language biases. [29] In recent years, several studies seek to diversified the sources of information to evaluate STI activities avoiding the aforementioned biases, for example using webometrics. [30] This research proposes the use of a new source of information in order to complement the analyses on scientific collaborations at a national level. For this purpose, we will focus on data extracted from CV platforms with a focus on research projects. We define research projects as the process for the production of basic, applied or experimental knowledge that is guided by objectives, limited in time and carried out with specific resources. Projects are developed within institutional frameworks with their own missions, involve several disciplinary, multidisciplinary, formal and tacit knowledge, and constitute workspaces for information exchange between several actors, which are involved with different levels of commitment, collaboration and conflict. By proposing the use of this unit of analysis we are focusing on a stage of knowledge production different from the one usually analyzed with data from published articles. If we think of knowledge production as a process, then data from research projects and data from articles could complement each other. The former captures the beginnings of knowledge production, which, depending on their success, use different dissemination channels or are not necessarily disseminated. While the latter capture a portion of the production of knowledge that is disseminated in a format that is primarily of interest to the academic community.

Methods and sources
CVs as sources of information to study STI activities Different studies have used CVs as a source of data to evaluate national STI activities and capacities, for example, researchers' academic careers, [31,32] international mobility, [33] dynamics of academic collaboration in general [34] and of co-authorship networks in particular, [35] as an assessment tool of STI activities, [36] among others. There are multiple advantages of using a CV as source: its use is nearly universal, information has a relatively standardized format, in most cases access it is easily available, and longitudinal data is provided on the performance of individuals in different areas of their professional life. [36] But it is not a source without disadvantages. [37] Information contained in a CV is of variable quality and completeness, and emphasis may be put on different aspects depending on the individual specialization and the intended goal as the CV was created. Truthfulness of the information is also variable because it is self-declared and may change over the course of time. In addition to the fact that it may not be updated with the latest information (data truncation). Finally, it requires a great effort of standardizing the information. The existence of standardized CVs platforms managed by public STI agencies is a key factor in gaining access to quality information.
The availability of platforms for public access to CVs is the main justification for analyzing the cases of Uruguay and Brazil. The CV formats in both countries were created in a similar way, in fact the CV in Uruguay is based on the Brazilian LattesCv.
In both cases, CVs are used to evaluate applicants to programs from various STI agencies. The CVUy platform in Uruguay, is managed by the National Research and Innovation Agency (ANII). It is a standardized and automated system launched in 2008 and currently has more than 12,200 records. In the case of Brazil, there are two large databases, one at research group level and their members, the Directory of Research Groups (DRG), and one at individual level, the mentioned Lattes-CV. The latter was launched by the National Council for Scientific and Technological Development (CNPq) in 1999 and, in 2007, it exceeded one million registered researchers. [38] Data retrieval processes in both platforms followed a common logic. Both extractions were based on obtaining information on collaboration from the integration of research teams projects declared in CVs. Variables regarding projects applied to both datasets are: title, summary, starting year, source of financing, number of members; and on individuals or project members: name, ID in CV, name of main institution, type of institution, area and knowledge discipline.
However, each platform allows different definitions of the population under analysis. In the case of Brazil, the interface between the DGP and LattesCv data allows to identify the principal investigators, or group leaders, from a broad definition of health, according to the application sector. In this case the population under analysis were 12,252 leaders of research groups with application in the human health sector, who registered 154,879 research projects in their CVs throughout the period 2000-2015. In the case of Uruguay, the CVuy does not allow to distinguish the leading of research groups, nor does it allow to define health research according to the sector of application. So, the population under analysis were defined from the main area of performance that the researcher declared and the main area of the research projects: Medical Sciences and Health (MSH). In this case, the population under study were 1,448 researchers who registered 3,183 in their CVs throughout the period.
The database building is divided into four common stages:

Retrieval of CVs: i) in Brazil it was carried out with
the scriptLattes [39] program based on the collection and summarization of all LattesCV in HTML format ii) in Uruguay, upon a formal request of the information, all the CVUys were obtained in HTML format, and were then organized in a structured way.

Consolidation and cleanup in two phases: i) integration of project data and individual researchers' data in a database;
ii) removal of duplicate information and definition of time frame (projects started between 2000 and 2015).
3. Disambiguation of textual data (names of institutions) and coding in categories. For disambiguation, VantagePoint® software was used for Lattes-CV, and OpenRefine for the CVUy.
4. Consistency check based on the analysis of descriptive statistics.
The criteria applied made it possible to obtain two large database comprising researchers from a wide variety of institutions and knowledge areas. Although the differences in the unit of analysis do not allow us the comparison between countries, we are able to evaluate the specificities of the collaboration networks in health in each country.

Research collaboration networks construction
Collaboration networks are analyzed at micro and meso levels. Figure 1 summarizes the network creation procedure.
In the first kind of network, nodes are individuals who participate in research projects and have registered their CVs in the system. In the second kind of network, nodes are the institutions where researchers belong. In both networks, links are established when a researcher's name appears in a project's team and/or when two researchers register the same project in their CV. These are non-directed networks where we assume reciprocity in collaborations.
For each country networks are analyzed by comparing the evolution of metrics at a node and structure level in four periods (2000-2003; 2004-2007; 2008-2011; 2012-2015). 1 This division into periods allows us to observe how the evolution of the networks is inserted in a particular economic and investment context in ITS in the region. Although each country presents its peculiarities, this is a period characterized S92 by a strong economic crisis and low STI investment at the beginning of the 2000s, followed by a period of recovery and improvements in STI investments and a subsequent period of stagnation approximately as of 2012.
The main indicators and metrics used to answer the research questions are: 1. Networks size evolution: using indicators of the size of the network we can identify periods of increasing or decreasing collaborations among researchers and institutions.
2. Networks cohesion and connectivity: By analyzing how Density (D) and Average Degree Centrality (ADC) evolves we explore the network cohesion. In addition, since in research practices it is usual to work between relatively small groups, it is interesting to observe the structure of communities based on the Clustering Coefficient (CCo). A network connectivity analysis is conducted based on the evolution of its Degree Centrality (DC), the size of the Largest Component (LC) and the Average Path Length (APL) of the LC. (ii). Healthcare services, mainly researchers in medical centers, hospitals, clinics and healthcare centers. Following the recommendations of Hicks and Katz (1996), university hospitals are not coded under the university category but as a different kind of institution. (iii). Companies, including researchers and professionals in companies that develop or provide technology or services for the health sector; and finally, (iv). S&T and health support institutions, governmental agencies and several non-profit organizations.

Collaboration networks in health research in Uruguay
During the period of analysis, the development of Uruguay's research system experiences at least three relevant changes: (i) Strengthening of research programs and creation of a new institutional framework; (ii) Strengthening of postgraduate programs and expansion of scholarship funding; (iii). Increase in investment in STI activities, though with fluctuations. Although between 2005-2010 the public component assigned to STI activities at a national level increased by a factor of five, a drop was observed in the following years. [40] In 2008, the budget allocated to university research also increased. [41] Over this period, capacity building in the health sector shows a positive development. According to the analysis of the extracted CVs, there is a sustained increase in the number of PhD graduates in 2000-2015, as well as an increase in the proportion of those who graduate in the country to the detriment of those who do so abroad. 2 The analysis of the CV data allows us to observe that collaboration networks in research projects seem to accompany Uruguay's maturation process regarding its research promotion system. They report growth throughout the period as well as an increase in connectivity among researchers (Figure 2). At the 2. For an analysis on Uruguay's health PhD base on CVuy data see supplementary material. beginning of the series, there are 275 participating researchers in the network and by the end of the period the number of researchers increases by a factor of four. Connections between researchers also increase, from more than 300 to more than 2,000 (Table 1).
Density measures indicate a low global network cohesion, that means, very few of all possible relationships between network actors actually take place. However, it is relevant to note that the analyzed networks show an increasing level of connectivity over the course of time. The ADC, that is, the average of nodes adjacent to each researcher in the network, ranges from 3 to 6 depending on the period. There is also a decreasing number of researchers without collaborations, while in every period the number of researchers with more than two direct collaborations increases. 3 In 2008-2011, more than 30% of researchers have high DC, collaborating with other eight or more network members. Higher network connectivity is reinforced by a growing LC. In the third period of the series, the LC highest proportion involves the 82% of the network. The APL of the LC decreases towards the present. Considering the entire period, any two nodes in the network are only 5 steps apart. It is also observed a high CCo, almost 0.8. This result was to be expected considering the dynamics of scientific collaboration, that is, small groups that have intense internal relations and sporadic collaborations with the exterior.
The networks analyzed from project data seems to be showing characteristics of "small world" networks. This kind of structure have been widely spotted in scientific co-authorship showing how they improve the information flow between group members. [42] However, this kind of network does not help bringing all actors of a system together and can become redundant in terms of circulation of ideas and information, which draws attention to the importance of weaker ties to external actors. [43] This information is key for networks focused on health knowledge production. The existence of a significant number of researchers who maintain direct or indirect contacts presupposes potential communication channels for the exchange of ideas within the global network.
3. The evolution of the DC and the LC is detailed in the supplementary material.
The existence of nodes that act as a bridge between groups, that is, researchers with high BC for example, is key to understanding this network's LC growth. In Figure 2, nodes with high BC are marked in different shades of green.
They are mostly consolidated researchers in areas such as biotechnology, basic medicine, chemical sciences and clinical medicine. Although networks are mostly composed by collaborations between researchers in STI institutions (88%), especially national public institutions, collaborations with academic and non-academic experts who perform different functions in the health system (12%) are also observed. For a more detailed analysis of the institutional collaborations, we regrouped the network using as nodes the institutions where researchers belong.
As can be seen in Figure 3, 4. Networks contain a residual category "UDELAR" that includes several schools  Except for the MSP, which is the main national health policies authority, the rest are public STI institutions, three schools (FMED, FCIEN and FQ), two research institutes (IIBCE and IP_MV) and one training program (PEDECIBA). All show incremental growth in their centrality measures (DC, BC, CC) throughout the period. 5 These institutions play a major role in information flows and exchanges related to the production of health knowledge at a national level. FMED shows the highest indicators of centrality. It maintains direct links with most institutions in the network (DC 134) and acts as mediator in collaborations between many of the network's node pairs (BC 0.46). In addition, it has a strategic position so as to easily receive information released by other nodes (CC 0.69). The rest of the institutions mentioned above also have high degrees of DC and CC, but lower degrees of intermediation. In terms of geographic distribution, a high centralization of collaborations is observed in the capital of the country. However, the establishment of the Regional University Center in the north of the country (CENUR_Lit. Norte) seems to be a great step to change this. Said institution shows high measures of centrality since its creation towards the present and seems to be acting as a bridge between leading health institutions in the capital and the north of the country.
At the network's periphery, there are several institutions that play different roles in the production of knowledge in health. Interinstitutional subnetworks show that the LC of the public university, but no further information can be disaggregated in the data extracted from some CVuy. 5. The evolution of the centrality measures for the main institutions is detailed in the supplementary material.
of collaboration between STI institutions and companies comprises 46 nodes and only 9% of the links of the global network ( Figure 4). Centrality measures of the companies are low and their links with STI institutions are not constant in time except for some of them, such as the Uruguayan Center for Molecular Imaging (CUDIM) (non-estate public corporation focused on diagnosis, research, training and applications in the health sector), Atgen (private laboratory that was created as the first spin-off incubated in FCIEN-UDELAR) and Celsius (private laboratory acquired by the Dermur Pharma group). Although these data need further research, the positive evolution shown by collaborations between UDELAR, the Pasteur Institute of Montevideo and private sector company ATGen is an essential precedent to explain the joint development of the first national test to detect SARS-CoV-2 that made possible the country's current success at expanding its diagnostic capacity to cope with the pandemic.

Collaboration networks in health research in Brazil
Brazil stands out in the region for the development of an institutional framework for the promotion of STIs in health, as well as for its strong scientific community and the ability to develop a research system that can effectively contribute to improve the quality of life of its population. [44,45] During the analyzed period, the STI promotion system in Brazil undergoes considerable changes regarding its institutional framework, infrastructure and budget. It should be noted that during the analyzed period, several efforts were made to decentralize and expand the higher education and postgraduate offer, including the establishment of new federal universities who graduate in the country are considerably more than those who graduate abroad. 6 The collaboration network analyzed for the case of Brazil comprises a total of 6,459 research group leaders with application in the health sector conducting research projects during the analyzed period (Table 2). Every period shows a rise in the network's size ( Figure 5), that is, a growing number of leaders and their collaborations. The same as in Uruguay's collaboration networks, the highest network growth is observed in 2008-2011. The significant decrease in the network's size in the last period stems from a lack of update of the CV data due to the closeness in time of the extraction date.
In this case, the network's structure shows low cohesion and high fragmentation. Density measures and connectivity  In the health research area, a rise in budget is observed, especially from 2003 to 2006, followed by a drop and subsequent stagnation. [44] Research groups focused on health sciences account for approximately 5,609, that is, 16% of all groups nationwide in 2014. [16] On the other hand, groups with direct application in human health gather a great diversity of disciplines and represent more than 10,800 until 2015. [46] According to data on the academic training of these group leaders, the number of PhD graduates grows continuously until 2008, followed by a subsequent drop. In addition, those  are very low, the networks' average degree does not exceed two connections in none of the periods. Nonetheless, an improvement in connectivity towards the present is observed. The proportion of leaders with 0 grade decreases as the proportion of leaders with grade 1 or higher increases. 7 In other words, towards the present, more and more research group leaders establish collaborations in projects with other leaders. The improvement in network connectivity is confirmed by LC growth towards the third period going up to 10% of the total network. The network shows a high clustering coefficient in all periods. On the other hand, the distance between pairs of nodes varies substantially in the periods depending on the size of the LC ( Table 2). If we consider the entire analysis period, the distance between any two pairs of nodes in the network is high, 11 steps away. This characteristic could be explained by the fact that the network only captures a part of collaboration between most established researchers, and should be complemented in the future with information of others researchers.
Most are researchers at STI institutions (93%). Nevertheless, the network also includes a group of leaders (4%) that belong to health system institutions, such as hospitals, clinics, nonprofit organizations and government agencies.
Institutional collaboration networks in research projects increase in size throughout the periods and comprise a total of 1,074 institutions ( Figure 6). As can be seen in Figure  6, based in Bahia with the rest of the network and accounting for high BC indicators. Geographic location is a key factor in understanding Brazil's dynamics in health research, particularly due to its high concentration in the southern and southeastern regions. Nonetheless, some slight improvements in capacity decentralization are observed over time in the north and northeast regions. [47,48] Interinstitutional subnetworks gathering leaders from institutions with different functions collaborating for the health research system comprise only 188 nodes and 202 edges, that is, a small portion (18%) of the global network ( Figure  7). However, growth is observed in these subnetworks from the first periods of the series to the present. Group leaders in hospitals, such as the Clinic Hospital of POA, the Barretos Cancer Hospital or the Clinic Hospital of USP play a leading role in these networks. Particularly university hospitals are key to collaborations that have grown over time. Collaboration between leaders from different government agencies in the health sector is also observed, especially in the Secretariat of Health Surveillance (SVS), MS and the Secretariat of Health of São Paulo (SES_SP).

CONCLUSION
The collaboration networks between health scientists have been analyzed by using data of research projects from their CVs over a 16-year period (2000-2015). We found that data from both CV platforms can be used for network analysis and provide significant evidence to understand the dynamics of collaboration at national level. In both countries, the networks analyzed at micro level grow in size and connectivity from the middle of the period. In general terms, the evolution of the networks seems to converge with the institutional and budget strengthening in STI systems observed in both countries.
The project data in both platforms also allowed us to analyze the inter-institutional collaborations of the different HIS actors, based on coding the types of institutions. This analysis confirms the leading role played by public's research institutions in these two Latin American countries, especially public universities and health research institutes. Furthermore, the data shows that both countries have a diversified base of collaborations between STI institutions, hospitals and government agencies. The presence of diverse actors that represent the supply and demands of knowledge could help guide research towards meeting the demands of the health system. The expert users located in hospitals and government agencies are not just sources of information, but they are also well aware of local problems and their possible solutions. [8] Although, the data shows that these interinstitutional subnetworks grow over time, they continue to be a small portion of the global network in both countries and the participation of several health subsystem actors is not constant over time. In that sense, the lack of connection between HIS subsystems, health policies and the population's health needs, continues to be a problem. [12] On the other hand, the analyzed data captures only a minor role of the business and industrial sector in health research projects, showing a limitation on CVs data collected. Particularly, it should be noted that data collected regarding group leaders' institutions is not a good source of information for capturing interactions with the industrial or business sector in Brazil. However, other dimensions contained in the DGP platforms have shown greater potential illustrating typical dynamics in the university-private sector interaction showing the relevance of collaborations between public health STI institutions, national laboratories and multinational companies. [49] In the future, it will be necessary to explore other variables within the Lattes-CV in order to obtain longitudinal information on company participation in research projects, for example their participation in research projects funding.
Although this preliminary analysis provides new sources of information as well as substantial evidence to understand the dynamics of collaborations for the knowledge production, it is also limited by the design of CVs, especially because of the little attention that the CVs manager agencies put in collecting the co-participation of non-academic actors. If the focus of scientific evaluation is mainly on rewarding bibliographic productivity based on published articles, then information on collaborations with non-academic actors will be little considered in the design of CVs. Nevertheless, to strengthen the HIS, progress should be made in expanding the role of both health and STI subsystems, increasing participation of actors involved in guaranteeing public health. This requires reducing the costs of collaboration and rewarding these activities. This is a problem that concerns scientific evaluation systems in our region and in the world. As Kickbusch [50] points out, the challenge is to find performance indicators that generate mutual benefits to the actors involves, so that collaborations can be sustained over time and thus improve the quality of health research and services.
In the future, it seems necessary to reinforce at least three lines of analysis to expand the use of these sources: (i) explore the potentiality of CVs data to analyses causal explanations and determining factors for collaboration networks and their structure, (ii) diversify the empirical evidence and units of analysis to explore types of collaborations. The CVs are a rich source of information that can be used to integrate collaborations in technical, bibliographic, teaching materials, among others, providing a more comprehensive overview of the various forms of knowledge and technologies production in developing countries. (iii) Explore the qualitative data contained in CVs, for example, applying text mining techniques in project summaries or research lines, with the aim of better understanding what research topics and problems make up the agendas in the different knowledge areas, institutions and country regions.
present is explained by the date of data extraction in 2015. The vast majority of group leaders are consolidated researchers with completed doctorates, so it is more difficult to find recent doctorates in the sample.

Number of nodes
Total number of nodes in the network n

Number of ties
Total number of links between nodes t

Density
Measures how close the network is to being complete. A complete network means that all possible connections take place between all nodes, in which case the density is 1.
Number of effective ties in the network, expressed as a proportion of the number possible. In a network with undirected links, the density is

Average Degree Centrality
Average number of ties that each node in the network has.

t ADC n
Where t is the total number of links; n is the total number of nodes ADC Clustering Coefficient Average of the individual clustering coefficient Where Ccoi is the individual clustering coefficient (i.e., the density of ties among nodes connected to a given node); n is the number of nodes in the network

Largest Component
It refers to the largest group of nodes that are all connected to each other, directly or indirectly.

Degree Centrality
Number of links that a node has, or number of adjacent nodes

Betweenness Centrality
It is a measure of how often a given node falls along the shortest path between two other nodes.

BC j = ∑ i<k gik
where gijk is the number of geodesic paths connecting i and k through j, and gik is the total number of geodesic paths connecting i and k.

Closeness Centrality
It is a measure of how close a node is to all the others in the network, higher CC measurements mean greater closeness to all other nodes.
Total distance (in the graph) of a given node from all other nodes. PhD training in health in Uruguay and Brazil In the understanding that PhD training constitutes a critical stage in building and developing R&D capacities, it was interesting to approach the evolution and geographical location of health researchers' PhD training in both countries. The information contained in the CVs was analyzed for this purpose and complement the network analysis.

The case of Uruguay
Among researchers participating in health research networks, there is a sustained increase in the number of PhD graduates in 1984-2015, as well as an increase in the proportion of those who graduate in the country to the detriment of those who do so abroad ( Figure S1).
Until 1992, the proportion of graduates abroad was significantly higher, which is associated with emigration during the dictatorial regime (1975)(1976)(1977)(1978)(1979)(1980)(1981)(1982)(1983)(1984)(1985) of a significant contingent of scientists and young people who then continued their training abroad. As of 1986, the proportion of those who graduate in the country increases to become a majority from the beginning of the 2000s until the end of the period. This relocation of PhD studies among health researchers can be associated with the establishment of several new institutions that: i) open work and academic opportunities, ii) offer the possibility to obtain a PhD degree in the country, and iii) support PhD training in general and in the health area in particular. The top five countries chosen by those who graduate abroad in 2000-2015 are: Spain, France, Argentina Brazil and the United States. In 2012-2015, half of the health PhDs graduate in the region, which may be associated with the growing number of postgraduate programs offered also at a regional level.

The case of Brazil
According to the analyzed information, PhD qualifications of group leaders in the health area grew continuously until 2008 ( Figure S2). The effect of the drop observed towards the A significant majority of those who graduate in 1984-2015 obtain their degree in the country and not abroad. The number of those graduating in the country increases throughout the period except for the period from 1993 to 1995. This may be related to Brazil's major economic crisis in the 1980s that led to larger emigration flows in general during that period, which is when those graduating in said triennium begin their PhD studies.
An increasing number of graduates and their growing nationalization is associated with a highly mature national postgraduate system, which has been continuously expanding since the 1970s. Most leaders of health research groups that obtain their PhD degrees abroad choose the United States and Canada as their study destinations, as well as the following European countries: Spain, France, Portugal, Great Britain and Germany. Brazilians, unlike what is observed in Uruguay, do not choose countries in the region to carry out their PhD studies. On the contrary, a high-quality and varied PhD program offer makes Brazil a leading host country of students in Latin America.
The analyzed information shows a growing nationalization of PhD training both in Brazil and Uruguay. It is possible to propose a hypothesis that identifies the development of PhD institutionalization in both countries, and the search and demand for greater specialization, as the cause for a migration shift due to PhD training.

Evolution of connectivity of individual collaboration networks in health research projects
The graphs show the evolution of the DC and LG analyzed for each country in the article ( Figure S3, S4, S5, S6).