1 Introduction

Nanotechnology is an important emerging technology, with many potential applications. Examples include nanoparticles (Davis 1997), nanosensors (Qu et al. 2013), nanocomposites, (Paul and Robeson 2008) carbon nanotubes and nanomaterials (Salata 2004), that can be used for biomedical, environmental, textile, food, or construction purposes. Healthcare is one field being highly revolutionized at the level of nano-scale manipulation (Gabellieri and Frima 2011; Pautler and Brenner 2010) since nano-devices and nano-medicines can harvest accurate diagnosis, cheaper and faster biomedical facilities, less invasive procedures and more targeted drugs (Bjørn Larsen 2011). All these applications can lead to great economic and societal benefits (Miyazaki and Islam 2007; Roco 2013). Hence, nanotechnology is continuously being incorporated as an essential part of industrial and governmental R&D agendas (Miyazaki and Islam 2007; Roco 2013). Nanotechnology R&D funding has grown worldwide from US$1.2 billion in 2000 to over US$18 billion in 2010 with an average annual growth rate of 0.27 (Roco et al. 2011; Roco 2013). Nanotechnology has also been selected as one of the six Key Enabling Technologies (KETs) by the European Commission, and the current Horizon 2020 (H2020) Framework Programme prioritizes the industrial implementation and cross-fertilization of nanotechnologies in industry (Højgaard et al. 2012; Kalisz and Aluchna 2012).

For an emerging technology like nanotechnology, creating sufficient technological diversity among its alternatives is important for its long-term success (Negro et al. 2008; Van den Bergh 2008; Van Rijnsoever et al. 2015). Innovation is an evolutionary process of variation and selection (Edquist 1997; Hekkert et al. 2007). Technological diversity helps to prevent an early lock-in, facilitates recombinant innovation, increases resilience of the technology in case of unexpected circumstances, and allows market growth (Dosi 1982; Faber and Frenken 2009; Van Rijnsoever et al. 2015).

However, technological diversity could also increase production costs; hamper economies of scale, and impede standardization and the learning of routines. It also leads to co-ordination difficulties among actors.

The diversity of a technology changes as new technological alternatives are created (Murmann and Frenken 2006; Saviotti and Metcalfe 1984; Van Rijnsoever et al. 2015). If a new technological alternative represents a common technological design, diversity decreases. Technological alternatives that have a novel or less common design increase diversity (Abernathy 1979; Frenken et al. 1999).

The creation of new technological alternatives often takes place in innovation projects in which different actors such as firms, universities, and research institutes collaborate (Cooke et al. 1997; Edquist and Hommen 1999; Niosi 2011). For emerging technologies, these innovation projects are often publicly supported, for example, through EU-funding. Hence, funding instruments can be used as a tool for policy makers to influence the level of technological diversity (Pandza et al. 2011; Van Rijnsoever et al. 2015), and thus to secure the long-term viability of the technology.

Simulations (Jonard and Yfldizoglu 1998) and conceptual works (Edquist and Hommen 1999) indicate that the creation and persistence of technological diversity depends on learning from their neighbourhood and network externalities. Yet, there is little empirical evidence about the characteristics of innovation projects that influence diversity. Van Rijnsoever et al. (2015) demonstrated that diversity created by an innovation project is related to the network position and actor composition of a project. Adding to insights from innovation systems (Hekkert et al. 2007), Van Rijnsoever et al. (2015) argue that it is also important to consider the structure of the network to make a technology successful in the long term. In nanotechnology European founded projects, Pandza et al. (2011) found that there is a significant degree of collaborative diversity in terms of international and institutional affiliation in a research network. This should be beneficial to technological diversity creation, but they did not test this implication empirically. In this paper we extend these current approaches by studying the influence of characteristics of EU-funded nanotechnology projects on the creation of technological diversity. In addition to actor diversity and the network of the project, we also include novel variables that have a plausible influence on diversity creation. The degree of multi-disciplinarity of the project and the knowledge base of the actors in the project can increase the chances that unique novel combinations are made, increasing technological diversity.

Further, to understand technological diversity we need to study the content of the documents. Scholars use pre-existing categories like patent classes or Web of Science categories to measure diversity (Leydesdorff et al. 2014; Rafols and Meyer 2010). Another approach to determine diversity is to look at the network of citations of the documents (Rafols and Meyer 2010). Yet, these approaches are mainly applicable to patent or publication data, and not to EU-projects. Hence, to study diversity, we apply topic modelling (Blei and Lafferty 2009) as a novel approach to categorize technological designs that are described in 69 EU-projects from 2014 to 2015. This method allows us to calculate diversity creation in an efficient manner.

We relate the change in technological diversity caused by a project to the independent variables mentioned above and show that the largest contribution to diversity comes from the multi-disciplinary nature of a project and the nanotechnology knowledge base of the actors in a project. Moreover, our results largely confirm the results by Van Rijnsoever et al. (2015). Policy makers can use these results to use subsidies as a tool to influence the level of diversity in a technological field.

2 Theory

In this section we describe the concept of creation of technological diversity by innovation projects as our dependent variable. Next, we formulate hypotheses about how this technological diversity is related to our independent variables: the multidisciplinary diversity of projects, the knowledge base of actors in a project, the number and diversity of actors in a project, the degree of clustering, and the geographical distance among them.

2.1 Technological diversity

Technological diversity refers to evenness in distribution of technological alternatives (Foray and Grübler 1990). These alternatives can be designs (Carlsson and Jacobsson 1997; Murmann and Frenken 2006; Van Rijnsoever et al. 2015), technical characteristics (Murmann and Frenken 2006; Saviotti and Metcalfe 1984; Van Rijnsoever et al. 2015) or numbers of different technological lineages represented in a technology (Gjesfjeld et al. 2016).

Technological diversity is a macro-level concept, as it applies to a set of technologies. The concept is related to the micro-level concepts of radical and incremental innovation, which are commonly used to assess specific innovations or the performance of firms. Radical innovations are new technologies and are often based on the combination of different technologies (Fleming 2001). As radical innovations are new, they increase technological diversity by definition. In contrast, incremental innovations can be achieved without novel information or the integration of different technologies (Wuyts et al. 2004), and can either decrease or increase technological diversity, depending on how abundant the incrementally improved technological design is among existing alternatives. Incremental innovations on rare technological designs increase diversity, while incremental innovations on common technological designs decrease diversity.

The evolutionary economics literature states that technological diversity contributes to more rapid technological change (Cohen and Klepper 1992). For this they give three reasons. First, diversity mitigates the possibility of an undesirable lock-in, reducing the likelihood that superior alternatives remain undiscovered or underdeveloped (Abernathy 1979; Cowan and Foray 1998; Frenken and Nuvolari 2004). Second, diversity increases the chances of making recombinant innovations, and hence of further developing the technology. Third, technological diversity means that there are more alternatives, which provides flexibility (Hannan and Freeman 1989; Stirling 2007). As a consequence, diversity increases the resilience of a technology against unexpected environmental changes, which are particularly common in emerging stages (Negro et al. 2008). These reasons influence the long-term success of an emerging technology and are important to consider when analysing the functionality of innovation systems (Van Rijnsoever et al. 2015). Technological diversity can thus be used to help assess the long-term viability of a technology.

However, having too much diversity also has drawbacks (Bassett-Jones 2005; Lettl et al. 2009). For instance, from a neoclassical economic perspective, the generation of diversity in products and production processes hampers the creation of economies of scale and the development of standards (Cohendet and Llerena 1997; Van den Bergh 2008). Moreover, too much diversity could have a negative influence on learning processes (routines) to exploit the technology (Foray 1997). In this context, more diversity requires time, effort, and co-ordination among actors (Leten et al. 2007) to resolve possible differences of perspective (Sirmon and Lane 2004). In addition, in a system with more diversity, more information needs to be codified, which also leads to increasing costs (Cohendet and Llerena 1997).

These advantages and disadvantages imply that there is an optimal level of diversity (Van den Bergh 2008). Despite there being no established parameters to obtain this optimal level, it is possible to analyse factors that influence the creation or otherwise of diversity (Van Rijnsoever et al. 2015). Innovation subsidies can be such a factor, but this is not enough to stimulate diversity creation. We argue that it is also necessary to consider how these subsidies are distributed.

2.2 Networks of innovation projects

In line with innovation system thinking, Van Rijnsoever et al. (2015) argue that the concept of technological diversity creation needs to be studied at a project level, rather than at the level of the individual actor because new technological alternatives are the output of innovation projects and not of the actors themselves. Actors collaborate in projects on new innovations. These collaborative projects can be seen as planned tasks that actors execute over a settled period of time to reach a desirable outcome. Actors contribute knowledge, resources and skills required for successful innovation to these projects, and share the risks of failure (Atkinson et al. 2006). Each project has specific characteristics that we study here. We discuss how a project’s degree of multi-disciplinarity, and the composition of the project in terms of the prior knowledge base of actors, the number of actors, the diversity of actor types, geographical distance between partners and network position influence technological diversity.

2.2.1 Degree of multi-disciplinarity of a project

The concept of discipline has been subject to much debate. For instance, it has been used with “inter”, “trans” and “cross” prefixes. Schummer (2004) makes a distinction between multi and interdisciplinary: multi-disciplinary refers to the involvement of many disciplines, meanwhile interdisciplinary refers to the interaction between disciplines (Schummer 2004). In the context of this paper, we follow Rafols and Meyer and define multi-disciplinarity as the spanning of a diversity of knowledge areas, which could be disciplines, technological fields or industrial sectors (Rafols and Meyer 2010).

Many scholars have analysed multi-disciplinary projects from the perspective of collaboration between team members (Chin et al. 2002; Cummings 2005; Teasley and Wolinsky 2001; Van Rijnsoever and Hessels 2011), or on the skills required to manage these types of projects (Dewulf et al. 2007; König et al. 2013). In this research, multi-disciplinarity of a project relates to the different types of disciplines, technological fields or industrial sectors that are involved in the project. We make an explicit distinction between the multi-disciplinarity within projects, which is the diversity of disciplines, technological fields or industrial sectors in a project, and the diversity among projects.

To the best of our knowledge, no research has focused on how the degree of multi-disciplinarity within projects contributes to the technological diversity creation among projects. Yet, there are good reasons to suspect such a relation. On the one hand, a multi-disciplinary environment favours a greater diversity of idea generation and promotes creativity (Alves et al. 2007). Due to the juxtaposition of ideas, tools, and people from different domains (Cummings 2005), multi-disciplinarity within projects enhances recombinant innovation (Baber et al. 1995; Fernández-Ribas and Shapira 2009; Rhoten 2004; Schmickl and Kieser 2008). Hence, the chances that novel technological alternatives emerge increase.

On the other hand one can argue that a high degree of multi-disciplinarity creates difficulties with the assimilation and integration of different ideas (Nooteboom 1999). Too much distance between disciplines can further lead to communication problems (Jeong and Lee 2015). However, it is to be expected that these potential difficulties are mitigated in the formation process of the consortium, and that partners have sufficient proximity to collaborate successfully (Boschma, 2005). We thus expect that the degree of multi-disciplinarity of a project has a positive effect on the creation of technological diversity. This leads to our first hypothesis:

Hypothesis 1

The degree of multi-disciplinarity within a project is positively associated with the creation of technological diversity among projects.

2.2.2 Knowledge base

Technological diversity is associated with prior technological knowledge of inventors (Lazear 2004; Lettl et al. 2009). This prior knowledge can be measured as patents and publications since they are two quantitative proxies of knowledge production (Lee et al. 2013; Zucker et al. 2007). In particular patents are used to measure knowledge diversity (Carnabuci and Operti 2013).

Innovative outcomes are the result of the combination of existing knowledge and ideas (Dubiansky 2006). Hence, prior knowledge is likely to be positively associated with the creation of technological diversity. There are several reasons to support this idea. For instance, previous studies showed that R&D intensity and patents increase with the degree of technological diversification of firms (Garcia-Vega 2006).

Prior knowledge also strengthens the absorptive capacity of actors by increasing “the prospect that incoming information will relate to what is already known” (Cohen and Levinthal 1990, p. 131). Hence, a large knowledge base enhances the ability of an actor to make novel combinations. Moreover, a larger prior knowledge base demonstrates that actors have the experience and routines needed to combine knowledge (Kogut and Zander 1992). This effect is even stronger if the joint knowledge base of all project partners is larger, as it further increases the chances of making novel combinations.

Hypothesis 2

The size of the joint knowledge base of actors within a project is positively associated with the creation of technological diversity.

2.2.3 Number of actors

Number of actors refers to “the size of the project consortium in terms of distinct actors” (Van Rijnsoever et al. 2015, p. 1097). There are two positions in the literature regarding the number of actors in a project. The first and most common position is that when there is a large number of parties involved, the process of communication, agreement or problem solving require a complex process of integration of knowledge and synchronization (Gilsing et al. 2008; Hippel 2005; Jeong and Lee 2015; Lorenzoni and Lipparini 1999). The co-ordination required in this scenario demands conformity to standards or rules and thus, less novelty (Tatikonda and Rosenthal 2000) or diversity creation. Similar arguments can be found in the team size literature from social psychology (Curral et al. 2001; Kozlowski and Bell 2003).

The second position argues that larger project teams foster a more dynamic collaboration resulting in faster outcomes, shorter product life cycles and competitive advantages (Edmondson and Nembhard 2009). Larger project teams also provide a larger chance of recombining different types of knowledge, expertise and ideas, and thus innovation (Powell et al. 1996; Ruef 2002).

Yet, few studies explicitly investigate the influence of the number of organisations involved on the creation of technological diversity. In this context, existing evidence suggests that there is a negative association between the number of project partners and the creation of technological diversity (Van Rijnsoever et al. 2015). The argument is that intense collaborations could result in conformity of norms and conventions producing less novelty (Tatikonda and Rosenthal 2000). Keeping this in mind we propose the following hypothesis:

Hypothesis 3

The number of actors in a project has a negative association with the creation of technological diversity.

2.2.4 Diversity of actors

Innovation projects commonly involve different actor types that come from different institutional spheres (Hsu et al. 2011). In this paper we distinguish the following actor types: private for-profit entities (PRC), research centres (REC), higher or secondary education establishments (HES), public bodies (PUB) and others (OTH). According to the European Commission, PRC are small or medium-sized enterprises, excluding for-profit educational establishments, HES are legal entities recognised as such by its national education system, REC are non-profit actors whose main objectives are carrying out research or technological development, PUB are actors that have a legal entity of public institutions governed by public laws and OTH, which are any entity not falling into the previous four categories (European Commission 2015a).

Previous studies on the nanotechnology innovation networks demonstrated that networks in this field are indeed characterized by a high degree of international and institutional diversity. Pandza et al. (2011) demonstrated that usually, the inter-institutional collaboration is taking place between private industry and public research actors (Pandza et al. 2011). Juanola et al. (2012) also showed that the development of nano-enabled biomedical devices requires the interaction between multiple actors such as universities, public research institutions, industries, and hospitals or health care institutions.

Arguments have been made for either a positive or negative relation regarding the diversity of actors. A positive aspect of involving actors’ partners from different institutional spheres is that each actor type brings to the project unique knowledge and skills which can be recombined to form novel concepts and designs (Mo 2016), creating more technological diversity (Van Rijnsoever et al. 2015). On the other hand, a project with less diversity in actor types could hamper diversity creation because collaborating with the same type of partners could lead to redundant information and collaborative inertia (Pandza et al. 2011; Rothaermel 2005).

However, having too much diversity among actor types requires the capacity to manage collaborative research and to take advantage of the knowledge from the network to achieve the goals of the project. Not all the actor types have these managerial capabilities (Pandza et al. 2011). Additionally, researchers from diverse types of organisations need to understand different points of view, people from different institutional backgrounds, cultures or even diverse technical language (Páez-Avilés et al. 2015). Again, this is something that can be taken into account when forming the project consortium. Hence, we expect a positive relation between diversity of actor types and technological diversity creation:

Hypothesis 4

The diversity of actor types in a project has a positive association with the creation of technological diversity.

2.2.5 Degree of clustering

As actors can participate in multiple projects, a network emerges in which projects are nodes and actors are ties between the nodes. Clustering is a property of a local network structure which refers to the likelihood that two actors that are connected to a third actor are also connected to one another (Eslami et al. 2013; Kaiser 2008). The more they are connected, the higher the degree of local clustering (Wasserman and Faust 1994).

There is a debate about the effect of clustering on innovation. On the one hand, clustered networks are argued to be dense local neighbourhoods where actors trust each other, shared norms emerge, information is verified or diffused (Ahuja 2000; Powell et al. 1996; Schilling and Phelps 2007) and novel combinations are being made (Uzzi and Spiro 2005). However, too much clustering can have negative effects on innovation. Many of the ties are redundant, yet costly to maintain (Burt 2004). Also, sharing the same information sources also means that knowledge becomes more homogenous. Moreover, the shared norms can hamper creativity. The opposite of clustering is that there are “structural holes” in a network (Burt 2004). Structural holes occur when two actors that are connected to a focal partner are not connected to each other (Burt 2001, 2004). This means that the focal partner has access to two different sources of information, which allows for making novel combinations (Burt 2004) that add more to technological diversity (Van Rijnsoever et al. 2015). Hence, we hypothesize:

Hypothesis 5

The degree of clustering around a project is negatively associated with the creation of technological diversity.

2.2.6 Geographical distance

Geographical distance between actors in a project is another network dimension that influences knowledge diffusion (Marrocu et al. 2013). Based on the theory of regional innovation systems (Cooke 2001), it has been shown that higher concentration of “talents” in a region helps to connect and exchange knowledge resulting in enhanced innovations (Boschma 2005; Kakko and Inkinen 2009). Geographical proximity also enables knowledge spill-overs among neighbouring actors in related industries (Cooke 2008).

However, knowledge is bound to a geographical location, and the content of knowledge bases varies geographically (Boschma et al. 2014; Frenken and Hoekman 2014). Therefore the further the distance between actors, the more likely it is that their knowledge bases differ. This increases the possibility of making novel combinations and thus the creation of technological diversity.

In contrast, having international teams can also hamper diversity creation. Cultural differences lead to difficulty in transference or decoding of certain types of messages (Lundvall 1992). Hence, the costs of international teams can exceed the gains of diversity (Faber et al. 2016; Sirmon and Lane 2004; Williams and O’Reilly 1998), since resources can be diverted into smoothing cultural differences in the team, which comes at the expense of innovation and diversity creation.

In addition, Van Rijnsoever et al. (2015) tested this relationship and found that the effects of geographical distance do not contribute to the creation of technological diversity. A possible explanation for this is that their study only included Dutch innovation projects. There might have been too little geographical distance between partners for the knowledge bases to differ.

This inconclusive evidence strengthens the need for testing the relation of this variable with the creation of technological diversity. Hence, we hypothesize:

Hypothesis 6

The geographical distance of actors within projects is positively associated with the creation of technological diversity.

2.3 Methods

2.3.1 Sample selection and data collection

We tested our hypotheses on the case of nanotechnology as an important emerging technology and a promising KET. We focussed our research on the healthcare domain due to the great applicability and growth of nanotechnology in medicine (Gabellieri and Frima 2011), which has been highly prioritized over the past European Framework Programmes (European Commission 2010; Galsworthy et al. 2012).

For this purpose, we selected health-related projects from the Work Programme LEIT 2014–2015 of H2020 called “Nanotechnologies, Advanced Materials, Biotechnology and Advanced Manufacturing and Processing”, which foster the technological cross-fertilization of nanotechnologies, biotechnologies and advanced manufacturing systems. Technological cross-fertilization, as coined by the European Commission, is the process of combining different KETs resulting in cross-cutting products or services, with enhanced technological performance (Butter et al. 2014). Therefore, since the combination of different technologies is being highly prioritized, projects selected under this initiative are suitable for studying technological diversity. Based on these criteria, 69 projects were obtained from the Community Research and Development Information Centre (http://cordis.europa.eu). The projects belong to four types of calls (European Commission 2015b):

  1. 1.

    Nanotechnology and advanced materials for more effective healthcare: focusses on the potential of advanced materials and nanotechnologies to enable effective therapies and diagnosis. The major innovation challenge in this call is to achieve clinical applications from pre-clinical laboratory-scale proof-of-concepts.

  2. 2.

    Exploiting the cross-sector potential of nanotechnologies and advanced materials to drive competitiveness and sustainability: this call focusses on the break-through potential of nanotechnology and advanced materials on several applications and economic sectors by boosting European industry.

  3. 3.

    Bridging the gap between nanotechnology research and markets: this call addresses three key nano-enabled industrial value chains (lightweight multifunctional materials and sustainable composites, structured surfaces, and functional fluids) by taking them from the laboratory to the industrial scale.

  4. 4.

    Biotechnology-based industrial processes driving competitiveness and sustainability: this call focusses on delivering novel products that cannot be produced in the current industry on the basis of efficient biotechnological methods with less environmental impact.

Within these projects we identified 222 unique actors as co-ordinators and participants. These are the actors that we use for our actor and network based variables.

2.4 Variable measurements

2.4.1 Technological diversity

Diversity is a multidimensional concept. Stirling (1998) recognized variety, balance and disparity as the three dimensions of diversity (Stirling 1998). Variety represents the number of elements or categories in the system. In other words, it represents the counting or enumeration of the distinctive types of elements or categories in the system of elements or categories. Balance refers to the distribution of these elements or the evenness of its distribution. As Stirling argued, this dimension could be equal to asking the following question: “how much of each type of thing do we have?” (Stirling, 2007, p. 709). Third, disparity is related to the degree to which these elements are distinct from each other.

In this study we used the first two dimensions to calculate diversity, as there is no consensus on an appropriate measure for disparity (Rafols and Meyer 2010; Stirling 2007; Van Rijnsoever et al. 2015; Zhang et al. 2016).

To analyse the creation of technological diversity, the first step was to find all the technological alternatives present in the system of projects. In the case of publications and patents this is often done by looking at citation patterns or pre-existing categories (Boschma et al. 2014; Rafols and Meyer 2010; Yegros-Yegros et al. 2015). Yet these measures are not applicable to our project data, as we only have access to the abstracts. Hence, we used topic modelling techniques. Topic Models represent a set of probabilistic variable models used to evaluate the semantic structure of documents based on a hierarchical Bayesian method (Blei and Lafferty 2007, 2009) which can be used to identify topics among documents. The different technological alternatives are based on semantic clusters, which are usually identified as “topics”. Therefore, topics are a set of words that represent a theme. For example, the words “nano-capsule”, “delivery” and “enzyme” can be classified in one topic because these words are related to each other. The distribution of topics is the relation that links words in a vocabulary and their occurrence in documents (mixture of topics). In this study, documents are the abstracts of each project.

To obtain the distribution of topics, we used Latent Dirichlet Allocation (LDA), which is a common type of topic model that uses discrete probabilistic techniques for information retrieval, and text and data mining (Blei et al. 2003). LDA assumes that K number of topics have an association with a collection of documents, and estimates for each document the probability that it belongs to a topic (Crossno et al. 2011; Grün and Hornik 2011; Zhang et al. 2016).

For the LDA analysis we used the lda package (Chang 2015; Ponweiser 2012) of the R-program. The first step was to pre-process the documents in order to avoid possible “noise”. This was done by cleaning the text corpus (e.g. remove punctuation, stop words, numbers, etc.) and stemming or merging words equivalent in meaning. For that purpose we used the tm package (Feinerer 2015). Second, an appropriate number of topics needed to be selected for the LDA analysis. Choosing too many topics will result in the “over-clustering” of a corpus into many small, highly-similar topics, while selecting too few can produce overly broad results (Zhao et al. 2015). For the estimation of the optimal number of topics, we used the LDA tunning package (Nikita 2015). This package estimates the optimal number of topics based on a Bayesian selection model which computes the likelihood probability distribution of a possible parameter setting by assigning all words of the corpus w, over a number of topics T expressed as P(w|T) (Griffiths and Steyvers 2004; Steyvers et al. 2006). The number of topics is therefore the model that leads the highest posterior probability. Figure 1 plots the posterior probability against the number of topics. The graph suggests that data are best described by a model with 33 topics.

Fig. 1
figure 1

Estimation of the optimal number of topics. Maximum likelihood distribution of all words over a number of topics

To visualize the distribution of topics per project, we developed a level plot graph by using the lattice package in R (Sarkar 2016). Figure 2 shows the LDA graph, where the x axis shows the projects, and the y axis the 33 topics found in the whole system of projects the system of projects. The distribution of each topic in each project is defined by the intensity of colours: more intense blue colours show few topics distributed in a project (so the colour is concentrated only in one point), while light red colours show a distribution of more than one topic in a project. To confirm the validity of the result, the lead author, who is an expert on nanotechnology, verified that the topics assigned to the documents made sense. As can be seen, most projects were clearly on just one topic. The most common topics were related to scaffolds,Footnote 1 nano-biosensors, tissue regeneration, wound dressing, and drug delivery, to give just a few examples.

Fig. 2
figure 2

Topic distribution per project

After estimating the most suitable number of topics and the distribution of each topic in each project, we calculated how much a project i influences technological diversity in the population of N projects (Van Rijnsoever et al. 2015). For that purpose we used Shannon’s entropy statistic measure (Shannon 1948). This variable measures the randomness of a distribution or the uncertainty associated with a random variable, and takes into account variety and balance. Entropy is calculated as follows:

$$ H = - \sum p_{s} Log_{2} p_{s} $$
(1)

where H is the entropy, and p represents the proportion of projects with a specific design or topic s. The diversity that a project i creates in the system is obtained through the difference between the entropy of the population of projects (H 0) and a hypothetical population where the specific project does not exist (H −1 ):

$$ dH_{i} = H_{0} {-}H_{ - 1} $$
(2)

H 0 was obtained through Eq (1) and H −1 was calculated by using the following formula:

$$ H_{ - 1} = - (p_{si} *Log_{2} p_{si} + \sum _{sj} p_{sj} *\log _{2} p_{sj} ) $$
(3)

where p si represents the proportion of projects with the same design \( i \) and \( p_{sj} \) is the proportion of any other designs. Both variables are calculated assuming that the focal project does not exist in the hypothetical population \( n_{s} \). Therefore we have to consider that there is one project fewer with that design in the population, represented by:

$$ p_{si} = \frac{{n_{sj} - 1}}{N - 1} $$
(4)
$$ p_{sj} = \frac{{n_{s} }}{N - 1} $$
(5)

A positive value of dH indicates that diversity is created. A negative value indicates reduction of diversity in the system of projects. These calculations revealed that there were four different levels of diversity creation.

2.4.2 Degree of multi-disciplinarity

In line with suggestions by (Rafols and Meyer 2010; Yegros-Yegros et al. 2015), we measured the degree of multi-disciplinarity by the diversity of topics. Instead of looking at how often a combination of topics occurs at the system level, we calculated the diversity of topics within a project, using the probabilities from the LDA and Eq. (1).

2.4.3 Knowledge base

We used the number of patents in a project as an indicator of the size of the knowledge base. Patents are a very homogeneous measure of technological novelty (Breschi et al. 2000). They reflect creativity (Juanola-Feliu 2009) and the ability to transfer scientific results into technological applications (Hullmann 2006).

Since we are analysing nano-related projects, we used nano-related patents as the indicator of the size of prior knowledge base. Nano-related patents of each actor were retrieved from the European Patent OfficeEspacenet Website (EPO) from 1980 to 2015. This period of time was selected based on the fact that 1980 was the starting year of the “boom” of nanotechnology. However, the first patent retrieved in our database was from 1994.

We used the B82 code for nanotechnology standardized by the International Patent Classification (IPC). This code is widely used as a nano-related patent retrieval in several studies (Baglieri et al. 2014; Dang et al. 2010; Kumar and Desai 2014; Leitch et al. 2012; Ozcan and Islam 2014; Porter and Youtie 2008; Scheu et al. 2006). Even though this is a new classification for nanotechnology-related patents, nano-related inventions granted in the 80s that weren’t classified as such were re-classified by patent authorities (Ozcan and Islam 2014). Moreover, the code Y01 N was replaced by B82Y in 2011 in order to have a uniform nanotechnology related patent classification (European Patent Office 2013).

We selected only European patents, because this enlarges the chances that the knowledge captured by the patent is present in the project. It is less likely that individuals in a project will be familiar with knowledge captured by a patent that is registered only in the USA. In order to select the normalized name of each assignee, the AcclaimIP Patent Search and Analysis Software was used in parallel, checking the standardized names of the actors in both sources for more thoroughness. As the number of patents has a skewed distribution, we used its natural logarithm. This also makes the realistic assumption that each extra level of the variable results in a decrease in marginal returns for diversity creation.

2.4.4 Number of actors

This variable was obtained by simply counting the number of actors per project. This variable had a skewed distribution; therefore we used its natural logarithm. The transformation also makes the realistic assumption that each extra level of the variable results in a decrease in marginal returns for diversity creation.

2.4.5 Diversity of actors

Based on the standard classification of actors from H2020 (see Sect. 2.2.4), we calculated the diversity in actor types per project, using the Shannon entropy mentioned in Eq. (1).

2.4.6 Degree of clustering

The degree of clustering was obtained by calculating the local clustering coefficient (CC) of a project (Wasserman and Faust 1994). The CC is a quantitative way to study the structure of a network (Ravasz et al. 2002). It represents the probability that two random neighbours of an actor from a project are connected. It measures the extent of interconnectivity between the neighbours (Moreira et al. 2006) and is represented as:

$$ CC_{i} = \frac{{2L_{i} }}{{D_{i} \left( {D_{i} - 1} \right)}} $$
(6)

where i is the focal project or node, D i is the number of other neighbour projects that have an actor in common with i, and L i is the number of links that connect the neighbour projects D i , if they are connected.

Van Rijnsoever et al. (2015) indicate the need to distinguish projects that are not connected to other projects (isolates) from projects that are connected, but whose neighbours are unconnected, since both receive a value of 0. Hence, we created an extra dummy variable for isolates. The number of actors is also correlated by definition on the clustering coefficient. This is because clustering is conditional on having at least two ties. To separate the effects of isolates and number of ties, we regressed them both on the clustering coefficient. The residuals of this regression form an unconfounded measure for clustering, and this was used as an independent variable in our models.

2.4.7 Geographical distance

The geographical distance variable was obtained by calculating the average distance in kilometres between the actors’ coordinates (latitude/longitude) from a project and a calculated geographical centre. The geographical centre was retrieved by using the geosphere package (Hijmans et al. 2015), and the geographical distances were calculated using the fossil package, both from the R program (Vavrek 2011). This variable had a skewed distribution, and we used its natural logarithm. The transformation also makes the realistic assumption that each extra level of the variable results in a decrease in marginal returns for diversity creation.

2.5 Analysis

As there were only four levels of diversity creation, it would be inappropriate to fit a linear regression model, as this assumes that a dependent variable has continuous value. Four values are insufficient to meet this assumption. Hence, we tested our hypothesis using a cumulative (ordinal) logit regression. This model is more robust against non-normal distributions or outliers than ordinary least squares regression.

The change in entropy caused by a project was our dependent variable. We added independent variables as predictors. Moreover, we added the type of call as categorical control variable with four levels. This might be seen as a limitation of this study, since the regression model only tested independent variables and names of the calls, and no other information (such as size of the funding, length of the funding, etc.) was controlled for. However, these facts are already captured by the name of the programs as specified in the call type.

Two projects were outliers with regards to the dependent variable and the degree of multi-disciplinarity. As this violates the assumption that there are no outliers, we removed these two projects from the final model we present below. However, we note that the models with and without the projects gave very similar results.

3 Results

Table 1 displays the descriptive statistics and the correlation matrix. As can be seen, the variable number of actors is strongly correlated with the geographical distance, with a correlation of 0.72. This correlation makes sense since more actors increase the probability of establishing large geographical distances between them.

Table 1 Descriptive statistics and correlation matrix

Table 2 shows the results of the cumulative logit model. The McFadden R2 of the model is 0.11, which is an acceptable fit. The variance inflation factors are all below 10, except for the number of partners which was at 13. We decided to leave this variable in, as it controls for other variables that are dependent on project size. Yet, we need to interpret the estimator of this variable with caution.

Table 2 Results of the cumulative logit model

Table 2 shows that the degree of multi-disciplinarity has a strong and significant positive association with the creation of diversity. This supports the idea that a multidisciplinary environment generates greater diversity and supports Hypothesis 1. Regarding the knowledge base variable, we observe that the number of nano-related patents also has significant positive association with the creation of technological diversity, which supports Hypothesis 2. In this context, the effects of knowledge creation and diffusion measured by patents contribute to explaining technological diversity creation. Moreover, it demonstrates that knowledge in nanotechnology is important for the creation of new alternatives in the system and this ratifies the transversal nature of nanotechnologies.

In contrast, there is a negative association between the number of actors on the creation of technological diversity, but this is only significant at the 5% level. Moreover, the variance inflation factor of this variable is rather high. Yet, it ratifies previous literature that argues that when there are more people involved it is more difficult to manage and more conflicts between them can emerge (Tatikonda and Rosenthal 2000; Van Rijnsoever et al. 2015). Overall, we interpret this finding as partial support for Hypothesis 3.

The diversity of actors is not significantly related to our dependent variable, which does not support Hypothesis 4, and casts doubts on the claims made by Van Rijnsoever et al. (2015). A possible explanation is that in the context of nanotechnology, the content of technological knowledge is independent of the type of actors. In other words, the content of the theories and knowledge about nanotechnology is the same for each actor type, regardless its institutional background. This result also means that the different skills or points of view that emerge from the different nature of the actors does not contribute to the creation of technological diversity.

The variable degree of clustering is significantly and negatively associated with the creation of technological diversity, which is in line with the structural holes arguments and supports Hypothesis 5. The structural holes argument is related to the degree of clustering. A lower degree of clustering means that there are more structural holes (since the degree of clustering is the probability that two nodes that are connected to a third one, are also connected to each other). Hence, if a project is less embedded, it adds more to diversity.

Finally, our model shows that there is a small positive effect of geographical distance within projects that is significant at the 10% level. This supports Hypothesis 6 and corroborates the results obtained from Van Rijnsoever et al. (2015), and is line with the argument that the knowledge base is geographically bound.

4 Discussion and conclusions

4.1 Discussion

This research suffers from a number of limitations. In the first place, the sample of projects was relatively small. We only took European nanotechnology healthcare projects into account as this makes projects more comparable. However, it also limits the generalizability of our results. It also resulted in limited levels of variation in the dependent variable, which required us to resort to a more conservative cumulative logit model. Future research could focus not only in the healthcare domain, but also in other industrial fields where nanotechnology is applied, such as environmental, energetic, textile, cosmetics, construction, communication, or other technologies that are not related to nano. Although the number of topics covered was quite broad, the European focus of the projects also implies that we possibly missed regional initiatives or priorities that can result in different national foci for application areas. This could explain regional differences in knowledge bases.

A second limitation is related to the patent data. It is important to consider that not all innovations are patented, especially in basic science research (Garcia-Vega 2006) and neither patents nor publications databases always provide complete information about the names or affiliation of researchers (Bengisu and Nekhili 2006). A possible solution for future research is to take into account previous participation in funded programmes to further validate the robustness of the prior knowledge base of actors.

4.2 Conclusion

In this paper we explained the creation of technological diversity using the characteristics of innovation projects. We tested our hypotheses on data from EU-funded nanotechnology projects belonging to H2020 calls that prioritize the cross-fertilization of emerging technologies, and applied LDA as a novel method to study the contents of the innovation projects.

Our main addition to the literature is that the degree of multi-disciplinarity of a project and the size of the joint knowledge base of project partners are strongly predictive for diversity creation. In this context we find support for the hypothesis that different disciplines and larger and broader knowledge base increase the chances of recombinant innovations (Baber et al. 1995; Cohen and Levinthal 1990; Fernández-Ribas and Shapira 2009; Rhoten 2004; Schmickl and Kieser 2008).

Second, the results mostly support earlier findings by Van Rijnsoever et al. (2015), and theoretical expectations with regards to the number of actors in the project (Tatikonda and Rosenthal 2000; Van Rijnsoever et al. 2015), the clustering coefficient (Burt 2004), and the effect of geographical distance (Boschma et al. 2014; Frenken and Hoekman 2014). However, we did not find support for the claim that actor diversity adds to technological diversity creation. This negative finding could be the result of contextual differences between nanotechnology projects and bio-gassification projects. Innovation system research argues that building networks is important for the success of an emerging technology (Hekkert et al. 2007). Our results verify the claim that it is also important to consider what the network should look like.

Finally, we also make a methodological contribution. The LDA method (Blei et al. 2003) allowed us to understand the topics of the projects in an efficient and reliable manner. It allowed us to calculate diversity and the degree of multi-disciplinarity, and can also aid future researchers with understanding the topics of innovation projects, in addition to publications or patents (Du et al. 2012).

These contributions allow us to further develop a theory on the creation of technological diversity, and hence to increase the possibilities of preventing technological lock-in and increase the chances of recombinant innovation as well as increasing the resilience of the technology (Van den Bergh 2008).

Our results can serve as guidelines to policy makers, especially at the EU-level, for fostering the success of emerging technologies on the basis of their cross-fertilization and technology diversity creation. In order to encourage creation of technological diversity, emphasis should be placed on subsidizing: (1) projects involving or developing multiple disciplines, (2) projects with actors that show a strong background in nano knowledge, (3) projects with partners from different geographical regions, and (4) projects with a limited number of partners that are not too closely connected with each other. The first three are already explicit or implicit criteria in Horizon 2020. Yet these projects often involve large consortia. Our results suggest that it is better for diversity if these consortia are smaller. Moreover, in some instances, partners are involved in multiple projects. Our results show that these cases should be handled with care, as this can decrease technological diversity.