Challenges for scientometric indicators: data demining, knowledge-flow measurements and diversity issues

Scientometrics and bibliometrics are being forced to respond to a strong increase in demand (e.g. research assessment practices, economics of science & technology, and innovation) and new forms of supply (e.g. availability of publication sources and statistics, Internet developments and online tools). This situation results in contrasting perspectives: on the one hand, it can favour spectac- ular 'hit-parades' and some veneration of numbers; on the other hand, it paves the way for more cau- tious and sophisticated evaluation systems, rooted in a better understanding of the dynamics of sci- ence. This paper describes some of the challenges for bibliometric indicators (data 'demining', knowledge-flow measurements and diversity issues) underlying, among other applications, reliable evaluation procedures. Responding to these challenges is necessary to promote a better use of scien- tometrics, although there are no guarantees against misuse in decision-making contexts. A few open issues are outlined on the dynamics of science, challenges of the web age, and interactions between scientometrics and scientific communities.


INTRODUCTION
The aim of scientometrics is to provide quantitative characterizations of scientific activity. Because of the particular importance of publication in scientific communities, it largely overlaps with bibliometrics, which is quantitative analysis of media in any written form. In addition to disciplines of measurement (infometrics/ data-mining, statistics and mathematical modelling), scientometrics has strong connections with economics and sociology of science as well as science policy. The 1970s saw the development of scientometrics as an operational activity -a response to the pressing demand for the 'measuring of science', especially in Russia and the USA. Amongst the founding fathers of the discipline were de Solla Price (1963), Garfield (1955) and Narin (1976) in the US, Nalimov & Mulczenko (1969) in Russia and Braun & Bujdoso (1975) in Hungary. Applying bibliometric methods to their own field, scientometricians confirm that their own domain, standing at an intersection of disciplines, evolved as a heterogeneous field, both in topics and practices (Schoepflin & Glänzel 2001) and intellectual repertoire (Peritz & Bar-Ilan 2002).
A certain tension has always existed between academic/cognitive scientometrics and political/practical scientometrics, the latter of which has been described as 'a hybrid of social science and bureaucratic expertise' (Wouters 2006, p. 21). Often, these aspects can hardly be disentangled. Scientometrics has to correctly represent the multiple facets of scientific activity in models of use to science policy makers, using quantitative tools with sound properties.
As with all decision-support disciplines, scientometrics must resist the temptation of 'l'art pour l'art'. A superfluous sophistication is most likely to generate artefacts and black-box effects, but the fascination of 'magic numbers', chimerical syntheses in a unique index of a complex and multidimensional reality, can also be misleading. Even if precautions are taken and methodologies explicit, end-users (e.g. managers) tend to apply their own rules to bibliometric indicators as within any decision-making process -scientometricians having no more control over this.

CONTEXT
The bibliometric component of scientometrics is a mirror of science: it uses the published works of scientists to answer the questions of policy makers, stakeholders, scientists themselves, and social scientists taking research and science as a research object. Scientific publication is central to the activity of scientific communities and is moreover made available on a large scale by modern databases -Garfield's Science Citation index (now Web of Science [WoS]) in the first place -and the Internet.
Each publication is both the result and the imprint of scientific networks, primarily social networks among scientists or institutions. As different as they may be in their theoretical positions, Merton (1942), Bourdieu (1975) and Latour & Woolgar (1979) stressed the interactive character of scientific activity. Coauthorship, citation, and hyperlinks are quite explicit elements of networks appearing in bibliographic sources. Others are implicit and can be revealed by analysis of textual elements, from title to full text, or geographical information. All of these networks, with further information on dates and journals, offer a wealth of material for many types of analysis of scientific activity and knowledge circulation.
Data on external resources, such as human resources and funding flows, are necessary to complete the landscape. Thus, counting outputs (papers, cites) is the very first stage of scientometrics, but assessment of scientific productivity is extremely difficult. We will describe this issue in more detail in the next section.
Webs of all kinds, explicitly or implicitly created by scientists, are observed by quantitative methods borrowed from informetrics, statistics, data analysis and data-mining, and network theory. The scientometric mirror is sometimes distorting: mathematical resources are both powerful and rich in artefacts. This arsenal of methods may be applied to any of the abovementioned networks. For example, if we wish to map scientific themes by grouping articles, we can design (1) a topical map, where proximity of articles is measured by the use of the same words, (2) a paradigmatic map, where the proximity is measured by the use of the same cited references, expressing the intellectual base of the article, or (3) an authorship-based map, where proximity is measured by the presence of the same author(s). We will return to mapping and delin-eation of scientific fields in the subsection 'Delineation and mapping of scientific areas'.
There are 2 incentives for the investigation of networks. The first is of an academic/cognitive nature. Born from social and information scientists' interest in understanding communities' activity and information circulation, the quantitative tools used by scientometricians may in turn feed these disciplines. Scientometrics allows varied ways to describe knowledge circulation networks and also the operationalization of hypotheses about scientific communities, generated by neighbouring intellectual universes (sociology, economics or even physics). One example is the rising interest of knowledge economics in scientometric networks, linking academia to economic and social actors and to political institutions. Data such as coauthorship, citations, hyperlinks, and migrations are assumed to describe circulation of information, knowledge and academic staff.
The second incentive is the demand for evaluation. The quest of the scientometrician is 2-fold: promoting the most robust, reliable and acceptable methods and, whenever possible, translating the strategic questions of policy makers, stakeholders, and scientists into meaningful measures.
A lasting issue in evaluation is the apparent competition of bibliometrics with peer-review, a topic covered by a huge literature with examples in this Theme Section (TS). As our focus here is on other aspects, we will not labour this particular point, and will only stress that scientometrics is not a deus ex machina, coming from heaven or hell. It reflects the peer review process, albeit collectively and implicitly, that leads to the writing of a paper, its acceptance by a journal, and its further citation. As shown by Wouters (1997), peers are present at every stage of the research and publication cycle: getting funds, attracting co-workers and coworking, discussing and submitting manuscripts, and getting them read and cited. Peers are the hub of this wheel and bibliometrics is a mirror of peer-review. Looking at this the other way around: peer-review of articles, or of dossiers of scientists applying for funding or tenure, can hardly ignore bibliometric elements, such as records of publication, impact factor of journals (quite partial and dangerous when used alone ;Seglen 1997), and real impact, especially when these elements are easily accessible through Thomson databases, Scopus or Google Scholar.
Although some convergence is expected (Rinia et al. 1998, Aksnes & Taxt 2004, Harnad 2008), bibliometrics and peer-review have their own strengths and weaknesses. Scientometrics dilutes many biases present in ad hoc 'peer-review', such as individual specialization, personal interests and various pitfalls of group behaviour that can jeopardize a jury's efficiency.
But scientometrics is not completely free of such biases: specialization and interests also shape collaboration and citation links, for example. Ad hoc peerreview may perform better at the individual scientist level, putting forward more extensive and more synthetic appraisals. It takes into account such dimensions as activity in non-academic outputs or managerial aspects. Scientometrics, in the spirit of its founder (D. J. de Solla Price, physicist and historian), was a statistical approach not meant for individual evaluation. The individual detour is helpful for institutional evaluation, but assessment of scientists solely based on conventional quantitative tools may be misleading, for example when publication behavior is atypical.
The scientometrician is sometimes compared to a knight, needing armour in evaluation exercises! Inappropriate usage of mirrors and webs can easily turn the knight into a wizard. A few challenges are encountered on the scientometrician's quest: poor quality of data, necessitating 'data-demining'; assessing the position of actors in labyrinths of knowledge flows; and avoiding the mirage of universality that leads to indicators uncorrected for peculiarities of disciplines or research practices. All of these elements matter when evaluation indicators are at stake. In conclusion, some perspectives are outlined on the dynamics of science, challenges of the web age, and interactions between scientometrics and scientific communities.

DATA MINING AND DATA 'DEMINING'
Bibliometrics mainly works with empirical models and data and should not overlook where data come from. To put it bluntly, in data-mines, there are mines in the data. This daily burden of the bibliometrician is important to keep in mind. Data illusion takes on many forms.

'In databases we trust'
Good proxies should represent the activity being described. To restrict ourselves to the creation of scientific knowledge, collected data have to represent publication modes and research topics of the observed actors. Yet the border of the standard database used for international benchmarking (i.e. WoS), in contrast with its core of highly visible journals, is largely arbitrary: the tail of low-cited and low-internationalized journals forms a sub-population marked by national biases, for example for Russia. Adding many 'national-oriented' and low-cited journals from emerging countries, a current trend, may produce odd effects: production figures of corresponding countries will be significantly increased, but as added journals are hardly cited within the database, impact figures (citation per paper) will decline. It is up to the scientometricians to select subsets for international benchmarking, for example by removing journals of 'low-impact' and/or 'nationalorientation' (Zitt et al. 2003).
Inter-disciplinary balances are not guaranteed either. A lasting issue is the limited coverage of particular fields by reference databases, especially when the dominant mode of production is not the journal article and the national traditions are strong, e.g. favoring non-English languages and/or national publishing. The coverage issues in the quite heterogeneous domain of social sciences and humanities have been studied for example by Hicks (2004) and Nederhof (2006). A similar issue is found in 'fast' disciplines such as computer science, where peer-reviewed conferences are a major form of communication. Other types of documents are found in the disciplines of law, for example. The fact is that non-article items are less covered by standard databases, making the bibliometric approach more difficult. The development of new sources (Scopus on a classical model, Google Scholar on web sources) may change the situation and also the national balances: the difficult issue of the 'US bias' in classical databases remains controversial (see Luwel 1999, Narvaez-Berthelemot & Russell 2001.

'In names we trust'
Correct actor positioning needs correct actor identification. The lack of standardisation of individual data for authors, and especially for institutional affiliations, is often due to the lack of rigor of authors themselves in the way they write their address, but other problems (e.g. transliteration of names; see Cheung 2008, this TS), variable across databases, are frequent. Bibliographic databases were not primarily meant for bibliometrics, but to speed dissemination of scientific knowledge among scientists interested mostly in getting access to relevant publications. For a long time, nobody but bibliometricians cared much for standardisation. The situation is clearly different now, but unification/disambiguation remains a challenging issue under the auspices of Confucius (Analects, Book 13, Verse 3): 'There must be a correction of terminology'. Several of the most popular studies currently available, for example the first versions of the Shanghai rankings (Liu & Cheng 2005), exhibited low standards of unification/disambiguation, with high magnitude errors for particular actors. In some countries, this imperfection eventually resulted in a virtuous process of correction by actors themselves, when they realized than their international image was at stake.
In quite complex academic systems such as the French one, with many overlapping structures, selfidentification by institutions still remains necessary to yield credible evaluations, but at a relatively high cost. The lack of unification of items (authors' names, institutions, cited items, text fields in natural language) hinders analyses of many types, especially those where the individual level is needed (e.g. detection of authors-inventors, calculation of institutional performances on the basis of individual distributions, gender studies and mobility studies). Producers of some reference databases are currently taking steps to alleviate this severe quality problem, but the practices of the scientific community are part of the issue.
The problem of data is worse on the input side. Indicators of human resources and funding suffer from availability and comparability problems despite international norms (OECD). For example, the definition of researchers and the estimation of full-time equivalents in higher education systems vary across national contexts. This gives somewhat paradoxically an aura of relative reliability to output measures, where biases are perhaps easier to detect. The magnitude of errors on the input side, in field or country comparisons, often jeopardizes productivity measures (Barré 2001).

LABYRINTHS OF KNOWLEDGE CIRCULATION
From data, scientific networks are disclosed. Whatever the context -evaluation, science policy or 'academic' science studies -these networks are the background of scientometrics. They help us to understand how communities produce, exchange, combine and disseminate knowledge. This is the ground on which scientometric indicators are built, from classical publication and citation counts to sophisticated positioning measures, addressed in the last section. The main networks of knowledge circulation 1 are outlined in the following sections.

Scientific collaboration networks
Science is a collective adventure and involves many forms of collaboration, some of them with a bibliometric: co-signature of articles, co-participation in programmes such as the EU Framework Pro-gramme, sharing of large instruments. Co-publication behavior and networks have inspired a huge literature, coming from scientists observing their own 'ecosystem', or from scientometricians. For example, determining factors of co-publication at the macrolevel of exchanges across nations have been identified, and sometimes ranked. Modelling the collaboration process at the micro-level is perhaps trickier. Collaboration has benefits (e.g. division of work, exploitation of complementarity, enhanced visibility through citations) but also costs (e.g. transportation, time devoted to communication between partners, increased administration, opportunity costs; see Katz & Martin 1997).

Citation networks
The actual dissemination of publications is not easy to track, but knowledge transfers are made visible by their citation counterpart, following Merton's (1942) hypothesis that citations recognize an intellectual debt. The general graph of citations within the Science Citation Index, for example, provides an access to the combinatory construction of knowledge, in the spirit of de Solla Price and Garfield. The social act of citing is much more complex than Merton's scheme, as scientometricians are aware (Small 2004), and has been studied by various schools of sociology of science (see Cronin 2004). Citation analyses are not limited to counting citations for evaluation purposes, which is already a difficult task: technical issues, finding adequate references for normalizing figures, interpretation -remembering that citations measure visibility or audience rather than quality. The position of actors (from countries to individual scientists) in the citation network is extremely rich in information, allowing us to assess knowledge dependencies, to estimate multidisciplinarity through transactions between fields, and to map themes and research fronts in an efficient way (see 'Delineation and mapping of scientific areas').

Linguistic networks
Linguistic networks are based on the contents of titles, abstracts, full texts, or various types of controlled terms (keywords). To assess thematic proximity, the methods range from purely lexical to semantic. The former are usually quicker, more automatic and less field-dependent, the latter more precise but also more difficult. Term associations have been extensively studied by sociologists of science to characterize schools of thoughts and by informetricians to build information retrieval tools.

Mobility networks
Mobility of scientists across institutions or nations is an essential vector of knowledge circulation. Mobility embodies both the transfer of existing knowledge and expectations of future flows. This question has many facets, and bibliometrics is only one of them. When addressing this problem on a large scale, studies of mobility face tricky 'data-demining' issues mentioned above, i.e. unification/disambiguation of names, both for individual scientists and their institutions.

Online networks
The World Wide Web has become a fantastic platform for knowledge circulation, both through classic media online communication and new ways, formal or informal, of dissemination and interaction. This is beyond the scope of the present paper, but will be discussed briefly in the conclusion.
On many occasions, a question regarding knowledge circulation may be addressed by competing or complementary measures from the above-mentioned networks, for establishing proximity between scientific articles on the basis of e.g. citations, words, and/or authorship, which allows mapping of scientific universes. This holds for proximity or transfers, from science to science; or from science to/from technology. The latter is a central subject both at the theoretical and practical level, when one deals with the 'third mission' of universities, having to enhance the economic and social impact of their first 2 missions, research and teaching. In bibliometrics, proximity of science and technology is classically addressed through 3 connections: citations from patents to articles, lexical proximity and co-activity of 'scientists-inventors'.
Tools for network studies have now been used for a long time in bibliometric applications, for instance the actor-networks theory introduced by the relativist school (Callon 1986). The renewal of Milgram's 'small world theory' by Watts & Strogatz (1998) has boosted the research on social networks, formalizing the effects of weak ties (Granovetter 1973). The small world structure, in which a few long-distance connections link widely-separated areas with many local connections, appears quite frequently, especially in scientific networks: scientists belong to small communities but remote/weak ties boost multidisciplinarity and creativity. Analyses of scientific networks can now be found in papers by mathematicians ('Erdös Number project' -studying the network of scientists directly and indirectly linked to Erdös by copublication linkages; www.oakland.edu/enp/index.html), computer scientists, physicists (general dynamic model proposed by Barabasi et al. 2002) and recently economists. The latter are particularly concerned with designing incentives and costs associated with scientific production (Stephan 1996) and address the creation and persistence of social links (e.g. coauthorship, Jackson & Wolinsky 1996; network of inventors, Cowan et al. 2006), sometimes in the formalism of game theory.
The diversity depicted by structural and dynamic analysis of scientometric networks warns against mirages of universality. Mere numbers of publications or citations are hardly interpretable out of their particular context.

Diversity of communities and its consequences on scientometric indicators
Within common principles of publication and communication norms, scientific communities exhibit quite different behaviour depending on their type of research, their degree of application and the nature of their field. Even when sharing the same communication system, different disciplines do not publish with the same frequency, do not exhibit the same propensity to collaborate and co-author papers, nor have the same citing practices both in volume (the length of references list in the articles) and immediacy (the age of references they cite). These discrepancies were evidenced in early literature on publication and citation practices (de Solla Price 1970).
In bibliometric terms, we could state that within the framework of general laws of distribution (mentioned above), a large variation of parameters takes place amongst local sub-networks of science, expressing the specificity of behaviour, organisation and diversity within each field and type of research. A principle in evaluation-oriented scientometrics is not to mix apples and oranges: scientometrics has to cope with the consequences of this diversity, namely the heterogeneity of areas and practices, and to find appropriate reference sets and time windows for meaningful comparisons.
Limit cases should be investigated. Many mathematicians, for example, are reluctant to validate quantitative analysis of publications and citations in their field, and rather favour a direct assessment of articles and institutionalized peer-review as the basis for e.g. scientific awards. An argument in favour of this is the strong diversity of publication practices among prominent mathematicians (e.g. Fields Medalists). Another issue is the poor interpretability of the citation performance, along with the low speed of citations in some sub-domains making it difficult to appreciate the audience of a paper in a reasonable time frame.
In their quest for sound comparisons, bibliometricians proposed various typologies and solutions for field-normalization (e.g. Pinski & Narin 1976, Murugesan & Moravcsik 1978, Schubert et al. 1988. As far as 'impact' (citations per publication) is concerned, a common method, although not rigorous, is the normalization by the mean of the field, giving the 'relative impact'. This assumes that a 'field' or a 'topic' is a welldefined notion, which is far from granted (see next subsection). Moreover, shall we define the field as small or large scale, small topics or large academic disciplines? The relative impact proves to be pretty unstable when successive embedded sets of growing sizes (e.g. research front, speciality, subfield, field) are used as references. We observed for example that the content of the top-cited 'excellence' class (say 1% or 5% more cited) is quite dependent on the level of observation/normalization used: one expects for example that in a given specialty, defined as a group of journals, articles from the top percentile in each journal will not all belong to the top percentile of the whole specialty, and this remains true for various levels of grouping (Zitt et al. 2005). Besides, other methods of normalization are being investigated by scientometricians.
Diversity of science is mirrored by the variety of actors' involvements. The spectrum of actors' activity amongst fields has been one of the first concerns of bibliometricians. Specialization/diversity are measured by various indices, such as concentration indices, the Herfindahl index, or deviations of the Balassa index. They were borrowed from economics or ecology (recent review by Stirling 2007) to describe the strategy of actors (e.g. Adams & Smith 2003). Variety of research actors' portfolios belongs to this family of indicators, in which 'ranking' as such makes little sense and can only be interpreted within a wider frame of evaluation.
To conclude on this point, one possibility is to take diversity as a control variable, for example to design satisfactory sets of references for normalization of indicators. Another possibility is to characterize diversity as a key aspect of a scientists' or institutions' behaviour, and at a larger scale a feature of self-organized scientific systems, continuously creating new areas from combinations of knowledge.

Delineation and mapping of scientific areas
A particular question concerning reference sets for scientometric analysis as well as dynamics of science, is the breakdown of science into disciplines, specialties, or the delimitation of strategic areas. Strong political stakes are associated with nomenclatures. In the absence of real standards in classification of science, most macro-level nomenclatures are based either on institutional definitions of academic disciplines, which may differ across nations, or database classifications, e.g. Thomson 'subject categories' grouping journals. Particular classification schemes from thematic databases or multidisciplinary ones (Pascal database) can also be used. Ad hoc macro-reference sets for a particular actor can be easily designed, e.g. a list of journals where that actor's work is most frequently published (van Raan 1996).
When a fine-grain (document-level) analysis is required, scientometrics usually mobilizes several networks. Delineating complex fields -e.g. nanosciences, genomics, or information and communication -typically involves a combination of approaches: the starting point may be a nomenclature (e.g. a collection of specialized journals, a list of researchers, or institutional sites with their field of research), a lexical query based on the field terminology, (occasionally) a complementary analysis of citation flows, and possibly expert advice.
Borders in human organizations, although sometimes complex, are usually more marked than borders of scientific fields or thematic clusters, the latter of which are fundamentally fuzzy and greatly overlapping. Locally and for some particular subjects, clear borders are encountered, but generally speaking, borders are blurred and one has to build practical frontiers based on some information-retrieval trade-off. If for example we wish to delineate nanosciences by automatic means, we cannot have both a complete recall (i.e. capturing all articles relevant to nanosciences) and a perfect precision (i.e. capturing only relevant articles). Optimization of this compromise is usually costly. The delineation issue may be addressed by sequences of lexical and citation modules to enhance recall, complemented by clustering stages to identify noise or border domains ('hybrid delineation'; Zitt & Bassecoulard 2006, Bassecoulard et al. 2007. Mapping and delineation of scientific fields are closely related issues. In typical studies, the delineation stage comes first (but it could derive from prior all-science mapping) and may involve more strenuous methods, while mapping and clustering into subfields is typically based on a single or a couple of networks and is more automatic. Classical methods of thematic mapping are co-word or lexical coupling on the one hand (Callon et al. 1983), or co-citation and bibliographic coupling on the other hand (Small & Griffith 1974, White & Griffith 1981. The various methods of mapping applied to a scientific area will provide many vantage points on the reality (see e.g. Boyack 2004 for a review on literature mapping techniques and their uses). Noyons (2004) stressed the specific requirements in a science policy context, where maps have to be used in interaction with representatives of actors under study, from research groups to universities or research organisations. As for delineation, hybrid approaches such as terms-citations are promising for thematic mapping (van den Besselaar & Heimeriks 2006, Janssens et al. 2007). Dynamics of clusters can be studied by combining growth indexes and age of citations (Zitt & Bassecoulard 1994).
The delineation and mapping is quite sensitive to methodology choices and information retrieval tradeoffs. Challenges are the efficiency of algorithms, the capability to reflect embedding and overlaps/fuzziness of areas at various scales, and the adaptation of advanced natural language processing to the properties of the scientific jargon. A lot of progress has been made in these directions, in particular with combined and hybrid approaches that are quite promising to achieve robust representations.

Measuring multi-disciplinarity
Measurement of multi-disciplinarity as relations between fields first depends on how scientific fields are delineated, as seen above. Building on this delimitation, a scientometric network is chosen to assess the relations amongst these fields, e.g. clustering methods used for mapping science on words or citations, yield clusters or relationships/overlaps amongst clusters. Several modalities and levels of exchanges (multidisciplinarity, inter-disciplinarity, transdisciplinarity) are often distinguished, although the vocabulary is not fixed. Multi-disciplinarity can also be seen in a static or dynamic perspective.
The literature on the subject is quite abundant and based on various points of view (e.g. Morillo et al. 2001). Many authors have used the citation-flows approach. The structure of references is a usual way to measure the various degrees of inter-disciplinary integration in natural and engineering sciences (e.g. Rinia et al. 2001), in social and human sciences (review in Hicks 2004) and at their interface.
Multi-disciplinarity can also be studied without predefined categories, for example by social network analysis and/or clustering (e.g. Rafols & Meyer 2007). Multi-disciplinarity is then observed through weak ties/'betweenness' amongst dense local neighbourhoods in a 'small world' network, or in dynamic terms through the direction and speed of the rearrangements. Again, several networks may be addressed or combined to measure multidisciplinarity: authors' mobility and co-authors' affiliations; citation spectrum of knowledge exports (degree of generality) or imports (complementarity); vocabulary overlaps; or journal linkages especially in a dynamic perspective (Leydesdorff 2007).

THE CHALLENGE OF EVALUATION
Ideal evaluation measures would be built upon the various aspects of knowledge creation and circulation. A typical set of standard indicators combines (1) output measures (volume and market shares of publication, concentration of activity and specialization spectrum), (2) visibility measures through citations, with a variety of indicators (e.g. volume and market share of citations, impact, impact factor), and (3) partnership indicators. In each category, we distinguish between power, performance and positioning indicators.

Power indicators
Many bibliometric studies deal with collective actors such as institutions or countries. Comparisons amongst actors may involve measures such as total publications, total cites of the actor, and corresponding market shares, e.g. percentage of world publications or citations. To the extent these measures convey an idea of market power in science, they can be termed power indicators. As they rely on an aggregate of productions and related citations of individuals affiliated to these institutions, they are strongly dependent on the size of these institutions. The famous 'Shanghai ranking' (Liu & Cheng 2005) mainly relies on power indicators.

Performance indicators
Performance indicators, in contrast to power indicators, are meant to reflect some average capability, primarily independent from actor size. Two types of performance indicators are commonly dealt with in scientometrics: bibliometric 'impact' measures (citations per publication, with various computing conventions), available or computable from citation databases; and productivity measures with respect to some input data, with various definitions and methods. The 2 measures have a quite different meaning, and are sometimes combined (impact and funding; Lewison & Dawson 1998). Productivity issues are addressed in a variety of ways, including studies on funding-policy efficiency. Sophisticated methods like data envelopment analysis allow relative positioning on a collection of inputs and outputs (e.g. Daraio & Simar 2005), but their pitfalls (see Dyson et al. 2001) and their sensitivity to input data (which are often poor) must not be overlooked. A central question of science policy is whether performance correlates with power or size, and we have some indications that the questions of increasing returns and critical masses are scaledependent, with different answers at the national systems level (Katz 1999) and at the laboratory level.
Even though criteria of evaluation are external to scientometrics, a reasonable assumption is that measures of power and/or performance are directly interpretable in evaluation schemes: the higher the figure (e.g. of publication, of citation), the better the rating: 'more is better'. Ranking makes sense, even though great precautions should be taken in the interpretation. Positioning indicators need a more elaborate interpretative framework.

Positioning indicators
In recent decades, positioning indicators have challenged the traditional input -output measures, prioritising volumes, market shares, and productivity. Positioning indicators typically describe the position of the actor in a particular network, and are amenable to measurement, but the interpretation of the values or ranks is not in terms of performance: 'more is not (always) better'. Rankings on indicators such as the gross rate of co-publication, spectrum of partnership or diversity indexes do not make sense out of actor's strategies and context (e.g. Glänzel et al. 2003); they can be used for positioning, strategic analysis, specialization/complementarity assessment, but not for direct ranking unless some additional rules are introduced. Let us look at the ratio of international co-authorship to all co-authorship: a high value is often held as a favorable sign of openness, but the highest values are usually reached in cases of peripheral countries exhibiting a strong scientific dependence. Some policies will value the co-publication with scientifically advanced partners or the support of developing countries for political reasons. Many scholars highlighted the sensitivity of collaboration patterns to the geopolitical and cultural background (Zitt et al. 2000, Schubert & Glänzel 2006.
Specialization ratios and profiles are another example, with relation to diversity issues. A high specialization or a high diversity do not have particular virtues as such, but depend on the level of observation, the context, the missions and the general policy. The same is true for specialization in particular domains, unless again some external authority establishes priorities amongst disciplines or research areas.

Significance of indicators
Many scientometric units from the US (Thomson-ISI, CHI Research, Indiana University) or Europe (ISSRU in Budapest, SPRU-University of Sussex, ISI-Fraunhofer in Karlsruhe, CWTS in Leiden, University of Leuven among many others) have stressed the questions of significance and sensitivity of classic indicators to methodology, typically based on skewed distributions (Rousseau 1990, Egghe 1991. The high tail, kingdom of star scientists, introduces a strong dependence of aggregate values on extreme individual values: a prominent scientist may gather as many citations as his/her whole laboratory. The handling of Pareto or Zipf distributions, also found in many other areas (physics and economics for example) is quite different from the standard Gaussian model. An example of long-tailed distribution is expressed in the 80-20 rule of thumb stating that 80% of the effect comes from 20% of the sources, a popular expression of Bradford's law about concentration of sources (see Egghe 1991 for an overview). Scientometric indicators are often calculated on the best sources, ranked in a Bradfordian fashion by their decreasing contribution e.g. to articles or citations. Thomson or Scopus databases are not samples of scientific literature, but fairly strong selections. Other forms of distribution are met, e.g. for journal internationalization where the problem is the long tail as mentioned above. In some cases, sampling schemes can be used. The theoretical and practical significance of bibliometric indicators is not a straightforward issue, and depends on the particular type of indicator and methodological framework used.
An obvious challenge for scientometrics is the reliability and robustness of measures (e.g. Moed 2002, among many others). There may be in this respect some difference with data-mining approaches, oriented towards the detection of 'nuggets' in data. Scientometrics and data-mining share many instruments and interests, such as eliciting emerging areas, mapping themes or detecting key-institutions in scientific networks. But scientometrics is always concerned with reference points especially in evaluation contexts.

Capturing creativity and innovation
An obvious risk for scientometrics is to focus on strong signals and extrapolation of trends, as these are more efficient in describing the past than making suggestions for the future. Missing promising signals is easy, because in informetric distributions they are drowned in an overwhelming flow of other 'weak signals'. The mirrors of bibliometrics are rather blurred in this respect. However, attempts at 'early warning' are possible, such as combining structural and dynamic clues of network reconfiguration. Emerging topics are likely to show a strong growth rate and refer to recent literature. In this respect, peer reviews do not always do a better job. A low risk strategy, on both approaches, is usually to bet on the consistency of institutional or individual trajectories over time (star labs will keep their momentum), concentrating the uncertainty on newcomers' assessment. Beyond the specific difficulties of evaluation at the individual level, this suggests an extreme caution in applying bibliometric assessment to newcomers, young teams or emerging topics. If the detection of promising topics or researchers, a typical challenge of data-mining applications, is more uncertain, scientometrics offers remarkable tools to describe the landscape of science in a relatively robust fashion.

CONCLUSION
We will conclude by mentioning a few topical challenges among other perspectives discussed in this TS.

Ranking, positioning, benchmarking
Quantitative analysis of research institutions is not only a matter of performance ranking, such as in the Shanghai exercise or in input/output approaches whatever the sophistication. As repeatedly stressed by scientometricians (van Leeuwen et al. 2003), even classical evaluations cannot involve a single point of view. Whether they are compared to global references, or against benchmarks exhibiting similar profiles, or else with respect to their own scope of missions and targets, higher education and research institutions can be studied in multi-criteria positioning, where performance criteria and ranks are present but where other measures and qualitative indicators 2 not amenable to ranking are also documented. The breakdown of an author's activity into 'mainstream' and 'transfer', or more generally among the branches of the research compass (or multiple helix of research), is sometimes neglected because the purely academic outputs are easier to measure.

Growth-diversity models
Amongst the founding fathers, de Solla Price (1963) established scientometrics on models of growth. With the appearance of new methodologies of network analysis, a new perspective can perhaps be found in understanding how growth regimes in science are shaped by the creation of local variety on the one hand, and the fabric of weak ties and multidisciplinary connections on the other hand. New models of citation competition (Van Raan 2001), reformulation of de Solla Price's questions about the relations of growth, diversity and convergence (components of 'scientific regimes'; Bonaccorsi 2002), and modelling the behaviour of scientists in their choice of areas and problems (Debackere & Rappa 1994, Carayol & Dalle 2007) may pave the way for a better understanding of science dynamics and actors' behaviour in interaction, the real reservoir of renewed indicators. What is challenging (see e.g. Leyesdorf 2001), beyond the variety of scientometric networks, is a unified perspective on knowledge circulation.

Scientometrics and the web age
The classical bibliometric approach relies on a model where articles have a clear status and where journals are key-nodes for 3 basic functions, standardisation, certification and archiving. The web age can enrich this model in many respects, or destroy it, either with a redistribution or a radical questioning of these functions. Scientometricians try to anticipate these evolutions. The quantitative analysis of the various web networks is not straightforward. Although they initially proposed web 'impact factors' as an extension of visibility measures to web pages, Bjorneborn & Ingwersen (2004) later warned against taking the analogy between citation analyses and link analyses too far. Recently, Aguillo et al. (2006) tested cybermetric indicators for ranking universities as shown in their Web sites (see also Thelwall et al. 2005 on Webometrics). Google-type ranks, at first indebted to scientometrics (Pinski & Narin 1976), in turn inspire a new generation of impact measures. Butler (2008, this TS) reports concerns over webometric measures in assessment exercises. 'Canonical' scientometrics and web analysis will keep cross-fertilizing, also with conspicuous risks of misusing analogies.

Science in context
Although functioning on particular norms or habits in their academic activity, laboratories produce not only science but also relations with many partners, as successful metaphors show: the 'research compass card' displays the various dimensions -and clientsof the laboratory outputs (Larédo et al. 1992), the 'triple helix' (Etzkowitz & Leydesdorff 1997) describes the complex interconnections of government, industry and science. The role of scientometrics is all the easier in that these transfers involve codified knowledge, but much work is still needed to measure sciencetechnology relations. Perhaps less easy to handle, but also promising, are some attempts to track percolation from scientific material to trade journals (Nederhof & Meijer 1995), prescriptive literature (such as medical guidelines) and public communication (Lewison et al. 2004). Scientometric tools are helpful in addressing the social and political dimension of scientific communities, for example in gender studies, analyses of particular forms of mobility in science such as diasporas and reverse diasporas or studies of social stratification.

Scientometrics and science: feedback, backlashes, paradoxes
As shown by sociologists, scientometrics has become part of the system. Scientific communities are quite reactive to changes in evaluation systems, especially when funding is at stake, but the consequences of this adaptive behaviour on knowledge production have not been much investigated (Gläser et al. 2002). The star of bibliometric tools was undoubtedly Garfield's impact factor, which durably shaped the behaviour of competitive science. Butler (2003) showed the perverse effects of naive bibliometric-based funding formulas in Australia. We mentioned the recurrent issue of actors' name unification in the section 'Data mining and data demining', especially in academic rankings. Starting from low unification standards, the Shanghai ranking (liu & Cheng 2005) elicited strong correcting actions by actors themselves in subsequent rounds, and additionally contributed to strategic actions in science policy because of the international visibility of this particular ranking. This virtuous circle was probably unexpected by most scientometrics teams. Mastering the feedback loops is far from easy: Weingart (2005) warned of unintended (and possibly destructive) consequences and rightly calls for a professional code of ethics in the application of bibliometric indicators.
In the dialog of rulers and counsellors -scientometrics as decision-support techniques -the danger is double-edged. On the one hand, scientometricians may well be fascinated by their own art, and succumb to the frenzy of numbers, rankings and models. On the other hand, decision-makers and other users of scientometric results are sometimes driven by the parsimony principle: 'one indicator is better than two'. As a result, most famous recent indicators, academic rankings or the hindex (Hirsch 2005), joining the good old impact factor in the family of one-number indicators, may encourage the worship of figures, if not properly normalized and put into context. Scientometric indicators should pay at-tention to the diversity of situations and missions, the comparativeness issues and the 'requisite variety' of vantage points. Misuses of indicators can be minimized in 2 ways: within scientometrics, by striving to respond to the internal challenges of robustness and quality of measures; and by enrolling users, whenever possible, to a cautious handling of indicators.