The virtual atomic and molecular data centre (VAMDC) consortium

The Virtual Atomic and Molecular Data Centre (VAMDC) Consortium is a worldwide consortium which federates atomic and molecular databases through an e-science infrastructure and an organisation to support this activity. About 90% of the inter-connected databases handle data that are used for the interpretation of astronomical spectra and for modelling in many fields of astrophysics. Recently the VAMDC Consortium has connected databases from the radiation damage and the plasma communities, as well as promoting the publication of data from Indian institutes. This paper describes how the VAMDC Consortium is organised for the optimal distribution of atomic and molecular data for scientific research. It is noted that the VAMDC Consortium strongly advocates that authors of research papers using data cite the original experimental and theoretical papers as well as the relevant databases.


Introduction
The Virtual Atomic and Molecular Data Centre (VAMDC) Consortium originates from two European funded projects: the VAMDC (http://www.vamdc-project.vamdc. (1) An e-science infrastructure that interconnects about 30 databases (http://www.vamdc.eu/activities/research); (2) An overarching organisation: 'the VAMDC Consortium' that was launched 1 November 2014 and currently involves 35 groups, 19 of them having signed a memorandum of understanding (http://www.vamdc. org/structure/how-to-join-us/). VAMDC Consortium activities cover four domains: research, which is the most developed domain, education and industry that are currently in their early development stage, and outreach activities.
To facilitate its implementation VAMDC has developed protocols, data structures and query languages to allow disparate but complementary databases to be both interrogated simultaneously and to be utilised in a synchronised manner. This article describes the current status of the VAMDC including a description of the databases participating in this project. The data centre provides particularly comprehensive coverage of laboratory atomic and molecular physics data required for astrophysics studies. VAMDC therefore provides a unique access point for astronomers seeking the best atomic and molecular data for their studies.

Research services
VAMDC offers a common entry point to all incorporated databases through the VAMDC portal(http://portal.vamdc. eu).This portal is flexibly designed offering the possibility of including new data and new databases within the VAMDC e-infrastructure, as well as providing software libraries and modules that can be included in the user's own programs, and providing stand-alone user-oriented software that retrieves and handles atomic and molecular data. Figure 1 depicts the structure of the VAMDC e-infrastructure.

Standards
The VAMDC e-infrastructure has evolved over the years through successive releases (Rixon et al 2011, Dubernet et al 2015. The current version of standards and software is release version v12.07. The VAMDC Standards include the definition of data models for atomic and molecular data, the definition of 'keywords', the definition of query protocols and data access, the definition of registries, of units, of versioning processes, and the establishment of a uniform protocol for web application to process XML schema for atoms, molecules and solids (XSAMSs) files. All Standards Documents can be found on http:// standards.vamdc.eu/. Below we provide here a brief explanation of some of the key features of these standards.
2.1.1. Data model and XSAMS files. The exchange of heterogeneous atomic and molecular data requires that the relevant parts of physics and chemistry be translated into a computing framework. XSAMS is an XML implementation of the VAMDC data model which is used to build the computer framework. The processes being represented by XSAMS are essentially quantum-mechanical in nature.
XSAMS provides a standard approach to the description of atomic and molecular (A+M) physics in terms of physical processes connecting different quantum-states of atoms and molecules. The identification of the states is standardised by XSAMS to promote matching of corresponding states between databases. Several levels of detail are supported by XSAMS for the description of states and a range of coupling schemes may be specified. States of molecules are identified by tuples of the most-relevant quantum numbers, different tuples being specified in XSAMS for different classes of molecules. XSAMS describes both radiative and collisional transitions between states. The description of collisions may include chemical changes and interactions with elementary particles. The environment in which the transitions occur may be specified, with the main emphasis being on events in the gas phase, but improved support is planned for solid and liquid environments. Cross-sections, both differential and total, for the reactions may be included, and these may be expressed in either tabular or parametric form.
Finally, since the origin and history of the data are necessary for data assessment, XSAMS imposes strict requirements on the traceability of the data, with mandatory inclusion of information on data sources and the methods used to generate specific sets of data. Similarly XSAMS also enforces the rigorous use of standardised units systems and facilitates the provision of uncertainties with the data provided. XSAMS was initially developed (Ralchenko et al 2009) through the joint efforts of researchers from the International Atomic Energy Agency (IAEA), National Institute of Standards and Technology (USA), Oak Ridge National Laboratory (USA), Université Pierre et Marie Curie and Observatoire Paris-Meudon (France), with contributions from the All-Russian Institute of Technical Physics (Russia). The VAMDC Consortium has further developed XSAMS, which then became VAMDC-XSAMS and implemented its logic in all the software used within the VAMDC e-infrastructure. An agreement between the VAMDC Consortium and the IAEA makes the IAEA curator of the stable developed versions of VAMDC-XSAMS (Braams et al 2016), currently VAMDC-XSAMS version 1.0 released in July 2012 (http://standards.vamdc.org/#data-model). 2.1.2. Registry of services. The VAMDC registry is a database of metadata describing VAMDC nodes and web applications. It allows an application to find the address of a given node, to select nodes by the kinds of data they offer and to find out which query terms are supported at a node. Members of the VAMDC Consortium have a password allowing them to register a new database or a web application, once they have followed the quality procedures concerning their new resource.

VAMDC connected databases
Each database included in VAMDC may be interrogated individually by software via its own web-service, but these services conform strictly to the VAMDC standards and all use the same query protocol and language; thus, it is easy to send the same query to multiple databases. The query protocol is called VAMDC-TAP 44 , derived from the table Access Protocol of the International Virtual Observatory Alliance, and the query language is VAMDC SQL Sub-set 2 (VSS2) 45 , a limited form of structured query language tailored to suit the VAMDC data-model inherent in XSAMS. The combination of a service and a database is called a 'VAMDC data-node' and is achieved by building a small program, specific to the database and running inside a web server, to represent the database and to convert between internal and external representations of the data. Such a program is called 'node software' and almost all the code for it is provided by VAMDC as a standard library; only minor customisation is needed for each database. Support is provided by VAMDC for building new data-nodes.
The VAMDC nodes are distributed such that they are located at themembers' and partners' sites. At present most of the databases included in the VAMDC e-infrastructure are key databases used in astrophysics. Table 1 provides a summary of the databases currently accessible via VAMDC which are each discussed in turn below.

Database of species
Many queries select by atomic and molecular species, and the names for species vary between scientific communities. VAMDC maintains a database of names, called the 'species database' from which community-specific terms may be refined into standard identifiers understood by all VAMDC nodes. VAMDC identifiers are based on InChIkeys (Heller et al 2015). In most cases, the VAMDC identifier is just the standard InChIkey, but in special cases a suffix is added to distinguish molecular conformers. One copy of the species database is encapsulated in the web portal. Another copy is available for query by applications as a VAMDC data-node. We have designed software which updates the species database when a new species is introduced in any of the VAMDCconnected databases. Copies of the Species Database are distributed to user client software to allow the same query capability as the portal. Table 2 gives two sample entries from the species database. This species database is a key component of the interoperability within the VAMDC e-infrastructure.

VALD
VALD (Piskunov et al 1995) is a collection of critically evaluated laboratory parameters for individual atomic transitions, complemented by theoretical calculations. VALD is actively used by astronomers for stellar spectroscopic studies -model atmosphere calculations, atmospheric parameter determinations, abundance analysis etc. The two first VALD releases (Piskunov et al 1995 contained parameters for atomic transitions only. In a major upgrade, VALD3, publically available from the spring of 2014, the atomic data was extended to cover elements from hydrogen to uranium, and also complemented with parameters for molecular transitions (Ryabchikova et al 2015). An evaluation procedure which involves a comparison with a selection of high-precision stellar spectra is described by  and (Ryabchikova et al 2008). At present the Moscow VALD-VAMDC node serves high-quality atomic data only (about 2 million transitions). The Uppsala node contains all collected transitions (about 177 million) and is, therefore, suitable for applications such as opacity calculations. The Moscow node is limited to transitions between the experimentally measured energy levels (hence precise wavelengths) and is more tuned for high-resolution spectroscopic analysis. From 2016 Moscow and Uppsala nodes will also run the mirrors of each other for enhanced reliability. spectroscopic diagnostic programs, it is used in the analysis of optically thin collisionally ionised plasmas. The CHIANTI database contains atomic structure data (experimental and calculated wavelengths and radiative data), and rates for electron and proton collisions, as well as for ionisation and recombination. The data are mostly obtained from published literature, and are regularly assessed and updated. The latest version 8 of CHIANTI (Del Zanna et al 2015), contains a large number of new atomic data for several isoelectronic sequences, calculated within the UK atomic processes in astrophysical plasmas (APAP) network (www.apap-network. org). APAP/CHIANTI data are used world-wide by almost all atomic databases and modelling codes for astrophysical plasmas. For example, by the modelling codes for photoionised plasmas such as Cloudy (Ferland et al 2013) and MOCASSIN (Ercolano et al 2008); by modelling codes for the x-rays such as PINTofALE (Kashyap and Drake 2000) and ISIS (Houck and Denicola 2000); hydrodynamic codes such as HYDRAD (Bradshaw and Mason 2003) and RADYN (Allred et al 2015); spectral modelling of supernovae such as TARDIS (Kerzendorf and Sim 2014); and atomic databases such as XSTAR (Bautista and Kallman 2001), Stout (Lykins et al 2015), ATOMDB (Smith et al 2001). The CHIANTI basic data is available via VAMDC, and version 8 is being added.

NIST atomic spectra database (NIST ASD)
The NIST ASD athttp://www.nist.gov/pml/data/asd.cfm contains critically evaluated standard reference data on energy levels, spectral lines, and radiative transitions in atoms and ions of 110 elements from H to Ds. As of November 2015, ASD offers detailed information on more than 109 000 levels and 250 000 spectral lines, in particular, level identifications (configurations, terms, total angular momentum, etc) and radiative decay rates for almost 100 000 transitions. All data in ASD are carefully analysed and evaluated and proper uncertainties for numerical values are derived. ASD also offers several graphical data manipulation tools, e.g., dynamic Grotrian diagrams and Saha-local thermodynamic equilibrium spectra, as well as spectroscopic diagnostic information for ions of several astrophysically abundant elements.

Spectr-W 3
The Spectr-W 3 project is the collaboration between the Russian Federal Nuclear Centre All-Russian Institute of Technical Physics (RFNC VNIITF) and the Joint Institute for High Temperatures of the Russian Academy of Sciences (JIHT RAS). The information accumulated in the Spectr-W 3 (Faenov et al 2002) atomic database contains over 450 000 records and includes experimental and theoretical data on ionisation potentials, energy levels, wavelengths, radiation transition probabilities, oscillator strengths, and (optionally) parameters providing analytical approximations for electroncollisional cross-sections and rates for atoms and ions. These A+M spectroscopy, atomic collision Astrophysics, other data were extracted from the scientific literature or provided directly by authors. The information is supplied with references to the original sources and comments, elucidating the details of experimental measurements or calculations, where necessary and available. At the present time, Spectr-W 3 is the largest available database providing information on the spectral properties of multicharged ions. A new development of the Spectr-W 3 atomic database started in 2014 will create a new section containing information on x-ray emission spectrograms recorded at various plasma sources. Software for this section is currently being created and tested. The spectrograms, mostly obtained in the laser-produced-plasma experiments are also in preparation. Spectr-W 3 is hosted at http://spectr-w3.snz.ru/.

TIPbase-TOPbase
Atomic data calculated within the framework of the international opacity project (OP) and iron project (IP) are available in TOPbase (the OP database for radiative processes) (Seaton 1987, Mendoza 1992, Cunto et al 1993) and TIPbase (the IP database, for collisional and radiative processes) (Hummer et al 1993). Included are energy levels/ terms, radiative transitions probabilities, photoionisation cross sections and collision strengths for a large selection of ions in the range H to Ni. These theoretical data have been calculated using state-of-the-art computer programs such as various versions of the R-matrix suite of codes (Berrington et al 1995, Berrington andBallance 2002), superstructure (Eissner et al 1974) and autostructure (Badnell 2011) or CIV3 (Hibbert 1975, Hibbert et al 1991. Within their range of validity, the OP and IP data remain a widely used reference set for both producers and users of atomic data. They are relevant for experiment analysis, theoretical comparisons and for various atrophysical or laboratory experiment applications. The OP radiative data are used to calculate monochromatic and mean opacities (Badnell et al 2005) required notably in stellar codes or for the analysis of experiments. TOPbase and TIPbase are hosted at http://cdsweb.u-strasbg. fr/OP.htx.

Stark-B
The Stark-B (Sahal-Bréchot et al 2015) database is a collaborative project between the Laboratoire d'Etude du Rayonnement et de la matire en Astrophysique and the Astronomical Observatory of Belgrade. The database contains calculated widths and shifts of isolated lines of atoms and ions due to electron and ion collisions (Stark broadening parameters), calculated in a series of papers by Dimitrijevi'c, Sahal-Br'echot and their co-workers, using the semiclassical-perturbation approach developed by Sahal-Bréchot (1969a, 1969b, 1974 supplemented by Fleurier et al (1977), and updated by Dimitrijević and Sahal-Bréchot (1984) and in subsequent papers (see Dimitrijević and Sahal-Bréchot 2014). The database's principal purpose is to provide the Stark broadening data needed for the modelling and spectroscopic diagnostics of stellar atmospheres and envelopes but it is also useful for laboratory plasmas, as well as for laser produced, fusion and technological plasmas. Various astrophysical and other applications of data in STARK-B are discussed by Dimitrijević and Sahal-Bréchot (2014). Data are provided for a wide range of temperatures and electron or ion densities. Hence the range of temperatures and densities covered by the tables is wide and depends on the ionisation degree of the considered ion. Stark-B is hosted at http:// stark-b.obspm.fr/.

Cologne database for molecular spectroscopy (CDMS)
The CDMS (Mueller et al 2005) provides transition frequencies of atoms and molecules mostly for radio astronomical observations. The included species have been or may be observed in the interstellar medium (ISM), in the circumstellar envelopes (CSEs) of very young and very old stars, or in planetary atmospheres. As of October 2015, 789 species are available which refer not only to the main isotopic species and the ground vibrational state, but also to minor isotopic species and excited vibrational states as far as these may be of interest for radio astronomers. The catalogue is similar to that of the Jet Propulsion Laboratory (Pickett et al 1998 and while duplication is avoided, nevertheless, some overlap exists. CDMS welcomes outside contributions after critical evaluation. As a partner of the VAMDC, the CDMS is accessible from the general VAMDC portal, from the SPECTCOL tool , from CDMS new interface (http://cdms.ph1.uni-koeln.de/cdms/portal/) and from its traditional user interface http://www.astro.uni-koeln. de/cdms. CDMS is hosted at http://www.cdms.de. UDfA, BASECOL, CDMS, JPL

JPL
The submillimetre, millimetre, and microwave spectral line catalogue (Pickett et al 1998) has provided the astrophysics community with atomic and molecular absorption/emission spectral data for over 35 years. The catalogue listings include line-by-line tabulations for each spectral feature, as well as source documentation. As for CDMS, entries are prepared with the quantum mechanical software SPFIT/SPCAT (Pickett 1991), which provides a well-founded and thoroughly tested common basis for the molecular physical data.

HITRAN
The HITRAN database (Rothman et al 2013) is a longstanding compilation of molecular spectroscopic parameters that are required for radiative-transfer codes. This latest edition of HITRAN has a traditional line-by-line high resolution section that contains essential spectral parameters for 47 molecules along with their significant isotopologues appropriate for the terrestrial atmospheric and diverse room temperature applications. There is a section that contains highresolution experimental cross-sections for so-called heavy molecules, including, for example, the chlorofluorocarbons and halocarbons, that are not amenable to full quantummechanical description as in the traditional part of HITRAN. There is also a section of the compilation that provides collision-induced absorption cross-sections. Recently the traditional line-by-line part of HITRAN has been restructured  in order to expand the capabilities to address the issues of planetary atmospheres as well as more sophisticated line shapes. The HITRAN database contains explicit and convenient citations to all source data, and in fact in many cases has raised the impact factor of contributors (Gordon et al 2016).

Spectroscopy and molecular properties of ozone (SMPO)
SMPO is an information system (Babikov et al 2014) devoted to the high resolution spectroscopy of the ozone molecule, related properties and data sources. SMPO contains information on original spectroscopic data recovered from comprehensive analyses and modelling of experimental spectra as well as associated software for data representation written in PHP Java Script, C++ and FORTRAN. , air-broadening coefficient, self pressure-induced broadening coefficient and exponent of temperature dependence of air broadening coefficients) have calculated values. The CDSD-296 databank covers the 5.9-12784.1 cm −1 spectral range and contains more than 419600 lines. This line list was generated using an intensity cutoff of 10 −30 cm/molecule at a reference temperature of 296 K. Recently the CDSD-296 databank (Tashkun et al 2015) has been extended to include all 12 stable carbon dioxide isotopologues, and the full database will be available through the VAMDC portal in 2016. The CDSD-1000 databank (Tashkun et al 2003) covers the spectral range 257-9648 cm −1 and contains more than 3950 500 lines. The line list has been generated using an intensity cutoff of 10 −27 cm/molecule at a reference temperature of 1000 K. The databanks are hosted in Tomsk at ftp://ftp.iao.ru/pub/ CDSD-296 and ftp://ftp.iao.ru/pub/CDSD-1000.

ExoCross
The ExoMol project (Tennyson and Yurchenko 2012) provides computed line lists for molecules likely to be of importance in hot atmospheres such as those of extrasolar planets and cold stars. Because of the elevated temperatures in these environments, a very large number of transitions must be included in spectral simulations: up to 10 11 lines in some cases. To ease the computational burden involved in the calculation of opacities (which in many practical cases are not required at high resolution), ExoMol provides pre-calculated absorption cross sections  at a range of temperatures through a service on the project website (http:// www.exomol.com/xsecs/). The ExoMol database has just undergone a comprehensive upgrade to its data structures (Tennyson et al 2016). ExoCross is a VAMDC database node which provides access to these cross sections through the VSS2 query language. Data is returned in the XSAMS format for interoperability with other data returned by VAMDC services. ExoCross is hosted at www.exomol.com where a considerable amount of other spectroscopic data can be found for the molecules concerned.

Spectroscopy of atoms and molecule (SESAM)
SESAMs is devoted to the spectroscopic analysis of UV electronic spectra of diatomic molecules. Data for H 2 , HD, D 2 and CO are presently available as a whole dataset or can be searched within a defined wavelength window. Other selection criteria are available such as the energy of the lower (upper) level of the transition and the oscillator strength value. The data are obtained from theoretical calculations which include rotational and radial perturbations between electronic potential curves (Abgrall et al 1993(Abgrall et al , 1994. They have been tested carefully against experimental values when available (Abgrall et al 1994, Liu et al 2007, Gabriel et al 2009. They are useful in various contexts, ranging from plasma laboratories (Gabriel et al 2009) to planetary (Barthélemy et al 2014), interstellar (France et al 2011(France et al , 2012(France et al , 2014 or extragalactic environments (Salumbides et al 2015). SESAM is hosted at http://sesam.obspm.fr/.

W@DIS
The W@DIS information system (Polovtseva et al 2012 is designed to provide access to data, information, and ontologies relating to quantitative spectroscopy required for solving fundamental and applied problems pertaining to a number of subject domains: atmospheric optics, astronomy, etc. The information system under discussion is a prototype of the next generation information system on molecular spectroscopy based on Semantic Web technologies (Berners-Lee et al 2001, Fazliev et al 2013. The system focuses on spectral data representation for the end user with the possibility of employing information characterised by different levels of details both for the structure of the data and for the knowledge (semantic annotations) associated with these data. It is the end user who makes the decision about the level of detail required. W@DIS divides molecular spectroscopic data into three parts: parameters for energy levels, transitions, and line profiles. Most of the molecular data are accessible via VAMDC portal. W@DIS is hosted at http:// wadis.saga.iao.ru. The KIDA (Wakelam et al 2012), is a compilation of kinetic data (chemical reactions and associated rate coefficients) used to model the chemistry in astrophysical environments (the interstellar medium, protoplanetary disks, planetary atmospheres etc). In addition, to detailed information of each reaction (temperature range of validity of the rate coefficients, reference, uncertainty etc), particular attention is given to the quality of the data which is evaluated by a group of experts in the field. The 2014 update has just been released (Wakelam et al 2015). KIDA is hosted at http://kida.obs.u-bordeaux1.fr.

UDfA
The UMIST database for astrochemistry (McElroy et al 2013) contains reaction rate coefficients and related material for the study of chemical kinetic modelling of astronomical sources, including molecular clouds, CSEs and shocked gas. The main part of the database Rate12, which is the fifth public release of the data, is a set of 6173 gas-phase reactions and temperaturedependent rate coefficients among 467 species and involving 13 elements (McElroy et al 2013). In this release particular emphasis has been placed on identifying the source of the data; DOI codes and web links to the original papers are provided where possible. In parallel, a comprehensive, searchable web site has been developed which includes other relevant data including deuterium exchange reactions and state-selective ortho-para reactions among hydrogen and deuterium, both of which are essential for calculations of deuterium fractionation in interstellar clouds, a set of binding energies for atoms and molecules on interstellar ices and a set of three-body reactions and rate coefficients that are important at very high densities. The web site also provides fully commented codes and documentation for calculations of abundances in molecular clouds and CSEs together with related perl scripts and instructions for generating the ODEs and plotting routines. Software that turn the abundance outputs into line intensities and profiles is being developed. UDfA is hosted at www.udfa.net.

BASECOL
The BASECOL2012 database (Dubernet et al 2013) is a repository of collisional data. It contains rate coefficients for the collisional excitation of rotational, ro-vibrational, vibrational, fine, and hyperfine levels of molecules by atoms, molecules, and electrons, as well as fine-structure excitation of some atoms, that are relevant to interstellar and circumstellar astrophysical applications. In addition, BASE-COL2012 provides spectroscopic data queried dynamically from various spectroscopic databases using the VAMDC technology. These spectroscopic data are conveniently matched to the in-house collisional excitation rate coefficients using the SPECTCOL software package , and the combined sets of data can be downloaded from the BASECOL2012 website. As a partner of the VAMDC, BASECOL2012 is accessible both from the general VAMDC portal and from user tools such as SPECTCOL. Submissions of newly published collisional rate coefficients sets are critically evaluated before inclusion in the database which is hosted at http://basecol.obspm.fr. The BASECOL database contains explicit citations to all source data; the authors' publications can be easily retrieved from the BASECOL website and from both the SPECTCOL tool and the VAMDC portal through the 'bibtex' visualisation software.

Laboratorio di Astrosica Sperimentale (LASP)
The database is maintained by the 'Laboratorio di Astrofisica Sperimentale' (Catania-LASp for short), Catania, Italy. LASp spectra are taken by using in situ techniques and equipment especially developed to analyse the effects of irradiation (ion and/or UV photons) and thermal cycling (down to 10 K) by infrared, Raman and UV-VIS-NIR spectroscopy. Analysed materials include frozen gases, solids samples and meteorites.
The main application field up until now has been in astrophysics, and over the years many hundreds of ice mixtures of various compositions and of solids have been studied (

Grenoble astrophysics and planetology solid spectroscopy and thermodynamics (GhoSST)
The GhoSST database provides experimental data and products on: (1) transmission spectra of ices, sulfur compounds, minerals, organic and meteoritic material in the wavelength range from the visible to the far-IR; (2) absorption coefficients and optical constants of ices; (3) bidirectional diffuse reflection spectra of granular surfaces (hydrated minerals, adsorption on minerals, ices, natural and synthetic organic materials, meteorites, ...) from Visible to near-IR; (4) bidirectional reflection spectra of granular surfaces simulated with a radiative transfer model (Spectrimag) using laboratory optical constant data; (5) infrared micro-spectroscopy of minerals, natural organics, and meteorites; (6) Raman and fluorescence micro-spectroscopy of cosmomaterials (meteorites, IDPs, Stardust, ...) and planetary analogs; (6) lists of vibration bands of molecular solids and molecules adsorbed on solids, from Visible to far-IR. Only the linelists are currently accessible from the VAMDC The GhoSSt database is hosted at http:// ghosst.osug.fr/.

Desire-dream
The main purpose of DESIRE (database on sixth row elements) is to provide the scientific community with updated spectroscopic information concerning the sixth row elements (Z = 72-86) of the periodic table in their lowest ionisation stages. In the tables, the spectra are classified by elements, and for a given element, by degree of ionisation. For each spectrum, the tables show the wavelengths (in Å) derived from the experimental levels, the lower and upper levels (in cm −1 ) of the transitions, the calculated weighted oscillator strengths (log gf) and transition probabilities (gA in s −1 ), and the cancellation factors. The radiative rates listed in this database have been obtained by the Atomic Physics and Astrophysics group of Mons University, Belgium, by means of a systematic and extensive use of the pseudo-relativistic Hartree-Fock (HFR) method (Cowan 1981) modified for taking core-polarisation effects into account (HFR+CPOL) (Quinet et al 1999(Quinet et al , 2002

ALADDIN2
The International Atomic Energy Agency (IAEA) ALADDIN (originally 'a labelled atomic data interface') database contains numerical data on atomic and molecular collision processes that are relevant to fusion energy and other plasma applications as well as data on plasma-material interaction processes. The name goes back to a data exchange format developed in the fusion atomic and molecular data community in the 1980s (Hulse 1990), but the present database at IAEA is based on SQL and XSAMS is used for data exchange. ALADDIN2 is a subset of ALADDIN accessible through the VAMDC interface and containing at this time only the gas-phase collisional differential and integrated cross sections and rate coefficients. Data in ALADDIN and ALADDIN2 have largely been produced in coordinated research projects or other activities sponsored by the IAEA or they have been evaluated under auspices of the IAEA. The ALADDIN database is hosted at http://www-amdis.iaea. org/ALADDIN.

RADAM databases
Some databases aimed at the radiation damage community have lately adopted the VAMDC standards and are therefore currently accessible in VAMDC: RADAM-ION, IDEABD and BeamDB. Those databases are part of a collaboration that has a created the RADAM portal at http://radamdb. mbnresearch.com/.
3.26.1. IDEABD. The innsbruck dissociative electron attachment (DEA) database node holds relative cross sections for dissociative electron attachment processes of the form: AB + e −  A − + B, where AB is a molecule. It supports querying by various identifiers for molecules and atoms, such as chemical names, stoichiometric formulae, InChI (-keys) and CAS registry numbers. These identifiers are searched both in products and reactants of the processes. It then returns XSAMS files describing the processes found including numeric values for the relative cross sections of the processes. DEA processes have several areas of importance, among which are technical applications, biological effects in radiation damage and extra-terrestrial ion chemistry, which makes this node interesting for the astrophysical and astrochemical community. The node is hosted at the University of Innsbruck, Austria and maintained by the group of Paul Scheier. New datasets will be added in the future and contributions are highly welcome. IDEABD is hosted at http://ideadb.uibk.ac.at/.
3.26.2. RADAM-ION. The RADAM-ION database concerns ion interactions with biomolecular system. Currently the RADAM-ION database compiles data on cross sections for elastic and inelastic collisions, for ionisation and charge transfer characterising the interaction of singly and multiply charged ions with biomolecules and biomolecular complexes. These data correspond to total and differential cross sections as well as fragmentation mass spectra (including tables of ion yield) of the biomolecular system in interaction with ions. Additional information on secondary particle production (electrons, radicals) will be included in a second step. It is hosted at http://radam.unicaen.fr/.
3.26.3. BeamDB. BeamDB (Marinković et al 2015) contains measured collisional data for electron interactions with atoms and molecules in the form of differential and integrated cross sections as well as energy loss of the spectra. The BeamDB is hosted at the Belgrade Astronomical Observatory, http:// servo.aob.rs.

LXCAT
LXCAT (Pancheshnyi et al 2012) is an open-access website for collecting, displaying, and downloading electron and ion scattering cross sections, swarm parameters (mobility, diffusion coefficient, etc), reaction rates, energy distribution functions and other data required for modelling low temperature plasmas. The available data bases have been contributed by members of the community and are indicated by the contributor's chosen title. The LXCAT website is hosted at http://www.lxcat.net/.The following contributors to LXCAT have chosen to open their data within the VAMDC e-infrastructure and LXCAT maintains this VAMDC access. 3.27.2. Zatsarinny and Bartschat data. Zatsarinny and Bartschat made their data, currently stored in LXCat, also available via VAMDC These are generally angle-integrated elastic, momentum-transfer, excitation, and ionisation cross sections from the ground state of the noble-gas atoms, for low and intermediate energies from threshold to about 100 eV. Some angle-differential data, as well as angle-integrated results for transitions between and ionisation from excited states are available from the authors upon request. The calculations were performed with a parallelised version of the B-spline R-matrix code described by (Zatsarinny 2006). Details of the method and references to many recent calculations can be found in a recent Topical Review (Zatsarinny and Bartschat 2013).
3.27.3. UNAM database. The UNAM database (de Urquijo et al 2011(de Urquijo et al , 2013(de Urquijo et al , 2014 contains measured swarm data for the interactions of low energy electrons and ions with rare gases (He, Ne, Ar, Xe), atmospheric gases (O 2 , N 2 , H 2 O, N 2 O, CO 2 ), hydrocarbons (CH 4 ), fluorocarbons (CF 4 , C 2 F 6 , CHF 3 , C 2 F 6 , CF 3 I and C 2 H 2 F 4 ), and SF 6 . The data are in the form of swarm coefficients (electron drift velocities, ionisation, attachment, electron detachment and diffusion coefficients, ion-molecule reaction rates, all as a function of the densitynormalised electric field intensity, E/N). Overall, the interaction energy range is 0.025-100 eV. An update with binary mixtures of H 2 O and other gases is in preparation.
3.27.4. ETHZ database. The ETHZ database contains the effective ionisation rate and the electron drift velocity in various gases, such as N 2 , CO 2 , Ar and several fluorocarbon gases. These data are obtained experimentally using the pulsed Townsend method (Dahl et al 2012) in the high voltage laboratory from ETH Zurich.

Indian atomic and molecular database (IAMDB)
The Indian atomic and molecular research community has been actively involved in atomic and molecular research for decades, especially in the areas of collision physics, spectroscopy, astrochemistry and chemical kinetics. There is a wide spectrum of activity in both theory and experiment covering collisions of electrons, ions and photons and atomic and molecular spectroscopy ( Krishnakumar and his collaborators have initiated a programme aimed at the creation of the IAMDB. The aim was two fold: (1) create 'IAMDB' as a repository for the A+M data in India; (2) link it to the VAMDC portal for wider visibility. IAMDB (www.iamdb.co.in) is currently being developed to incorporate data from all the Indian atomic and molecular research groups. Each node in the database is developed independently and the collision node will be ready soon. Later all the nodes will be assembled together to form the complete IAMDB database which will consist of: (1) a collisional data node; (2) an atomic spectroscopy data node; (3) a molecular spectroscopy data node.
3.28.1. The collisional data node. The atomic and molecular physics group at Indian School of Mines, Dhanbad in collaboration with Tata Institute of Fundamental Research and Sardar Patel University is setting up a database on collision cross section for atoms and molecules (iamdb.co.in). This is a general version of the previous database (electron collision cross section, e-atmol.in) with data from ion collisions, photon collisions and spectroscopy. The iamdb. co.in contains data on electron scattering cross section for many atoms and molecules in their gaseous form, published in peer-reviewed papers (Goswami et al 2013, Gupta and Antony 2014, Nagma and Antony 2014, Kaur et al 2015.
Preparations are under way for the inclusion of ion and photon collision data. It should be noted that unpublished data may also be included later after due evaluation by experts in the field. This first node of the IAMDB will be ready to go online by the end of 2015, and within VAMDC by 2016.

Web portal
The portal(http://portal.vamdc.eu) is the main single entry point to all VAMDC connected databases. It provides immediate visibility for the available data in the VAMDC connected databases, and in many cases can be the first step towards discovering data since those databases will respond if they contain the requested data. The portal contains several sections: the 'VAMDC Databases' section provides a general description of the VAMDC connected databases and allows display of the species that are available in each database; the 'Guided Query' section is new and provides a query interface for beginners where users are guided step-by-step; the 'Advanced Query' section allows more flexibility in interrogating the databases; The 'info' section provides a link to tutorials which provide a full understanding of the portal's capabilities. Once a query is performed, the user is invited to choose one of the possible tools to visualise the data. These tools are homogeneous between the databases and the description of the retrieved data, in particular of quantum numbers, is also the same across all databases. Such uniformity means that the portal cannot offer the same services as a traditional graphical user interface of the individual databases. This is the price paid for accessing a wide range of data stored in a heterogeneous fashion. Nevertheless, where possible, specific visualisation software is associate with each database. Currently this visualisation software is accessible from '-Choose display-' of the result page (see figure 3), the tools labelled with stars being those that are recommended. The result page is under continuous development in order to meet evolving user needs.
It should be emphasised that the portal has been tuned towards astrophysical users for spectroscopic applications. Indeed the two visualisation tools called 'molecular spectroscopy XSAMS to HTML' and 'Atomicxsams2HTML' that display, for molecular and atomic spectroscopic data respectively, an HTML page where columns and lines can be selected, allow to export selected data into a VOTable format which is a virtual observatory (VO) standard(http://www. ivoa.net)used to exchange data among VO Tools. The VOTable can be sent to any VO Tool launched on the user's desktop using the 'send via SAMP' functionality of the processor. SAMP is another VO Standard (http://www.ivoa. net/documents/SAMP/)that allows several applications to exchange data directly, without the need to save and load a file locally. As a typical example, a selected data file can be sent to an application called TOPCAT (http://www.star.bris. ac.uk/~mbt/topcat/) which is dedicated to the visualisation of tabular data. All functionalities of TOPCAT can be used on the selected VAMDC data.

Services to include new databases
The current VAMDC e-infrastructure includes databases related to atomic and molecular spectroscopy and to heavy particle collisional processes, and is appropriate to the type of currently accessible data. Any producer of data can join the VAMDC infrastructure through one of the following methods: (1) they can include their data in existing atomic and molecular databases that are partners of VAMDC; (2) they can create a new database hosted by a partner of VAMDC; (3) they can create a new node in the VAMDC e-infrastructure. VAMDC aims to provide atomic and molecular data providers and compilers with a large dissemination platform for their data. Currently all products related to VAMDC, portal and tools, explicitly encourage users to cite both the original papers where the data have been published and the relevant databases.

Libraries and software
The libraries, software modules and software can be found on the VAMDC website 46 . The integration of the libraries are documented, supported via tutorials and illustrated by scientific case studies (Dubernet et al 2014). Up till now case studies have concentrated on astrophysical applications since the traditional VAMDC collaborators come from the astrophysics communities. As an example of this, the SPECTCOL tool  matches spectroscopic data from CMDS (Mueller et al 2005) and JPL (Pickett et al 1998) with collisional data from BASECOL (Dubernet et al 2013) for interstellar applications. The client tool interrogates the registry to find spectroscopic and collisional information about a molecule. It retrieves different possible sets of data from different databases. The user can associate sets of their own choice to create a customised combination of spectroscopic and collisional data. This automatically resolves the common difficulty met by researchers wishing to combine collisional data from a database such as BASECOL with spectroscopic data coming from other native databases such as CDMS, JPL or any other spectroscopic databases. Combining data involves matching molecular states that are very often described differently in JPL, CDMS and BASECOL. This problem is now resolved easily using XSAMS. A recent example of application of the SPECTCOL tool has been to the study of protoplanetary disks (Öberg et al 2015).
Another example is access of VAMDC connected databases through the BASS2000 web portal(http://bass2000. obspm.fr)that provides access to a wide range of the solar spectra, from 67 to 5400 nm. This overall spectrum is obtained from (Curdt et al 2001) for the UV part using SOHO/SUMER data; from (Delbouille et al 1972) for the visible part, using Kitt Peak National Observatory observations; from (Delbouille et al 1981) for the infrared part, also based on Kitt Peak observations. The visible part of the spectrum is very much used both for preparing and processing observations. Spectro-polarimeters take full advantage of the detailed knowledge of atomic data concerning the observed lines in order to be able to calculate local magnetic field. Therefore, the solar spectral data has been connected to VAMDC in order to provide directly useful information to the user. This helps in data processing, but also in order to prepare observations, for instance tagging atmospheric molecular lines, such as water, that may serve for wavelength calibration.
Two astrophysical software packages have implemented access to the VAMDC e-infrastructure, to provide the capability of analysing interstellar medium spectra. One package is the standalone software called CASSIS (http://cassis.irap. omp.eu),the other is a toolbox (http://www.astro.uni-koeln. de/projects/schilke/myXCLASSInterface)for the common astronomy software applications package that includes the MyXClass program (Moeller et al 2015).
Finally SPECVIEW (http://www.stsci.edu/institute/ software_hardware/specview/) is a tool for 1D spectral visualisation and analysis of astronomical spectrograms from STScI that can overplot spectral line identifications taken from a variety of line lists, including user-supplied lists. SPECVIEW has been modified to include the VAMDC query module, called QueryBuilder 47 , thus providing the full capability to query the VAMDC databases. As an example SPECVIEW is launched automatically to visualise synthetic spectral librairies for the GAIA mission http://gaia.esa.int/ spectralib/)and now the user can make detailed selections of observed species and properties of linelists in order to overplot spectral line identifications.
Users might want to create new libraries and software; VAMDC provides support for those activities. The VAMDC Consortium can provide the following services to users: (1) services and tools can be adapted to meet specific user requirements; (2) VAMDC capabilities and facilities can be imported into tools developed by institutes external to the consortium; (3) innovative tools for easily handling and processing results can be provided.

The communication services
The VAMDC Consortium has a large communication platform that is made available to producers and users of data. The 'VAMDC Consortium' communication activities occur through its main website that is the entry point for all customers from research, education, business, outreach, through the News, the Events, the Blogs sections. VAMDC is present in the social networks (Facebook, Twitter, ResearchGate, LinkedIn), uses the natural channel of dissemination in conferences and workshops, organises tutorials for different categories of users either through self-organisation or through joining other tutorials linked to atomic and molecular data or to e-infrastructure or to the application fields. We offer a Forum platform that can be used by any groups of data users and data producers. The above communication channels and tools are available for general use.

The education services
Education activities cover different target groups and different methodologies. The main target groups are students in secondary school education, higher education and continuous education. The methodologies include the use of VAMDC in face-to-face education sessions or via on-line teaching. The Education activities are linked to national curriculae and must be displayed in the national language at least for all levels below university degrees. Nevertheless the coupling of educational activities in science and the use of English is often seen as attractive.
Our objectives for education are the following: to give easy access to atomic and molecular data and information related to these data; to provide innovative pedagogical resources in agreement with national curriculae in order to illustrate lectures at all levels of education; to re-inforce the link between research and education; to create national networks, and to interconnect them at the international level; to be partners of public institutions; to support teachers and lecturers, and bring them our knowledge on our scientific expertise linked to e-science; to offer training on the developed education tools.

Future developments
The VAMDC Consortium is currently involved into two updates of its current standards: the serialisation of the data model underlying the XSAMS schema and issues concerning data citation. Indeed the data model underlying XSAMS, being based on physics and chemistry, can be serialised in other languages and technologies should the XML technology become obsolete. In particular, the serialised form of XML is very verbose and probably needs to be reconsidered for extracting, transporting and analysing large sets of data such as linelists containing many billions of lines Tennyson 2014, Sousa-Silva et al 2015).
Since the VAMDC e-infrastructure is designed for scientific usage, we pay special attention to problems linked to data citation and reproducibility of the data-extraction process: in this context we are in collaboration with the research data alliance (RDA) and its data citation working group 48 . The aim of this working group is to create an identification mechanism that allows VAMDC services to identify and to cite selected views of data, ranging from a single record to an entire database, the identification includes a timestamp. The VAMDC has fully implemented the RDA recommendations 49 : we are refining our standards (mainly the XSAMS format) and databases in order to support both data versioning, so that an earlier state of data can be retrieved at later time, and timestamping, so that modification and addition of data are marked with a timestamp. These evolutions are necessary for the scientific usage of atomic and molecular data retrieved through the e-infrastructure as it will ensure the traceability of data in user applications codes.

Conclusion
The VAMDC is not a database; rather is a facility which allows data to be federated from many different databases with overlapping but different remits. Implementation of the VAMDC portal has required the development of a novel data language, XSAMS, to allow the participating databases to be automatically queried, and implementation of common molecule and quantum number identification schemes, allowing data from heterogeneous databases to be both compared and merged. Software products, such as SPECT-COL, are being developed which exploit the functionality offered by combining data from several sources. It is anticipated that as the scope and use of VAMDC increases, a number of similar products will emerge.
The VAMDC Consortium is open to new members and to the inclusion of new types of data. In particular, the scope of VAMDC Consortium can be extended if: (1) a new community of data providers is interested in benefiting from our experience and from parts of our software, (2) one of our user communities needs different types of data to be combined with the sets of data already available via the VAMDC e-infrastructure. The inclusion of new types of data could impact some of the VAMDC Consortium activities so the method of integration within the 'VAMDC e-infrastructure' needs careful discussion.
Overall the VAMDC offers access to disparate data sources within the atomic and molecular physics area. The interoperability developed offers a roadmap for other communities wishing to confederate their data sources in a similar fashion. The Spectr-W 3 project activities are currently supported in part by the Russian Foundation for Basic Research, RFBR grant Nr. 14-07-00863. LSR and IEG wish to thank the support from NASA Planetary Atmospheres Grant NNX13AI59G. ExoMol is supported by ERC Advanced Investigator Project 267219. ZB and KB acknowledge support from the United States National Science Foundation under grants PHY-1403245 and PHY-1520970, and by the XSEDE supercomputer allocation PHY-090031. AD and SM thank P Rousseau and B A Huber for fruitful discussions, Quentin Marie for the preparing of the RADAM-ION database website structure and Université de Caen Normandie for hosting the website. BA is supported by the Dept. of Science and Technology (DST), Govt. of India. Part of this research was supported by the Austrian Science Fund (FWF): P26635. Support from Nano-IBCT COST Action MP1002 (Nanoscale Insights into Ion Beam Cancer Therapy) is also acknowledged. YLB and VGT acknowledge support from LIA SAMIA and from Tomsk State University D Mendeleev funding program. Belgrade node activities and the corresponding research are supported by projects III44002, 176002 and 171020 of the Ministry of Education, Science and Technological Development of Republic of Serbia. 48 https://rd-alliance.org/groups/data-citation-wg.html. 49 https://rd-alliance.org/system/files/documents/RDA-DC-Recommendations_150924.pdf.