Deploying and administrating the ATLAS Metadata Interface (AMI) 2.0 ecosystem

ATLAS Metadata Interface (AMI) is a generic software ecosystem for metadata aggregation, transformation and cataloguing. Benefiting from about 20 years of feedback in the LHC context, the second major version was released in 2018. This paper describes how to install and administrate AMI version 2. A particular focus is given to the registration of existing databases in AMI, the adding of additional metadata and, finally, the generation of high level HTML 5 search interfaces using a dedicated wizard.


Introduction
Originally developed for the ATLAS experiment [1] at the CERN Large Hadron Collider (LHC), ATLAS Metadata Interface (AMI) is a generic ecosystem for metadata aggregation, transformation and database storing. Benefiting from about 20 years of feedback [2,3], it provides a wide array of tools (command line tools, lightweight clients) and Web interfaces for searching data by metadata criteria.
The second version of AMI, released in 2018, was designed to guarantee scalability, evolutivity and maintainability. It perfectly fits the needs of scientific experiments in big data contexts. The design principles and main features were described in [4]. A key feature is the AMI Metadata Query Language (MQL) [5], a Domain-Specific Language (DSL) for querying databases and permitting to perform queries without (precisely) knowing relations between tables. In other words, it means that MQL only deals with metadata names while SQL uses a catalog / table / field paradigm. This paper describes how to install and administrate the AMI ecosystem. The following sections present the hardware and software requirements, the deployment and the administration of both the Java backend and the JavaScript frontend of AMI. And finally, a particular focus is given to the registration of existing databases in AMI and to the graphical design of rich HTML 5 search interfaces with a dedicated wizard.

Installing AMI
Installing the AMI ecosystem is quite easy and doesn't require any specialist skills (system, database, development, …). Most operations are automated and a set of HTML 5 interfaces makes the ecosystem administration easy.

Hardware and software requirements
Any machine (x68, x86-64, ARM, …) with at least 4 gigabytes RAM is able to run the AMI ecosystem. It can be deployed on Linux, Mac OSX and Microsoft Windows. Table 1 shows the minimum software requirements for each software of the ecosystem. The AMI ecosystem stores its configuration in a dedicated database: the configuration database (also called "router" database for historical reasons). Table 2 shows the list of supported Database Management System (DBMS).

Deploying the AMI Web Framework
There are two ways of deploying the AMI Web Framework (AWF): i) On an Apache/ Nginx server; ii) Directly on Apache Tomcat. In this paper, the second way is described:

Finalizing the installation
Before using the ecosystem, the administrator password has to be defined and the AMI configuration database has to be properly created and filled. A dedicated Web interface is devoted to that (URL: https://<domain>:8443/AMI/Setup, see Figure 1). From this point, AMI is accessible from URL: https://<domain>:8443/.

The administration dashboard
The ecosystem is centrally administrated via a dedicated dashboard. There are five sections: 1) configuration (JDBC connection pool, command cache, logs, authentication modes, user data protection, …); 2) roles; 3) commands; 4) users; 5) catalogs and three external tools (Schema viewer, Search Modeler and Monitoring). The AMI administration dashboard is available from this URL: https://<domain>:8443/?subapp=AdminDashboard (see Figure 3). The "catalogs" session permits registering new databases (aka catalogs). It consists of an editable table to specify: catalog name in AMI, JDBC URL, credentials, … When a new catalog is added, clicking on "Flush Server caches (full)" triggers the extraction of all the catalog, table and field metadata (for instance: field name, type, …), then, it builds a relation graph between tables. This information is stored in a persistent cache and is used by the AMI Metadata Query Language (MQL), as previously said, a query language that automatically builds SQL joins.

The Schema Viewer
The AMI Schema Viewer is a tool designed for browsing database schemas and for graphically editing additional catalog, table and field metadata. As an example, a field can be tagged as "hidden", "admin only", "crypted", "primary", "groupable", … and a description can be mentioned. Figure 3 shows two screenshots of the Schema Viewer tool.

The Search Modeler
The AMI Search Modeler is a tool for designing rich, and across catalogs, search interfaces, see Figure 4. For each interface definition, a set of searchable fields is specified so that the end user is able to select data by criterion on these fields, see Figure 5. For each selection, a MQL query (without any join) is generated and the AMI backend automatically generates a more complex SQL query (with joins).   Figure 3) for searching "Datasets" by "Project", "Dataset name", "Dataset type", "File name" and "File type". If is selected "Project"='AMI' and "Dataset name"='dataset_2', the resulting MQL query will be: SELECT * WHERE [`test`.`PROJECT`.`name` = 'AMI'] and (`test`.`DATASET`.`name` = 'dataset_2')

Conclusion
AMI is a well-established and mature metadata ecosystem, proposing services and Web interfaces to more than 2000 active users. The second version of the ecosystem, released in 2018, is very easy to deploy and doesn't require any specialist skills. This paper showed how to install AMI from the last binary packages. A set of HTML 5 interfaces makes totally graphical the administration of AMI (administration dashboard) and both the Schema Viewer and the Search Modeler tools permit creating rich interfaces for searching data by advanced metadata criteria.
The AMI Team also provides a ready-to-use Docker [10] image. It is planned to publish it on the Docker Hub platform. Additionally, it is seriously considered to deploy the AMI ecosystem with Ansible [11], the new application-deployment tool provided by Red Hat.