A Wikidata-based tool for building and visualising narratives

In this paper we present a semi-automatic tool for constructing and visualising narratives, intended as networks of events related to each other by semantic relations. The tool obeys an ontology for narratives that we developed. It retrieves and assigns internationalised resource identifiers to the instances of the classes of the ontology using Wikidata as an external knowledge base and also facilitates the construction and contextualisation of events, and their linking to form the narratives. The knowledge collected by the tool is automatically saved as an Web ontology language graph. The tool also allows the visualisation of the knowledge included in the graph in simple formats like tables, network graphs and timelines. We have carried out an initial qualitative evaluation of the tool. As case study, an historian from the University of Pisa has used the tool to build the narrative of Dante Alighieri’s life. The evaluation has regarded the effectiveness of the tool and the satisfaction of the users’ requirements.


Introduction
Digital libraries (DLs) are information systems that offer information services over large sets of digital objects [1]. DLs were officially born about two decades ago in the Digital Library Initiative. 1 In two successive rounds, this initiative funded several projects and brought DLs in the focus of research. Much progress has been achieved since then, mainly due to major breakthroughs in relevant technological fields, such as communications, information retrieval, ontologies and multimedia. Thanks to these technological progresses, today's DLs have a wide penetration and cover a wide spectrum of digital object types, ranging from hypermedia to 3D models. However, the basic information service of a contemporary DL is essentially the same as that of a traditional library: to support users in discovering the digital objects that satisfy an information need, typically expressed as a query consisting of a short list of terms. This kind of discovery service works reasonably well on the Web, which may be viewed as a very large DL whose objects are pages rich in textual content and interlinked between each other.
On the contrary, the traditional discovery functionalities of DLs respond to a Web-like query with a ranked list of digital objects based only on metadata descriptors that are semantically poor. For example, consider a young student wishing to know more about Dante Alighieri, the major Italian poet of the late Middle Ages. She may type "Dante Alighieri" into her favourite Web search engine and most likely she would get a list of ranked documents with the Wikipedia 2 page about Dante within the top 5 results. Not willing to spend time reading (which she can do in her favourite traditional reference library), the user tries other Web sites, where she hopes to find something quicker and more exciting to consume than the typical textbook narrative. At some point, she lands on the search pages of some DLs, where she tries her query again. The result is a long list of disparate objects, each offering only a glimpse of Dante's life and works, instead of a complete portrait of who Dante was and what he did.
This generalised behaviour is a consequence of seeing a DL as a traditional library endowed with digital resources managed by software. However, the behaviour is strictly in contrast with the very idea at the core of a digital library.
The final report of an NSF sponsored Santa Fe planning workshop on distributed knowledge work environments 3 held in March 1997, defined a digital library as follows: "…the concept of a digital library is not merely equivalent to a digitised collection with information management tools. It is rather an environment to bring together collections, services and people in support of the full life cycle of creation, dissemination, use and preservation of data, information, and knowledge". This vision has been implemented in different ways (e.g. thematic collections, exhibitions) in several projects such as Europeana 4 and the British Library, which have conducted highly successful crowdsourcing campaigns in order to improve their collections [2,3]. 5 Another example is the Internet Archive, 6 a digital library of Web sites and other digital artefacts that collects both user-contributed and curated content [4].
Our research is based on the vision shared by these previous projects and on the idea that DLs should offer something more than a ranked list of objects when a user makes a query on the search form of the DL portal. In particular, we believe that DLs should be able to provide narratives as a result of the users' queries.
Narrative is a well-researched concept in several fields, ranging from literary studies to cognitive science [5]. Giving an account of the concept is beyond the scope of this paper. As a matter of fact, "narrative can be viewed under several profiles -as a cognitive structure or way of making sense of experience, as a type of text, and as a resource for communicative interaction" [6]. For the purposes of this research, a narrative will be intended as a network of "temporally indexed representations of events" [7], i.e. events associated with time structures and related to one another and to the DL resources through semantic links. Indeed, the final goal of our research is to promote narratives as first-class objects for DLs.
In order to introduce this new search functionality, we developed: (i) a formal ontology for representing events and narratives, which reuses existing ontologies in order to maximise interoperability; (ii) a tool based on the ontology, allowing the construction of narratives and their visualisation in an intuitive and useful way.
This paper presents the tool to construct and visualise narratives, which we call "Narrative Building and Visualisation Tool" (NBVT for short). We designed the tool for two kinds of narrators: 7 (i) scholars who want to create a narrative starting from an already existing textual narration written by them, or (ii) general narrators (e.g. school teachers, students) who want to create a narrative based on a textual narration written by someone else, or a narrative existing only in the narrator's mind.
In order to provide the narrator with entities to populate the narrative, NBVT imports knowledge from Wikidata, 8 an open collaborative knowledge base created by the Wikimedia Foundation. The knowledge base accepts editing by any user, following the model of Wikipedia. Wikidata currently contains more than 17 million items. It has reached full compatibility with Semantic Web technologies [8], and it provides a SPARQL endpoint to query the knowledge base. In particular, NBVT uses Wikidata to import a series of entities related to the subject of the narrative, each endowed with its own Internationalised Resource Identifier (IRI). These entities can then be connected to the events that compose the narrative through semantic relations defined in our ontology.
We implemented a visualisation functionality to display the data to users in appropriate and intuitive ways, such as tables, timelines, and network graphs. The functionality obtains the narrative data to be displayed from the underlying triple store, via SPARQL queries.
The paper is structured as follows: Sect. 2 reports related works describing several tools for constructing and visualising narratives; Sect. 3 presents the ontology for narratives we developed and its population using Wikidata; Sect. 4 reports an analysis of the user requirements and describes the tool in detail. Section 5 presents the results of an initial qualitative evaluation of the tool. Finally, Sect. 6 contains our conclusive remarks.

Related works
The long-term aim of our study is to introduce narratives as a new first-class data type in DLs, in order to enrich the content and the search functionality of DLs. As a necessary and preliminary step towards this goal, we have studied the main tools for construction and visualisation of narratives in a user-friendly form. This section overviews the results of this study.
Since the 2000s, narratives have been actively investigated as effective structures able to enhance the information contents and functionalities of DLs, with special emphasis on information discovery and exploration. In this context, several tools have been developed to organise and visualise digital objects collections using semantic models. For example, the CultureSampo project [9] developed an application to explore Finnish cultural heritage contents on the Web, based on Semantic Web technologies. This system uses an event-based model and makes links among events and digital objects. However, it does not allow the visualisation of the semantic network connecting events and the related digital objects.
The WarSampo system [10] is an extension of Cul-tureSampo that allows the publication of collections of heterogeneous data about the Second World War in a Semantic Web format. The knowledge base underlying WarSampo is built by harmonising different datasets using an eventbased model as pivot ontology.
Bletchley Park Text [11] is a semantic application allowing users to enhance their visits in the museum when they get home. Visitors express their interests using SMS messages containing keywords from a suggested controlled vocabulary. The application uses the semantic description of the resources in the collection to create a personalised Web site that the users can explore after visiting the museum.
The PATHS project [12] created a system that acts as an interactive personalised tour guide through existing digital library collections. A path is a device for ordering, connecting and annotating a series of items of interest that have been collected in a Cultural Heritage digital library. Similarly to the approach of the PATHS project, the CULTURA project [13] developed a tool to enrich cultural heritage collections with guided paths in the form of short lessons called narratives.
The Storyspace system [14] allows the description of stories based on events that span museum objects. The system is focused on the creation of curatorial narratives from an exhibition. Each digital object has a linked creation event in its associated heritage object story.
The CIPHER project [15] developed a set of tools to facilitate the development of a meaningful story or narrative structure from existing or new contents. The aim of the project is to allow authors to establish semantic relations between different contents and to select and put them together. The ontology was based on existent taxonomies and thesauri related to Irish archaeology.
DECHO is a framework for the acquisition, ontological representation and visualisation of knowledge about archaeological objects [16]. The ontological component is based on the CIDOC CRM reference ontology [17]. The visualisation component has the ability to display narratives by linking together images or 3D representations of archaeological objects via semantic hotspots [18].
Another example is the CADMOS suite of applications [19]. CADMOS adopts a computer-supported semantic annotation of narrative media objects (video, text, audio, etc.) and integrates with a large commonsense ontology (YAGO-SUMO). CADMOS also features a visualisation tool, which gives a graphical representation of the basic aspects of the narrative.
The Labyrinth project is an ontology-based system for the visualisation of narratives [20]. The Labyrinth system allows the exploration of digital cultural heritage archives by providing narrative relations among knowledge resources. Labyrinth is not tied to a specific collection of objects, but is an open system that allows the emergence of semantic connections among heterogeneous resources. In 2015, the Labyrinth system has been extended with a threedimensional interface [21]. A similar project is Invisibilia, which is focused on the domain of contemporary public art [22]. Invisibilia takes as input an ontological representation, constructed using a CRM-based ontology for intangible art [23], and outputs a 3D layout featuring the artworks.
Several tools exist that allow the visualisation of data on a particular topic contained in existing knowledge bases (e.g. Wikidata, Freebase) in form of narratives. For example, Thinkbase and Thinkpedia [24] are two applications which produce visualisations of the semantic knowledge contained in Freebase and Wikipedia, respectively, allowing the user to explore the semantic graphs of the two knowledge bases in an accessible and interactive way. Histropedia 9 allows users to create or view timelines on topics of their choice by importing statements from Wikidata. Links to related Wikipedia articles and Wikimedia Commons images are automatically added, resulting in rich spatio-temporal visualisations. The scope of the project includes research, education, tourism and proprietary applications [25]. Chronas 10 is a chronological and cartographical history application, with a special focus on visualising maps of the world across human history. The knowledge visualised in the application is automatically imported from Wikipedia, but manual additions and modifications by the users are allowed. Histography 11 is a Web application that visualises events imported from Wikipedia, spanning the entire history of the universe. The user can focus on a particular period of history or even a specific event. Storify 12 was a free storytelling environment (no longer available) allowing users to aggregate social media posts, Web pages, photographs and videos, to curate narratives about news events. The tool has also been applied for education purposes [26].
In this context, we developed a tool for building and visualising narratives (NBVT). The tool is based on a formal ontology of narratives, grounded in narratology theory [27] and encoded in the languages of the Semantic Web. Our ontology-based approach is motivated by considerations of generality, interoperability and reuse.
In comparison with the tools for organising and visualising knowledge reported above, our tool uses an external and open knowledge base to populate the ontology model. Furthermore, we aim at making the ontology underlying our tool general enough to be understood and possibly reused by the widest possible audience. The use of a general-purpose knowledge base and of a general ontology allows the construction of different types of narratives. Indeed, differently from the tools reviewed above that focus on a particular domain of knowledge, NBVT is born to be domain independent. For example, using our tool, narrators have built narratives about different subjects, e.g. the life of the painter Gustav Klimt, the history of the giant squid and the evolution of climate change.

An ontology for narratives
In order to enhance the search functionality of digital libraries, we developed a formal ontology 13 for representing narratives [28]. To maximise its interoperability, we developed the ontology as an extension of the CIDOC CRM 14 standard ontology. The ontology allows: -Representing the events that compose a narrative, linked to each other using three types of relations we introduced: -Temporal occurrence relation, which associates each event with a time interval during which the event occurs; an event occurs before (or during, or after) another event just in case the period of occurrence of the former event is before (or during, or after) the period of occurrence of the latter event. -Causality relation, which links events that in normal discourse are predicated to have a cause-effect relation, e.g. the eruption of the Vesuvius caused the destruction of Pompeii. -Mereological relation, which links events to other events that include them as parts, e.g. the birth of Dante Alighieri is part of the life of Dante.
-Linking the events with the related digital objects included in external digital libraries. -Representing the inferential process of a narrator who reconstructs a narrative starting from the primary sources. 15 13 The interested reader can find a detailed description of the classes and properties of the ontology at: https://dlnarratives.eu/ontology. 14 http://www.cidoc-crm.org. 15 A primary source is a document, a manuscript, an artefact or any other source of information that was created at the time under study.
We strived to maintain simplicity and ease of use at the core of the user interface by hiding from view all the technicalities of the ontology in favour of a clear and streamlined narrative creation process. The ontology is encoded in the OWL 2 DL language [29].

Using Wikidata for populating the ontology
As already mentioned, our tool imports knowledge from an existing knowledge base (KB), providing the narrator with a vast and detailed amount of entities and events with which to build the narrative. Other approaches rely on linguistic methods for extracting knowledge from textual sources in an automatic way. However, these methods are heavily errorprone and as such not suitable for applications such as ours where accuracy is paramount, as pointed out in Sect. 3. Instead, we opted for relying on the narrator's judgement, but at the same time tried to support her as much as possible by proposing her accurate information.
For this purpose, we initially considered two KBs: DBpedia and Wikidata. DBpedia [30] is a free KB developed by the Free University of Berlin, the University of Leipzig and OpenLink Software through automatic extraction of structured knowledge from multiple language versions of the Wikipedia collaborative encyclopaedia. It currently contains 4.5 million entities and allows exporting of the data through the DBpedia Lookup Service and a SPARQL endpoint. Wikidata [31] is a free collaborative KB developed and hosted by the Wikimedia Foundation, built by extracting structured knowledge from several open collaborative projects including Wikipedia, Wikisource, 16 and Wikimedia Commons, 17 and more. It currently contains more than 50 million entities and allows exporting of the data through the Wikidata API and a SPARQL endpoint. Wikidata encourages collaborative addition and editing of the data through manual and automated means.
We identified the following differences between Wikidata and DBpedia: -Wikidata contains more than ten times as many entities as DBpedia, and it features automatic importing of entities not just from Wikipedia but also from other projects managed by the Wikimedia Foundation. -DBpedia has many international versions but these are sometimes disconnected from each other, while Wikidata acts as a central hub for the almost 300 language versions of Wikipedia. Wikidata is also more fully multilingual than DBpedia, with more than 39% of the entities having labels in multiple languages.
-DBpedia has a stable ontology that is carefully managed.
Wikidata is more open to collaborative user input, allowing the free addition, editing and correction of knowledge by users, also allowing them to refine and improve the class hierarchy. However, Wikidata also presents some potential downsides. Chief among these is its collaborative nature, which can sometimes be a double-edged sword. Just like its sister project Wikipedia, Wikidata's openness makes it prone to vandalism from malicious users. Furthermore, it makes the structure of its ontology less stable than that of DBpedia [32].
Another potential downside of Wikidata could be the fact that it was not initially conceived as a Semantic Web project. Its internal structure is a document-based database, and compatibility with Semantic Web technologies was added later. In some cases, the structure of Wikidata statements requires reification to obtain a full conversion to RDF [33]. Despite these challenges, in 2014 the Wikidata model was fully translated to RDF [8].
At the end of our analysis, we decided that the advantages of Wikidata outnumbered its shortcomings and selected it as reference KB for the tool.
NBVT leverages the knowledge stored in Wikidata for the following purposes: -Providing the narrator with a list of entities to use in the narrative, each endowed with its own IRI; -Importing basic events, such as the birth and death of a person; -Referencing primary sources and their authors.
In order to extract RDF knowledge to import in our tool, we made use of the Wikidata Query Service 20 (WQS), a SPARQL endpoint that provides full access to the knowledge stored in Wikidata and is constantly updated following each change in the KB.

The tool
In this section we describe NBVT, by first presenting its architecture and then giving the details of the components of the architecture. The tool is available online 21 and its source code will be released under the GPLv3 licence. 22 Figure 1 graphically presents all actors involved in the operation of NBVT, along with their interrelationships: -The narrator uses the tool to create, modify and visualise a narrative, possibly representing knowledge that has been derived by reading some texts; -The narrator operates through the Graphical User Interface (GUI) of the tool, by manually inserting the narrative data based on the narrative ontology and possibly importing resources from Wikidata; -The narrative is stored as an intermediate JSON representation; -Once the narrative is complete, the corresponding JSON representation is given as input to the Triplifier, which transforms it into an Web Ontology Language (OWL) ontology encoded as an Resource Description Framework (RDF) graph. A semantic reasoner is used to infer new knowledge. The Triplifier also stores the resulting graph into a triple store; -The final users access the visualisation components to explore the completed narrative.

Requirements
An information system serves the needs of a community of users; therefore, analysing users' requirements is an integral part of information systems design, and it is important for the success of interactive systems. As specified in the ISO 9241- 210 standard, 23 understanding the needs and requirements of the users is the first step to develop successful systems and products. Indeed, the result of this analysis can bring a project an increase of productivity, a better quality of the work, smaller costs for providing support and training, and improvement of users' satisfaction [34]. The first step in users' requirements analysis is to collect background information about the users and the processes that currently take place, through structured interviews. In our project, we used this approach to identify a set of requirements on the easy and user-friendly creation and visualisation of narratives and on the knowledge required to carry out such tasks. The preferred users for this research are scholars who want to create and access narratives about the life and the works of the authors they study. These scholars naturally refer to DLs for their work: they use the search service of the DL to discover and access the relevant resources; they use the tools made available by the DL to carry out their work, including communicating with other scholars; finally, they rely on the DL to disseminate and preserve the result of their work.
We interviewed two scholars, both experts in Italian literature at the University of Pisa, deriving the user requirements reported below. For their experience and authoritativeness, these scholars can be considered as representatives of a large community of users. The first requirement that both users emphasised was the accuracy of the information stored in the system. Different scholars may have different opinions as to the events that really happened in, for example, the life of a writer, and the system is expected to account for such differences by documenting them, as explained below. However, this plurality of views should not be confused with the accuracy of the information provided in each view.
In order to create narratives, the tool should allow creating the events that compose narratives in an assisted way, so as to minimise the cognitive and technical burden on the narrator in the selection and identification of the involved resources. This implies support in: -Defining the factual components that characterise the events, linking each event to persons, place, time, physical and conceptual objects through the appropriate semantic relations. -Identifying the roles that persons played in the event.
-Defining the type of each event choosing from a list of pre-defined options. -Linking one or more digital objects to an event (e.g. objects collected in a DL).
In addition to the creation of the narrative, the tool should support its management as a digital object, meaning the pos- These requirements were elicited in three different phases. In between these phases, we developed a prototype to illustrate the satisfied users' requirements. We used feedback from users of the prototypes to validate the solution and refine the requirements.

The NBVT GUI
The NBVT GUI is Web-based and written in HTML5, CSS3 and JavaScript with jQuery. 24 It makes use of the jQuery UI, 25 Bootstrap, 26 and Typeahead.js 27 libraries for interface elements, and of the TimelineJS 28 library for visualisation. It interfaces with a CouchDB database 29 to store data and retrieve it on subsequent loadings. The communication with the database is handled by the PouchDB 30 library, which allows the tool to store the inserted data locally and optionally synchronise it with a remote server. The features of the GUI will be illustrated, along with the services it supports, in Sect. 4.4.
We based the development of the GUI on the user-centred design approach. User-centred design tries to improve the product around how users can, want, or need to use it, rather than forcing the users to change their behaviour to use the product.

Narrative building
The main functionalities of NBVT are the following: -A functionality that supports the narrator in selecting the subject of the narrative; -A functionality dedicated to the main narrative creation; -A functionality for creating causal and mereological relations between the events of the narrative.

Subject selection functionality
When the narrator loads NBVT for the first time, she is provided with a view that allows her to select the subject of the narrative. She can either select the subject from a few default examples or insert the subject name in a text field. Figure 2 shows the GUI of the subject selection functionality. The subject can be any entity that is present in Wikipedia. In the figure we show a few default entities, including notable people from various historical periods (Dante Alighieri, Gustav Klimt, Michelle Obama), a species (Giant Squid) and an organisation (NASA). If the narrator selects one of these default entities, the narrative creation phase begins. If the narrator prefers to search for another subject, she can type some characters in order to be provided with a visual list of possible subjects whose names include the inserted text, created by querying the Wikipedia API 31 and visualised dynamically using the jQuery library. Clicking on the name of a subject starts the narrative creation phase.

Narrative creation
This functionality is dedicated to the narrative creation phase. The GUI that supports it divides the screen into three main areas: 1. The left-hand side of the screen contains a list of all entities that can be chosen as event components, and a series of buttons to filter the displayed entities, search them by name, or add new ones; 31 https://en.wikipedia.org/w/api.php. The narrative creation view is shown in Fig. 3.

Entity Population
During the first load of the narrative creation screen, the tool performs several queries to the Wikipedia API and the Wikidata Query Service (WQS). First of all, a SPARQL query is performed to extract all the relevant information about the subject of the narrative. Then, the tool queries the Wikipedia API to extract all Wikipedia pages that are linked from the one about the subject of the narrative. For instance, for Dante Alighieri it would extract pages about the members of his family, his works, the places where he lived, the people he interacted with, etc. Finally, the tool queries the WQS for the names, descriptions, and classes of each of the Wikidata items corresponding to these Wikipedia pages, using several SPARQL queries. Once all Wikidata queries have finished loading, the tool classifies the retrieved entities into seven main categories: person, organisation, object, concept, work and other (e.g. animals, fictional characters). These correspond to the top classes of the ontology for narratives. Table 1 shows their corresponding classes in the Wikidata ontology. NBVT saves the list of entities to the CouchDB database and displays them in a container on the left-hand side of the screen. Each entity is colour-coded according to its category, in order to allow the narrator to instantly recognise its nature. The narrator can filter entities by clicking on the button corresponding to each category, and each of these buttons is also colour-coded. Another way to filter entities is via the search function, which allows the narrator to find entities by name. A view of the entities container is shown in Fig. 4.
By selecting an entity from the list, the narrator expresses the intention to access the entity to view or to update it, pos-   Fig. 5. 32 The popover is an interface element provided by the Bootstrap library. 33 https://commons.wikimedia.org.

Fig. 4 The entity container
The narrator can also create new entities, using one of three functionalities: 1. Search for a specified entity, querying Wikidata for the name of the entity, 2. Specify a Wikipedia URL, Wikidata URL, Wikidata entity IRI, or Wikidata ID to automatically load the entity from Wikidata, or,

Event Creation
The right-hand side of the screen contains the form for creating an event. Here the narrator can insert the basic data of each event, including: -The event title (a string of text), -The start date of the event (a string of text, automatically parsed), -The end date of the event (a string of text, automatically parsed), -The event type (a string of text, selected from a list or inserted manually), -One or more Wikidata or narrator-defined entities that the narrator has identified as components of the event (added through drag-and-drop from the entity list), -An optional event description (a string of text), -An optional textual note (a string of text), -One or more optional links to digital objects (URLs, automatically parsed).

Fig. 6 The event creation form
A detailed view of the event creation form is shown in Fig. 6. When the narrator begins creating an event, she types the title of the event in the "Event Title" field of the event creation form. Then she types the start and end dates of the event, and she inserts an event type. The event type auto-complete menu contains several event types imported from our ontology and the Wikidata ontology, but the narrator can define new ones if needed. The following step is the linking of the event to its related resources such as people, places, objects. This is done by selecting an entity from the left-hand list and by dragging it onto the Entities field that is found at the centre of the event creation form. When the selected entity is dropped, a popover appears (Fig. 7) for letting the narrator enter information about the sources the narrative is based on. The types of information that a narrator can specify are: secondary source(s), primary source(s) and Notes. In the primary sources section, the narrator can include one or more works that act as primary sources for the event being created, giving their author, title, a reference fragment and a textual fragment. In the secondary sources section, the narrator can insert citations to one or more works that act as secondary sources for the event being created, thereby stating that each such work took part in the event. For instance, if the narrator is a biographer of Dante Alighieri, she may cite the biography she has written. The secondary source form has the following fields: author, title, reference (to indicate a section of the work, such as "pp. 12-13"), textual fragment (the text which describes the event). To facilitate the addition of sources and to make sure that the same work is not added twice by the same narrator, the title and author fields for both primary and secondary sources have been constructed as auto-complete fields using the Typeahead.js library. When Fig. 7 The sourcing popover for "Florence Baptistery" in the event "Baptism of Dante" the narrator inserts the title of a work, she is provided with a list of all works in Wikidata with that title. If she selects a particular work, the author's name is filled automatically. If instead she starts by inserting a name in the author field, she is provided with a list of all authors found in Wikidata under that name. Upon selecting a particular author from the list, the title field automatically changes to show only the works by that author. In both cases, the source is stored internally with its title and corresponding Wikidata IRIs. If the source or the author is not present in Wikidata, they are assigned custom IRIs. The narrator can also insert a textual note with the entity, e.g. to give some more information about the participation of the entity in the event. If the entity being linked to the event is a person, the narrator can additionally specify the role that the person played in the event (for instance, in an event of travelling, the person who performed the travelling will have role "traveller"). The candidate roles are suggested by the tool, based on the type of event and on the type of entity. For instance, a singer who also plays guitar would have "singer" and "guitarist" as suggested roles. These roles are part of a vocabulary that is incrementally built in this way, based on the content of Wikidata, possibly enriched with narrator-defined roles. Additional suggestions are based on the event type, e.g. if the event is of type "murder" the role menu will contain "killer", "victim" and "witness". However, the narrator can also bypass the suggestions and insert a role of her own creation. When the role is undefined, it defaults to "participant". After adding all related entities to the event, the narrator can optionally add a textual note about the event and a link to one or more digital objects.

Inferred Events
In the case of biographical narratives, several of the events that the narrator would have to create can be inferred from the knowledge that is stored in Wikidata. For instance, when describing the life of Dante Alighieri, the narrator will necessarily have to insert an event titled "Death of Dante Alighieri" with a date of 14 September 1321. This information is already present in Wikidata, but since the Wikidata ontology is not event-based, the event is not represented explicitly in the knowledge base. Instead, Wikidata uses Property P569 (date of death) to directly link the entity that represents Dante Alighieri to the date value "14 September 1321". In order to facilitate the narrator's work, we developed a feature that extracts basic events from the knowledge contained in Wikidata and uses them to populate the timeline, before the narrator creates any event of the narrative. The results are used to populate the timeline with several "inferred events". The narrator is then able to edit the inferred events, or delete them if they are not needed in her narrative. Currently, the tool is able to handle basic events such as birth, death and marriage for humans, foundation for organisations and creation for physical objects. The list is easily extensible to other kinds of events, thanks to the flexibility of Semantic Web ontologies.

Event Timeline
The bottom side of the screen is a timeline giving a simple view of the events created by the narrator in chronological order. One such timeline is shown in Fig. 8. Whenever an event is saved or loaded as an inferred event, it is automatically added to the timeline. Each event in the timeline shows an abridged representation of the data entered by the narrator, i.e. the title, the dates and the entities that are related to the event. By clicking on an event in the timeline, the narrator can reload the event for subsequent additions or corrections. The narrator can also delete events from the timeline.

Adding event relations
When the narrator has inserted two or more events, she can switch to the Event Relations functionality. In this view, shown in Fig. 9, the narrator can link an event to the events on which it causally depends or to the events that contain it mereologically. To ease this operation, the bottom timeline slides up, revealing boxes where events can be freely dragged. These are taken from a duplicate timeline, which is displayed on the bottom of the screen. By doing so, the narrator can freely connect events to each other without losing

Narrative visualisation
In the following, we report a complete description of the visualisation functionalities that satisfy the requirements listed in Sect. 4.1. We adopted a specific visualisation for each query, in order to display the retrieved data in an appropriate way.

Timeline visualisation
First of all, in order to give a complete overview of the narrative, the events are placed on an interactive timeline, as shown in Fig. 10. We adopted the TimelineJS library for the implementation, allowing the final user to visually navigate the semantic network of events.
In particular, the tool extracts from the knowledge base the following information: (i) the title of the event, (ii) the date in which the event starts and (iii) ends, (iv) the textual fragment of the secondary source, (v) the title of the primary source, (vi) the textual fragment of the primary source, (vii) the reference of the primary source as reported in the secondary source, e.g. Inferno V, 1-9, (vii) the IRI of the digital object that represents the event.
In order to load the image of an entity related to each event, a SPARQL query is run on the Wikidata Query Service to extract all available images of Wikidata entities that are present in the narrative. These images are stored in the Wikimedia Commons repository. If the image is present, it is loaded from Wikimedia Commons.
Finally, for each event, the visualisation output shows the first image retrieved from the results of the query. The righthand side of the visualisation gives the textual content that describes the event, along with links to the corresponding digital objects, references to the primary and secondary sources, and links to the related entities. The left-hand side displays an image representing the event, extracted from Wikimedia Commons, as shown in Fig. 11.

Event-centred and entity-centred graph
Another requirement for the tool is the visualisation of the entities that compose each event. To this aim, we implemented a SPARQL query to extract this information from the knowledge base. This query retrieves, for each event title, the names and IRIs of the corresponding entities. The visualisation of an event along with its related entities is shown in Fig. 12. We adopted the vis.js 34 JavaScript library to implement the visualisation. The network graph visualises a star   In order to extract all the events that are related to a specific entity, we implemented a SPARQL query and the result is presented as a network graph. This query extracts all the events (IRIs and titles) that occurred in the city of Florence. The visualisation of an entity with its related events is reported in Fig. 13. In the centre of the graph, the entity "Florence" is shown. The arcs connect this entity to all the events that took place in Florence.

Primary sources table
One of the most important requirements for a scholar who studies historical events is the knowledge of their primary sources. For this reason, using a SPARQL query the tool retrieves the following information: (i) the title of event, (ii) If an event has more than one primary source, the table shows a row for each source, e.g. the event "Baptism of Dante" has two primary sources: Dante's Inferno XIX 17 and Paradiso XXV 8-9. It is possible that the reference fragment is absent, for example when the primary source has no internal subsections, e.g. Boccaccio's work "Trattatello in laude di Dante".

Table of events that occurred in a specific time range
The final user has the possibility to visualise all events that occurred in a specified period of time. Upon specifying the desired period, the final user can freely insert the dates using a widget to select a full date or the year only. In order to be sure that the query always returns a result, the start and end dates of the narrative are suggested. The results of the query are shown in form of table, where for each event its dates are shown. An example is reported in Fig. 15.

Relations visualisation
In the ontology, we defined three types of relations between events: mereological, temporal and causal. We developed a visualisation component that shows these relations using a network graph. For instance, the event "Education of Dante" can be divided into three sub-events: elementary school, middle school and further education received under his preceptor Brunetto Latini. The mereological relations between the event "Education of Dante" and its sub-events are shown in Fig. 16. The figure also shows the temporal relations that occur between the sub-events. Also in this case, we used vis.js to implement the graph.

Triplification
Once created, the narrative is encoded as an OWL knowledge base and stored into a triple store. The overall process, which is called "triplification", is described in Fig. 17. The narrative is first exported by the GUI to a JSON object; this JSON object is processed by a Java software, the Triplifier, which transforms it into an OWL graph encoded in RDF/XML 35 and Turtle 36 formats. The Triplifier carries out its task by relying on the OWL API 37 library. Then, the open-source Openllet OWL DL reasoner 38 is applied to check the graph for consistency and expand it with inferred knowledge. The graph is finally stored into a Blazegraph 39 triple store.

An initial qualitative evaluation
An initial qualitative evaluation has been carried out, in order to measure the effectiveness of the narrative building tool and the satisfaction of the narrator's requirements. The results of the evaluation are presented in this section.

Effectiveness of the narrative building tool
Measuring the effectiveness of NBVT means to estimate how well the tool allows the narrator to build a formal narrative in a short time and with little effort. The experiment that we set up in order to perform this estimation consists in using the tool for constructing a narrative based on a short biography of the Italian poet Dante Alighieri. The choice of biography is based on the consideration that biographies are a very common kind of narratives in digital libraries: they provide an ideal context for the interpretation of the works included in the digital library, by relating these works with the events of the life of the person who created them. We chose Dante Alighieri because his life is the subject of many studies, and we had the possibility of working in close cooperation with one authoritative historian from the University of Pisa, who is writing a revised version of Dante's biography, based on the most recent discoveries in this field.
The narrative of Dante's life, 40 built by the historian using the tool, is composed of 83 events. Before the creation of the narrative, the historian took part in a training session with us, in which we explained to him the functionalities of the tool and he was able to try them out. After the training session, the historian built the narrative on his own in about five hours of work. This can be considered a very positive outcome since the historian constructed the narrative starting only from the 35 40 The timeline visualisation of the narrative is available online at the following address: https://dlnarratives.eu/timeline/dante.html. The text used to describe the events is extracted from Wikipedia, since the original text written by the historian is protected by copyright.  The historian praised the fact that it is possible to formalise a significant narrative from the text in a matter of a few hours. This performance was possible since the tool is provided with a semi-automatic knowledge extraction method that identifies candidate relevant named entities within the narration and automatically imports them from Wikidata. Indeed, 69% of the entities used in the narration of the life of Dante are those defined in Wikidata. Figure 18 reports a comparison between the number of entities reused from Wikidata and those created by the narrator. It is interesting to note that about two thirds of the entities are imported from Wikidata, and only one third of them were inserted manually. This result is significant because: (i) following the Linked Data paradigm [35], we reuse existing  Wikidata entities versus narrator-defined entities by times used in the narrative entities that already have an identifier (an IRI) and a description, (ii) importing entities from Wikidata is much faster for the narrator than adding them manually, (iii) entities imported from Wikidata allow the narrator to load into the tool unstructured content from projects that are linked to Wikidata, such as text from Wikipedia and images from Wikimedia Commons. Figure 19 shows a comparison of the usage of the Wikidata entities versus the narrator-defined entities in the narration of Dante's life. The Wikidata entities were used in 84% of the events that compose the narrative.
Due to the positive results of this first evaluation experiment, we plan to perform a larger scale evaluation with a wider community of scholars.

Satisfaction of requirements
In order to prove the satisfaction of each requirement defined in Sect. 4.1, we implemented a corresponding SPARQL query returning exactly what the requirement asked for. In particular, the requirements to consider at this stage of our research are: (i) those relating to the knowledge to obtain from the underlying knowledge base, and (ii) those relating to the visualisation of the narratives. For the reader's convenience, the list of the main requirements is reported below. The knowledge to be extracted from the knowledge base includes: -Events along with their primary sources.
-Events that happened in a specific range of time.
-Entities related to a specific event.
-Events related to a specific entity (e.g. place, person). -Events linked by a particular relation (e.g. causal or mereological).
In order to visualise narratives, the tool should allow: -Visualising a narrative on a timeline.
-Visualising events (all or only some defined by the final user) along with their primary sources in form of table, exportable in CSV format. -Visualising events that happened in a specific range of time (defined by the final user) in form of table, exportable in CSV format. -Using network graphs to visualise an event and its related entities. -Using network graphs to visualise a particular entity and its related events. -Using network graphs to visualise the different types of relations that connect events.
It can be easily verified that the narrative visualisation functionalities presented in Sect. 4.5 adequately address these requirements, providing a further validation of our tool with respect to: (i) the underlying knowledge base, which is rich enough to support the visualisation, and (ii) the visualisation component, which is sophisticated enough to make the best usage of the knowledge base.

Conclusions
In this paper we have presented a semi-automatic tool, called NBVT, for constructing and visualising narratives intended as semantic networks of events related to each other through semantic relations. The tool facilitates the construction and contextualisation of events and their linking to form narratives. It obeys an ontology for narratives we developed. The tool retrieves and assigns IRIs to the instances of the classes of the ontology using Wikidata as reference knowledge base. In particular, NBVT manages the knowledge stored in Wikidata for the following aims: (i) providing the narrator with a list of entities and their corresponding IRIs, to use in the narrative; (ii) importing basic events, e.g. the birth and death of a person; (iii) referencing primary sources and their authors. We have carried out an initial qualitative evaluation of the tool. As case study, an historian from the University of Pisa has used the tool to build the narrative of Dante Alighieri's life. The evaluation has regarded: (i) the effectiveness of the narrative building tool; (ii) the satisfaction of the requirements defined at the beginning of the study. To verify the satisfaction of the requirements, we have demonstrated that a SPARQL query can always be built to retrieve the required information from the knowledge base storing the narrative. For each of these queries, we have built a specific visualisation to display the data in a user-friendly way. The evaluation by the historian has confirmed the satisfaction of the requirements. This evaluation has also highlighted some key features of the proposed solution, i.e. the possibility to extend the knowledge on each event using related resources (e.g. Wikipedia pages), digital objects described in external digital libraries (e.g. Europeana) and related images extracted from Wikimedia Commons. Furthermore, the evaluation has also highlighted that, starting from a text, the proposed semiautomatic approach allows the formalisation of a significant narrative in a relatively short time. Our tool has also been used by a researcher in Computational Biology at the Italian National Research Council (CNR) to narrate the discoveries related to the giant squid 41 and the effects of climate change, 42 and by a Digital Humanities researcher at the Italian National Research Council (CNR) to create the narrative of the life and works of the Austrian painter Gustav Klimt. 43 The development of the ontology for narratives and NBVT constitutes the first step towards the introduction of narratives in DLs. Indeed, narratives can improve the discovery functionality of DLs by connecting the events that compose them to the digital objects contained in the DLs. As future work, we plan to integrate our tool with the Europeana digital library, in order to allow digital curators to create narratives on digital objects included in their DLs, and DL users to visualise the created narratives in simple formats such as timelines and tables. We also plan to perform a usability evaluation of the tool through a questionnaire that will be filled by its users.