From keywords to semantic queries—Incremental query construction on the semantic web
Introduction
With the advance of the semantic web, increasing amounts of data are available in a structured and machine understandable form. This opens opportunities for users to employ semantic queries instead of simple keyword based ones, to accurately express the information need. However, constructing semantic queries is a demanding task for human users [10]. To compose a valid semantic query, a user has to (1) master a query language (e.g., SPARQL) and (2) acquire sufficient knowledge about the ontology or the schema of the data source. While there are systems which support this task with visual tools [20], [25] or natural language interfaces [4], [12], [13], [17], the process of query construction can still be complex and time consuming. According to [23], users prefer keyword search, and struggle with the construction of semantic queries although being supported with a natural language interface.
Several keyword search approaches have already been proposed to ease information seeking on semantic data [15], [31], [33] or databases [1], [30]. However, keyword queries lack the expressivity to precisely describe the user’s intent. As a result, ranking can at best put query intentions of the majority on top, making it impossible to take the intentions of all users into consideration.
In this paper, we introduce QUICK,1 a novel system for querying semantic data. QUICK internally works on pre-defined domain-specific ontologies. A user starts by entering a keyword query, QUICK then guides the user through an incremental construction process, which quickly leads to the desired semantic query. Users are assumed to have basic domain knowledge, but don’t need specific details of the ontology, or proficiency in a query language. In that way, QUICK combines the convenience of keyword search with the expressivity of semantic queries.
The paper presents a detailed realization of the QUICK system, including the following contributions: (1) we defined a framework for incrementally constructing semantic queries from keywords; (2) we devised algorithms to generate near-optimal query construction guides, which enable users to quickly construct semantic queries; (3) to support the QUICK system, we designed a scheme for optimizing the execution of full-text queries on RDF data; (4) we conducted experiments to evaluate the effectiveness of QUICK and the efficiency of the proposed algorithms.
The rest of the paper is organized as follows. Section 2 provides an overview on how to use QUICK. In Section 3 we present our framework for incremental query construction. Section 4 presents algorithms for generating near-optimal query guides. Section 5 introduces optimization techniques to improve query execution performance. In Section 6, we present the results of our experimental evaluation. Section 7 reviews the related work. We close with conclusions in Section 8.
Section snippets
QUICK overview
As illustrated in Fig. 1, the interface of QUICK consists of three parts, a search field (on the top), the construction pane showing query construction options (on the left), and the query pane showing semantic queries (on the right).
Suppose a user looks for a movie set in London and directed by Egdar Wright.2 The user starts by entering a keyword query, for instance ‘wright london’. Of course, these
Query construction framework
In this section, we introduce the query construction framework of QUICK. We describe our model for transforming keyword queries to semantic queries using an incremental refinement process.
Query guide generation
For a given keyword query, multiple possible query guides exist. While every guide allows the user to obtain the wanted semantic query, they differ significantly in effectiveness as pointed out in Section 3.3. It is thus essential to find a guide that imposes as little effort on the user as possible, i.e., a minimum query guide. Query construction graphs have several helpful properties for constructing query guides:
Lemma 10 Query construction graph properties Given a node in a query construction graph, the complete sub-graph with this
Query evaluation
When the user finally selects a query that reflects the actual intention, it will be converted to a SPARQL query and evaluated against an RDF store to retrieve the results. The conversion process is straight forward: For each concept node in the query or edge between nodes, a triple pattern expression of SPARQL is generated. In the first case, it specifies the node type, in the second case it specifies the relation between the nodes. Finally, for each search term, a filter expression is added.
Experimental evaluation
We implemented the QUICK system using Java. The implementation uses Sesame2 [5] as RDF Store and the inverted index provided by LuceneSail [18] to facilitate semantic query generation. Parts of the described query optimization approaches have been integrated to Sesame2 version 2.2. We have used this implementation to conduct a set of experiments to evaluate the effectiveness and efficiency of the QUICK system and present our results in this section.
Related work
In recent years, a number of user interfaces have been proposed to facilitate construction of semantic queries. These interfaces can be mainly classified into visual graphic interfaces [20], [25] and natural language interfaces [13], [4], [17]. Natural language interfaces are potentially more convenient for end users [12], as they require little prior knowledge of the data schema or a query language. However, the state-of-the-art natural language interfaces still require users to use a
Conclusion
In this paper, we introduced QUICK, a system for guiding users in constructing semantic queries from keywords. QUICK allows users to query semantic data without any prior knowledge of its ontology. A user starts with an arbitrary keyword query and incrementally transforms it into the intended semantic query. In this way, QUICK integrates the ease of use of keyword search with the expressiveness of semantic queries.
The presented algorithms optimize this process such that the user can construct
References (33)
- et al.
Aqualog: an ontology-driven question answering system for organizational semantic intranets
Journal of Web Semantics
(2007) - et al.
DBXplorer: a system for keyword-based search over relational databases
- et al.
Effective keyword search in relational databases
- et al.
The CompleteSearch engine: interactive, efficient, and towards IR & DB integration
- et al.
Gino—a guided input natural language ontology editor
- et al.
Sesame: a generic architecture for storing and querying RDF and RDF Schema
- et al.
Classification schemes revisited: applications to web indexing and searching
Journal of Internet Cataloging
(2000) QuiKey—the smart semantic commandline (a concept)
- et al.
Optimized index structures for querying RDF from the Web
- A. Harth, J. Umbrich, A. Hogan, S. Decker, YARS2: a federated repository for querying graph structured data from the...
Reducibility among combinatorial problems
Cited by (128)
Keyword search over schema-less RDF datasets by SPARQL query compilation
2021, Information SystemsCitation Excerpt :The experiments reported in this article adopt this tool as a baseline. QUICK [15] is another example of an RDF schema-based tool. It translates keyword-based queries to SPARQL queries with the help of the user, who chooses a set of intermediate queries, which the tool ranks and executes.
Olio: A Semantic Search Interface for Data Repositories
2023, UIST 2023 - Proceedings of the 36th Annual ACM Symposium on User Interface Software and TechnologyDomainNet: Homograph Detection and Understanding in Data Lake Disambiguation
2023, ACM Transactions on Database Systems