A Semantic Web Solution for E-Government Educational Services

This paper presents a solution for public education services. The problem: although there is sufficiently publicly available data, this data is not available in an open standard format and frequently it comes from heterogeneous data sources. Our main contribution consists in integrating this data and making it available for citizens in realizing their queries. We approach the necessity by referring to current problems such as information overload, benefits for citizens and Open Data initiative. We present our Semantic Web solution. We conclude by presenting the benefits for E-Government in terms of content creation, content generation and user satisfaction.


Introduction
This study presents a Semantic Web solution for E-Government based on integrating web resources related to education.It proposes a framework for integrating conceptual query answering with semantically marked web resources.The necessity derives from the E-Government challenges.As many authors noted, Semantic Web challenges for E-government consists in: 1) solving the problem of exchanging information meaningfully between different agency systems, 2) guaranteeing semantic accuracy of information, and 3) dynamically configuring governmental services basing on the specification of citizen/ business needs [1].The challenge is to implement e-government systems that allow fluid communication with the general public, thereby achieving a greater degree of participation; the key to success for e-government.The desired progression from a purely innovative service into a democratic process that provides efficient citizen-friendly support and communication depends on: information, interaction, transaction and integration [2].We took the example of educational services because we consider that finding the best fitted educational offer is a subject of maximum importance.The scope of this study is to present a method to improve information gathering about educational services.The motivation is given by the lack of interoperability and of semantic consistency of different data sources.Specific contributions of the paper include: 1) the presentation of a Linked Data application deployed for the E-Governmental educational services, 2) the description of how the data is assembled into a bigger data model, 3) the description of methods used to generate recommendations from our data model.The paper is organized as follows: Section 2 discusses the problem statement.We continue in Section 3 by presenting the related work.Section 4 presents our solution.Section 5 is for discussions.Section 6 discusses the main conclusions.

Problem Statement
For the moment, in this matter, the solution is represented by the presence of a web site and, eventually a Facebook page or Youtube channel for every educational unit.The user is responsible for integrating disparate information.On the one hand, it is a problem of data integration, but more often it is a problem of gathering knowledge and semantic integration.Very often the user does not know a specific name of a high-school or what is the best rated high-school in his/her area.The user knows only some basic properties that would suit better for his/her demands and very often the user does not know what the most important 1 DOI: 10.12948/issn14531305/19.4.2015.04characteristic of one specialization or highschool is.This process of gathering the necessary knowledge may take some time and requires trust that can be given in an online environment through online reviews.If the reviewer included some personal information and if the web page is integrated with a Facebook like button or it offers the possibility to add a comment by using the Facebook account, a semantic web application could offer the necessary integration.There are a number of important functionalities, relevant to the process of making sense of educational services data, which are currently not supported.The semantic relations between the educational offers which help to structure the data space are still a subject of debate.The problem that this paper addresses resides in the question: how to enable the discovery of relevant data within the multitude of available data sets and what are the proper required technologies in order to approach the user?For example, a citizen who intends to find out what is the proper specialization to choose if the child likes mathematics, physics, biology and English he/she needs to do some very fine analyses between different data.Finding the best solution requires either to 1) mark the relations between those subjects or to 2) automatically discover the relations.This sense making process requires exploring information about different high schools, teachers, specializations, opportunities as well as understanding the relations which exist between them.As long as the educational process is very often established at the national level, the appropriate method would be to build the necessary model.The model must allow realizing fine analyses over data and, in the same time, to query data depending on different questions with alternative meanings.We did not concentrate on proposing the best model; we established what the main components of this model are.The next section presents our solution.

Related Work
The actual status on semantic interoperability at present, most information systems store data in relational databases and make its exchange possible according to well defined structures, usually using XML schemas.Sharing data according to some sort of schema has been the technological paradigm of the last decades because it enables computer programs to process data efficiently.However, when these schemas evolve, information systems need to be adapted accordingly.Over time maintaining these schemas requires significant effort and can be quite inflexibleespecially when the pace of change is high.This is a key reason for the emergence of a new paradigm for data exchange centered on the Resource Description Framework (RDF).According to its publisher, W3C, "RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed" [3].In RDF, data is organized in graphs around subject-predicate-object statements and can be queried using SPARQL.These and other related standards are the foundations of Linked Data. Figure 1 presents the semantic web layer cake.Semantic interoperability is therefore concerned not just with the packaging of data (syntax), but the simultaneous transmission of the meaning with the data (semantics).This is accomplished by adding data about the data (metadata), linking each data element to a controlled, shared vocabulary.The meaning of the data is transmitted with the data itself, in one self-describing "information package" that is independent of any information system.It is this shared vocabulary, and its associated links to an ontology, which provides the foundation and capability of machine interpretation, inferencing, and logic.To make communication effective, data formats need to be harmonized, thus improving syntactic interoperability [4].Only in the late 1990s, standards like XML became widely adopted.However, XML and related technologies (XSD, Web Services, SOAP, WSDL) left one problem open: the need of sharing "off-line" a common naming scheme and agreeing upfront on the meaning and on strict data typing.Mutual understanding was impossible otherwise.In the 2000s, thanks to semantic standards like RDF and OWL, a new step has been achieved: semantic interoperability [4], which entails reaching consensus on the meaning of data elements and the relationships between them.Semantic interoperability requires developing common vocabularies to describe data entities, and ensures that these are understood in the same way by communicating parties [5].Linked Data is an enabler of semantic interoperability.The situation about educational services Lambert (2013) [6] observed that seeking information about educational services is not that common as seeking information about entertainment or leisure, but it represents a public's interest category manifested by addressing queries to governmental institutions.Information quality plays a critical but indirect role in influencing a person's use of a community municipal portal.In addition, perceived ease of use and compatibility also affect usage.[7] Most local governments are using Web 2.0.and social media tools to enhance transparency but, in general, the concept of corporate dialog and the use of Web 2.0. to promote e-participation are still in their infancy at the local level.[8] Even though the last technological solutions like social media tools, blogs or YouTube channels have proved their utility for spreading information and making communication easier, a list of unsolved issues still exists.Picazo-Vela et al. (2012) [9] noticed the main perceived risks associated to social media in the public sector: 1) repetition of content in several platforms, 2) reliability of information published by governments, 3) message's dispersion in the different channels, 4) lack of information or updated information, 5) access to data, 6) integrity and validity of the information, 7) the fact that some information is sensitive and it cannot be distributed and 8) the problem that information opens the door to more questions.

Semantic Web
The term Semantic Web first appeared in 2001 in Scientific American in an article by Tim Berners Lee, James Hendler and Ora Lassila and has been defined as a new form of web content, presenting understand computer will generate a revolution of new possibilities.The research comes from artificial intelligence (James Hendler is a known author in the field of Artificial Intelligence).After several years of concern to define standards for data representation for the Web (Resource Data Format -RDF), ontology (Ontology Web Language -OWL), vocabularies of terms (Resource Data Format Scheme -rdfs) queries (SPARQL Protocol and RDF Query Language -SPARQL), rules (Semantic Web Rule Language -SWRL) appeared Linked Data initiative (2009) who proposed interconnection Internet data environment using standards defined representation.The aim of the Semantic Web is to enable global access and unambiguous way at any time present in the Internet environment.Data is structured using ontologies.Each term / concept is attached to a unique identifier.The items and information resources can be text, multimedia, transactions (backend applications processed), ties / DOI: 10.12948/issn14531305/19.4.2015.04links, services, user profiles.These are all various sources of useful data integration and semantic annotations are subject.The semantic web is built on W3C standards: the Resource Description Format (RDF) (V3C 2012) data model, the SPARQL Protocol and RDF Query Language (SPARQL) (W3C 2012) and the RDF Schema (W3C 2012) and Ontology Web Language (OWL) (W3C 2012) for specifying vocabularies and ontologies, and Semantic Web Rule Language (SWRL) for editing rules on ontology's concepts.The semantic web, as envisioned by the W3C, is in effect a Giant Global Graph constructed by joining many small graphs of data distributed across the Web.A community for Linking Open Data has emerged, developing best practices around the publication of distributed semantic data.[10] Heath et al. ( 2009) [11] present some valuable information about Linked Data.The four principles of Linked Data are [11]: 1. Use URIs to identify things.2. Use HTTP URIs so that these things can be referred to and looked up ("dereference") by people and user agents.3. Provide useful information about the "thing" when it's URI is dereferenced, using standard formats such as RDF-XML.4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.There are some standard concepts available to work with: 1) the ontology which formally defines a common set of terms that are used to describe and represent a domain (W3C 2012), 2) the ontology defines the terms of a vocabulary by using RDF Schema and OWL, 3) usually the developers make available some SPARQL endpoints through which users can query data stored into a data store.A semantic web application must provide the links to web resources which intends to describe or any other related information to these web resources in RDF.The process of semantically marking must conform to the Linked Data principles.The process of integrating data can be easily maintained by making use of a new rising and promising technology which is Semantic Web.Semantic Web technology gives the possibility to integrate and to query data into a much more meaningful manner because of 1) it's capability to semantically integrate data coming from heterogeneous sources, 2) it's capability to maintain the data store without being necessary to modify data provided by every source but by modifying the ontology, 3) it's capability to enrich the given data by querying it and obtain answers like: which is the total number, which value is bigger than the average value and what is the web source that the respective data is characterizing, which web source is positively reviewed by users.Currently there are a number of ontologies that can be used with success in integrating data describing educational resources like: The Bowlogna ontology offers a standard schema for European universities, Academic Institution Internal Structure Ontology (AIISO) provides a schema to describe the internal organizational structure of an academic institution, Discourse Relationships provides a basic ontology to model (academic) discourse, ReSIST Courseware Ontology, CITO, FaBio, DoCo, BiRO, C4O, PSO, PRO, PWO, BBO (all of them are schemas available for describing bibliographic data about educational resources), and mEducator schema for educational resources.Hyosook et al. (2012) propose a generic context model based on ontologies.There are some existing applications.For example data.gov.ukcontains several Linked Data applications, but more important for the purpose of our article is the fact that it offers a Linked Data Education API which is listing schools from Edubase database.The user can query data after address, religion, gender, type of establishment.In Europe, access to government data, and the possibility to freely use it, is seen as an enabler for Open Government and a goldmine of unrealized economic potential.Open Data usually refers to public records (e.g. on transport, infrastructure, education, and environment) that can be freely used and redistributed by anyone -either for free or at marginal cost [12].DOI: 10.12948/issn14531305/19.4.2015.04But opening-up data, e.g. in Open Data portals, often happens in an ad-hoc manner, and in many cases thousands of datasets is published without adhering to commonly-agreed data and metadata standards and without reusing common identifiers.Hence, a fragmented data-scape is created, where finding, reusing, integrating and making sense of data from different sources is a real challenge.Linked Data can respond to these challenges and can be an enabler of eGovernment transformation, leading to smarter and more efficient government services and applications, and fostering creativity and innovation in the digital economy.Data flow depicting the ontology-driven mapping of structured and unstructured data into RDF format and the subsequent use of that data by a Semantic Web application via a SPARQL endpoint is presented in Figure 2. Fig. 2. Data flow of a semantic web application [13] The Semantic Web aims to turn today's existing Web of linked documents into tomorrow's Web of Linked Data.For example, it can help us create search engines that are far superior to today's search engines.Most search engines today only search keywords based on the number of times they occur (and then combine with various proprietary ranking algorithmssuch as assuming authority by the number of backlinks).They do not search based on a true understanding of conceptual information.And by "understanding", we do not have to believe that advanced artificial intelligence is required -although verily, these technologies are part and parcel to the "age of cognitive computing" that IBM now promotes in context of the Watson system, which beat humans on the show Jeopardy.It can also help us integrate systems.The RDF data model makes systems inherently easier to integrate than they would be using only traditional data models like relational data in an RDMS.Systems integration is one of the highest costs in IT today, which Linked Data and Semantic Web technologies can help to alleviate.The architecture of a semantic web application is presented in Figure 3. DOI: 10.12948/issn14531305/19.4.2015.04Fig. 3.The architecture of a semantic web application [12] Knowledge-oriented systems can benefit significantly and systems can be made to be more knowledgeable because reasoning engines can be used to reason against assertions that have been made to infer new meaning.And because data from within your organization may be linked to Linked Open Data on the web, you can find relationships and meaning far beyond the scope of the data you manage yourself alone.This can help make meaningful connections between knowledge bases (medical science institutions with hospitals and pharmaceutical research firms, for example or -for e-commerce purposes, one corporate enterprise to others).This brings to mind a possible "dark web" where computers are churning through the internet, making connections, reasoning, and processing information on our behalf.The use of semantic web standards is presented in Figure 4.

Fig. 4. Semantic web standards and schema [14]
That is where the business is going to be.Big Data as a concept/technology has been there in the past but with different names.Now that almost every transaction is being stored and there are no storage problems, people have become interested.Large unstructured data is available now and is at disposal.The result of so much data is data based-products.They have collected essential data from the customers and they are using it to better target their customers.
Here are the four main components of big data.Let's examine them: 1. NoSQL/Hadoop.From technology perspective, it is interesting.But in the end, it is a just a distributed file storage, with map reduce to enable distributed data processing.The driver for this is just that business wants to store data as much as possible so that it might be useful in the future.This need drives the NoSQL adoption and then Hadoop for processing it.
The assumption is that "more data now = better decision tomorrow", which is wrong in many levels.It was hot for the last couple years.2. Machine Learning.Technologies in ML existed for many years.What's new now?Well, NoSQL/Hadoop is new and being able to apply ML for a large data set becomes hot.One way to view this is that NoSQL/Hadoop does not provide the return on investment promised, so ML is revived in the hope it could be the thing.ML has potential, however, to realize its value, we need better data and a class of business problems related to correlation.There are two gaps here.For one, NoSQL can only give more data, not better data, this could be garbage in, garbage out problem, or even worse, good data gets lost in the garbage ocean.The other gap is that business tends to look for causality solutions, which is not what ML can solve.There are many business problems are correlation discovery problems, ML can bring great value, if the data also happens to be better.3. Text analytic.This is interesting.But our NLP technologies are quite lacking and applications so far are very limited.There are many places that we could use the technology right now, but technology is not quite there.Once technology is ready, it will be quite hot. 4. Graph.This is also interesting.We could argue that technology, or the theory, is there already, (obviously it could be more scalable).Other than a few places where graph solution is naturally desired, such as Google's PageRank, most business are not mature enough to have problems for it to solve.This could be quite hot, only after businesses start to frame their problems DOI: 10.12948/issn14531305/19.4.2015.04 in graph.If you think of the Semantic Web as a web of information interpretable and so exploitable by machines, then you need to assume that there is a way to specify formally the meaning of this information.In this case ontologies are needed as they represent logical models of the concepts that are involved in modeling and sharing data.Now, this view has been criticized a bit for being too high level and impractical (which I think is wrong, but that is another question).But even if you adhere to a view of the semantic web which is less oriented towards formal semantics, ontologies are still essential.They form a structured, reusable specification of the concepts and relations that are involved in sharing data.In other terms, talking about the same kind of things on the semantic web mostly means using the same (a compatible) ontology(ies).To give an example, you can expose information about products on the Web, but it is only with the use of a shared ontology (such as GoodRelations) that this information becomes exploitable automatically, at web scale.

Solution
First, we identified the main concepts that are necessary to semantically mark the web resources and, second, we identified the points of relations between all the resources from the educational system service.The RDF model has a number of components related to 1) the hierarchy of specializations offered by high-schools, 2) the number of students, the graduation rate of each specialization and the last grade entered in the process of admission, 3) the lists of resources associated with every specialization.We proposed an ontology titled High-School that has the following Uniform Resource Identificator http://www.knowledgedescisionmaking.ro/highschool#.
The RDF model has classes and properties as it is shown in Fig. 5. Our model has described the following relations: -High School's ranking, Specialization's ranking and Subject's ranking expressed by Facebook likes but it can be a subject for any other review system.
-Relations between Subjects and Specializations.
-Specialization evolution.It makes possible to visualize migration patterns across areas, thus allowing users to understand where other student are enrolled, what are the specializations which gained a number of greater students over years.Figure 6 presents the RDF document describing the high-school's characteristics which the software agent searched.
A semantic software agent scraps the www and gets the necessary information by parsing the content or, in recent years, by consuming the web services provided by web publishers.
For the moment, it is not that common for high-schools or other educational services providers to make available some web services.
In our example we exemplified the use of #Review and #General classes.We used the following properties: #gradRate, #likes, #fa-cebookLikes.
We considered s the rate of the enrolled students and we considered that this rate should be greater than 1, meaning that the number of enrolled students from the current year must be greater than the number of enrolled students of the past year.L represents the rate of Facebook likes, and g represents the rate of graduation students.We had the same conditions for each indicator (greater than 1).In order to establish which the best rated highschool is we presented the SWRL rule Figure 7.The scalability offered by a semantic web application is higher than the one offered by a traditional web application because the ontology can be easily modified when the domain model is changed.An ontology for educational services allows the web publishers to describe their resources.These descriptions represent metadata that can be searched, collected or integrated by any software agent.The web publisher reviews the educational service in a HTML page and into an RDF document which uses the educational services ontology.As soon as the software agent collects all the RDF files, there is the possibility to query data by using SPARQL and obtain answers to the following questions for example: The benefits for citizen include a single point of access for multiple lists, and integration of these lists with web sites and with the high school educational system.Our findings are consistent with other researches in the field.There is a common agree that governmental institutions must provide efficient administration through cooperation with citizens and that the public services are in the benefit of all citizens.Regarding the use of Facebook, YouTube or Twitter there are for the moment a lot of local government institutions which make use of these channels.We showed in the introduction that by using these channels information is dissipated and although these channels enhance cooperation, this is not sufficient.We came with an original approach of using Facebook likes as a review system into a bigger application developed with semantic web technologies.We used Facebook likes as an indicator of users' preferences for a certain specialization, high-school or subject.Consistent with our domain of study data.gov.ukoffered an early example of how to use semantic web in integrating information about educational services.We extended the research and we came with a new approach by taking into consideration users' preferences and by applying some rules related to these preferences so the application can provide conceptual answers.One big advantage in consuming Liked Data consists in reusing ontologies.Our study demonstrates that there is a relation between the possibility to search for a certain highschool and gaining user's trust if the web site provides sorting options depending on highschool popularity, number of Facebook likes, number of visualizations.

Highschool(?h)^has s(?h, s) ^has s(
The main difference between the current study and the previously published studies is, first, that we chose an approach to characterize high school relating to Semantic Web and Social Media.We did not use an approach of developing a general model but we deducted what high-school is better rated basing on user's trust and data integration with ontology.Second, we identified and discussed the types of available/ missing support and we came with potential solutions from Semantic Web area.

Conclusions
This paper presented the notion of semantic web educational services and proposed a conceptual representation for data and relations required by the process of seeking information about educational institutions.The scope was to integrate these educational services into the large application area of Smart Cities on semantically purposes.Social Media and E-Government are two areas of application with different development speed.Having integrated the information which is useful for citizens is a demanding task which can benefit from Semantic Web.There are some important aspects to note:  Semantic Web is not that common for the moment but it represents a set of technologies which can offer great advantages if the web publishers describe their web sites by using ontologies and RDF/ OWL. As the time passes and the technology is starting being used there will be available some accepted ontologies and, in this way, the whole process of integrating data will be much easier to realize.The main benefits for E-Government that our paper introduces consist in designing web applications having integrated semantic web functionalities.Being better informed means acquiring intelligence and, therefore, smartness.The second set of advantages comes from the area of social services by the fact that by integrating and making available information that is useful for all citizens a lot of third-sector services might get improved.In summary, Open Data and E-Government need automated support for searching, integrating and visualizing data.This study presented an approach for educational services and helps readers understanding the strengths and opportunities of the field.

Fig. 7 .
Fig. 7. SWRL rule that establish which is the best rated high school