The Construction of Inference Engine for Meaningful Context and Prediction Based on USN Environment

Currently, with gradually increasing movement to live with nature, artificial wetlands are increasing as well. All these change blows at rivers and streams thereby need for wetland management systems to increase. To measure environmental situations on the wetlands, people should go outside and check with measurement tools regularly. However, with these tools only it is difficult to know the exact situations on that wetland. Thus, we attached various sensors on the wetland and made sensor network environment. We used sensing data from sensor network to assume the situation of the wetland. This paper proposes a design for this through application of context inference of USN (Ubiquitous Sensor Network) and inference production rules for context inference engine of wetland management system by using JESS. In this study, we made rules using actual eutrophication criteria as a standard of water quality. The produced rules in this paper can decide the grade of eutrophication on wetland environment then predict the status of the wetland based on facts collected from sensor networks. Sensors sense data such as DO, BOD, SS, PH. And production rules divided the grades of each fact and then final rules can decide the eutrophication grades which mean water quality grades.


Introduction
Wetland is a wet ground surrounded by rivers, ponds, and swamps and also a reserve of natural resources where water always remains under natural environment and lasts long [1,2]. Thus, ongoing maintenance is required to better manage important wetlands. The wetland needs to be constantly checked in its nature. However, with the use of these measurement tools only, it is difficult to know exactly the situation that wetlands change every time, and the time to prepare may be insufficient.
Accordingly, this study aims to prepare for this through application of context inference techniques of USN (Ubiquitous Sensor Network) and production rules for wetland control system design. The produced rules in this paper can decide the grade of eutrophication on wetland environment then predict the status of the wetland. This paper is organized as follows. Section 2 discusses the basic knowledge on the situation reasoning and rule-based languages needed in designing reasoning rules. And also proposes reasoning engine to predict the state of water quality in the wetland environment. Section 3 applies the performance of the created rules into the wetland environment situation data and then evaluates the performance. Section 4 evaluates the study results and suggests future researches and brings to a conclusion.

Related Work
Context inference is an inference of new facts from alreadyknown facts and knowledge. Context inference method is largely divided into two things: rule-based reasoning and case-based reasoning [3]. Rule-based reasoning is a method that operates equipment as long as condition of a sensor is met regardless of place and user. Contrary to this, Casebased reasoning is a method to infer based on the previous cases. Compared to rule-based reasoning, it is easy to learn (name-of-this-production LHS / * one or more conditions * / -> RHS / * one or more actions * / ) Algorithm 1: A representation method of rules. and convenient to adapt to exceptional rules. Generally, rulebased inference is applied in an expert system. Rules make it possible to infer another fact based on, the collected data from a sensor (Algorithm 1).
The production memory is a set of productions (rules). A production is specied as a set of conditions, collectively called the left-hand side (LHS), and a set of actions, collectively called the right-hand side (RHS). One condition or more come into LHS, and Action is followed on RHS (reft-hand side), when the conditions are met. Such rules should be set to take a specific action when datum that a sensor collect meets a certain condition or rule that a user has made. Or it can be predicted that a certain situation can happen. This is called a rule-based context inference [4]. For example, it is to perform the specified information when a factor based on water quality data collected from a sensor has a specific data value.
USN means an advanced intelligent community infrastructure which stores, processes, and integrates things and environment information from tags and sensor nodes attached anywhere and can freely use personalized service anytime, anywhere, and anyone through context awareness information and knowledge contents production [5,6].
Sensor network includes various sensors which can gather information for a specific environment, derive meaningful information based on such collected information, and act like a middleware to lead a role in running appropriate services for users and, finally, obtain information from middleware and interact directly with users (service application). Middleware performs comprehensively: (a) efficiently integrate and control heterogeneous computing resources distributed in Ubiquitous environment; (b) supports integrity; (c) infers the meaning of the various information collected from fundamental roles such as information protection and security functions [7].
The Rete Match Algorithm is a method for comparing a set of patterns to a set of objects in order to determine all the possible matches. It was described in detail in this article because enough evidence has been accumulated to make it clear that it is an efficient algorithm which has many possible applications [8].
JESS is an abbreviation of Java Expert System Shell, and it is a rule-based expert system with features of Java language. JESS is a language which has rule-based system concept similar to LISP but it is easy to define rules and codes and it has a powerful Java API support environment such as networking through integration with Java, graphics, database, and connection [9, 10].

Eutrophication.
Eutrophication is the addition of artificial or natural substances, such as nitrates and phosphates, through fertilizers or sewage, to an aquatic system [11]. Eutrophication can be human-caused or natural. Untreated sewage effluent and agricultural runoff carrying fertilizers are examples of human-caused eutrophication. However, it also occurs naturally in situations where nutrients accumulate (e.g., depositional environments), or where they flow into systems on an ephemeral basis [12]. Many ecological effects can arise from stimulating primary production, but there are three particularly troubling ecological impacts: decreased biodiversity, changes in species composition and dominance, and toxicity effects [13].
Evaluating the occurrence and progression of eutrophication in the lake is an essential tool for the management of the lake's water quality. The eutrophication evaluation data is the most basic material for the long-term and short-term water management plans. In order to make an accurate assessment of eutrophication in the lake, it is important to collect various measurement data of water quality parameters, but the analysis and interpretation of the collected water quality data is also a very important factor.
The evaluation criteria to determine the eutrophication is divided into (a) evaluation by single-item assessment and (b) evaluation by multiple ones, and so in order to determine the exact status, it is desirable to study with use of synthesized multiple items. With such method, we can estimate the general correlation that exists between water quality items in consideration of physical and chemical characteristics. In addition, if you are using two or more items, you will be able to make more accurate assessment [14]. Table 1 represents the criteria for determining the degree of eutrophication in the lake. The eutrophication is divided into three major steps: we call it an oligotrophic lake when the degree of eutrophication is not severe but, rather, relatively in mild state; a mesotrophic lake when the eutrophication has progressed to some extent; an eutrophic lake when the eutrophication has progressed a lot and needs to be managed.
The items for evaluation are largely divided into three major parts, among these the only item that has the biggest impact on assessment is T-P (total phosphorus). Chl-a is an abbreviation of chlorophyll-a, indicating the concentration of phytoplankton. SD (Secchi depth) is the simplest  indicator to evaluation the eutrophication, being inversely proportional to the concentration of phytoplankton. But in case that suspended matters other than the phytoplankton are present a lot, its disadvantage is that it is less accurate.

Configuration of Context Inference System
The main purpose is rule design for context inference engine which allows us to figure out current status of water quality by using JESS language based on rule-based inference method based on collected data from a sensor. Figure 1 represents the System model of context-aware inference engine. System model uses a standard interface for sensor network and consists of incomplete data processing, stream data processing, situation awareness, and reasoning support module.

System Model.
(1) Interface Manager. As a component which provides a standardized interface for USN sensor networks, each sensor network is a module which complies with common message specifications for sensor data acquisition, monitoring, and control function in the internetworking to sensor network common interface. (2) Step 2 IDPM (Incomplete Data Processing Mechanism) Module. It is a module which analyzes patterns that appear when an amount of power is insufficient and the patterns of incomplete data resulting from increasing distance of communication distance, determines the relationship between attributes of sensors, and discovers and treats imperfections.
(3) Metadata Manager. It combines packets transmitted from a sensor network with use of metadata and changes it into completed information and reduces frequent access to database with an application of a specific technique, Metadata set.
(4) Rule Manager. It receives environmental information of a sensor network done by Stream Data Processor and matches database to the stored inference rule. Consequently, if necessary, it is reloaded on WM to go through matching tasks, and it comes to perform a particular action depending on reasoning results.

(5) Application
Interface. This application interface controls sensor network and provides an appropriate interface with application interface so that it can be monitored. Figure 2 shows how sensor data flows inside of system. Sensors in sensor network collects an environmental data and then sends it to the gateway. The stream data goes through the gateway to USN middleware. As figure dedicates, stream data preprocessor receives data packet from gateway then processes raw data out of the stream data. These raw data goes to metadata manager to changes it into completed information for context inference of what situation in such environment. Rule manager take charge of inference tasks to provide to user appropriate service. After matching tasks, the results go to action planner. Action planner does communicate with Application interface to provide inference results to the users. Table 2 represents the types and features of sensors necessary to determine the water quality status of the wetlands [15]. PH ranges 6.5∼8.5 in any level of water quality in an equal way, and when data outside the range is entered, it is only used to determine that it is abnormal. The other necessary indicators to determine eutrophication are T-N (total nitrogen), T-P (total phosphorus), BOD, SS, DO, and MPN. However, MPN is a measurement of the number of colon bacillus in water and so it is not considered for the design of rules because it cannot be used as a sensor. In addition, as both T-P and T-N are difficult to be measured in a sensor, they are used as a standard to determine the eutrophication using correlations between measurable factors. Here, four facts (BOD, DO, SS, PH) are considered in order to design situation reasoning rules.

Correlations between Water Quality and Eutrophication.
To predict the status of water quality in the wetland using factors that can be measured in a sensor, it is necessary to establish the correlations between such factors and the criteria to determine eutrophication.
To set up such standards, eutrophication assessment criteria and relations commonly used in evaluating the water quality and resources measured from the actual wetland were synthesized and correlation was established. When we make clear the correlation between water quality items and then determine the degree of eutrophication from that, we can estimate the other item out of one item. Table 3 represents the water quality data is measured on a regular basis in various areas of Lake Chungju. The factors that can be measured in a sensor network are DO, BOD, SS, and PH. As the eutrophication should be determined with the values that can be earned from a sensor, the correlation

Correlation (T-N, T-P) increase → DO decreases → BOD increases
Chl-a decreases → SS decreases SD decreases → SS increases between these facts was established. Table 4 shows this correlation among the factors. DO can be used as one fact of eutrophication criteria, and from which the eutrophication can be determined independently. When you make a reference to the actual data, it shows the signs of increased levels of DO as it transits from being oligotrphic to being eutrophic. BOD is typically used as an indicator to determine water quality, and generally inversely proportional to DO. As the state of the lake transits from being oligotrophic to being eutrophic, the level increases.
SS is a measurement of the amount of suspended solids contained in the lake, being closely related to SD (transparency), which is the criteria to determine eutrophication and inversely proportional. Transparency is closely related to the concentration of N, P, and the concentration of chlorophyll-a changed depending on the growth of algae. But in case of the lake in which there are many other suspended matters other than phytoplankton, it is not a so important indicator for the determination of eutrophication.
PH remains in nearly constant value regardless of water quality. But it is used as criteria to determine whether it is abnormal when levels are too low or too high. Figure 3 shows the operating procedure of inference module. At first a certain event occurs at a network, and data values from a sensor come in (input data). Such raw data values are stored in working memory (WM) set. Users' predefined set of rules are stored in the Data Base. Through comparison process between the sensed values stored in WM and such rules, facts-level value is determined. And then such factslevel values are loaded into WM and are going through matching process with the rules that finally determine water quality.  Such process performance is a pattern matcher of inference engine. Pattern Matcher goes through the process of comparison between facts values stored in WM and rules stored in database and stores the matched values into conflict set. The conflict set calculates reliability depending on the order of priority of such matching values and sends matching results that are considered to be the most accurate to execution engine and performs a certain action correspondingly at service module.

The Design of Production Rule Designs.
Rules are designed by using rule-based language, JESS, is a rule engine for the Java platform.
JESS can specify logic in the form of rules using the JESS rule language and provide some data for the rules to operate. One of the strengths of JESS is that rules can create new data, or they can do anything that the Java programming language can do. Also JESS that has the advantages is will embed the JESS library in Java code and manipulate it using its own Java API.
Definition of facts. Deftemplate is a function used to define unordered facts in JESS. The facts defined by a function of deftempalte are independent of order if a name is changed by using slot. As the facts that we use here are regardless of order, they are defined as shown in Algorithm 2.
It shows collectible facts from the wetlands environmental sensor networks. Indicators are four kinds of quality management standards. Facts represented as "facts-level" are necessary to design rules to determine the final water quality. Based on water quality management standard, DO (see Algorithm 3), BOD (see Algorithm 4) has 3 standard points and it is divided into 3 stages; SS (see Algorithm 5) into two stages. Water Quality Standard Rules is based on the relationship with eutrophication criteria to divide DO and BOD into 3 steps and SS into 2 steps.
As DO is a factor that plays a role in determining the eutrophication in the wetland, it is considered with high priority. DO was designed in the way of defining 3 steps of the level. For the rule, it is divided into LHS which belongs to if part on the left and RHS which belongs to then part on the right, on the basis of this symbol of =>.
is on level 3, it means that it is bad in water quality and that it is likely to be in eutrophication. Therefore, the importance of this was considered on the design of rules and the priority among rules was set higher. Likewise, DO consists of 3 steps, through which it operates in the way of setting up level depending on the range of BOD level. As we've seen from the correlation above, BOD is inversely proportional to DO. It means that as it is closer to BOD-level 1, BOD value becomes higher. From this we can see that DO level is low and that water quality is not good. Adversely, as it goes closer to level 3, BOD level decreases. And from this, we can presume that water quality is better.
It shows the water quality assessment rules depending on SS values. SS is a factor that is closely related to SD (transparency), one of the eutrophication criteria and regarded as a factor to replace the transparency.
As it is closer to SS-level 2, the amount of suspended solids increases, indicating that its transparency is lower, which in turn means that water quality is not good. But in the case of transparency, if suspended matters other than plankton are plenty in the lake, accuracy of transparency falls down, and its importance is considered as relatively low in the design of rules, and the priority of rules is set in the low.
PH (see Algorithm 6) does not affect to the determination of eutrophication but, if PH is too high or too low, it is possible to determine whether it is in dangerous situation or not, and so warning message is set to be printed when PH is more than 9 or when PH is less than 6.
A part of the final rule which was made by synthesizing BOD, SS, DO level facts. If it's Do-level 2, if BOD-level is less than 2, and if SS-level is 2 or less, it's determined as Grade D (Warning) for warning message to be printed out (see Algorithm 7).
The risk of eutrophication can be divided into six steps: stable (A), to observed (B), cautious (C), warning (D), dangerous (E), and urgent (F). The criteria for determining the risk is set based on the importance of rules. DO is an important factor to determine the status of eutrophication, and the priority is set much higher than the other factors like BOD and SS in setting its rating.
3.6. Experiment. For design of rules to determine the status of eutrophication in the wetland and tests, data measured from various areas in Lake Chungju is used for experiments. Table 5 indicates 10 randomly selected data among the water quality data in Lake Chungju. An experiment was conducted to figure out if the rules designed by putting in 10 data facts are correctly operated. In order to compare whether the degree of eutrophication is correctly defined, the degree of eutrophication is determined through the eutrophication criteria-T-P, T-N, SD, chl-a-based on test data and then compared. Figure 4 shows the result obtained after putting in one test data at JESS. It is the experimental data having values of DO 7.8, SS 1.3, BOD 1, and PH 7. The results earned from the rules are Bod-level 1, Ss-level 2, and Dolevel 2. From this, we can see that grade D (warning) is set and the warning message is printed out. The degree of eutrophication is determined from the results earned through such reasoning process and the actual data and then compared. Also Table 6 represents the comparative result between the test result and the criteria for the actual degree of eutrophication. We can see that the inference result and the degree of eutrophication in the lake is 80% agreeable. This result shows that it is highly accurate between reasoning result using facts measured from sensor network and the result measured in the actual field.

Conclusions
In the past, in order to exactly determine the ratings of water quality in the wetland, for example, streams and rivers, a person has to go out to the field himself or herself on a regular basis in order to measure, compare, and determine. Accordingly, this paper applies the concept of ubiquitous to the wetland environment, based on facts that are collectible from a sensor network, determines the correlation between eutrophication criteria, that are determined from the actual field, and designs the situation reasoning rules that are available to determine the ratings of eutrophication. From such rules, we can predict the status of water quality in the wetland much faster using reasoning rules without any efforts from a person's hands. This paper uses data collected from the actual wetland environment, applies situation reasoning rules for the designed wetland water quality, and proves its accuracy.
Furthermore, based on such rules, more research is needed to improve accuracy through the designs of various situations that may happen in the wetland and the detailed rules that can be applied to more water quality data. And these rules should be reflected on the design of the situation reasoning engine and the situation reasoning engine should be constantly developed so that a user can obtain more accurate reasoning results.