Decision Analysis via Granulation Based on General Binary Relation

Decision theory considers how best to make decisions in the light of uncertainty about data. There are several methodologies that may be used to determine the best decision. In rough set theory, the classiﬁcation of objects according to approximation operators can be ﬁtted into the Bayesian decision-theoretic model, with respect to three regions (positive, negative, and boundary region). Granulation using equivalence classes is a restriction that limits the decision makers. In this paper, we introduce a generalization and modiﬁcation of decision-theoretic rough set model by using granular computing on general binary relations. We obtain two new types of approximation that enable us to classify the objects into ﬁve regions instead of three regions. The classiﬁcation of decision region into ﬁve areas will enlarge the range of choice for decision makers.


Introduction
Making decisions is a fundamental task in data analysis.Some methods have appeared to make a decision.Yao and Wong [1] proposed and studied a more general type of probabilistic rough set approximations via Bayesian decision theory.In Section 2, we give a brief overview of granulation structures on the universe.One is defined by an equivalence relation due to Pawlak [2] and the other by a general relation proposed by Rady et al. [3].Approximation structures are discussed for each type of granulation.Section 3 discusses a decision-theoretic model of rough sets under equivalence relations given by Yao and Wong [1].Our main contribution is to introduce a general decision-theoretic model of rough sets using a general relation.The resulted granulation induces approximation different from that due to Pawlak.This enables us to construct two new approximations, namely, semilower and semiupper approximations which are useful in the partition of boundary region in particular and the universe in general with respect to any subset of the universe.

Granulation of universe and rough set approximations
In rough set theory, indiscernibility is modeled by an equivalence relation.A granulated view of the universe can be obtained from equivalence classes.By generalizing equivalence relations to binary relations, one may obtain a different granulation of the universe.For any kind of relations, a pair of rough set approximation operators, known as lower and upper approximation operators, can be defined in many ways (Pawlak [2], Rady et al. [3]).consists of all elements equivalent to x, and is also the equivalence class containing x.
In an approximation space apr = (U,E), Pawlak [2] defined a pair of lower and upper approximations of a subset A ⊆ U, written as apr (A) and apr(A) or simply A and A as follows: (2. 2) The lower and upper approximations have the following properties.
For every A and B ⊂ U and every approximation space apr = (U,E), apr(−A) = −apr(A), (9) apr(−A) = −apr(A), (10) apr(apr(A)) = apr(apr(A)) = apr(A), (11) apr(apr(A)) = apr(apr(A)) = apr(A), (12) if A ⊆ B, then apr(A) ⊆ apr(B) and apr(A) ⊆ apr(B).Moreover, for a subset A ⊆ U, a rough membership function is defined by Pawlak and Skowron [4]: where | • | denotes the cardinality of a set.The rough membership value μ A (x) may be interpreted as the conditional probability that an arbitrary element belongs to A given that the element belongs to [x] E .

Granulation by general relation.
Let U be a finite universe set and E any binary relation defined on U, and S the set of all elements which are in relation with certain x in U for all x ∈ U.In symbols, S = {xE}, ∀x ∈ U where {xE} = {y : xEy; x, y ∈ U}. (2.4) Define β as the general knowledge base (GKB) using all possible intersections of the members of S. The member that will be equal to any union of some members of β must be omitted.That is, if S contains n sets, 2,...,n; S i ⊂ S and β i = ∪S i for some i . (2.5) The pair apr β = (U,E) will be called the general approximation space based on the general knowledge base β.
Rady et al. [3] extend the classical definition of the lower and upper approximations of any subset A of U to take these general forms where β x = {b ∈ β : x ∈ B}.These general approximations satisfy all the properties introduced in Section 2.1 except for properties (8, 9, 10, and 11).This is the main deviation that will help to construct our new approach.
For granulation by any binary relation, Lashin et al. [5] defined a rough membership function as follows: (2.7)

Granulation by general relation in multivalued information system.
For a generalized approximation space, Abd El-Monsef [6] defined a multivalued information system.This system is an ordinary information system whose elements are sets.Each object has number of attributes with attribute subsets related to it.The attributes are the same for all objects but the attribute set-valued may differ.A multivalued information system (IS) is an ordered pair (U,Ψ), where U is a nonempty finite set of objects (the universe), and Ψ is a nonempty finite set of elements called attributes.Every attribute q ∈ Ψ has a multivalued function Γ q , which maps into the power set of V q , where V q is the set of allowed values for the attributes.That is, The multivalued information system may also be written as (2.8) With a set P ⊆ Ψ we may associate an indiscernibility relation on U, denoted by β(P) and defined by (x, y) ∈ β(P) if and only if Γ q (x) ⊆ Γ q (y) ∀Q ∈ P. (2.9) Clearly, this indiscernibility relation does not perform a partition on U.
Example 2.1.In Table 2.1, we have ten persons (objects) with attributes reflecting each situation of life.Consider that we have three condition attributes, namely, spoken languages, computer programs, and skills.Each one was asked about his adaptation by choosing between {English, German, French} in the first attribute; {Word, Excel, Access, Power Point} in the second attribute; {Typing, Translation} in the third attribute.Let a i be the ith value in the first attribute, let b j be the jth value in the second attribute, and let c k be the kth value in the third attribute.The indiscernibility relation for C = {T 1 ,T 2 ,T 3 } will be x 4 ,x 9 , x 5 ,x 5 , x 6 ,x 6 , x 7 ,x 7 , x 7 ,x 8 , x 8 ,x 8 , x 9 ,x 9 , x 10 ,x 3 , x 10 ,x 10 . (2.10) It is easy to see that β(C) does not perform a partition on U in general.This can be seen via x 6 , x 7 ,x 8 , x 8 , x 9 , x 10 ,x 3 . (2.11) Obviously, the U/β(C) is the set S defined in the general approach in Section 2.2.

Bayesian decision-theoretic framework for rough sets
In this section, the basic notion of the Bayesian decision procedure is briefly reviewed (Duda and Hart [7]).We present a review of results that are relevant to decision-theoretic modeling of rough sets induced by an equivalence relation.A generalization and modification of decision-theoretic modeling induced by general relation is applied on the universe.

Bayesian decision procedure.
Let Ω = {ω 1 ,...,ω s } be a finite set of s states of nature, and let Ꮽ = {a 1 ,...,a m } be a finite set of m possible actions.Let P(ω j /X) be the conditional probability of an object x being in state ω j given the object is described by X.
Let λ(a i /ω j ) denote the loss for taking action a i when the state is ω j .For an object with description X, suppose that an action a i is taken.Since P(ω j /X) is the conditional probability that the true state is ω j given X, the expected loss associated with taking action a i is given by and also called the conditional risk.Given description X, a decision rule is a function τ(X) that specifies which action to take.That is, for every X, τ(X) assumes one of the actions, a 1 ,...,a m .The overall risk R is the expected loss associated with a given decision rule.Since R(τ(X)/X) is the conditional risk associated with the action τ(X), the overall risk is defined by where the summation is over the set of all possible description of objects, that is, entire knowledge representation space.If τ(X) is chosen so that R(τ(X)/X) is as small as possible for every X, the overall risk R is minimized.Thus, the Bayesian decision procedure can be formally stated as follows.For every X, compute the conditional risk R(a i /X) for i = 1,...,m and then selected the action for which the conditional risk is minimum.

Decision-theoretic approach of rough sets (under equivalence relations).
Let apr = (U,E) be an approximation space where E is equivalence relation on U.With respect to a subset A ⊆ U, one can divide the universe U into three disjoint regions, the positive region POS(A), the negative region NEG(A), and the boundary region BND(A) (see Figure 3.1); In an approximation space apr = (U,E), the equivalence class containing x, [x] E , is considered to be description of x.The classification of objects according to approximation operators can be easily fitted into Bayesian decision-theoretic framework (Yao and Wong [1]).The set of states is given by Ω = {A, −A} indicating that an element is in A and not in A, respectively.With respect to the three regions, the set of actions is given by Ꮽ = {a 1 ,a 2 ,a 3 }, where a 1 , a 2 , and a 3 represent the three actions in classifying an object, deciding POS(A), deciding NEG(A), and deciding BND(A), respectively.
Let λ(a i /A) denote the loss incurred for taking action a i when an object in fact belongs to A, and λ(a i / − A) denote the loss incurred for taking the same action when the object does not belong to A, the rough membership values μ A (x) = P(A/[x] E ) and μ A C (x) = 1 − P(A/[x] E ) are in fact the probabilities that an object in equivalence class [x] E belongs to A and −A, respectively.The expected loss R(a i /[x] E ) associated with taking the individual actions can be expressed as where λ i1 = λ(a i /A), λ i2 = λ(a i / − A) and i = 1,2,3.
The Bayesian decision procedure leads to the following minimum-risk decision rules:

Based on P(A/[x] E ) + P(−A/[x] E ) = 1, the decision rules can be simplified by using only probabilities P(A/[x] E ).
Consider a special kind of loss with λ 11 ≤ λ 31 < λ 21 and λ 22 ≤ λ 32 < λ 12 .That is, the loss for classifying an object x belonging to A into the positive region is less than or equal to the loss of classifying x into the boundary region, and both of these losses are strictly less than the loss of classifying x into the negative region.For this type of loss functions, the minimum-risk decision rules (P)-(B) can be written as (P ) if , where From the assumptions λ 11 ≤ λ 31 < λ 21 and λ 22 ≤ λ 32 < λ 12 , it follows that α ∈ [0,1], γ ∈ (0,1) and β ∈ [0,1).Note that the parameters λ i j should satisfy the condition α ≥ β.This ensures that the results are consisted with rough set approximations.That is, the boundary region may be nonempty.

Generalized decision-theoretic approach of rough sets (under general relation).
In fact, the original granulation of rough set theory based on partition is a special type of topological spaces.Lower and upper approximations in this model are exactly the closure and interior in topology.In general spaces, semiclosure and semiinterior (Crossley and Hildbrand [8]) are two types of approximation based on semiopen and semiclosed sets which are well defined (Levine [9]).This fact with the concepts of semiclosure and semi-interior directed our intentions to introduce two new approximations.For any general binary relation, the general approximations do not satisfy the properties (10, 11) in Section 2.1.Therefore, we can define two new approximations, namely, semilower and semiupper approximation.Definition 3.1.Let apr = (U,E), where E is any binary relation defined on U. Then we can define two new approximations, namely, semilower and semiupper approximations as follows: (3.6) The lower and upper approximations have the following properties.
For every A and B ⊂ U and every approximation space apr = (U,E), where This definition enables us to divide the universe U into five disjoint regions as follows (see Figure 3. (3.8) In this case, the set of states remains Ω = {A, −A} but the set of actions becomes Ꮽ = {a 1 ,a 2 ,a 3 ,a 4 ,a 5 }, where a 1 , a 2 , a 3 , a 4 , and a 5 represent the five actions in classifying an object deciding POS(A), deciding SemiL(A) − A β , deciding SemiBND(A), deciding A β − SemiU(A), and deciding NEG(A), respectively.
In an approximation space apr = (U,E), where E is a binary relation, an element x is viewed as β x (a subset of GKB containing x).Since β does not perform a partition on U in general, then we consider that ∩β x be a description x.The rough membership values μ A (x) = P(A/ ∩ β x ) and μ A C (x) = 1 − P(A/ ∩ β x ) are in fact the probabilities that an object in ∩β x belongs to A and −A, respectively.The expected loss R(a i / ∩ β x ) associated with taking the individual actions can be expressed as The Bayesian decision procedure leads to the following minimum-risk decision rules. ( , decide NEG(A).Since P(A/ ∩ β x ) + P(−A/ ∩ β x ) = 1, the above decision rules can be simplified such that only the probabilities P(A/ ∩ β x ) are involved.

NEG(A)
Consider a special kind of loss function with For this type of loss functions, the minimum-risk decision rules (1)-( 5) can be written as follows.
( (3.11) A loss function should be chosen in such a way to satisfy the conditions: (3.12) These conditions imply that (Semi L(A) − A β ) ∪ SemiBND(A) ∪ (A β − SemiU(A)) is not empty, that is, the boundary region is not empty.
(3.13) Now, we can decide the region for each object by using the generalized decisiontheoretic approach that proposed in Section 3.3.This approach can be applied on a multivalued information system and gives us the ability to divide the universe U into five regions which help in increasing the decision efficiency.The result given by general rough sets model can be viewed as a special case of our generalized approach.
In our example, the set of states is given by Ω = {A, −A} indicating that an element is in A and not in A, respectively.With respect to five regions, the set of actions is given by Ꮽ = {a 1 ,a 2 ,a 3 ,a 4 ,a 5 }.
To apply our proposed technique, consider the following loss function: There is no cost for a correct classification, 2 units of cost for an incorrect classification, 0.25 unit cost for an object belonging to A is classified in SemiL(A) − A β and for an object does not belong to A classified into A β − SemiU(A), 0.5 unit cost for classifying an object into boundary region, and 1 unit cost for classifying an object belong to A into A β − SemiU(A) and for an object does not belong to A into SemiL(A) − A β (note that a loss function supplied by user or expert).According to these losses, we have (3.15) By using the decision rules (1 )-( 5), we get the results shown in Table 3.1.Thus, we have (3.16) Now we apply the decision theoretic technique proposed by Yao and Wong [1] to classify the decision region into three areas.The set of actions is given by Ꮽ = {a 1 ,a 2 ,a 3 }, where a 1 , a 2 , and a 3 represent the three actions in classifying an object, deciding POS(A), deciding NEG(A), and deciding BND(A), respectively.To make this, consider that there is 0.25 unit cost for a correct classification, 3 units of cost for an incorrect classification, and 0.5 unit cost for classifying an object into boundary region, that is, λ 11 = λ 22 = 0, λ 31 = λ 32 = 0.5, λ 21 = λ 12 = 3.
(3.18)By using the decision rule (P )-(C ) and replacing P(A/[x] E ) by P(A/β x ), we get the results shown in Table 3. 2. This means that POS(A) = x 1 ,x 3 ,x 6 , BND(A) = x 2 ,x 4 ,x 10 , NEG(A) = x 5 ,x 7 ,x 8 ,x 9 . (3.19) From the comparison between the two approaches, we note that the our approach (classification of decision region into five areas) gives us the ability to divid BND(A) = {x 2 ,x 4 ,x 10 } into SemiL(A) − A β = {x 2 }, which is closer to the positive region, A β − SemiU(A) = {x 4 }, which is closer to the negative region, and SemiBND(A) = {x 10 }.

Conclusion
The decision theoretic rough set theory is a probabilistic generalization of standard rough set theory and extends the application domain of rough sets.The decision model can be interpreted in terms of more familiar and interpretable concept known as loss or cost.One can easily interpret or mesure loss or cost according to real application.
We have proposed in this paper a generalized decision-theoretic approach, which is applied under granulated view of the universe by any general binary relation.This approach enables us to classify the decision region into five areas.This classification will enlarge the choice for decision maker and help in increasing the decision efficiency.