1 Introduction

Nowadays, we live in a world submerged with more information than ever before, and the information is growing faster. On the other hand, we encounter imperfect decision-relevant information in some situations. Although big information systems appear everywhere, data has begun to accumulate to the point where a new and special event is taking place. Today we deal with big data which does not necessarily mean good data. And that, as an increasing number of experts are saying more insistently, means big data does not automatically yield good analytics. If the data is imperfect, out of context or otherwise contaminated, it can lead to decisions that could undermine the competitiveness of an enterprise or damage the personal lives of individuals. Therefore, knowledge representation plays an important role in dealing with many aspects of problem solving. This includes handling imperfect knowledge. More precisely, we study inconsistent decision tables that can further analyze imperfect knowledge. One of the most important concepts an intelligent system needs to understand is the concept of knowledge. It may or may not be perfect. Also, one wants to know what knowledge is needed to achieve particular goals, and how that knowledge can be obtained. So, one of the important problems along this line is to seek an appropriate approach to analyze imperfect knowledge. The problem related to imperfect knowledge or an inconsistent decision table has been investigated by many researchers in different areas. Our idea is to apply a granular computing approach (Pedrycz and Chen 2014, 2015; Skowron et al. 2016), rough set theory, to cope with imperfect data analysis. Rough set theory is proposed by Pawlak (1982) and is based on fundamental set theory and has been applied extensively in many different aspects of theoretical and applied research areas.

Regarding the inclusion degree, several papers have done various aspects of research. Polkowski and Skowron (1996) proposed rough mereology as a foundation for approximate reasoning about complex objects. Yao et al. (2015) adopted the Bayesian decision-theoretic analysis to provide a systematic method for determining the precision parameters by using some notions of costs and risks. Zhang and Leung (1996) proposed a generalized notion of inclusion degree in the context of a partially ordered set. Gomolinska (2008) obtained two rough inclusion functions (RIFs in short), different from the standard one and from each other. With every RIF, she associated a mapping which is in some sense complementary to it. She used these complementary mappings (co-RIFs) to define certain metrics. While the distance functions may directly be used to measure the degree of dissimilarity of sets of objects, their complementary mappings are useful in measuring the degree of mutual similarity of sets.

In our work, we use a table to represent some collections of data in the real world. Using the concept of lower and upper approximations in rough set theory together with the variation of parameter to describe the characterizations of the objects, we are able to come up with some variable precision models through the given decision information. It was first introduced by Ziarko (1993), the tool of variable precision and later was used by many others. However, this is the first time, we derive better analysis of equivalence classes by using infimum and supremum of inclusion degrees of equivalence classes in a given set with smaller (or greater) values than 0.5 instead of minimum and maximum, respectively. In fact, the family of inclusion degrees of equivalence classes in a given set with smaller (or greater) values than 0.5 may be empty. In what follows, we first provide some preliminary backgrounds in rough set theory and variable precision model (VP-model), under decision tables, and the definitions of positive region, boundary region and relative reduct for a decision table in the VP-model in Sect. 2. In Sect. 3, we provide the relationship between the approximation degree of dependency and the belief function in evidence theory. We then provide the intrinsic properties of positive regions and upper approximations of decision classes using the threshold of the decision table in Sect. 4. We also present several examples in Sect. 4. We show a very useful tool in understanding the condition classes in positive regions in Sect. 5. We show a clear relationship between consistency and positive regions of a given decision table in Sect. 6. Several concluding remarks are presented in Sect. 7.

2 Preliminaries

Let U be a finite and nonempty set, known as the universe of discourse. We use the symbol “\(\subseteq \)” (“\(\subset \)”) to denote set inclusion (strict set inclusion, respectively). The cardinality of a set \(S \subseteq U\), denoted | S |, is the number of elements in S. The power set of U is the collection \(2^U=\{S \ | \ S \subseteq U \}\).

The inclusion degree of a nonempty set \(X \subseteq U\) in a set \(Y \subseteq U\) is defined as

$$\begin{aligned} I(X,Y)=\frac{|X \cap Y|}{|X|}. \end{aligned}$$
(1)

Let us define \(Pr: 2^U \longrightarrow [0,1]\) as follows:

$$\begin{aligned} Pr(X)={|X| \over |U|}, \ \ \ ~\forall \ \ X \subseteq U. \end{aligned}$$
(2)

The greatest lower bound (inf or infimum) and least upper bound (sup or supremum) of a subset S of the unit interval [0, 1] will be denoted by inf S and sup S, respectively. From the definitions of inf and sup, we have

$$\begin{aligned} \mathrm{inf} ~\emptyset = 1 \quad \mathrm{and} \quad \mathrm{sup} ~\emptyset = 0. \end{aligned}$$
(3)

Remark If \(\mathrm{inf}~S \in S\), then we also denote it by \(\mathrm{min}~S\) and call it the minimum of S, and if \(\mathrm{sup}~S \in S\), then we also denote it by \(\mathrm{max}~S\) and call it the maximum of S. However, nothing is an element of the empty set and the empty set would not have a minimum. In fact, we will deal with the family of inclusion degrees of equivalence classes in a given set with smaller (or greater) values than 0.5, which may be empty. We therefore introduce infimum and supremum to include all the possibilities.

2.1 VP-models under decision tables

The knowledge representation in the rough set model is often structured in a decision table which is a 4-tuple \((U, Q=C \cup D, V, f)\), where U is a nonempty finite universe, C is a nonempty finite set of condition attributes, D is a nonempty finite set of decision attributes, \(C \cap D=\emptyset \), \(V=\bigcup \nolimits _{q \in Q} V_q\) and \(V_q\) is a domain of the attribute q, and

$$\begin{aligned} f : U \times Q \longrightarrow V \end{aligned}$$

is an information function such that \(f(x,q) \in V_q\) for every \(x \in U\) and \(q \in Q\).

Every nonempty subset P of condition attributes C, or decision attributes D, generates an equivalence relation on U, denoted by \({\hat{P}}\) and defined as follows (Pawlak 1988).

$$\begin{aligned} {\hat{P}}=\{ (x,y) \in U \times U \mid \forall \ q \in P, f(x,q)=f(y,q) \}. \end{aligned}$$
(4)

Let \(P^*=\{P_1,P_2,\ldots ,P_{|P^*|} \}\) denote the partition on U induced by equivalence relation \({\hat{P}}\). Each member of \(D^*\) will be called a decision class. The decision table \((U, C \cup D, V, f)\) is consistent if \({\hat{C}} \subseteq {\hat{D}}\); otherwise, the decision table is inconsistent (Pawlak 1987).

For any \(X \subseteq U\), we can define the P-lower approximation \({\underline{P}}(X)\) and P-upper approximation \({\overline{P}}(X)\) of X, in the classical rough set model as follows (An et al. 1996):

$$\begin{aligned} {\underline{P}}(X) = \cup \{P_i \in {P^*} \mid P_i \subseteq X \}=\cup \{P_i \in {P^*} \mid I(P_i,X)=1 \}, \end{aligned}$$
(5)
$$\begin{aligned} {\overline{P}}(X) = \cup \{P_i \in {P^*} \mid P_i \cap X \ne \emptyset \}=\cup \{P_i \in {P^*} \mid I(P_i,X) > 0 \} . \end{aligned}$$
(6)

Let \(\beta \) be a parameter such that \(0.5 < \beta \le 1\). For \(P_i \in {P^*}\) and \(X \subseteq U\), we define

$$\begin{aligned} P_i \subseteq ^{\beta } X \quad \text{ if } \text{ and } \text{ only } \text{ if } \; I(P_i,X) \ge \beta , \end{aligned}$$
(7)
$$\begin{aligned} P_i \cap ^{\beta } X \not =\emptyset \quad \text{ if } \text{ and } \text{ only } \text{ if } \; I(P_i,U-X) < \beta . \end{aligned}$$
(8)

Then, we can define the \(P^{\beta }\)-lower approximation \({\underline{P}}^{\beta }(X)\) and \(P^{\beta }\)-upper approximation \({\overline{P}}^{\beta }(X)\) of X, in the VP-model under the threshold \(\beta \), as follows (An et al. 1996):

$$\begin{aligned} {\underline{P}}^{\beta }(X) = \cup \{P_i \in {P^*} \mid P_i \subseteq ^{\beta } X \}=\cup \{P_i \in {P^*} \mid I(P_i,X) \ge \beta \}, \end{aligned}$$
(9)
$$\begin{aligned} {\overline{P}}^{\beta }(X) = \cup \{P_i \in {P^*} \mid P_i \cap ^{\beta } X \ne \emptyset \}=\cup \{P_i \in {P^*} \mid I(P_i,X) > 1- \beta \} . \end{aligned}$$
(10)

Evidently, we have

$$\begin{aligned} {\underline{P}}^{\beta }(\emptyset )={\overline{P}}^{\beta }(\emptyset )=\emptyset \quad \text{ and } \quad {\underline{P}}^{\beta }(U)={\overline{P}}^{\beta }(U)=U, \end{aligned}$$
(11)
$$\begin{aligned} {\underline{P}}(X)={\underline{P}}^{1}(X) \subseteq {\underline{P}}^{\beta }(X) \subseteq {\overline{P}}^{\beta }(X) \subseteq {\overline{P}}^{1}(X)= {\overline{P}}(X), \end{aligned}$$
(12)
$$\begin{aligned} {\overline{P}}^{\beta }(X)=U-{\underline{P}}^{\beta }(U-X). \end{aligned}$$
(13)

2.2 Positive region and reduct

Approximations of equivalence classes are obtained by using these concepts of lower approximations, upper approximations, boundary regions and positive regions. We extend the concepts to VP models by providing similar extensions via parameters. To obtain optimal approximations, we introduce reducts of decision classes.

Definition 1

Given a decision table \((U, C \cup D, V, f)\) and a parameter \(\beta \in (0.5,1]\), let \(D^* =\{D_1,D_2,\ldots ,D_{|D^*|} \}\).

  1. 1.

    (Inuiguchi 2005; Zhou and Miao 2011; Ziarko 1993) The quality of classification, or \(\beta \)-approximation degree of dependency, of D w.r.t. a nonempty \(A \subseteq C\) is defined as:

    $$\begin{aligned} {\gamma }_A^{\beta }(D)={|\mathrm{POS}_A^{\beta }(D^*)| \over |U|}={\sum \limits _{j=1}^{|D^*|}} {|{{\underline{A}}^{\beta }(D_j)}|\over {|U|}} \end{aligned}$$
    (14)

    where

    $$\begin{aligned} \mathrm{POS}_A^{\beta }(D^*)=\bigcup \limits _{D_j \in D^*} {\underline{A}}^{\beta }(D_j) \end{aligned}$$
    (15)

    is called the \(\beta \)-positive region of \({D^*}\) w.r.t. A.

  2. 2.

    A nonempty \(B \subseteq C\) is called a \(\beta \)-reduct of C w.r.t. D iff it is a minimum subset of C such that \({\gamma }_B^{\beta }(D)={\gamma }_C^{\beta }(D)\).

Here “iff” is Halmos’ convention for “if and only if”.

3 Evidence theory and VP-models in decision tables

Let \((U, C \cup D, V, f)\) be a decision table. To each nonempty \(A \subseteq C\), we associate a basic probability assignment (bpa) (Lin 1998; Shafer 1976; Skowron and Grzymala-Busse 1994) \(m_A: 2^U \longrightarrow [0,1]\) defined as follows:

$$\begin{aligned} m_A(X)= \left\{ \begin{array}{ll} Pr(X) &\quad \mathrm{if} \ X \in {A^*}\\ 0 &\quad \mathrm{elsewhere}. \end{array} \right. \end{aligned}$$

Then, by replacing “\(\subseteq \)” and “\(\cap \)” with “\(\subseteq ^{\beta }\)” and “\(\cap ^{\beta }\)”, respectively, in the definitions of the belief and plausibility function generated by the bpa \(m_A\), we obtain the generalized notions of \({\beta }\)-belief and \({\beta }\)-plausibility functions generated by the bpa \(m_A\) as follows (Syau and Lin 2015): for any \(X \subseteq U\),

$$\begin{aligned} \mathrm{Bel}_A^{\beta } (X) = {\sum \limits _{A_i \in {A^*}:{A_i} \subseteq ^{\beta } X}} m_A(A_i), \quad Pl_A^{\beta } (X) = {\sum \limits _{A_i \in {A^*}: {A_i} \cap ^{\beta } X \not =\emptyset }} m_A(A_i). \end{aligned}$$
(16)

In addition, motivated by our earlier paper (Syau and Lin 2015), we also have

$$\begin{aligned} \mathrm{Bel}_A^{\beta } (X)= & {} {\sum \limits _{A_i \in {A^*}:{A_i} \subseteq ^{\beta } X}} m_A(A_i) = {\sum \limits _{A_i \in {A^*}:{A_i} \subseteq ^{\beta } X}} Pr(A_i) \nonumber \\= & {} \ \ {\sum \limits _{A_i \in {A^*}:{A_i} \subseteq ^{\beta } X}} {|A_i|\over |U|} \ \ = \ {|\cup \{{A_i} \in {A^*} \mid {A_i} \subseteq ^{\beta } X \}| \over |U|} \nonumber \\= & {} \ \ Pr({\underline{A}}^{\beta }(X)), \end{aligned}$$
(17)
$$\begin{aligned} Pl_A^{\beta } (X)&= {\sum \limits _{A_i \in {A^*}:{A_i} \cap ^{\beta } X \not =\emptyset }} m_A(A_i)={\sum \limits _{A_i \in {A^*}:{A_i} \cap ^{\beta } X \not =\emptyset }} Pr(A_i) \nonumber \\&= {\sum \limits _{A_i \in {A^*}:{A_i} \cap ^{\beta } X \not =\emptyset }} {|A_i|\over |U|} \ \ = \ {|\cup \{{A_i} \in {A^*} \mid {A_i} \cap ^{\beta } X \not =\emptyset \}| \over |U|} \nonumber \\&= Pr({\overline{A}}^{\beta }(X)). \end{aligned}$$
(18)

The duality

$$\begin{aligned} Pl_A^{\beta } (X)=1-\mathrm{Bel}_A^{\beta }(U-X) \end{aligned}$$
(19)

follows immediately from (13), (17) and (18).

Let \(D^* =\{D_1,D_2,\ldots ,D_{|D^*|} \}\). According to (2), (14) and (17), we obtain

$$\begin{aligned} {\gamma }_A^{\beta }(D)&= {\sum \limits _{j=1}^{|D^*|}} {|{{\underline{A}}^{\beta }(D_j)}|\over {|U|}}={\sum \limits _{j=1}^{|D^*|}} Pr({\underline{A}}^{\beta }(D_j)) \nonumber \\ &= {\sum \limits _{j=1}^{|D^*|}} \mathrm{Bel}_A^{\beta } (D_j). \end{aligned}$$
(20)

This leads to the following:

Theorem 1

Given a decision table \((U, C \cup D, V, f)\) and a parameter \(\beta \in (0.5,1]\) , let \(D^* =\{D_1,D_2,\ldots ,D_{|D^*|} \}\) . For any nonempty \(A \subseteq C\) , we have

$$\begin{aligned} {\gamma }_A^{\beta }(D)={\gamma }_C^{\beta }(D) \Longleftrightarrow {\sum \limits _{j=1}^{|D^*|}} \mathrm{Bel}_A^{\beta } (D_j)={\sum \limits _{j=1}^{|D^*|}} \mathrm{Bel}_C^{\beta } (D_j). \end{aligned}$$
(21)

4 Relative discernibility of decision classes

As an immediate consequence of (9) and (10), we have the following:

Lemma 1

Given a decision table \((U, C \cup D, V, f)\) and a parameter \(\beta \in (0.5,1]\), let \(D^* =\{D_1,D_2,\ldots ,D_{|D^*|} \}\). For every decision class \(D_j \in D^*\), we have

$$\begin{aligned} {\underline{C}}^{\beta }(D_j) \subseteq {\underline{C}}^{\beta '}(D_j) \subseteq {\overline{C}}^{\beta '}(D_j) \subseteq {\overline{C}}^{\beta }(D_j), \ \ ~\forall \ \ {\beta '} \in (0.5,{\beta }]. \end{aligned}$$
(22)

Ziarko (1993) states that a decision class \(D_j \in D^*\) is said to be \(\beta \)-discernable if

$$\begin{aligned} {\underline{C}}^{\beta }(D_j)={\overline{C}}^{\beta }(D_j). \end{aligned}$$
(23)

According to Ziarko (1993), a decision class which is not discernable for every \(\beta \in (0.5,1]\) will be called absolutely indiscernible. A decision class \(D_k \in D^*\) is absolutely indiscernible iff its absolute boundary

$$\begin{aligned} M(D_k)= \cup \{C_i \in {C^*}: I(C_i,D_k)=0.5 \} \ne \emptyset . \end{aligned}$$
(24)

A decision class which is not absolutely indiscernible will be referred to as weakly discernable. More precisely, a decision class \(D_j \in D^*\) is weakly discernable iff \({\underline{C}}^{\beta }(D_j)={\overline{C}}^{\beta }(D_j)\) for some \({\beta } \in (0.5,1]\). The greatest value of \(\beta \) which makes \(D_j\) discernable is referred to as discernibility threshold. Ziarko (1993) also gives a proposition which provides the computation procedures for discernibility thresholds. Considering this proposition together with inclusion degrees and the remark in Sect. 2, we obtain the following.

Lemma 2

Given a decision table \((U, C \cup D, V, f)\), let \(C^* =\{C_1,C_2,\ldots ,D_{|C^*|} \}\) and \(D^* =\{D_1,D_2,\ldots ,D_{|D^*|} \}\). If a decision class \(D_j \in D^*\) is weakly discernable and its discernibility threshold is equal to \({\zeta _j}\). Then

$$\begin{aligned} \zeta _j=\mathrm{min} ~ \{ \eta _j,\lambda _j \}, \quad \mathrm{where} \end{aligned}$$
$$ \begin{aligned} \eta _j=\mathrm{inf} ~ \{ I(C_i,D_j) \mid C_i \in {C^*} \ \ \& \ \ I(C_i,D_j) > 0.5 \}, \end{aligned}$$
(25)
$$ \begin{aligned} \lambda _j=1-\mathrm{sup} ~ \{ I(C_i,D_j) \mid C_i \in {C^*} \ \ \& \ \ I(C_i,D_j) < 0.5 \}. \end{aligned}$$
(26)

Example 1

Let us consider an example of an inconsistent decision table \((U, C \cup D, V, f)\) as shown in Table 1, where \(C=\{a,b,c \}\) is the set of condition attributes, \(D=\{d \}\) is the decision attribute.

Table 1 Exemplary decision

From this table, we have

  1. 1.

    \(U=\{x_1,x_2,\ldots ,x_8 \}\).

  2. 2.

    \(V_a=V_b=V_c=V_d=\{1,2 \}.\)

  3. 3.

    \(C^*=\{C_1,C_2,C_3 \}, \ \ \ D^*=\{ D_1, D_2 \},\) where

    $$\begin{aligned}&C_1=\{x_1,x_5,x_7 \}, \ \ C_2=\{x_2,x_3,x_6,x_8 \}, \ \ C_3=\{x_4 \}, \\&D_1=\{x_1,x_2,x_4, x_6,x_7,x_8 \}, \ \ D_2=\{x_3,x_5\}. \end{aligned}$$

According to (1), we have

$$\begin{aligned}&I(C_1,D_1)= {\frac{2}{3}}, \quad I(C_1,D_2)= {\frac{1}{3}}, \nonumber \\&I(C_2,D_1)= {\frac{3}{4}}, \quad I(C_2,D_2)= {\frac{1}{4}}, \nonumber \\&I(C_3,D_1)= 1, \quad I(C_3,D_2)= 0. \end{aligned}$$
(27)

This gives the following:

  1. (i)

    \( \{ I(C_i,D_1) \mid C_i \in {C^*} \ \ \& \ \ I(C_i,D_1) < 0.5 \}=\emptyset \), \( {\frac{2}{3}}=\mathrm{min}~\{ I(C_i,D_1) \mid C_i \in {C^*} \ \ \& \ \ I(C_i,D_1) > 0.5 \},\)

    $$\begin{aligned} {\underline{C}}^{\beta }(D_1)={\underline{C}}^{{\frac{2}{3}}}(D_1)=U, \ \ \ ~\forall \ \ {\beta } \in \left( 0.5,{\frac{2}{3}}\right] , \end{aligned}$$
    (28)
    $$\begin{aligned} {\overline{C}}^{\beta }(D_1)=U, \quad ~\forall \ \ {\beta } \in (0.5,1]. \end{aligned}$$
    (29)

    It follows from (28) and (29) that the decision class \(D_1\) is weakly discernable and its discernibility threshold is equal to \({\frac{2}{3}}\). This example clearly validates that our method works and indicates that why we use supremum instead of maximum in (26).

  2. (ii)

    \( \{ I(C_i,D_2) \mid C_i \in {C^*} \ \ \& \ \ I(C_i,D_1) > 0.5 \}=\emptyset \), \( {\frac{2}{3}}=1-\mathrm{max}~\{ I(C_i,D_2) \mid C_i \in {C^*} \ \ \& \ \ I(C_i,D_1) < 0.5 \},\)

    $$\begin{aligned} {\underline{C}}^{\beta }(D_2)=\emptyset , \quad ~\forall \ \ {\beta } \in (0.5,1], \end{aligned}$$
    (30)
    $$\begin{aligned} {\overline{C}}^{\beta }(D_2)=\emptyset , \quad ~\forall \ \ {\beta } \in (0.5,{\frac{2}{3}}]. \end{aligned}$$
    (31)

It follows from (30) and (31) that the decision class \(D_2\) is weakly discernable and its discernibility threshold is equal to \({\frac{2}{3}}\). Similarly to part (i), this example clearly validates that our method works and indicates that why we use infimum instead of minimum in (25).

In what follows, we consider a given decision tale, \((U, C \cup D, V, f)\). Let

$$ \begin{aligned}&C^* =\{C_1,C_2,\ldots ,C_{|C^*|} \}, \ \ D^* =\{D_1,D_2,\ldots ,D_{|D^*|} \}, \nonumber \\&M(D_j)= \cup \{C_i \in {C^*}: I(C_i,D_j)=0.5 \}, \nonumber \\&H(D_j)= \cup \{C_i \in {C^*}: I(C_i,D_j) \ge 0.5 \}, \nonumber \\&\eta _j=\mathrm{inf} ~ \{ I(C_i,D_j) \mid C_i \in {C^*} \ \ \& \ \ I(C_i,D_j) > 0.5 \}, \nonumber \\&\lambda _j=1-\mathrm{sup} ~ \{ I(C_i,D_j) \mid C_i \in {C^*} \ \ \& \ \ I(C_i,D_j) < 0.5 \}, \nonumber \\&\zeta _j= \mathrm{min} ~ \{ \eta _j,\lambda _j \}, \nonumber \\&\eta =\mathrm{min} ~ \{ \eta _1, \eta _2,\ldots , \eta _{|D^*|} \}, \ \ {\lambda }=\mathrm{min} ~ \{ \lambda _1, \lambda _2,\ldots , \lambda _{|D^*|} \}, \nonumber \\&\zeta = \mathrm{min} ~ \{ \eta ,\lambda \}=\mathrm{min} ~ \{ \zeta _1, \zeta _2,\ldots , \zeta _{|D^*|} \}. \end{aligned}$$
(32)

Then, by Lemma 2 and Example 1, we obtain the following:

Lemma 3

Given a decision table \((U, C \cup D, V, f)\), and the notations \(\eta \), \(\lambda \) defined in (32), for each \(D_j \in D^* \), we have

$$\begin{aligned} {\underline{C}}^{\beta }(D_j)={\underline{C}}^{\eta _j}(D_j), \quad ~\forall \ \ {\beta } \in (0.5,{\eta _j}] \end{aligned}$$
(33)
$$\begin{aligned} {\overline{C}}^{\beta }(D_j) = {\overline{C}}^{\lambda _j}(D_j) = H(D_j), \quad ~\forall \ \ {\beta } \in (0.5,{\lambda _j}] \end{aligned}$$
(34)

If \(M(D_j)= \cup \{C_i \in {C^*}: I(C_i,D_j)=0.5 \}=\emptyset \), then the decision class \(D_j\) is weakly discernable and its discernibility threshold is equal to \({\zeta _j}\). That is,

$$\begin{aligned} {\underline{C}}^{\beta }(D_j)={\overline{C}}^{\beta }(D_j), \quad ~\forall \ \ {\beta } \in (0.5,{\zeta _j}]. \end{aligned}$$
(35)

Considering an improvement of Cheng et al. (2015), we define relative discernibility of decision tables as follows.

Definition 2

A decision table \((U, C \cup D, V, f)\) is said to be weakly discernible iff all its decision classes \(D_1,D_2,\ldots ,D_{|D^*|}\) are weakly discernible, or equivalently, iff for each \(D_j \in {D^*}\),

$$\begin{aligned} M(D_j)= \cup \{C_i \in {C^*}: I(C_i,D_j)=0.5 \}=\emptyset . \end{aligned}$$
(36)

The greatest value of \(\beta \) which makes the decision table \((U, C \cup D, V, f)\) discernable will be referred to as discernibility threshold.

Theorem 2

Given a decision table \((U, C \cup D, V, f)\) , and the notations \(\eta \), \(\lambda \) defined in (32), we have

  1. 1.

    \(\mathrm{POS}_C^{\beta }(D^*)=\mathrm{POS}_C^{\eta }(D^*), \ \ ~\forall \ \ {\beta } \in (0.5,{\eta }]\).

  2. 2.

    for each \({D_j \in D^*}\),

    $$\begin{aligned} {\overline{C}}^{\beta }(D_j)={\overline{C}}^{\lambda }(D_j), \quad ~\forall \ \ {\beta } \in (0.5,{\lambda }]. \end{aligned}$$

If the decision table is weakly discernable, then its discernibility threshold is equal to \(\zeta = \mathrm{min} ~ \{ \eta ,\lambda \}\). That is,

$$\begin{aligned} \mathrm{POS}_C^{\beta }(D^*)=\bigcup \limits _{D_j \in D^*} ~ {\underline{C}}^{\beta }(D_j)=\bigcup \limits _{D_j \in D^*} ~ {\overline{C}}^{\beta }(D_j), \ \ ~\forall \ \ {\beta } \in (0.5,{\zeta }], \end{aligned}$$
(37)

or equivalently, for each \({D_j \in D^*}\),

$$\begin{aligned} {\overline{C}}^{\beta }(D_j)={\underline{C}}^{\beta }(D_j), \ \ ~\forall \ \ {\beta } \in (0.5,{\zeta }]. \end{aligned}$$
(38)

To illustrate the above concepts, we present below an example of a weakly discernable decision table (Table 2).

Table 2 Exemplary decision

Example 2

Let us consider a decision table \((U, C \cup D, V, f)\) as shown in Table 1, where \(C=\{a,b,c \}\) is the set of condition attributes, \(D=\{d \}\) is the decision attribute. From this table, we have

  1. 1.

    \(U=\{x_1,x_2,\ldots ,x_{11} \}.\)

  2. 2.

    \(V_a=V_b=V_c=\{1,2 \}, \ \ \ V_d=\{1,2,3 \}.\)

  3. 3.

    \(C^*=\{C_1,C_2,C_3 \}, \ \ \ D^*=\{ D_1, D_2, D_3 \},\) where

    $$\begin{aligned}&C_1=\{x_1,x_5,x_7 \}, \ \ C_2=\{x_2,x_4,x_6,x_{10} \}, \ \ C_3=\{x_3,x_8,x_9,x_{11} \}, \\&D_1=\{x_1,x_4,x_6,x_{10} \}, \ \ D_2=\{x_2,x_3,x_5 \}, \ \ D_3=\{x_7,x_8,x_9, x_{11} \}. \end{aligned}$$

According to (1), we obtain

$$\begin{aligned}&I(C_1,D_1)= {\frac{1}{3}}, \ \ \ I(C_1,D_2)= {\frac{1}{3}}, \ \ \ I(C_1,D_3)= {\frac{1}{3}}, \nonumber \\&I(C_2,D_1)= {\frac{3}{4}}, \ \ \ I(C_2,D_2)= {\frac{1}{4}}, \ \ \ I(C_2,D_3)= 0, \nonumber \\&I(C_3,D_1)= 0, \ \ \ I(C_3,D_2)= {\frac{1}{4}}, \ \ \ I(C_3,D_3)= {\frac{3}{4}}. \end{aligned}$$
(39)

Then, according to (5) and (6), we obtain

$$\begin{aligned} {\underline{C}}^{\beta }(D_1)=\left\{ \begin{array}{ll} C_2,&{} \mathrm{if} \ 0.5< {\beta } \le {\frac{3}{4}} \\ \emptyset ,&{} \mathrm{if} \ {\frac{3}{4}} < {\beta } \le 1, \end{array} \right. \end{aligned}$$
(40)
$$\begin{aligned} {\underline{C}}^{\beta }(D_2)=\emptyset , \ \ ~\forall \ \ {\beta } \in (0.5,1] \end{aligned}$$
(41)
$$\begin{aligned} {\underline{C}}^{\beta }(D_3)=\left\{ \begin{array}{ll} C_3,&{} \mathrm{if} \ 0.5< {\beta } \le {\frac{3}{4}} \\ \emptyset ,&{} \mathrm{if} \ {\frac{3}{4}} < {\beta } \le 1, \end{array} \right. \end{aligned}$$
(42)
$$\begin{aligned} {\overline{C}}^{\beta }(D_1)=\left\{ \begin{array}{ll} C_2,&{} \mathrm{if} \ 0.5< {\beta } \le {\frac{2}{3}} \\ C_1 \cup C_2,&{} \mathrm{if} \ {\frac{2}{3}} < {\beta } \le 1, \end{array} \right. \end{aligned}$$
(43)
$$\begin{aligned} {\overline{C}}^{\beta }(D_2)=\left\{ \begin{array}{ll} \emptyset ,&{} \mathrm{if} \ 0.5< {\beta } \le {\frac{2}{3}} \\ C_1,&{} \mathrm{if} \ {\frac{2}{3}}< {\beta } \le {\frac{3}{4}} \\ U,&{} \mathrm{if} \ {\frac{3}{4}} < {\beta } \le 1, \end{array} \right. \end{aligned}$$
(44)

and

$$\begin{aligned} {\overline{C}}^{\beta }(D_3)=\left\{ \begin{array}{ll} C_3,&{} \mathrm{if} \ 0.5< {\beta } \le {\frac{2}{3}} \\ C_1 \cup C_3, &{} \mathrm{if} \ {\frac{2}{3}} < {\beta } \le 1. \end{array} \right. \end{aligned}$$
(45)

Equations (40)–(45) give

$$\begin{aligned} {\overline{C}}^{\beta }(D_1)={\underline{C}}^{\beta }(D_1)=C_2, \ \ ~\forall \ \ {\beta } \in \left( 0.5,{\frac{2}{3}}\right] , \end{aligned}$$
(46)
$$\begin{aligned} {\overline{C}}^{\beta }(D_2)={\underline{C}}^{\beta }(D_2)=\emptyset , \ \ ~\forall \ \ {\beta } \in \left( 0.5,{\frac{2}{3}}\right] , \end{aligned}$$
(47)
$$\begin{aligned} {\overline{C}}^{\beta }(D_3)={\underline{C}}^{\beta }(D_3)=C_3, \ \ ~\forall \ \ {\beta } \in \left( 0.5,{\frac{2}{3}}\right] , \end{aligned}$$
(48)

and

$$\begin{aligned} \mathrm{POS}_C^{\beta }(D^*)=\bigcup \limits _{D_j \in D^*} ~ {\underline{C}}^{\beta }(D_j)=\mathrm{POS}_C^{\frac{3}{4}}(D^*)=C_2 \cup C_3, \ \ ~\forall \ \ {\beta } \in \left( 0.5,{\frac{3}{4}}\right] . \end{aligned}$$
(49)

According to (32) and (39), we have

$$\begin{aligned}&\eta _1={\frac{3}{4}}, \quad \lambda _1=1-{\frac{1}{3}}={\frac{2}{3}}, \quad \zeta _1= \mathrm{min} ~ \{ \eta _1,\lambda _1 \}={\frac{2}{3}}, \nonumber \\&\eta _2=\mathrm{inf} ~\emptyset = 1, \quad \lambda _2=1-{\frac{1}{3}}={\frac{2}{3}}, \quad \zeta _2= \mathrm{min} ~ \{ \eta _2,\lambda _2 \}={\frac{2}{3}}, \nonumber \\&\eta _3={\frac{3}{4}}, \quad \lambda _3=1-{\frac{1}{3}}={\frac{2}{3}}, \quad \zeta _3= \mathrm{min} ~ \{ \eta _3,\lambda _3 \}={\frac{2}{3}}, \nonumber \\&\eta =\mathrm{min} ~ \{ \eta _1, \eta _2, \eta _3 \}={\frac{3}{4}}, \quad {\lambda }=\mathrm{min} ~ \{ \lambda _1, \lambda _2, \lambda _3 \}={\frac{2}{3}}, \nonumber \\&\zeta = \mathrm{min} ~ \{ \eta ,\lambda \}=\mathrm{min} ~ \{ \zeta _1, \zeta _2,\zeta _3 \}={\frac{2}{3}}. \end{aligned}$$
(50)

This example validates the results of Lemma 3 and Theorem 2.

5 Inclusion degrees in computing \(\beta \)-positive region

Let \({\beta } \in (0.5,1]\). As it is indicated in (14) that

$$\begin{aligned} {\underline{C}}^{\beta }(D_j) \cap {\underline{C}}^{\beta }(D_k)=\emptyset , \ \ ~\forall \ \ D_j, \ D_k \in D^* \ \ (j \ne k). \end{aligned}$$
(51)

Notice that for each \(C_i \in C^* \), we have

$$\begin{aligned}&I(C_i,D_1)+I(C_i,D_2)+ \cdots +I(C_i,D_{|D^*|}) \nonumber \\&\quad = \frac{|C_i \cap D_1|}{|C_i|}+\frac{|C_i \cap D_2|}{|C_i|}+ \cdots + \frac{|C_i \cap D_{|D^*|}|}{|C_i|} = 1. \end{aligned}$$
(52)

This, combined with (9) and (51), leads to the following:

Lemma 4

Let \(C_i \in C^* \).

  1. 1.

    For \({\beta } \in (0.5,1]\), one and only one of the following two cases: \( \mathrm{case \ 1}.\ \ C_i \subseteq \mathrm{POS}_C^{\beta }(D^*)\), or \( \mathrm{case \ 2}. \ \ C_i \cap \mathrm{POS}_C^{\beta }(D^*)=\emptyset \) must occur.

  2. 2.

    \(C_i \subseteq \mathrm{POS}_C^{\beta }(D^*)\) for some \({\beta } \in (0.5,1]\) iff there exists (one and only one) \(D_j \in D^*\) with \(I(C_i,D_j) > 0.5\). In this case, we have

    $$\begin{aligned} I(C_i,D_k) < 0.5, \quad ~\forall \ \ D_k \in D^*-\{D_j \}. \end{aligned}$$
    (53)
  3. 3.

    If \(I(C_i,D_j) \le 0.5\) for all \(D_j \in D^*\), or if there exists \(D_j \in D^*\) with \(I(C_i,D_j) = 0.5\), then we must have

    $$\begin{aligned} C_i \cap \mathrm{POS}_C^{\beta }(D^*)=\emptyset , \quad ~\forall \ \ {\beta } \in (0.5,1]. \end{aligned}$$

6 Relative consistency of decision tables

Following from Example 2, the \(\beta \)-positive regions, \(\mathrm{POS}_C^{\beta }(D^*)\) for all \({\beta } \in (0.5,1]\), of the weakly discernable decision table can be expressed as:

$$\begin{aligned} \mathrm{POS}_C^{\beta }(D^*)= \left\{ \begin{array}{ll} C_2 \cup C_3 &\quad \mathrm{if} \ {\beta } \in (0.5,{\frac{3}{4}}], \\ \emptyset &\quad \mathrm{if} \ {\beta } \in ({\frac{3}{4}},1]. \end{array} \right. \end{aligned}$$

We indeed have \(\mathrm{POS}_C^{\beta }(D^*) \subset U\) for all \({\beta } \in (0.5,1]\). We have thus seen that weakly discernibility of a decision table \((U, C \cup D, V, f)\) is not able to ensure that \(\mathrm{POS}_C^{\beta }(D^*)=U\). This gives a counterexample to Proposition 3.2 of Cheng et al. (2015). The above observation leads to the following definition:

Definition 3

A decision table \((U, C \cup D, V, f)\) is said to be \(\beta \)-consistent \(({\beta } \in (0.5,1])\) if its \(\beta \)-positive region of \({D^*}\) w.r.t. C equals the universe U. That is,

$$\begin{aligned} \mathrm{POS}_C^{\beta }(D^*)=U. \end{aligned}$$
(54)

A decision table which is not consistent for every \(\beta \in (0.5,1]\) will be called absolutely inconsistent.

Using Item 3 of Lemma 4, it can be easily seen that if a decision table \((U, C \cup D, V, f)\) is absolutely inconsistent iff there exists \(C_i \in C^*\) such that

$$\begin{aligned} I(C_i,D_j) \le 0.5, \ \ ~\forall \ \ D_j \in D^*, \end{aligned}$$
(55)

or equivalently,

$$\begin{aligned} C_i \cap \mathrm{POS}_C^{\beta }(D^*)=\emptyset , \ \ ~\forall \ \ {\beta } \in (0.5,1]. \end{aligned}$$
(56)

A decision table which is not absolutely inconsistent will be referred to as weakly consistent. More precisely, a decision table \((U, C \cup D, V, f)\) is weakly consistent iff for each \(C_i \in C^*\) there exists (one and only one) \(D_j \in D^*\) with \(I(C_i,D_j) > 0.5\). The greatest value of \(\beta \) which makes \((U, C \cup D, V, f)\) consistent will be referred to as consistency threshold; hence, by Theorem 2, we obtain the following:

Theorem 3

Given a weakly consistent decision table \((U, C \cup D, V, f)\) , and the notation \(\eta \) defined in (32), we have

$$\begin{aligned} \mathrm{POS}_C^{\beta }(D^*)=\mathrm{POS}_C^{\eta }(D^*)=U, \ \ ~\forall \ \ {\beta } \in (0.5,{\eta }]. \end{aligned}$$
(57)

According to Theorem 3 and (22), we have immediate consequences as follows.

Corollary 1

Given a weakly consistent decision table \((U, C \cup D, V, f)\), and the notations \(\eta \), \(\lambda \), \(\zeta \) defined in (32), we have

$$\begin{aligned} \mathrm{POS}_C^{\beta }(D^*)=U=\bigcup \limits _{D_j \in D^*} ~ {\overline{C}}^{\beta }(D_j), \quad ~\forall \ \ {\beta } \in (0.5,{\zeta }]. \end{aligned}$$
(58)

Example 3

From Example 1, we have

$$\begin{aligned} \mathrm{POS}_C^{\beta }(D^*)=U, \quad ~\forall \ \ {\beta } \in \left( 0.5,{\frac{2}{3}}\right] . \end{aligned}$$

7 Concluding remarks

We have shown a close relationship between \(\beta \)-approximation degree of dependency and \(\beta \)-belief function of a given decision table. We further characterized a decision table by its positive regions with certain thresholds. We indicated how one finds a threshold for weakly discernable decision tables. In particular, positive regions with some thresholds provide a clear description for a weakly consistent decision table. By finding inclusion degrees, we are able to find thresholds and give rise to the characterization of decision tables. Our examples have demonstrated some good understandings of our theory in small scales. In fact, the approximation degree plays an interesting role in upper and lower approximations. This paper also showed the relationships between approximation degrees and discernibility thresholds. We have extended and improved other’s results (Ziarko 1993). It is anticipated to derive some algorithms for a large decision table. It would be interesting to connect with mutual entropy and Bayesian considerations (Düntsch and Gediga 2015) which may give rise to some algorithm to determine an approximately optimal set of predicting attributes. Future applications of our theory will include incorporating the inclusion degree in analyzing inconsistent decision tables on a Hadoop-based distributed platform.