Measures of uncertainty based on Gaussian kernel for a fully fuzzy information system
Introduction
Uncertainty, including vagueness, incompleteness, inconsistency, fuzziness and randomness, is a common sight in most places of the world. In machine learning, how to evaluate uncertainty information (i.e., how to search measure indicators) is an important issue [1], [2]. When applying measures of uncertainty for evaluating an information system, good indicators can enhance the accuracies and efficiencies of clustering and classification tasks [3], [4].
Rough set theory is an effective method of soft computing for characterizing uncertainty [5], [6], [7], [8]. As the expression of information (or knowledge), information system [5] is the research object of rough set theory. There are many studies of rough set theory which are related to information systems, such as decision analysis, uncertainty modeling, attribute reduction, machine learning, pattern recognition and reasoning with uncertainty [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21].
In an information system, there are two major directions for measuring uncertainty: information granulation, information entropy. They give indicators for studying uncertainty from different perspectives. The uncertainty of an information system decreases when information granulation increases or information entropy decreases.
Granulation is a typical physical indicator for measuring granule size. As a serviceable measure, Zadeh gave the concept of granulation [22]. Uncertainty measures for information systems can be defined as information granulation or knowledge granulation. Granulation is actually the average measure of the degree of information or knowledge refinement in information systems. In information systems, its distinction ability can be depicted through information granulation, i.e., the distinction ability of an information system becomes stronger as information granulation gets smaller. Some explorations and many outstanding contributions have been made by some researchers in this respect [23], [24], [25], [26], [27], [28], [29]. Granularity measure is advanced by Yao et al. [30] in the orientation of granulation. In set-valued information systems, Dai et al. [31] studied four types of measures from perspectives of entropy and granularity. Moreover, in interval and set-valued information systems, Wang et al. [32] discussed granularity measures. In neighborhood systems, Chen et al. [33] investigated granule structures, distances and measures. In view of neighborhood rough sets, Li et al. [34] advanced the neighborhood granulation measures for information systems. Based on partition-based granular structures, Yao [35] studied granularity measures and complexity measures.
Entropy, expanded by Shannon [36], is quite a practical index. It can represent content of information or knowledge in multifarious types, and has been extensively employed in various fields, especially in the aspect of measuring the uncertainty. Dai et al. [37] studied measurement in interval-valued decision systems, which is on account of extended conditional entropy. A measurement—-rough degree, that can be utilized to reveal uncertainty, is given by Dai et al. [38] in interval-valued information systems. In incomplete decision systems, Dai et al. [39] raised a new definition for conditional entropy, and got some significant properties and tried to apply it. Liu et al. [40] discussed several measures with parameter from the points of granulation and entropy in distributed FFISs. In order to research uncertainty of probabilistic hesitant fuzzy information, Su et al. [41] put forward two types of entropy measures. To handle the noise and uncertainty, Sun et al. [42] studied some neighborhood entropies for measuring uncertainty in neighborhood decision systems.
Many information systems, in which information values under attributes are fuzzy, exist in reality [43], [44], [45], so more researches are needed for fuzzy information systems. In an information system, if all information values under each attribute are fuzzy, then this system is said to be a FFIS [40], [46], [47]. Zhang et al. [48] studied information structures and uncertainty measures on account of equivalence relations for FFISs. In view of the idea of discretization, the two close values are classified into the same class. The equivalence relation corresponding to each attribute is defined based on this class-consistent. Li et al. [49] gave the process of constructing fuzzy -similarity relation for a fuzzy condition decision information system based on Gaussian kernel. The similarity of two objects under a single condition attribute is calculated through Gaussian kernel, then fuzzy -similarity relations under each condition attribute are formed. By using Hadamard product to aggregate these relations, a fuzzy -similarity relation under all condition attributes is obtained.
Although Zhang et al. [48] studied information structures and uncertainty measures for FFISs, their study was based on equivalence relations and ignored fuzziness of the system itself. This paper is also devoted to research measures of uncertainty for a FFIS, but unlike the idea of Zhang et al. the relation in a FFIS is based on Gaussian kernel, i.e., fuzzy -similarity relation. The fuzzy information structure is defined by using this relation so that it does not lose fuzziness. This is one of our research motivations. On the other hand, Li et al. [49] applied the fuzzy -similarity relation based on Gaussian kernel to construct rough sets, but their main purpose was to propose a multi-granulation decision-theoretic rough set method for fuzzy condition decision information systems. In this article, we try to give the fuzzy -similarity relation caused by a FFIS based on Gaussian kernel, the main purpose is to research uncertainty measures for this FFIS. This is another motivation of our research. The main advantage of proposed information structure is that it is based on fuzzy -similarity. Meanwhile, compared with the uncertainty measures based on general information structure, the proposed uncertainty measures also take fuzziness into account, so they can better reflect the essence of uncertainty and are more suitable for FFISs.
There are three main contributions in this article: (1) The fuzzy -similarity relation is extracted based on Gaussian kernel for a given FFIS. In view of this relation, fuzzy information granules are constructed, and fuzzy information structures based on these granules are advanced. (2) According to the advanced information structures, uncertainty measures are presented for a given FFIS. Besides, their important properties are given, and the relationships among these measures are established. (3) Two numerical experiments and statistical analysis (i.e., dispersion analysis and correlation analysis) of the proposed measures are conducted to determine their effectiveness.
These results will be helpful for understanding the uncertainty essence of FFISs. The proposed measures can be used to compute the importance of attributes, measure the quality of a decision rule in FFISs, construct the heuristic function in a heuristic reduction algorithm and so on.
The workflow of this article is displayed in Fig. 1.
The follow-up to this article consists of the following six sections. In Section 2, we review some basic knowledge about fuzzy sets, fuzzy relations and FFISs. The fuzzy -similarity relation is proposed based on Gaussian kernel method for a FFIS in Section 3. In Section 4, some notions for fuzzy information structures are given for a FFIS. In Section 5, we advance some indicators about measuring uncertainty for a FFIS. In Section 6, to assess the performance of the advanced measure indicators, two numerical experiments are conducted. Section 7 is a summary of this paper.
Section snippets
Preliminaries
In this paper, let be a universe (non-empty finite set), . is recorded as the family formed by all fuzzy sets on .
Fuzzy -similarity relation caused by a FFIS
In this section, the fuzzy -similarity relation is proposed according to Gaussian kernel method in a FFIS.
Fuzzy information structures in a FFIS
In this section, some notions for fuzzy information structures are given for a FFIS.
Measuring uncertainty of a FFIS
In a FFIS, the information values corresponding to each attribute are fuzzy, i.e., each attribute ascertains a fuzzy set. In view of fuzzy information structures in a FFIS, we raise some indicators for measuring its uncertainty.
Numerical experiments and effectiveness analysis
To assess the performance of the advanced measure indicators in FFISs, we conduct two numerical experiments and analyze the effectiveness in this section. Firstly, two numerical experiments are conducted by considering the monotonicity of the four measures. Experimental results verify the monotonicity of four measures and show their availability. Then, dispersion degree is considered to analyze the effectiveness of the four measures. The performances of the four measures are compared based on
Conclusion
In this paper, we obtain fuzzy -similarity relation in a FFIS by using Gaussian kernel. In view of the fuzzy -similarity relation, some notions including dependence, information distance and inclusion degree for fuzzy information structures are given for a FFIS. According to this information structure, granulation measure of a given FFIS is advanced. Moreover, entropy measure is also considered for a given FFIS. To assess the performance of the advanced measure indicators, two numerical
CRediT authorship contribution statement
Zhaowen Li: Conceptualization, Methodology. Xiaofeng Liu: Investigation, Methodology, Writing - original draft, Writing - review & editing. Jianhua Dai: Supervision, Conceptualization, Validation, Writing - review & editing. Jiaolong Chen: Software, Investigation. Hamido Fujita: Validation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported by the National Natural Science Foundation of China (61976089, 11971420), Natural Science Foundation of Guangxi, China (2016GXNSFAA380045, 2016GXNSFAA380282, 2016GXNSFAA380286), Key Laboratory of Optimization Control and Engineering Calculation in Department of Guangxi Education, China, Special Funds of Guangxi Distinguished Experts Construction Engineering, China, Key Laboratory of Complex System Optimization and Big Data Processing in Department of Guangxi Education,
References (61)
- et al.
A measurement theory view on the granularity of partitions
Inform. Sci.
(2012) - et al.
A rapid learning algorithm for vehicle classification
Inform. Sci.
(2015) - et al.
Rudiments of rough sets
Inform. Sci.
(2007) - et al.
Approximations and uncertainty measures in incomplete information systems
Inform. Sci.
(2012) - et al.
Uncertainty measurement for incomplete interval-valued information systems based on -weak similarity
Knowl.-Based Syst.
(2017) - et al.
Sequential covering rule induction algorithm for variable consistency rough set approaches
Inform. Sci.
(2011) - et al.
Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification
Appl. Soft Comput.
(2013) - et al.
Rough sets in distributed decision information systems
Knowl.-Based Syst.
(2016) - et al.
The uncertainty of probabilistic rough sets in multi-granulation spaces
Internat. J. Approx. Reason.
(2016) - et al.
Dynamic variable precision rough set approach for probabilistic set-valued information systems
Knowl.-Based Syst.
(2017)
Communication between fuzzy information systems using fuzzy covering-based rough sets
Internat. J. Approx. Reason.
A novel three-way decision model with decision-theoretic rough sets using utility theory
Knowl.-Based Syst.
Incremental approaches to updating reducts under dynamic covering granularity
Knowl.-Based Syst.
A characterization of novel rough fuzzy sets of information systems and their application in decision making
Expert Syst. Appl.
Fuzzy equivalence relation and its multigranulation spaces
Inform. Sci.
Three-layer granular structures and three-way informational measures of a decision table
Inform. Sci.
Entropy measures and granularity measures for set-valued information systems
Inform. Sci.
Granule structures, distances and measures in neighborhood systems
Knowl.-Based Syst.
Granularity measures and complexity measures of partition-based granular structures
Knowl.-Based Syst.
Uncertainty measurement for interval-valued decision systems based on extended conditional entropy
Knowl.-Based Syst.
Uncertainty measurement for interval-valued information systems
Inform. Sci.
Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification
Inform. Sci.
The role of fuzzy sets in decision sciences: Old techniques and new directions
Fuzzy Sets and Systems
Granular representation and granular computing with fuzzy sets
Fuzzy Sets and Systems
Implication-based models of monotone fuzzy rule bases
Fuzzy Sets and Systems
Information structures and uncertainty measures in a fully fuzzy information system
Internat. J. Approx. Reason.
A multi-granulation decision-theoretic rough set method for distributed -decision information systems: An application in medical diagnosis
Appl. Soft Comput.
Fuzzy sets
Inf. Control
Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation
Pattern Recognit.
Gaussian kernel based fuzzy rough sets: Model, uncertainty measures and applications
Internat. J. Approx. Reason.
Cited by (50)
The Research on Relative Knowledge Distances and Their Cognitive Features
2023, International Journal of Cognitive Computing in EngineeringAttribute reduction for hybrid data based on fuzzy rough iterative computation model
2023, Information SciencesFeature selection based on double-hierarchical and multiplication-optimal fusion measurement in fuzzy neighborhood rough sets
2022, Information SciencesCitation Excerpt :Feature selection (FS) adopts granulation cognition to remove redundant attributes and select useful information, so FS is extensively utilized in data mining, machine learning, and knowledge discovery [4,14,18,30,39,40,49]. In particular, FS resorts mainly to uncertainty measurement [5,16,45], so measure-driven FS becomes an important topic [15,22,27,41,43]. Uncertainty measures include algebraic and informational types, and both can be solely or synthetically utilized for FS [34,36].
Student-t kernelized fuzzy rough set model with fuzzy divergence for feature selection
2022, Information SciencesMatrix representation of the conditional entropy for incremental feature selection on multi-source data
2022, Information Sciences