Abstract:
Detection of atypical data and outliers is an important and difficult task. Data that are considered atypical in turn are characterized by a set of features that determin...Show MoreMetadata
Abstract:
Detection of atypical data and outliers is an important and difficult task. Data that are considered atypical in turn are characterized by a set of features that determine their informativeness. Therefore, the search for features due to which data elements are considered atypical is relevant and necessary. An even more difficult task is to find atypical features on a set of small data, which can be considered transient data, i.e. those data that are located at the boundaries of classes in the classification of data. Transition zone data between classes create difficulties in classifying and constructing discriminant separation. This paper proposes a method for determining outliers and irrelevant data based on the clustering of class data using a minimal spanning tree. Outliers are detected by minimizing the set of bipartite graph data of adjacent groups. This leads to spatial local delimitation and determination of the set of outlier characteristics. The proposed method allows detecting outliers on different sets of features, both common and on the features of individual classes.
Published in: 2021 IEEE 3rd International Conference on Advanced Trends in Information Theory (ATIT)
Date of Conference: 15-17 December 2021
Date Added to IEEE Xplore: 20 January 2022
ISBN Information: