Learning differential diagnosis of erythemato-squamous diseases using voting feature intervals

https://doi.org/10.1016/S0933-3657(98)00028-1Get rights and content

Abstract

A new classification algorithm, called VFI5 (for Voting Feature Intervals), is developed and applied to problem of differential diagnosis of erythemato-squamous diseases. The domain contains records of patients with known diagnosis. Given a training set of such records, the VFI5 classifier learns how to differentiate a new case in the domain. VFI5 represents a concept in the form of feature intervals on each feature dimension separately. classification in the VFI5 algorithm is based on a real-valued voting. Each feature equally participates in the voting process and the class that receives the maximum amount of votes is declared to be the predicted class. The performance of the VFI5 classifier is evaluated empirically in terms of classification accuracy and running time.

Introduction

Researchers working on artificial intelligence have created many algorithms that successfully learn straightforward abilities. If the context is well-defined and the bounds of the problem can be correctly encoded for the computer, then these algorithms can often pick up a pattern and learn to predict it successfully. Inductive learning is a well-known approach to automatic knowledge acquisition of such patterns and classification knowledge from examples.

In several medical domains, the inductive learning systems were actually applied, e.g. two classification systems are used in the localization of a primary tumor, prognostics of recurrence of breast cancer, diagnosis of thyroid diseases, and rheumatology [10]. The CRLS system is a system for learning categorical decision criteria in biomedical domains [15]. The case-based BOLERO system learns both plans and goals states, with the aim of improving the performance of a rule-based system by adapting the rule-based system behavior to the most recent information available about a patient [13]. The DIAGAID is a program, using connectionist approach, to determine the diagnostic value of clinical data [7].

Classification learning algorithms are composed of two components; namely, training and prediction (classification). The training phase, using some induction algorithms, forms a model of the domain from the training examples encoding some previous experiences. The classification phase, on the other hand, using this model, tries to predict the class that a new instance (case) belongs to.

The main requirement for such a system is prediction accuracy. Furthermore, a classification learning algorithm is expected to have a short training and prediction time. Such a system should be robust to noisy training instances. Also, in some real-world domains, both training and test instances may have some missing values. Features (attributes) that are used to encode instances may have different levels of relevancy to the domain. A classification learning system should be able to learn and/or incorporate information about the weights of the features. Another requirement might be the comprehensibility of the learned knowledge by human experts. The advantage of this trait is two folded. First, the human experts can check and verify the learned classification knowledge before it is put to use in real-world domains. Second, some previously unknown facts and patterns may be brought to the attention of human experts, leading to interesting discoveries in the field.

Previously developed machine learning algorithms, usually, possess some of these characteristics, and fail to satisfy the others. For example, some algorithms, (e.g. nearest neighbor and instance based learning algorithms 1, 4) develop a model of the domain quickly, however, it may take quite a long time to make a prediction using this model. On the other hand, some algorithms (e.g. neural networks) can make a fast prediction, however the knowledge they learn is difficult for humans to understand and verify.

The success of a classification learning algorithm, in terms of the criteria mentioned above, is directly related to the scheme used for representing the classification knowledge learned. In this paper, we present a knowledge representation technique called voting feature intervals (VFI). Along with the learning and classification algorithms, the whole system is called VFI5. The VFI representation is based on Feature Projections that has been used in CFP [8]and k-NNFP [2]. The VFI5, which is a non-incremental and supervised learning algorithm, is applied to differential diagnosis of erythemato-squamous diseases. Here, we show that that VFI5 algorithm, using the VFI representation, results in highly accurate predictions, has short training and classification times, is robust to noisy training instances and missing feature values, can use feature weights, and produces a human readable model of the classification knowledge.

The rationale behind VFI knowledge representation is that human experts maintain knowledge in this form, especially in medical domains. The input to VFI5 training algorithm is a set of training instances that are descriptions of patients with known diagnoses. Learning from these training examples, VFI5 constructs a representation of the classification knowledge inherent in the examples. This knowledge is represented as the projections of the training dataset by feature intervals on each feature dimension separately. Subsequently, for each feature dimension, projection points with similar characteristics are grouped into intervals. Therefore, an interval represents a set of feature values that yield the same classifications.

When diagnosing a new patient, each feature participates in the voting process and the diagnosis that receives the maximum amount of votes is predicted to be the diagnosis of that patient. As each feature participates in learning and classification independently, VFI enables an easy and natural way of handling missing feature values by simply ignoring them, i.e. features whose values are unknown do not participate in the voting.

The next section will describe the VFI5 algorithm in detail. In Section 3, the problem of differential diagnosis of erythemato-squamous diseases is explained. Application of the VFI5 algorithm to this domain is discussed in Section 4. Section 6describes the weights learned for the features of this domain using a genetic algorithm. Finally, the last section concludes with some remarks and plans for future work.

Section snippets

The VFI5 algorithm

The VFI5 classification algorithm is an improved version of the early VFI1 algorithm [6]. Here, the VFI5 algorithm is described in detail and explained through the use of an example.

Differential diagnosis of erythemato-squamous diseases

The differential diagnosis of erythemato-squamous diseases is a difficult problem in dermatology. They all share the clinical features of erythema and scaling, with very few differences. The diseases in this group are psoriasis, seboreic dermatitis, lichen planus, pityriasis rosea, chronic dermatitis and pityriasis rubra pilaris.

These diseases are frequently seen in the outpatient dermatology departments. At first sight, all of the diseases look very much alike with the erythema and scaling.

Experiments

Currently, the dataset for the domain contains 366 instances. Firstly, we used all of these instances to obtain a description of the domain. The description consists of the feature intervals constructed for each feature. The intervals obtained for features f6, f14, f15, f21 and f34 are shown in Fig. 6.

It is clear from Fig. 6 that the nonzero values of feature f6 (polygonal papules) indicate class C3 (pityriasis rubra pilaris). On the other hand, the high values for f14 would suggest class C1 or

Comprehensibility of VFI5

The explanation ability of a classification process is as much important as its classification accuracy. We have shown the empirical evaluation of the VFI5 classifier in Section 4on the Dermatology dataset. However, a high prediction accuracy is not enough for a classification system; the knowledge it constructs should also be comprehensible by humans. For this purpose, we have tried to visualize the concept description learned by the VFI5 classifier. Since each feature votes for each class

Learning feature weights using a genetic algorithm

In a real-world domain, just like the one used in this paper, all of the features used in the descriptions of instances may have different levels of relevancy. Therefore, many feature selection and feature weight learning algorithms have been developed by machine learning researchers 3, 5, 12.

We had developed a genetic algorithm for learning the feature weights to be used with the Nearest Neighbor classification algorithm. We applied the same genetic algorithm to determine the weights of the

Conclusions

In this paper, a new classification algorithm called VFI5 has been developed and applied to differential diagnosis of erythemato-squamous diseases. Since each feature is processed separately, the missing feature values that may appear both in the training and test instances are simply ignored in VFI5. In other classification algorithms, such as decision tree inductive learning algorithms, the missing values require extra care [14]. This problem has been overcome by simply omitting the feature

Acknowledgements

This project is supported by TUBITAK (Scientific and Technical Research Council of Turkey) under Grant EEEAG-153. The authors thank Narin Emeksiz for preparing the user interface for the VFI5 program.

References (15)

There are more references available in the full text version of this article.

Cited by (194)

  • Consensus Clustering With Co-Association Matrix Optimization

    2024, IEEE Transactions on Neural Networks and Learning Systems
View all citing articles on Scopus
1

Present address. Microsoft Corporation, Redmond, WA 98052, USA. Tel.: +1 425 9366181; fax: +1 425 9367329; e-mail: [email protected]

View full text