Abstract
Clustering as a part of data mining automates the process of collecting similar documents in a single cluster by grouping like ones together. With the help of clusters, we can organize text documents which are similar at a single place and it helps us to group other unknown documents in future to be assigned to one of the known cluster based on the similarity measure. Automatic clustering is usually based on words. In this work, we have used two approaches for clustering using Neutrosophic logic. While using fuzzy logic we take into account only two values; degree of truth and degree of falsity, whereas, in Neutrosophic logic, a new factor called as indeterminacy is also involved. Indeterminacy applies to the situation when for a particular document it is not sure that to which cluster it belongs. The first approach added the indeterminacy factor of Neutrosophic logic to Fuzzy C Means clustering method and modified the formula which calculates the cluster centers and the truth membership of documents toward clusters. The second approach has three phases. First, generate the dataset according to the relative frequency of words in a document. Second, decide seed documents for different clusters with the help of Euclidean distance between different documents. Finally calculate the T, I, and F values for all documents with respect to all clusters. Then decide the cluster for each document on the basis of T, I, and F values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hartigan JA (1975) Clustering algorithms. Wiley, London
Olson, DL, Delen D (2008) Advanced data mining techniques, 1st edn. Springer, Berlin, p 138. (February 1, 2008), ISBN 3-540-76916-1
Akhtar N, Ahamad MV (2015) A modified fuzzy C means clustering using neutrosophic logic. In: Proceedings of IEEE fifth international conference on communication systems and network technologies (CSNT). ISSN/ISBN 978-1-4799-1797-6/15, 10.1109/CSNT.2015.164, pp 1124–1128
Hartigan JA, Wong MA (1979) Algorithm AS 136: a K-means clustering algorithm. J Royal Stat Soc, Ser C 28(1):100–108. JSTOR 2346830
Suganya R, Shanthi R (2012) Fuzzy C-means algorithm—a review. Inter J Sci Res Publ 2(11)
Bezdek J (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Upper Saddle River, NJ. ​​ISBN:0-13-022278-X
Zadeh L (1965) Fuzzy sets. Inf Control 8:338–352
Dunn J (1973) A fuzzy relative of the Isodata process and its use in detecting compact, well-separated clusters. J Cybern 3(3):32–57
Smarandache F (1998) Neutrosophy / neutrosophic probability, set, and logic. American Research Press, Rehoboth, NM
Bezdek J, Hathaway R (1988) Recent convergence results for the fuzzy c-means clustering algorithms. J Classif 5(2):237–247
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Akhtar, N., Qureshi, M.N., Ahamad, M.V. (2017). An Improved Clustering Method for Text Documents Using Neutrosophic Logic. In: Ali, R., Beg, M. (eds) Applications of Soft Computing for the Web. Springer, Singapore. https://doi.org/10.1007/978-981-10-7098-3_10
Download citation
DOI: https://doi.org/10.1007/978-981-10-7098-3_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7097-6
Online ISBN: 978-981-10-7098-3
eBook Packages: Computer ScienceComputer Science (R0)