skip to main content
10.1145/3195106.3195110acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

Non-negative Matrix Factorization for Overlapping Clustering of Customer Inquiry and Review Data

Authors Info & Claims
Published:26 February 2018Publication History

ABSTRACT

Considering the complexity of clustering text datasets in terms of informal user generated content and the fact that there are multiple labels for each data point in many informal user generated content datasets, this paper focuses on Non-negative Matrix Factorization (NMF) algorithms for Overlapping Clustering of customer inquiry and review data, which has seldom been discussed in previous literature. We extend the use of Semi-NMF and Convex-NMF to Overlapping Clustering and develop a procedure of applying SemiNMF and Convex-NMF on Overlapping Clustering of text data. The developed procedure is tested based on customer review and inquiry datasets. The results of comparing SemiNMF and Convex-NMF with a baseline model demonstrate that they have advantages over the baseline model, since they do not need to adjust parameters to obtain similarly strong clustering performances. Moreover, we compare different methods of picking labels for generating Overlapping Clustering results from Soft Clustering algorithms, and it is concluded that thresholding by mean method is a simpler and relatively more reliable method compared to maximum n method.

References

  1. Amigó, E., Gonzalo, J., Artiles, J., and Verdejo, F. A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval 12, 4 (2009), 461--486. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Asri, L. E., Schulz, H., Sharma, S., Zumer, J., Harris, J., Fine, E., Mehrotra, R., and Suleman, K. Frames: A corpus for adding memory to goal-oriented dialogue systems. CoRR abs/arXiv:1704.00057 (2017).Google ScholarGoogle Scholar
  3. Bezdek, J., Ehrlich, R., and Full, W. Fcm: The fuzzy c-means clustering algorithm. Computers Geosciences 10, 2-3 (1984), 191--203.Google ScholarGoogle ScholarCross RefCross Ref
  4. Ding, C. H. Q., Li, T., and Jordan, M. I. Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1 (Jan. 2010), 45--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Lazar, C., and Doncescu, A. Non negative matrix factorization clustering capabilities; application on multivariate image segmentation. In CISIS (2009), L. Barolli, F. Xhafa, and H.-H. Hsu, Eds., IEEE Computer Society, pp. 924--929.Google ScholarGoogle Scholar
  6. Lee, D. D., and Seung, H. S. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788--791.Google ScholarGoogle ScholarCross RefCross Ref
  7. Lee, D. D., and Seung, H. S. Algorithms for non-negative matrix factorization. In NIPS (2000), MIT Press, pp. 556--562. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. McAuley, J., and Leskovec, J. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems (New York, NY, USA, 2013), RecSys '13, ACM, pp. 165--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Pennington, J., Socher, R., and Manning, C. D. Glove word embedding, 2013. https://nlp.stanford.edu/projects/glove/.Google ScholarGoogle Scholar
  10. Pennington, J., Socher, R., and Manning, C. D. Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP) (2014), pp. 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  11. Ratcliff, J. W., and Metzener, D. E. Pattern matching: The gestalt approach. 46, 47, 59--51, 68--72.Google ScholarGoogle Scholar
  12. Shahnaz, F., Berry, M. W., Pauca, V. P., and Plemmons, R. J. Document clustering using nonnegative matrix factorization. Inf. Process. Manage. 42, 2 (Mar. 2006), 373--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thorndike, R. L. Who belongs in the family. Psychometrika (1953), 267--276.Google ScholarGoogle Scholar
  14. Xu, W., Liu, X., and Gong, Y. Document clustering based on non-negative matrix factorization. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval (New York, NY, USA, 2003), SIGIR 2003, ACM, pp. 267--273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yelp. Yelp review dataset, 2013. https://www.yelp.com/dataset challenge.Google ScholarGoogle Scholar
  16. Zhao, Y., and Karypis, G. Criterion functions for document clustering: Experiments and analysis. In Technical Report TR 01-40, Department of Computer Science, University of Minnesota, Minneapolis, MN (2001).Google ScholarGoogle Scholar

Index Terms

  1. Non-negative Matrix Factorization for Overlapping Clustering of Customer Inquiry and Review Data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICMLC '18: Proceedings of the 2018 10th International Conference on Machine Learning and Computing
        February 2018
        411 pages
        ISBN:9781450363532
        DOI:10.1145/3195106

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 February 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)3
        • Downloads (Last 6 weeks)2

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader