Random Set to Interpret Topic Models in Terms of Ontology Concepts

Bashar, Md Abul; Li, Yuefeng

doi:10.1007/978-3-319-63004-5_19

Random Set to Interpret Topic Models in Terms of Ontology Concepts

Md Abul Bashar¹⁶ &
Yuefeng Li¹⁶

Conference paper
First Online: 09 July 2017

1450 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10400))

Abstract

Topic modelling is a popular technique in text mining. However, discovered topic models are difficult to interpret due to incoherence and lack of background context. Many applications require an accurate interpretation of topic models so that both users and machines can use them effectively. Taking the advantage of random set and a domain ontology, this research can interpret the topic models. The interpretation is evaluated by comparing it with different baseline models on two standard datasets. The results show that the performance of the interpretation is significantly better than baseline models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://svmlight.joachims.org.

References

Blei, D., Lafferty, J.: Correlated topic models. Adv. Neural Inform. Process. Syst. 18, 147 (2006)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Brewster, C., Alani, H., Dasmahapatra, S., Wilks, Y.: Data driven ontology evaluation. In: International Conference on Language Resources and Evaluation (LREC 2004) (2004)
Google Scholar
Brody, S., Lapata, M.: Bayesian word sense induction. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 103–111. Association for Computational Linguistics (2009)
Google Scholar
Calegari, S., Pasi, G.: Personal ontologies: generation of user profiles based on the yago ontology. Inform. Process. Manag. 49(3), 640–658 (2013)
Article Google Scholar
Chaney, A.J.-B., Blei, D.M.: Visualizing topic models. In: ICWSM (2012)
Google Scholar
Chemudugunta, C., Holloway, A., Smyth, P., Steyvers, M.: Modeling documents by combining semantic concepts with unsupervised statistical learning. In: Sheth, A., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 229–244. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88564-1_15
Chapter Google Scholar
Gao, Y., Xu, Y., Li, Y.: Pattern-based topics for document modelling in information filtering. IEEE Trans. Knowl. Data Eng. 27(6), 1629–1642 (2015)
Article Google Scholar
Goutsias, J., Mahler, R.P., Nguyen, H.T.: Random Sets: Theory and Applications, vol. 97. Springer Science & Business Media, New York (2012)
Google Scholar
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. 101(suppl 1), 5228–5235 (2004)
Article Google Scholar
Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 362–370. Association for Computational Linguistics (2009)
Google Scholar
Hu, Z., Luo, G., Sachan, M., Xing, E., Nie, Z.: Grounding topic models with knowledge bases. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence (2016)
Google Scholar
Hulpus, I., Hayes, C., Karnstedt, M., Greene, D.: Unsupervised graph-based topic labelling using DBpedia. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 465–474. ACM (2013)
Google Scholar
Kruse, R., Schwecke, E., Heinsohn, J.: Uncertainty and Vagueness in Knowledge Based Systems. Springer, New York (1991)
Book MATH Google Scholar
Lau, J.H., Grieser, K., Newman, D., Baldwin, T.: Automatic labelling of topic models. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1536–1545. Association for Computational Linguistics (2011)
Google Scholar
Lau, J.H., Newman, D., Karimi, S., Baldwin, T.: Best topic word selection for topic labelling. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 605–613. Association for Computational Linguistics (2010)
Google Scholar
Li, Y., Algarni, A., Albathan, M., Shen, Y., Bijaksana, M.A.: Relevance feature discovery for text mining. IEEE Trans. Knowl. Data Eng. 27(6), 1656–1669 (2015)
Article Google Scholar
Mao, X.-L., Ming, Z.-Y., Zha, Z.-J., Chua, T.-S., Yan, H., Li, X.: Automatic labeling hierarchical topics. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2383–2386. ACM (2012)
Google Scholar
Mei, Q., Shen, X., Zhai, C.: Automatic labeling of multinomial topic models. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 490–499. ACM (2007)
Google Scholar
Mei, Q., Zhai, C.: A mixture model for contextual text mining. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 649–655. ACM (2006)
Google Scholar
Mimno, D., Wallach, H.M., Naradowsky, J., Smith, D.A., McCallum, A.: Polylingual topic models. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 2, pp. 880–889. Association for Computational Linguistics (2009)
Google Scholar
Robertson, S., Zaragoza, H., Taylor, M.: Simple BM25 extension to multiple weighted fields. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pp. 42–49. ACM (2004)
Google Scholar
Robertson, S.E., Soboroff, I.: The TREC 2002 filtering track report. In: TREC, vol. 2002, p. 5 (2002)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)
Article Google Scholar
Shen, Y., Li, Y., Xu, Y.: Adopting relevance feature to learn personalized ontologies. In: Thielscher, M., Zhang, D. (eds.) AI 2012. LNCS, vol. 7691, pp. 457–468. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35101-3_39
Chapter Google Scholar
Sieg, A., Mobasher, B., Burke, R.: Web search personalization with ontological user profiles. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 525–534. ACM (2007)
Google Scholar
Song, Y., Wang, H., Wang, Z., Li, H., Chen, W.: Short text conceptualization using a probabilistic knowledgebase. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 3, pp. 2330–2336. AAAI Press (2011)
Google Scholar
Spasic, I., Ananiadou, S., McNaught, J., Kumar, A.: Text mining and ontologies in biomedicine: making sense of raw text. Brief. Bioinform. 6(3), 239–251 (2005)
Article Google Scholar
Steyvers, M., Griffiths, T.: Probabilistic topic models. Handb. Latent Semant. Anal. 427(7), 424–440 (2007)
Google Scholar
Sun, X., Xiao, Y., Wang, H., Wang, W.: On conceptual labeling of a bag of words. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 1326–1332. AAAI Press (2015)
Google Scholar
Tran, T., Cimiano, P., Rudolph, S., Studer, R.: Ontology-based interpretation of keywords for semantic search. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 523–536. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76298-0_38
Chapter Google Scholar
Wang, X., McCallum, A.: Topics over time: a non-markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424–433. ACM (2006)
Google Scholar
Wang, X., McCallum, A., Wei. X.: Topical n-grams: phrase and topic discovery, with an application to information retrieval. In: Seventh IEEE International Conference on Data Mining, ICDM 2007, pp. 697–702. IEEE (2007)
Google Scholar
Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185. ACM (2006)
Google Scholar
Wu, S.-T., Li, Y., Xu, Y.: Deploying approaches for pattern refinement in text mining. In: Sixth International Conference on Data Mining, ICDM 2006, pp. 1157–1161. IEEE (2006)
Google Scholar
Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing itemset patterns: a profile-based approach. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 314–323. ACM (2005)
Google Scholar
Yi, K., Chan, L.M.: Linking folksonomy to library of congress subject headings: an exploratory study. J. Document. 65(6), 872–900 (2009)
Article Google Scholar

Download references

Acknowledgment

This research was partially supported by Grant DP140103157 from the Australian Research Council (ARC Discovery Project).

Author information

Authors and Affiliations

Electrical Engineering and Computer Science School, Queensland University of Technology (QUT), Brisbane, 4001, Australia
Md Abul Bashar & Yuefeng Li

Authors

Md Abul Bashar
View author publications
You can also search for this author in PubMed Google Scholar
Yuefeng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Md Abul Bashar .

Editor information

Editors and Affiliations

La Trobe University, Melbourne, Australia
Wei Peng
La Trobe Business School, La Trobe University, Bundoora, Victoria, Australia
Damminda Alahakoon
RMIT University, Melbourne, Australia
Xiaodong Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bashar, M.A., Li, Y. (2017). Random Set to Interpret Topic Models in Terms of Ontology Concepts. In: Peng, W., Alahakoon, D., Li, X. (eds) AI 2017: Advances in Artificial Intelligence. AI 2017. Lecture Notes in Computer Science(), vol 10400. Springer, Cham. https://doi.org/10.1007/978-3-319-63004-5_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-63004-5_19
Published: 09 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63003-8
Online ISBN: 978-3-319-63004-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics