IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Special Section on Recent Advances in Machine Learning for Spoken Language Processing
Short Text Classification Based on Distributional Representations of Words
Chenglong MAQingwei ZHAOJielin PANYonghong YAN
Author information
JOURNAL FREE ACCESS

2016 Volume E99.D Issue 10 Pages 2562-2565

Details
Abstract

Short texts usually encounter the problem of data sparseness, as they do not provide sufficient term co-occurrence information. In this paper, we show how to mitigate the problem in short text classification through word embeddings. We assume that a short text document is a specific sample of one distribution in a Gaussian-Bayesian framework. Furthermore, a fast clustering algorithm is utilized to expand and enrich the context of short text in embedding space. This approach is compared with those based on the classical bag-of-words approaches and neural network based methods. Experimental results validate the effectiveness of the proposed method.

Content from these authors
© 2016 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top