research-article

From Group to Individual Labels Using Deep Features

Authors:
Dimitrios Kotzias

University of California Irvine, Irvine, CA, USA

University of California Irvine, Irvine, CA, USA
View Profile

,
Misha Denil

University of Oxford, Oxford, United Kingdom

University of Oxford, Oxford, United Kingdom
View Profile

,
Nando de Freitas

University of Oxford, Oxford, United Kingdom

University of Oxford, Oxford, United Kingdom
View Profile

,
Padhraic Smyth

University of California Irvine, Irvine, CA, USA

University of California Irvine, Irvine, CA, USA
View Profile

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2015Pages 597–606https://doi.org/10.1145/2783258.2783380

Published:10 August 2015Publication History

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 597–606

ABSTRACT

In many classification problems labels are relatively scarce. One context in which this occurs is where we have labels for groups of instances but not for the instances themselves, as in multi-instance learning. Past work on this problem has typically focused on learning classifiers to make predictions at the group level. In this paper we focus on the problem of learning classifiers to make predictions at the instance level. To achieve this we propose a new objective function that encourages smoothness of inferred instance-level labels based on instance-level similarity, while at the same time respecting group-level label constraints. We apply this approach to the problem of predicting labels for sentences given labels for reviews, using a convolutional neural network to infer sentence similarity. The approach is evaluated using three large review data sets from IMDB, Yelp, and Amazon, and we demonstrate the proposed approach is both accurate and scalable compared to various alternatives.

References

S. Andrews, I. Tsochantaridis, and T. Hofmann. Support vector machines for multiple-instance learning. In Advances in Neural Information Processing Systems 15, pages 561--568, 2002.Google Scholar
Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137--1155, 2003. Google ScholarDigital Library
R. C. Bunescu and R. J. Mooney. Multiple instance learning for sparse positive bags. In International Conference on Machine Learning, International Conference on Machine Learning, pages 105--112, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
O. Chapelle and A. Zien. Semi--supervised classification by low density separation. In International Workshop on Artificial Intelligence and Statistics, pages 57--64, 2005.Google Scholar
V. Cheplygina, D. M. Tax, and M. Loog. On classification with bags, groups and sets. arXiv preprint arXiv:1406.0281, 2014.Google Scholar
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural language processing (almost) from scratch.Google Scholar
M. Denil, A. Demiraj, and N. de Freitas. Extraction of salient sentences from labelled documents. Technical report, University of Oxford, 2014.Google Scholar
T. G. Dietterich, R. H. Lathrop, T. Lozano-Perez, and A. Pharmaceutical. Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence, 89:31--71, 1997. Google ScholarDigital Library
J. Foulds and E. Frank. A review of multi-instance learning assumptions. The Knowledge Engineering Review, 25(01):1--25, 2010. Google ScholarDigital Library
T. Gartner, P. A. Flach, A. Kowalczyk, and A. J. Smola. Multi-instance kernels. In In Proc. 19th International Conf. on Machine Learning, pages 179--186. Morgan Kaufmann, 2002. Google ScholarDigital Library
G. E. Hinton. Learning distributed representations of concepts. In Annual Conference of the Cognitive Science Society, pages 1--12, 1986.Google Scholar
M. Kandemir and F. A. Hamprecht. Instance label prediction by Dirichlet process multiple instance learning. In Uncertainty in Artificial Intelligence, 2014.Google Scholar
D. Kifer. Attacks on privacy and de Finetti's theorem. In International Conference on Management of Data, pages 127--138, 2009. Google ScholarDigital Library
H. Kueck, P. Carbonetto, and N. Freitas. A constrained semi-supervised learning approach to data association. In European Conference on Computer Vision, pages 1--12, 2004.Google ScholarCross Ref
H. Kueck and N. de Freitas. Learning about individuals from group statistics. In Uncertainty in Artificial Intelligence, pages 332--339, 2005.Google Scholar
Q. Le and T. Mikolov. Distributed representations of sentences and documents. In International Conference on Machine Learning, volume 32, pages 1188--1196, 2014.Google Scholar
Y. Li, J. Hu, Y. Jiang, and Z. Zhou. Towards discovering what patterns trigger what labels. In Conference on Artificial Intelligence, 2012.Google Scholar
Y.-F. Li, J. T. Kwok, I. W. Tsang, and Z.-H. Zhou. A convex method for locating regions of interest with multi-instance learning. In European Conference on Machine Learning and Knowledge Discovery in Databases: Part II, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, pages 15--30, 2009. Google ScholarDigital Library
G. Liu, J. Wu, and Z. Zhou. Key instance detection in multi-instance learning. In Asian Conference on Machine Learning, pages 253--268, 2012.Google Scholar
A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. Learning word vectors for sentiment analysis. In Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 142--150, 2011. Google ScholarDigital Library
O. Maron and A. L. Ratan. Multiple-instance learning for natural scene classification. In International Conference on Machine Learning, pages 341--349, 1998. Google ScholarDigital Library
J. McAuley and J. Leskovec. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Conference on Recommender Systems, RecSys '13, pages 165--172, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Neural Information Processing Systems, pages 3111--3119, 2013.Google ScholarDigital Library
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In International Conference on Learning Representations, 2013.Google Scholar
N. Pappas and A. Popescu-Belis. Explaining the stars: Weighted multiple-instance learning for aspect-based sentiment analysis. In Conference on Empirical Methods in Natural Language Processing, pages 455--466, Doha, Qatar, October 2014.Google ScholarCross Ref
G. Patrini, R. Nock, T. Caetano, and P. Rivera. (almost) no label no cry. In Advances in Neural Information Processing Systems 27, pages 190--198. Curran Associates, Inc., 2014.Google ScholarDigital Library
J. Pennington, R. Socher, and C. Manning. Glove: Global vectors for word representation. In Conference on Empirical Methods in Natural Language Processing, pages 1532--1543, October 2014.Google ScholarCross Ref
N. Quadrianto, A. J. Smola, T. S. Caetano, and Q. V. Le. Estimating labels from label proportions. Journal of Machine Learning Research, 10:2349--2374, 2009. Google ScholarDigital Library
A. Shrivastava and P. Li. Asymmetric LSH (ALSH) for sublinear time maximum inner product search (mips). In Advances in Neural Information Processing Systems 27, pages 2321--2329. Curran Associates, Inc., 2014.Google Scholar
R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Ng, and C. Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Conference on Empirical Methods in Natural Language Processing, pages 1631--1642, 2013.Google Scholar
X.-S. Wei, J. Wu, and Z.-H. Zhou. Scalable multi-instance learning. In International Conference on Data Mining, pages 1037--1042, 2014. Google ScholarDigital Library
N. Weidmann, E. Frank, and B. Pfahringer. A two-level learning method for generalized multi-instance problems. In European Conference on Machine Learning, volume 2837, pages 468--479, 2003.Google ScholarDigital Library
X. Xu and E. Frank. Logistic regression and boosting for labeled bags of instances. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 272--281, 2004.Google ScholarCross Ref
F. X. Yu, D. Liu, S. Kumar, T. Jebara, and S.-F. Chang.(\propto\)svm for learning with label proportions. In International Conference on Machine Learning, volume 28, pages 504--512, 2013.Google Scholar
Z.-H. Zhou, Y.-Y. Sun, and Y.-F. Li. Multi-instance learning by treating instances as non-iid samples. In International Conference on Machine Learning, pages 1249--1256. ACM, 2009. Google ScholarDigital Library
Z.-H. Zhou and J.-M. Xu. On the relation between multi-instance learning and semi-supervised learning. In International Conference on Machine Learning, pages 1167--1174, 2007. Google ScholarDigital Library

Index Terms

From Group to Individual Labels Using Deep Features
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
  2. Machine learning

Recommendations

ImageNet classification with deep convolutional neural networks

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0%, ...
Read More
Deep Learning--based Text Classification: A Comprehensive Review

Deep learning--based models have surpassed classical machine learning--based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural language inference. In this article, we ...
Read More
Hidden factors and hidden topics: understanding rating dimensions with review text
RecSys '13: Proceedings of the 7th ACM conference on Recommender systems

In order to recommend products to users we must ultimately predict how a user will respond to a new product. To do so we must uncover the implicit tastes of each user as well as the properties of each product. For example, in order to predict whether a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2015
2378 pages
ISBN:9781450336642
DOI:10.1145/2783258
General Chairs:
Longbing Cao
University of Technology, Sydney
,
Chengqi Zhang
University of Technology, Sydney
,
Program Chairs:
Thorsten Joachims
Cornell University
,
Geoff Webb
Monash University
,
Dragos D. Margineantu
Boeing Research
,
Graham Williams
Australian Taxation Office
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
multi-instance learning
sentiment analysis
unsupervised learning
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 150
  Total Citations
  View Citations
- 2,067
  Total Downloads
- Downloads (Last 12 months)117
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

From Group to Individual Labels Using Deep Features

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

ImageNet classification with deep convolutional neural networks

Deep Learning--based Text Classification: A Comprehensive Review

Hidden factors and hidden topics: understanding rating dimensions with review text