research-article

Open Access

Smart Multitask Bregman Clustering and Multitask Kernel Clustering

Authors:
Xianchao Zhang

Dalian University of Technology, Dalian, China

Dalian University of Technology, Dalian, China
View Profile

,
Xiaotong Zhang

Dalian University of Technology, Dalian, China

Dalian University of Technology, Dalian, China
View Profile

,
Han Liu

Dalian University of Technology, Dalian, China

Dalian University of Technology, Dalian, China
View Profile

Authors Info & Claims

ACM Transactions on Knowledge Discovery from Data Volume 10 Issue 1Article No.: 8pp 1–29https://doi.org/10.1145/2747879

Published:22 July 2015Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Traditional clustering algorithms deal with a single clustering task on a single dataset. However, there are many related tasks in the real world, which motivates multitask clustering. Recently some multitask clustering algorithms have been proposed, and among them multitask Bregman clustering (MBC) is a very applicable method. MBC alternatively updates clusters and learns relationships between clusters of different tasks, and the two phases boost each other. However, the boosting does not always have positive effects on improving the clustering performance, it may also cause negative effects. Another issue of MBC is that it cannot deal with nonlinear separable data. In this article, we show that in MBC, the process of using cluster relationship to boost the cluster updating phase may cause negative effects, that is, cluster centroids may be skewed under some conditions. We propose a smart multitask Bregman clustering (S-MBC) algorithm which can identify the negative effects of the boosting and avoid the negative effects if they occur. We then propose a multitask kernel clustering (MKC) framework for nonlinear separable data by using a similar framework like MBC in the kernel space. We also propose a specific optimization method, which is quite different from that of MBC, to implement the MKC framework. Since MKC can also cause negative effects like MBC, we further extend the framework of MKC to a smart multitask kernel clustering (S-MKC) framework in a similar way that S-MBC is extended from MBC. We conduct experiments on 10 real world multitask clustering datasets to evaluate the performance of S-MBC and S-MKC. The results on clustering accuracy show that: (1) compared with the original MBC algorithm MBC, S-MBC and S-MKC perform much better; (2) compared with the convex discriminative multitask relationship clustering (DMTRC) algorithms DMTRC-L and DMTRC-R which also avoid negative transfer, S-MBC and S-MKC perform worse in the (ideal) case in which different tasks have the same cluster number and the empirical label marginal distribution in each task distributes evenly, but better or comparable in other (more general) cases. Moreover, S-MBC and S-MKC can work on the datasets in which different tasks have different number of clusters, violating the assumptions of DMTRC-L and DMTRC-R. The results on efficiency show that S-MBC and S-MKC consume more computational time than MBC and less computational time than DMTRC-L and DMTRC-R. Overall S-MBC and S-MKC are competitive compared with the state-of-the-art multitask clustering algorithms in synthetical terms of accuracy, efficiency and applicability.

References

Rie K. Ando and Tong Zhang. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research 6, 1817--1853. Google ScholarDigital Library
Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil. 2006. Multi-task feature learning. In Advances in Neural Information Processing Systems 19. Vancouver, British Columbia, Canada, 41--48.Google Scholar
Andrew Arnold, Ramesh Nallapati, and William W. Cohen. 2007. A comparative study of methods for transductive transfer learning. In Workshops Proceedings of the Seventh IEEE International Conference on Data Mining. Omaha, Nebraska, USA, 77--82. Google ScholarDigital Library
Bart Bakker and Tom Heskes. 2003. Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research 4, 83--99. Google ScholarDigital Library
Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, and Joydeep Ghosh. 2005. Clustering with bregman divergences. Journal of Machine Learning Research 6, 1705--1749. Google ScholarDigital Library
Edwin V. Bonilla, Kian Ming A. Chai, and Christopher K. I. Williams. 2007. Multi-task gaussian process prediction. In Advances in Neural Information Processing Systems 20. Vancouver, British Columbia, Canada, 153C160.Google Scholar
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press, New York, NY, USA. Google ScholarDigital Library
Lev M. Bregman. 1967. The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. U. S. S. R. Comput. Math. and Math. Phys. 7, 3, 200--217.Google ScholarCross Ref
Rich Caruana. 1997. Multitask learning. Machine Learning 28, 1, 41--75. Google ScholarDigital Library
Jianhui Chen, Lei Tang, Jun Liu, and Jieping Ye. 2009. A convex formulation for learning shared structures from multiple tasks. In Proceedings of the Twenty-Sixth International Conference on Machine Learning. Montreal, Quebec, Canada, 137--144. Google ScholarDigital Library
Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2007a. Co-clustering based classification for out-of-domain documents. In Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Jose, California, USA, 210--219. Google ScholarDigital Library
Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007b. Boosting for transfer learning. In Proceedings of the Twenty-Fourth International Conference on Machine Learning. Corvalis, Oregon, USA, 193--200. Google ScholarDigital Library
Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2008. Self-taught clustering. In Proceedings of the Twenty-Fifth International Conference on Machine Learning. Helsinki, Finland, 200--207. Google ScholarDigital Library
Inderjit S. Dhillon and Suvrit Sra. 2005. Generalized nonnegative matrix approximations with bregman divergences. In Advances in Neural Information Processing Systems 18. Vancouver, British Columbia, CanadaGoogle ScholarDigital Library
Chris H. Q. Ding, Tao Li, and Michael I. Jordan. 2010. Convex and semi-nonnegative matrix factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1, 45--55. Google ScholarDigital Library
Theodoros Evgeniou, Charles A. Micchelli, and Massimiliano Pontil. 2005. Learning multiple tasks with kernel methods. Journal of Machine Learning Research 6, 615--637. Google ScholarDigital Library
Theodoros Evgeniou and Massimiliano Pontil. 2004. Regularized multi--task learning. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, Washington, USA, 109--117. Google ScholarDigital Library
Hongliang Fei and Jun Huan. 2013. Structured feature selection and task relationship inference for multi-task learning. Knowledge and Information Systems 35, 2, 345--364.Google ScholarCross Ref
Quanquan Gu, Zhenhui Li, and Jiawei Han. 2011. Learning a kernel for multi-task clustering. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence. San Francisco, California, USA, 368--373.Google ScholarDigital Library
Quanquan Gu and Jie Zhou. 2009. Learning the shared subspace for multi-task clustering and transductive transfer classification. In Proceedings of the Ninth IEEE International Conference on Data Mining. Miami, Florida, USA, 159--168. Google ScholarDigital Library
Laurent Jacob, Francis Bach, and Jean-Philippe Vert. 2008. Clustered multi-task learning: A convex formulation. In Advances in Neural Information Processing Systems 21. Vancouver, British Columbia, Canada, 745--752.Google Scholar
Wenhao Jiang and Fu-Lai Chung. 2012. Transfer spectral clustering. In Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases Part II. Bristol, UK, 789--803. Google ScholarDigital Library
Neil D. Lawrence and John C. Platt. 2004. Learning to learn with the informative vector machine. In Proceedings of the Twenty-First International Conference on Machine Learning. Banff, Alberta, Canada. Google ScholarDigital Library
Xuejun Liao, Ya Xue, and Lawrence Carin. 2005. Logistic regression with an auxiliary data source. In Proceedings of the Twenty-Second International Conference on Machine Learning. Bonn, Germany, 505--512. Google ScholarDigital Library
Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2008. Spectral domain-transfer learning. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas, Nevada, USA, 488--496. Google ScholarDigital Library
Charles A. Micchelli and Massimiliano Pontil. 2004. Kernels for Multi-task Learning. In Advances in Neural Information Processing Systems 17. Vancouver, British Columbia, Canada.Google Scholar
Morten Mørup and Lars Kai Hansen. 2009. An Exact Relaxation of Clustering. Technical Report. Technical University of Denmark.Google Scholar
Thach Huy Nguyen, Hao Shao, Bin Tong, and Einoshin Suzuki. 2011. A compression-based dissimilarity measure for multi-task clustering. In Proceedings of the Nineteenth International Symposium on Methodologies for Intelligent Systems. Warsaw, Poland, 123--132. Google ScholarDigital Library
Thach Huy Nguyen, Hao Shao, Bin Tong, and Einoshin Suzuki. 2013. A feature-free and parameter-light multi-task clustering framework. Knowledge and Information Systems 36, 1, 251--276.Google ScholarDigital Library
Frank Nielsen and Richard Nock. 2009. Sided and symmetrized bregman centroids. IEEE Transactions on Information Theory 55, 6, 2882--2904. Google ScholarDigital Library
Sinno Jialin Pan, James T. Kwok, and Qiang Yang. 2008. Transfer learning via dimensionality reduction. In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence. Chicago, Illinois, USA, 677--682. Google ScholarDigital Library
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10, 1345--1359. Google ScholarDigital Library
Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer, and Andrew Y. Ng. 2007. Self-taught learning: Transfer learning from unlabeled data. In Proceedings of the Twenty-Fourth International Conference on Machine Learning. Corvalis, Oregon, USA, 759--766. Google ScholarDigital Library
Bernardino Romera-Paredes, Andreas Argyriou, Nadia Berthouze, and Massimiliano Pontil. 2012. Exploiting unrelated tasks in multi-task learning. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics. La Palma, Canary Islands, 951--959.Google Scholar
Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas. 1998. A metric for distributions with applications to image databases. In Proceedings of the Sixth International Conference on Computer Vision. Bombay, India, 59--66. Google ScholarDigital Library
Avishek Saha, Piyush Rai, Hal Daumé III, and Suresh Venkatasubramanian. 2011. Online learning of multiple tasks and their relationships. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, USA, 643--651.Google Scholar
Craig Saunders, Mark O. Stitson, Jason Weston, Léon Bottou, Bernhard Schölkopf, and Alexander J. Smola. 1998. Support Vector Machine Reference Manual. Technical Report CSD-TR-98-03. Royal Holloway College, University of London.Google Scholar
John Shawe-Taylor and Nello Cristianini. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press. I--XIV, 1--462 pages. Google ScholarDigital Library
Pengcheng Wu and Thomas G. Dietterich. 2004. Improving SVM accuracy by training on auxiliary data sources. In Proceedings of the Twenty-First International Conference on Machine Learning. Banff, Alberta, Canada. Google ScholarDigital Library
Saining Xie, Hongtao Lu, and Yangcheng He. 2012. Multi-task co-clustering via nonnegative matrix factorization. In Proceedings of the Twenty-First International Conference on Pattern Recognition. Tsukuba, Japan, 2954--2958.Google Scholar
Wei Xu, Xin Liu, and Yihong Gong. 2003. Document clustering based on non-negative matrix factorization. In Proceedings of the Twenty-Sixth International ACM SIGIR Conference on Research and Development in Information Retrieval. Toronto, Canada, 267--273. Google ScholarDigital Library
Jianwen Zhang and Changshui Zhang. 2010. Multitask bregman clustering. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence. Atlanta, Georgia, USA.Google ScholarDigital Library
Jianwen Zhang and Changshui Zhang. 2011. Multitask bregman clustering. Neurocomputing 74, 10, 1720--1734. Google ScholarDigital Library
Xiao-Lei Zhang. 2015. Convex discriminative multitask clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 1, 28--40.Google ScholarCross Ref
Xianchao Zhang and Xiaotong Zhang. 2013. Smart multi-task bregman clustering and multi-task kernel clustering. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence. Bellevue, Washington, USA, 1034--1040.Google ScholarDigital Library
Yin Zhang. 1996. Solving Large-Scale Linear Programs by Interior-Point Methods Under the MATLAB Environment. Technical Report TR96-01. Department of Mathematics and Statistics, University of Maryland Baltimore County.Google Scholar
Yu Zhang and Dit-Yan Yeung. 2010. A convex formulation for learning task relationships in multi-task learning. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence. Catalina Island, CA, USA, 733--442.Google ScholarDigital Library
Yu Zhang and Dit-Yan Yeung. 2012a. Multi-task boosting by exploiting task relationships. In Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases Part I. Bristol, UK, 697--710. Google ScholarDigital Library
Yu Zhang and Dit-Yan Yeung. 2012b. Transfer metric learning with semi-supervised extension. ACM Transactions on Intelligent Systems and Technology 3, 3. Google ScholarDigital Library
Yu Zhang and Dit-Yan Yeung. 2014. A regularization approach to learning task relationships in multi-task learning. ACM Transactions on Knowledge Discovery from Data, accepted. Google ScholarDigital Library
Zhihao Zhang and Jie Zhou. 2012. Multi-task clustering via domain adaptation. Pattern Recognition 45, 1, 465--473. Google ScholarDigital Library
Shi Zhong and Joydeep Ghosh. 2003. A unified framework for model-based clustering. Journal of Machine Learning Research 4, 1001--1037. Google ScholarDigital Library
Jiayu Zhou, Jianhui Chen, and Jieping Ye. 2011. Clustered multi-task learning via alternating structure optimization. In Advances in Neural Information Processing Systems 24. Granada, Spain, 702--710.Google Scholar

Index Terms

Smart Multitask Bregman Clustering and Multitask Kernel Clustering
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis

Recommendations

Multitask fuzzy Bregman co-clustering approach for clustering data with multisource features

In usual real-world clustering problems, the set of features extracted from the data has two problems which prevent the methods from accurate clustering. First, the features extracted from the samples provide poor information for clustering purpose. ...
Read More
Multitask Bregman clustering

Traditional clustering methods deal with a single clustering task on a single data set. In some newly emerging applications, multiple similar clustering tasks are involved simultaneously. In this case, we not only desire a partition for each task, but ...
Read More
Multitask bregman clustering
AAAI'10: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence

Traditional clustering methods deal with a single clustering task on a single data set. However, in some newly emerging applications, multiple similar clustering tasks are involved simultaneously. In this case, we not only desire a partition for each ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 10, Issue 1
July 2015
321 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/2808688
Editor:
Philip S. Yu
University of Illinois at Chicago, USA
Issue’s Table of Contents
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 July 2015
- Accepted: 1 March 2015
- Revised: 1 December 2014
- Received: 1 July 2014
Published in tkdd Volume 10, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Bregman divergence
Mercer kernel
Multitask clustering
negative transfer
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 747
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Smart Multitask Bregman Clustering and Multitask Kernel Clustering

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Multitask fuzzy Bregman co-clustering approach for clustering data with multisource features

Multitask Bregman clustering

Multitask bregman clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Smart Multitask Bregman Clustering and Multitask Kernel Clustering

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Multitask fuzzy Bregman co-clustering approach for clustering data with multisource features

Multitask Bregman clustering

Multitask bregman clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media