research-article

Spectral domain-transfer learning

Authors:
Xiao Ling

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China
View Profile

,
Wenyuan Dai

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China
View Profile

,
Gui-Rong Xue

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China
View Profile

,
Qiang Yang

Hong Kong University of Science and Technology, Hong Kong, Hong Kong

Hong Kong University of Science and Technology, Hong Kong, Hong Kong
View Profile

,
Yong Yu

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China
View Profile

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2008Pages 488–496https://doi.org/10.1145/1401890.1401951

Published:24 August 2008Publication History

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 488–496

ABSTRACT

Traditional spectral classification has been proved to be effective in dealing with both labeled and unlabeled data when these data are from the same domain. In many real world applications, however, we wish to make use of the labeled data from one domain (called in-domain) to classify the unlabeled data in a different domain (out-of-domain). This problem often happens when obtaining labeled data in one domain is difficult while there are plenty of labeled data from a related but different domain. In general, this is a transfer learning problem where we wish to classify the unlabeled data through the labeled data even though these data are not from the same domain. In this paper, we formulate this domain-transfer learning problem under a novel spectral classification framework, where the objective function is introduced to seek consistency between the in-domain supervision and the out-of-domain intrinsic structure. Through optimization of the cost function, the label information from the in-domain data is effectively transferred to help classify the unlabeled data from the out-of-domain. We conduct extensive experiments to evaluate our method and show that our algorithm achieves significant improvements on classification performance over many state-of-the-art algorithms.

References

S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. In NIPS, 2007.Google ScholarDigital Library
S. Ben-David and R. Schuller. Exploiting task relatedness for multiple task learning. In COLT, 2003.Google Scholar
S. Bickel, M. Brückner, and T. Scheffer. Discriminative learning for differing training and test distributions. In ICML, 2007. Google ScholarDigital Library
S. Bickel and T. Scheffer. Dirichlet-enhanced spam filtering based on biased samples. In NIPS, 2007.Google ScholarDigital Library
J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. Wortman. Learning bounds for domain adaptation. In NIPS, 2008.Google ScholarDigital Library
J. Blitzer, R. McDonald, and F. Pereira. Domain adaptation with structural correspondence learning. In EMNLP, 2006. Google ScholarDigital Library
R. Caruana. Multitask Learning. Machine Learning, 28(1):41--75, 1997. Google ScholarDigital Library
C.-K. Cheng and Y.-C. A. Wei. An improved two-way partitioning algorithm with stable performance {VLSI}. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 10(12):1502--1511, 1991.Google ScholarDigital Library
F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.Google Scholar
W. Dai, G.-R. Xue, Q. Yang, and Y. Yu. Co-clustering based classification for out-of-domain documents. In SIGKDD, 2007. Google ScholarDigital Library
W. Dai, G.-R. Xue, Q. Yang, and Y. Yu. Transferring naive bayes classifiers for text classification. In AAAI, 2007. Google ScholarDigital Library
W. Dai, Q. Yang, G.-R. Xue, and Y. Yu. Boosting for transfer learning. In ICML, 2007. Google ScholarDigital Library
H. Daum2e III. Frustratingly easy domain adaptation. In ACL, 2007.Google Scholar
H. Daume III and D. Marcu. Domain adaptation for statistical classifiers. JAIR, 1:1--15, 2006. Google ScholarDigital Library
C. Ding, X. He, H. Zha, M. Gu, and H. Simon. Spectral min-max cut for graph partitioning and data clustering. In ICDM, 2001. Google ScholarDigital Library
G. H. Golub and C. F. Van Loan. Matrix Computation. The Johns Hopkins University Press Baltimore, 1996.Google Scholar
J. J. Heckman. Sample selection bias as a specification error. Econometrica, 47(1):153--162, 1979.Google ScholarCross Ref
J. Huang, A. J. Smola, A. Gretton, K. Borgwardt, and B. Schölkopf. Correcting sample selection bias by unlabeled data. In NIPS, 2007.Google ScholarDigital Library
X. Ji and W. Xu. Document clustering with prior knowledge. In SIGIR, 2006. Google ScholarDigital Library
T. Joachims. Transductive inference for text classification using support vector machines. In ICML,1999. Google ScholarDigital Library
T. Joachims. Transductive learning via spectral graph partitioning. In ICML, 2003.Google ScholarDigital Library
S. D. Kamvar, D. Klein, and C. D. Manning. Spectral learning. In IJCAI, 2003. Google ScholarDigital Library
X. Liao, Y. Xue, and L. Carin. Logistic regression with an auxiliary data source. In ICML, 2005. Google ScholarDigital Library
M. Meila and J. Shi. A random walks view of spectral segmentation. In Proceedings of the 8th International Workshop on Artificial Intelligence and Statistics, 2001.Google Scholar
A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, 2001.Google ScholarDigital Library
M. Porter. An algorithm for suffix stripping program. Program, 14(3):130--137, 1980.Google ScholarCross Ref
J. Schmidhuber. On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultat fur Informatik, 1994.Google Scholar
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000. Google ScholarDigital Library
H. Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2):227--244, 2000.Google ScholarCross Ref
S. Thrun and T. Mitchell. Learning one more thing. In IJCAI, 1995. Google ScholarDigital Library
K. Wagstaff and C. Cardie. Clustering with instance-level constraints. In ICML, 2000. Google ScholarDigital Library
P. Wu and T. G. Dietterich. Improving SVM accuracy by training on auxiliary data sources. In ICML, 2004. Google ScholarDigital Library
D. Xing, W. Dai, G.-R. Xue, and Y. Yu. Bridged refinement for transfer learning. In PKDD, 2007.Google ScholarCross Ref
Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In ICML, 1997. Google ScholarDigital Library
B. Zadrozny. Learning and evaluating classifiers under sample selection bias. In ICML, 2004. Google ScholarDigital Library

Index Terms

Spectral domain-transfer learning
1. Computing methodologies
  1. Machine learning

Recommendations

A robust semi-supervised classification method for transfer learning
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

The transfer learning problem of designing good classifiers with a high generalization ability by using labeled samples whose distribution is different from that of test samples is an important and challenging research issue in the fields of machine ...
Read More
Transfer active learning
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Active learning traditionally assumes that labeled and unlabeled samples are subject to the same distributions and the goal of an active learner is to label the most informative unlabeled samples. In reality, situations may exist that we may not have ...
Read More
A Tri-training Based Transfer Learning Algorithm
ICTAI '12: Proceedings of the 2012 IEEE 24th International Conference on Tools with Artificial Intelligence - Volume 01

The lack of labeled training data is a common issue in many machine learning applications. Semi-supervised learning addresses this issue by self-labeling unlabelled examples. Transfer learning tackles it from a different way: borrow labeled examples ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2008
1116 pages
ISBN:9781605581934
DOI:10.1145/1401890
General Chair:
Ying Li
Microsoft adCenter Labs
,
Program Chairs:
Bing Liu
University of Illinois at Chicago
,
Sunita Sarawagi
Indian Institute of Technology, Bombay
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 August 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
spectral learning
transfer learning
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '08 Paper Acceptance Rate118of593submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 90
  Total Citations
  View Citations
- 1,870
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Spectral domain-transfer learning

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

A robust semi-supervised classification method for transfer learning

Transfer active learning

A Tri-training Based Transfer Learning Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Spectral domain-transfer learning

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

A robust semi-supervised classification method for transfer learning

Transfer active learning

A Tri-training Based Transfer Learning Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media