Skip to main content
Log in

Parsimonious unsupervised and semi-supervised domain adaptation with good similarity functions

  • Regular paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In this paper, we address the problem of domain adaptation for binary classification. This problem arises when the distributions generating the source learning data and target test data are somewhat different. From a theoretical standpoint, a classifier has better generalization guarantees when the two domain marginal distributions of the input space are close. Classical approaches try mainly to build new projection spaces or to reweight the source data with the objective of moving closer the two distributions. We study an original direction based on a recent framework introduced by Balcan et al. enabling one to learn linear classifiers in an explicit projection space based on a similarity function, not necessarily symmetric nor positive semi-definite. We propose a well-founded general method for learning a low-error classifier on target data, which is effective with the help of an iterative procedure compatible with Balcan et al.’s framework. A reweighting scheme of the similarity function is then introduced in order to move closer the distributions in a new projection space. The hyperparameters and the reweighting quality are controlled by a reverse validation procedure. Our approach is based on a linear programming formulation and shows good adaptation performances with very sparse models. We first consider the challenging unsupervised case where no target label is accessible, which can be helpful when no manual annotation is possible. We also propose a generalization to the semi-supervised case allowing us to consider some few target labels when available. Finally, we evaluate our method on a synthetic problem and on a real image annotation task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abbasnejad M, Ramachandram D, Mandava R (2012) A survey of the state of the art in learning the kernels. Knowl Inf Syst 31(2): 193–221. doi:10.1007/s10115-011-0404-6

    Article  Google Scholar 

  2. Ando R, Zhang T (2005) A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res 6: 1817–1853

    MathSciNet  MATH  Google Scholar 

  3. Ayache S, Quénot G (2008) Video corpus annotation using active learning. In: Proceedings of the 30th European conference on information retrieval research (ECIR), vol 4956 of LNCS. Springer, pp 187–198

  4. Ayache S, Quénot G, Gensel J (2007) Image and video indexing using networks of operators. J Image Video Process 1: 1–113

    Article  Google Scholar 

  5. Bahadori MT, Liu Y, Zhang D (2011) Learning with minimum supervision: a general framework for transductive transfer learning. In: Proceedings of the 11th IEEE international conference on data mining (ICDM), pp 61–70

  6. Balcan M, Blum A, Srebro N (2008a) Improved guarantees for learning via similarity functions. In: Proceedings of the annual conference on computational learning theory (COLT), pp 287–298

  7. Balcan M, Blum A, Srebro N (2008) A theory of learning with similarity functions. Mach Learn J 72(1–2): 89–112

    Article  Google Scholar 

  8. Bellet A, Habrard A, Sebban M (2011) Learning good edit similarities with generalization guarantees. In: Proceedings of European conference on machine learning and principles of data mining and knowledge discovery (ECML/PKDD), vol 6911 of LNCS, pp 188–203

  9. Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan J (2010) A theory of learning from different domains. Mach Learn J 79(1–2): 151–175

    Article  Google Scholar 

  10. Ben-David S, Blitzer J, Crammer K, Pereira F (2007) Analysis of representations for domain adaptation. In: Proceedings of advances in neural information processing systems (NIPS), pp 137–144

  11. Ben-David S, Lu T, Luu T, Pal D (2010) Impossibility theorems for domain adaptation. JMLR W&CP 9: 129–136

    Google Scholar 

  12. Bergamo A, Torresani L (2010) Exploiting weakly-labeled web images to improve object classification: a domain adaptation approach. In: Proceedings of advances in neural information processing systems (NIPS)

  13. Blitzer J, Foster D, Kakade S (2011) Domain adaptation with coupled subspaces. In: Proceedings of AISTATS

  14. Bruzzone L, Marconcini M (2010) Domain adaptation problems: a DASVM classification technique and a circular validation strategy. IEEE Trans Pattern Anal Mach Intell 32(5): 770–787

    Article  Google Scholar 

  15. Cao B, Ni X, Sun J-T, Wang G, Yang Q (2011) Distance metric learning under covariate shift. In: Proceedings of international joint conference on artificial intelligence (IJCAI), pp 1204–1210

  16. Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm

  17. Chattopadhyay R, Ye J, Panchanathan S, Fan W, Davidson I (2011) Multi-source domain adaptation and its application to early detection of fatigue. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, pp 717–725

  18. Chen M, Weinberger K, Blitzer J (2011) Co-training for domain adaptation. In: Proceedings of advances in neural information processing systems (NIPS)

  19. Cortes C, Mohri M (2011) Domain adaptation in regression. In: Proceedings of international conference on algorithmic learning theory (ALT), vol 6925 of LNCS, pp 308–323

  20. Daumé H III (2007) Frustratingly easy domain adaptation. In: Proceedings of the association for computational linguistics (ACL)

  21. Daumé H III, Kumar A, Saha A (2010) Co-regularization based semi-supervised domain adaptation. In: Proceedings of advances in neural information processing systems (NIPS)

  22. Duan L, Tsang I, Xu D, Chua T (2009) Domain adaptation from multiple sources via auxiliary classifiers. In: Proceedings of international conference on machine learning (ICML), p 37

  23. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2007) The PASCAL visual object classes challenge 2007 (VOC2007) results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/

  24. Fei H, Huan J (2011) Structured feature selection and task relationship inference for multi-task learning. In: Proceedings of the 11th IEEE international conference on data mining (ICDM). IEEE, pp 171–180

  25. Freund R (1991) Polynomial-time algorithms for linear programming based only on primal scaling and projected gradients of a potential function. Math Program 51: 203–222

    Article  MathSciNet  MATH  Google Scholar 

  26. Geng B, Tao D, Xu C (2011) DAML: Domain adaptation metric learning. IEEE Trans Image Process (TIP) 20(10): 2980–2989

    Article  MathSciNet  Google Scholar 

  27. Guerra P, Veloso A Jr, WM, Almeida V (2011) From bias to opinion: a transfer-learning approach to real-time sentiment analysis. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, pp 150–158

  28. Huang J, Smola A, Gretton A, Borgwardt K, Schölkopf B (2006) Correcting sample selection bias by unlabeled data. In: Proceedings of advances in neural information processing systems (NIPS), pp 601–608

  29. Jiang J (2008) A literature survey on domain adaptation of statistical classifiers. Technical report, Computer Science Department at University of Illinois at Urbana-Champaign. http://sifaka.cs.uiuc.edu/jiang4/domain_adaptation/da_survey.pdf

  30. Jiang J, Zhai C (2007) Instance weighting for domain adaptation in nlp. In: Proceedings of the association for computational linguistics (ACL)

  31. Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of international conference on machine learning (ICML), pp 200–209

  32. Junejo K, Karim A (2012) Robust personalizable spam filtering via local and global discrimination modeling. Knowl Inf Syst 1–36. doi:10.1007/s10115-012-0477-x

  33. Kulis B, Saenko K, Darrell T (2011) What you saw is not what you get: domain adaptation using asymmetric kernel transforms. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR 2011), pp 1785–1792

  34. Macqueen J (1967) Some methods of classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, pp 281–297

  35. Mansour Y, Mohri M, Rostamizadeh A (2008) Domain adaptation with multiple sources. In: Proceedings of advances in neural information processing systems (NIPS), pp 1041–1048

  36. Mansour Y, Mohri M, Rostamizadeh A (2009) Domain adaptation: learning bounds and algorithms. In: Proceedings of annual conference on learning theory (COLT), pp 19–30

  37. Pan S, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10): 1345–1359

    Article  Google Scholar 

  38. Quionero-Candela J, Sugiyama M, Schwaighofer A, Lawrence N (2009) Dataset shift in machine learning. MIT Press, Cambridge

    Google Scholar 

  39. Schweikert G, Widmer C, Schölkopf B, Rätsch G (2008) An empirical analysis of domain adaptation algorithms for genomic sequence analysis. In: Proceedings of advances in neural information processing systems (NIPS), pp 1433–1440

  40. Seah C, Tsang I, Ong Y, Lee K (2010) Predictive distribution matching svm for multi-domain learning. In: Proceedings of European conference on machine learning and principles of data mining and knowledge discovery (ECML/PKDD), vol 6321 of LNCS. Springer, pp 231–247

  41. Smeaton A, Over P, Kraaij W (2009) High-level feature detection from video in TRECVid: a 5-year retrospective of achievements. In: Multimedia content analysis, theory and applications. Springer, pp 151–174

  42. Sugiyama M, Nakajima S, Kashima H, von Bünau P, Kawanabe M (2007) Direct importance estimation with model selection and its application to covariate shift adaptation. In: Proceedings of advances in neural information processing systems (NIPS)

  43. Vapnik V (1998) Statistical learning theory. Springer, Berlin

    MATH  Google Scholar 

  44. Wang B, Tang J, Fan W, Chen S, Tan C, Yang Z (2012) Query-dependent cross-domain ranking in heterogeneous network. Knowl Inf Syst 1–37. doi:10.1007/s10115-011-0472-7

  45. Xu H, Mannor S (2010) Robustness and generalization. In: Proceedings of annual conference on computational theory (COLT), pp 503–515

  46. Xu H, Mannor S (2012) Robustness and generalization. Mach Learn J 86(3): 391–423

    Article  MathSciNet  MATH  Google Scholar 

  47. Xu Z, Kersting K (2011) Multi-task learning with task relations. In: Proceedings of the 11th IEEE international conference on data mining (ICDM). IEEE, pp 884–893

  48. Xue G-R, Dai W, Yang Q, Yu Y (2008) Topic-bridged plsa for cross-domain text classification. In: Proceedings of international ACM SIGIR conference on research and development in information retrieval, pp 627–634

  49. Ye Y (1991) ‘An O(n 3L) potential reduction algorithm for linear programming’. Math Program 50: 239–258

    Article  MATH  Google Scholar 

  50. Zhang Y, Yeung D-Y (2010) Transfer metric learning by learning task relationships. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, pp 1199–1208

  51. Zhong E, Fan W, Yang Q, Verscheure O, Ren J (2010) Cross validation framework to choose amongst models and datasets for transfer learning. In: Proceedings of European conference on machine learning and principles of data mining and knowledge discovery (ECML/PKDD), vol 6323 of LNCS. Springer, pp 547–562

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emilie Morvant.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Morvant, E., Habrard, A. & Ayache, S. Parsimonious unsupervised and semi-supervised domain adaptation with good similarity functions. Knowl Inf Syst 33, 309–349 (2012). https://doi.org/10.1007/s10115-012-0516-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-012-0516-7

Keywords

Navigation