Skip to main content
Log in

Negative transfer detection in transductive transfer learning

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Transfer learning method has been widely used in machine learning when training data is limited. However, class noise accumulated during learning iterations can lead to negative transfer which can adversely affect performance when more training data is used. In this paper, we propose a novel method to identify noise samples for noise reduction. More importantly, the method can detect the point where negative transfer happens such that transfer learning can terminate at the near top performance point. In this method, we use the sum of the Rademacher distribution to estimate the class noise rate of transferred data. Transferred data having high probability of being labeled wrongly is removed to reduce noise accumulation. This negative sample reduction process can be repeated several times during transfer learning until we find the point where negative transfer occurs. As we can detect the point where negative transfer occurs, our method not only has the ability to delay the point where negative transfer happens, but also the ability to stop transfer learning algorithms at the right place for top performance gain. Evaluation based on cross-lingual/domain opinion analysis evaluation data set shows that our algorithm achieves the state-of-the-art result. Furthermore, our system shows a monotonic increase trend in performance improvement when more training data are used beating the performance degradation curse of most transfer learning methods when training data reaches certain size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://tcci.ccf.org.cn/conference/2013/dldoc/evdata03.zip.

  2. http://ictclas.nlpir.org/

  3. https://translate.google.com

  4. http://svmlight.joachims.org/.

  5. http://tcci.ccf.org.cn/conference/2013/dldoc/evres03.pdf .

References

  1. Angluin D, Laird P (1988) Learning from noisy examples. Mach Learn 2(4):343–370

    Google Scholar 

  2. Arnold A, Nallapati R, Cohen WW (2007) A comparative study of methods for transductive transfer learning. In: ICDMW, pp 77–82

  3. Aue A, Gamon M (2005) Customizing sentiment classifiers to new domains: a case study. RANLP 1:2-1

    Google Scholar 

  4. Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: EMNLP, pp 120–128

  5. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: COLT, pp 92–100

  6. Brodley CE, Friedl MA et al (1996) Identifying and eliminating mislabeled training instances. AAAI IAAI 1:799–805

    Google Scholar 

  7. Chen M, Weinberger KQ, Blitzer J (2011) Co-training for domain adaptation. In: NIPS, pp 2456–2464

  8. Cheng Y, Li Q (2009) Transfer learning with data edit. In: International conference on ADMA, pp 427–434

  9. Deng C, Guo MZ, Liu Y, Li HF (2008) Participatory learning based semi-supervised classification. In: 2008 Fourth International Conference on Natural Computation, vol 4, pp 207–216

  10. Fukumoto F, Suzuki Y, Matsuyoshi S (2013) Text classification from positive and unlabeled data using misclassified data correction. In: ACL (2), pp 474–478

  11. Gui L, Xu R, Xu J, Yuan L, Yao Y, Zhou J, Qiu Q, Wang S, Wong KF, Cheung R (2013) A mixed model for cross lingual opinion analysis. In: NLPCC, pp 93–104

  12. Gui L, Xu R, Lu Q, Xu J, Xu J, Liu B, Wang X (2014) Cross-lingual opinion analysis via negative transfer detection. In: ACL (2), pp 860–865

  13. Gui L, Lu Q, Xu R, Li M, Wei Q (2015) A novel class noise estimation method and application in classification. In: CIKM, pp 1081–1090

  14. Holmstedt T (1970) Interpolation of quasi-normed spaces. Math Scand 26(1):177–199

    Article  MathSciNet  MATH  Google Scholar 

  15. Huang J, Gretton A, Borgwardt KM, Schölkopf B, Smola AJ (2006) Correcting sample selection bias by unlabeled data. In: NIPS, pp 601–608

  16. Jiang Y, Zhou ZH (2004) Editing training data for kNN classifiers with neural network ensemble. In: International symposium on neural networks, pp 356–361

  17. Li M, Zhou ZH (2005) Setred: self-training with editing. In: PAKDD, pp 611–621

  18. Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: IJCAI, pp 2873–2879

  19. Liu P, Qiu X, Chen X, Wu S, Huang X (2015) Multi-timescale long short-term memory neural network for modelling sentences and documents. In: EMNLP, pp 2326–2335

  20. Montgomery-Smith SJ (1990) The distribution of rademacher sums. Proc Am Math Soc 109(2):517–522

    Article  MathSciNet  MATH  Google Scholar 

  21. Muhlenbach F, Lallich S, Zighed DA (2004) Identifying and handling mislabelled instances. J Intell Inf Syst 22(1):89–109

    Article  MATH  Google Scholar 

  22. Sluban B, Gamberger D, Lavra N (2010) Advances in class noise detection. In: ECAI, pp 1105–1106

  23. Sugiyama M, Nakajima S, Kashima H, Buenau PV, Kawanabe M (2008) Direct importance estimation with model selection and its application to covariate shift adaptation. In: NIPS, pp 1433–1440

  24. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR, pp 1–9

  25. Wan X (2009) Co-training for cross-lingual sentiment classification. In: ACL, pp 235–243

  26. Wang X (2015) Learning from big data with uncertainty. J Intell Fuzzy Syst 28(5):2329–2330

    Article  MathSciNet  Google Scholar 

  27. Xu J, Xu R, Lu Q, Wang X (2012) Coarse-to-fine sentence-level emotion classification based on the intra-sentence features and sentential context. In: CIKM, pp 2455–2458

  28. Xu J, Zhang Y, Wu Y, Wang J, Dong X, Xu H (2015) Citation sentiment analysis in clinical trial papers. AMIA Annu Symp Proc 2015:1334–1341

    Google Scholar 

  29. Xu R, Gui L, Xu J, Lu Q, Wong K-F (2015) Cross lingual opinion holder extraction based on multi-kernel SVMs and transfer learning. World Wide Web 18(2):299–316

    Article  Google Scholar 

  30. Zhai J, Li T, Wang X (2016) A cross-selection instance algorithm. J Intell Fuzzy Syst 30(2):717–728

    Article  Google Scholar 

  31. Zhang ML, Zhou ZH (2011) Cotrade: confident co-training with data editing. IEEE Trans Syst Man Cybern Part B 41(6):1612–1626

    Article  Google Scholar 

  32. Zhou X, Wan X, Xiao J (2012) Cross-language opinion target extraction in review texts. In: ICDM, pp 1200–1205

  33. Zhu X, Wu X (2004) Cost-guided class noise handling for effective cost-sensitive learning. In: ICDM, pp 297–304

  34. Zhu X, Wu X, Chen Q (2003) Eliminating class noise in large datasets. ICML 3:920–927

    Google Scholar 

  35. Zighed DA, Lallich S, Muhlenbach F (2002) Separability index in supervised learning. In: ECML-PKDD, pp 475–487

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China 61370165, U1636103, 61632011, National 863 Program of China 2015AA015405, Shenzhen Foundational Research Funding JCYJ20150625142543470 and Guangdong Provincial Engineering Technology Research Center for Data Science 2016KF09.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruifeng Xu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gui, L., Xu, R., Lu, Q. et al. Negative transfer detection in transductive transfer learning. Int. J. Mach. Learn. & Cyber. 9, 185–197 (2018). https://doi.org/10.1007/s13042-016-0634-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-016-0634-8

Keywords

Navigation