ABSTRACT
A heterogeneous information network (HIN) is used to model objects of different types and their relationships. Objects are often associated with properties such as labels. In many applications, such as curated knowledge bases for which object labels are manually given, only a small fraction of the objects are labeled. Studies have shown that transductive classification is an effective way to classify and to deduce labels of objects, and a number of transductive classifiers have been put forward to classify objects in an HIN. We study the performance of a few representative transductive classification algorithms on HINs. We identify two fundamental properties, namely, cohesiveness and connectedness, of an HIN that greatly influence the effectiveness of transductive classifiers. We define metrics that measure the two properties. Through experiments, we show that the two properties serve as very effective indicators that predict the accuracy of transductive classifiers. Based on cohesiveness and connectedness we derive (1) a black-box tester that evaluates whether transductive classifiers should be applied for a given classification task and (2) an active learning algorithm that identifies the objects in an HIN whose labels should be sought in order to improve classification accuracy.
- M. Belkin et al. Regularization and semi-supervised learning on large graphs. In COLT, 2004.Google Scholar
- M. Belkin et al. On manifold regularization. In AISTATS, 2005.Google Scholar
- O. Chapelle and A. Zien. Semi-supervised classification by low density separation. In AISTATS, 2005.Google Scholar
- M. Ji et al. Graph regularized transductive classification on heterogeneous information networks. In ECML/PKDD, 2010. Google ScholarDigital Library
- M. Ji et al. Ranking-based classification of heterogeneous information networks. In KDD, 2011. Google ScholarDigital Library
- X. Kong et al. Meta path-based collective classification in heterogeneous information networks. In CIKM, 2012. Google ScholarDigital Library
- Q. Lu and L. Getoor. Link-based classification. In ICML, 2003.Google ScholarDigital Library
- C. Luo et al. Hetpathmine: A novel transductive classification algorithm on heterogeneous information networks. In ECIR, 2014.Google ScholarCross Ref
- S. A. Macskassy and F. Provost. A simple relational classifier. Technical report, DTIC Document, 2003.Google Scholar
- S. A. Macskassy and F. J. Provost. Classification in networked data: A toolkit and a univariate case study. JMLR, 2007. Google ScholarDigital Library
- J. Neville and D. Jensen. Relational dependency networks. JMLR, 2007. Google ScholarDigital Library
- Y. Sun et al. Ranking-based clustering of heterogeneous information networks with star network schema. In KDD, 2009. Google ScholarDigital Library
- Y. Sun et al. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. In VLDB, 2011.Google ScholarDigital Library
- Y. Sun et al. Integrating meta-path selection with user-guided object clustering in heterogeneous information networks. In KDD, 2012. Google ScholarDigital Library
- B. Taskar et al. Probabilistic classification and clustering in relational data. In IJCAI, 2001. Google ScholarDigital Library
- B. Taskar et al. Discriminative probabilistic models for relational data. In UAI, 2002. Google ScholarDigital Library
- C. Wan et al. Classification with active learning and meta-paths in heterogeneous information networks. In CIKM, 2015. Google ScholarDigital Library
- M. Wan et al. Graph regularized meta-path based transductive regression in heterogeneous information network. In SDM, 2015.Google ScholarCross Ref
- M. Wu and B. Schölkopf. Transductive classification via local learning regularization. In AISTATS, 2007.Google Scholar
- Z. Yin et al. Exploring social tagging graph for web object classification. In KDD, 2009. Google ScholarDigital Library
- X. Yu et al. User guided entity similarity search using meta-path selection in heterogeneous information networks. In CIKM, 2012. Google ScholarDigital Library
- X. Yu et al. Personalized entity recommendation: a heterogeneous information network approach. In WSDM, 2014. Google ScholarDigital Library
- J. Zhang et al. Meta-path based multi-network collective link prediction. In KDD, 2014. Google ScholarDigital Library
- D. Zhou et al. Learning with local and global consistency. In NIPS, 2004. Google ScholarDigital Library
- Y. Zhou and L. Liu. Activity-edge centric multi-label classification for mining heterogeneous information networks. In KDD, 2014. Google ScholarDigital Library
- X. Zhu et al. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, 2003.Google ScholarDigital Library
Index Terms
- On Transductive Classification in Heterogeneous Information Networks
Recommendations
Multi-label classification by mining label and instance correlations from heterogeneous information networks
KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data miningMulti-label classification is prevalent in many real-world applications, where each example can be associated with a set of multiple labels simultaneously. The key challenge of multi-label classification comes from the large space of all possible label ...
A relational approach to probabilistic classification in a transductive setting
Transduction is an inference mechanism adopted from several classification algorithms capable of exploiting both labeled and unlabeled data and making the prediction for the given set of unlabeled data only. Several transductive learning methods have ...
Comments