Skip to main content
Log in

An ant colony-based semi-supervised approach for learning classification rules

  • Published:
Swarm Intelligence Aims and scope Submit manuscript

Abstract

Semi-supervised learning methods create models from a few labeled instances and a great number of unlabeled instances. They appear as a good option in scenarios where there is a lot of unlabeled data and the process of labeling instances is expensive, such as those where most Web applications stand. This paper proposes a semi-supervised self-training algorithm called Ant-Labeler. Self-training algorithms take advantage of supervised learning algorithms to iteratively learn a model from the labeled instances and then use this model to classify unlabeled instances. The instances that receive labels with high confidence are moved from the unlabeled to the labeled set, and this process is repeated until a stopping criteria is met, such as labeling all unlabeled instances. Ant-Labeler uses an ACO algorithm as the supervised learning method in the self-training procedure to generate interpretable rule-based models—used as an ensemble to ensure accurate predictions. The pheromone matrix is reused across different executions of the ACO algorithm to avoid rebuilding the models from scratch every time the labeled set is updated. Results showed that the proposed algorithm obtains better predictive accuracy than three state-of-the-art algorithms in roughly half of the datasets on which it was tested, and the smaller the number of labeled instances, the better the Ant-Labeler performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Refer to (Otero et al. 2013) for more details on the cAnt-Miner\(_{\mathrm {PB}}\) algorithm.

  2. No results for APSSC with 70 and 100 % of labeled data are reported, as the KEEL implementation was not able to generate results for these data configurations.

  3. The running times were observed on a Xeon 2.4 GHz machine with 3.5 GB of RAM.

References

  • Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M., Ventura, S., Garrell, J., et al. (2009). KEEL: A software tool to assess evolutionary algorithms for data mining problems. Soft Computing, 13(3), 307–318.

    Article  Google Scholar 

  • Angus, D. (2009). Niching for ant colony optimisation. In A. Lewis, S. Mostaghim, & M. Randall (Eds.), Biologically-inspired optimisation methods, studies in computational intelligence (Vol. 210, pp. 165–188). Heidelberg: Springer.

    Chapter  Google Scholar 

  • Arcanjo, F. L., Pappa, G. L., Bicalho, P. V., Meira, W. Jr., & da Silva, A. S. (2011). Semi-supervised genetic programming for classification. In Proceedings of the 13th annual conference on genetic and evolutionary computation (GECCO 2011) (pp. 1259–1266). ACM.

  • Bache, K., & Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml

  • Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the 11th annual conference on computational learning theory (COLT ’98) (pp. 92–100). ACM.

  • Chapelle, O., Schölkopf, B., & Zien, A. (eds) (2010) Semi-Supervised Learning (528 pp.). Cambridge: MIT Press.

  • Davidson, I., & Ravi, S. (2005). Clustering with constraints: Feasibility issues and the k-means algorithm. In Proceedings of the 2005 SIAM international conference on data mining (SDM05) (pp. 201–211). SIAM.

  • Freitas, A. A. (2002). Data mining and knowledge discovery with evolutionary algorithms. New York: Springer.

    Book  MATH  Google Scholar 

  • Ginestet, C. (2009). Semisupervised learning for computational linguistics. Journal of the Royal Statistical Society: Series A (Statistics in Society), 172(3), 694–694.

    Article  Google Scholar 

  • Halder, A., Ghosh, S., & Ghosh, A. (2010). Ant based semi-supervised classification. In Swarm intelligence: 7th international conference (ANTS 2010) (vol. 6234, pp. 376–383). Springer, LNCS.

  • Halder, A., Ghosh, S., & Ghosh, A. (2013). Aggregation pheromone metaphor for semi-supervised classification. Pattern Recognition, 46(8), 2239–2248.

    Article  Google Scholar 

  • Hsu, C., & Lin, C. (2002). A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, 13(2), 415–425.

    Article  Google Scholar 

  • Joachims, T. (1999). Transductive inference for text classification using support vector machines. In Proceedings of the 16th international conference on machine learning (ICML ’99) (pp. 200–209). Morgan Kaufmann.

  • Kasabov, N., & Pang, S. (2003). Transductive support vector machines and applications in bioinformatics for promoter recognition. In Proceedings of the 2003 international conference on neural networks and signal processing (ICNNSP) (pp. 1–6). IEEE.

  • Koutra, D., Ke, T. Y., Kang, U., Chau, D. H., Pao, H. K. K., & Faloutsos, C. (2011). Unifying guilt-by-association approaches: Theorems and fast algorithms. In Machine learning and knowledge discovery in databases: European conference (ECML PKDD 2011) (vol. 6912, pp. 245–260). Springer, LNCS.

  • Li, M., & Zhou, Z. H. (2007). Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 37(6), 1088–1098.

    Article  Google Scholar 

  • Li, Y. F., & Zhou, Z. H. (2011). Towards making unlabeled data never hurt. In Proceedings of the 28th international conference on machine learning (ICML ’11) (pp. 1081–1088). ACM.

  • Martens, D., Baesens, B., & Fawcett, T. (2011). Editorial survey: Swarm intelligence for data mining. Machine Learning, 82(1), 1–42.

    Article  MathSciNet  Google Scholar 

  • Olmo, J. L., Luna, J. M., Romero, J. R., & Ventura, S. (2010). An automatic programming aco-based algorithm for classification rule mining. In Trends in practical applications of agents and multiagent systems: 8th international conference on practical applications of agents and multiagent systems, advances in intelligent and soft computing (vol. 71, pp. 649–656). Springer.

  • Otero, F., Freitas, A., & Johnson, C. (2008). \(c\)Ant-Miner: An ant colony classification algorithm to cope with continuous attributes. In Ant colony optimization and swarm intelligence: 6th international conference (ANTS 2008) (vol. 5217, pp. 48–59). Springer, LNCS.

  • Otero, F., Freitas, A., & Johnson, C. (2013). A new sequential covering strategy for inducing classification rules with ant colony algorithms. IEEE Transactions on Evolutionary Computation, 17(1), 64–76.

    Article  Google Scholar 

  • Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1), 1–39.

    Article  MathSciNet  Google Scholar 

  • Tong, S., & Chang, E. (2001). Support vector machine active learning for image retrieval. In Proceedings of the 9th ACM international conference on multimedia (MULTIMEDIA ’01) (pp. 107–118). ACM.

  • Triguero, I., Garçia, S., & Herrera, F. (2015). Self-labeled techniques for semi-supervised learning: Taxonomy, software and empirical study. Knowledge and Information Systems, 42(2), 245–284.

    Article  Google Scholar 

  • Wang, J., Zhao, Y., Wu, X., & Hua, X. S. (2008). Transductive multi-label learning for video concept detection. In Proceedings of the 1st ACM international conference on multimedia information retrieval (MIR ’08) (pp. 298–304). ACM.

  • Wang, J., Jebara, T., & Chang, S. F. (2013). Semi-supervised learning using greedy max-cut. Journal of Machine Learning Research, 14(1), 771–800.

    MATH  MathSciNet  Google Scholar 

  • Xu, X., Lu, L., He, P., Ma, Y., Chen, Q., & Chen, L. (2013). Semi-supervised classification with multiple ants maximal spanning tree. In Proceedings of IEEE/WIC/ACM international joint conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) (pp. 315–320). IEEE.

  • Zhao, B., Wang, F., & Zhang, C. (2008). CutS3VM: A fast semi-supervised SVM algorithm. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD) (pp. 830–838). ACM.

  • Zhou, Z. H., & Li, M. (2005). Tri-Training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 17(11), 1529–1541.

    Article  Google Scholar 

  • Zhu, X., & Goldberg, A. B. (2009). Introduction to semi-supervised learning. San Rafael: Morgan & Claypool.

    MATH  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers and the associate editor for their valuable comments and suggestions. This work was partially supported by the following Brazilian Research Support Agencies: CNPq, FAPEMIG, and CAPES.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samuel E. L. Oliveira.

Additional information

Julio Albinati and Samuel E. L. Oliveira have contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Albinati, J., Oliveira, S.E.L., Otero, F.E.B. et al. An ant colony-based semi-supervised approach for learning classification rules. Swarm Intell 9, 315–341 (2015). https://doi.org/10.1007/s11721-015-0116-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11721-015-0116-8

Keywords

Navigation