Abstract
Microbes exists everywhere. Current generation of genomic technologies have allowed researchers to determine the collective DNA sequence of all microorganisms co-existing together. In this paper, we present some of the challenges related to the analysis of data obtained from the community genomics experiment (commonly referred by metagenomics), advocate the need of machine learning techniques and highlight our contributions related to development of supervised and unsupervised techniques for solving this complex, real world problem.
Chapter PDF
References
Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, pp. 327–336. ACM (1998)
Charuvaka, A., Rangwala, H.: Evaluation of short read metagenomic assembly. BMC genomics12 (suppl. 2), S8 (2011)
Charuvaka, A., Rangwala, H.: Multi-task learning for classifying proteins with dual hierarchies. In: IEEE International Conference on Data Mining (ICDM), Brussels, Belgium, pp. 834–839. IEEE (December 2012)
Delcher, A.L., Bratke, K.A., Powers, E.C., Salzberg, S.L.: Identifying bacterial genes and endosymbiont dna with glimmer. Bioinformatics 23(6), 673–679 (2007)
Hugenholtz, P., Tyson, G.W.: Microbiology: metagenomics. Nature 455(7212), 481–483 (2008)
Naik, A., Charuvaka, A., Rangwala, H.: Classifying documents within multiple hierarchical datasets using multi-task learning. In: 2013 IEEE 25th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 390–397. IEEE (2013)
Petrosino, J.F., Highlander, S., Luna, R.A., Gibbs, R.A., Versalovic, J.: Metagenomic pyrosequencing and microbial identification. Clinical Chemistry 55(5), 856–866 (2009)
Rasheed, Z., Rangwala, H.: A map-reduce framework for clustering metagenomes. In: Proceedings of the 12th IEEE International Workshop on High Performance Computational Biology (HiCOMB), Boston, MA. IEEE (May 2013)
Rasheed, Z., Rangwala, H.: Mc-minh: Metagenome clustering using minwise based hashing. In: SIAM International Conference in Data Mining (SDM), Austin, TX. SIAM (May 2013)
Rasheed, Z., Rangwala, H., Barbara, D.: Efficient clustering of metagenomic sequences using locality sensitive hashing. In: SIAM International Conference in Data Mining, Anaheim, CA, pp. 1023–1034. SIAM (April 2012)
Rasheed, Z., Rangwala, H., Barbara, D.: LSH-Div:species diversity estimation using locality sensitive hashing. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Philadelphia, USA. IEEE (October 2012)
Zhu, W., Lomsadze, A., Borodovsky, M.: Ab initio gene identification in metagenomic sequences. Nucleic Acids Research 38(12), e132 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rangwala, H., Charuvaka, A., Rasheed, Z. (2014). Machine Learning Approaches for Metagenomics. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44845-8_47
Download citation
DOI: https://doi.org/10.1007/978-3-662-44845-8_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44844-1
Online ISBN: 978-3-662-44845-8
eBook Packages: Computer ScienceComputer Science (R0)