Abstract
In this paper, we have proposed a novel framework to integrate multiple views of data using hypergraph based partitioning approach. Representing each view as separate graphs, we partitioned them into clusters using standard clustering technique. Each clustering solution is represented as a separate hypergraph in which a hyperedge encodes a cluster. Concatenating all these hyperedges result to an adjacency matrix (H) of a hypergraph with n vertices which is equal to the number of objects in each data, and E edges which is equal to total number of clusters in the representative clustering solutions. A similarity matrix (S) is compiled from the adjacency (H) by taking entry wise average. Each entry of S merely denotes the fraction of clustering in which two objects are members of the same group/cluster. Clustering the S matrix results in meta-clusters which essentially inherit the characteristics of all data views in the original data. We have performed a simulation study on three data views to validate the framework. For each view, we tune the parameter of data, and investigate whether the results conform the changes. We have also applied the proposed framework in a real life dataset to identify meta-clusters. Moreover, we have analyzed the resulting meta-clusters to validate our proposed method.
References
Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Mining hidden community in heterogeneous social networks. In: Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD 2005, NY, USA, pp. 58–65 (2005). doi:10.1145/1134271.1134280
Dasarathy, B.: Decision Fusion. IEEE CS Press, Los Alamitos (1994)
Fu, W., Sanders-Beer, B., Katz, K., Maglott, D., Pruitt, K.: Human immunodeficiency virus type-1, human protein interaction database at NCBI. Nucl. Acids Res. (Database Issue) 37, D417–D422 (2009)
Granger, W.J.: Combining forecasts twenty years later. Eur. Conf. Mach. Learn. 8(3), 167–173 (1989)
Green, D., Cunningham, P.: A matrix factorization approach for integrating multiple data views. In: Berlin, S.V. (ed.) Proceedings in ECML PKDD 2009, Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 423–438 (2009)
Greene, D., Cunningham, P.: Multi-view clustering for mining heterogeneous social network data. In: Workshop on Information Retrieval over Social Networks, 31st European Conference on Information Retrieval, ECIR 2009 (2009)
Hecker, M., Lambeck, S., Toepfer, S., van Someren, E., Guthke, R.: Gene regulatory network inference: data integration in dynamic models? A review. BioSystems 96, 86–103 (2009)
Hull, R., Zhou, G.: A framework for supporting data integration using the materialized and virtual approaches. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, vol. 25, pp. 481–492, June 1996
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998). doi:10.1137/S1064827595287997
Lapatas, V., Stefanidakis, M., Jimenez, R., Via, A., Schneider, M.V.: Data integration in biological research: an overview. J. Biol. Res. Thessalon. 2, 9 (2015)
Liu, Y.T., Liu, T.Y., Qin, T., Ma, Z.M., Li, H.: Supervised rank aggregation. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, NY, USA, pp. 481–490 (2007). doi:10.1145/1242572.1242638
Ray, S., Bandyopadhyay, S.: A NMF based approach for integrating multiple data sources to predict HIV-1-human PPIs. BMC Bioinform. 17(1), 121 (2016). doi:10.1186/s12859-016-0952-6
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., Lewis, S.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25, 1251–1255 (2007)
Strehl, A., Ghosh, J.: Cluster ensembles a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3(12), 583–617 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Khan, S.A., Ray, S. (2017). Integrating Multi-view Data: A Hypergraph Based Approach. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 776. Springer, Singapore. https://doi.org/10.1007/978-981-10-6430-2_27
Download citation
DOI: https://doi.org/10.1007/978-981-10-6430-2_27
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6429-6
Online ISBN: 978-981-10-6430-2
eBook Packages: Computer ScienceComputer Science (R0)