Integrating Multi-view Data: A Hypergraph Based Approach

Khan, Saif Ayan; Ray, Sumanta

doi:10.1007/978-981-10-6430-2_27

Saif Ayan Khan¹² &
Sumanta Ray¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 776))

Included in the following conference series:

International Conference on Computational Intelligence, Communications, and Business Analytics

1416 Accesses

Abstract

In this paper, we have proposed a novel framework to integrate multiple views of data using hypergraph based partitioning approach. Representing each view as separate graphs, we partitioned them into clusters using standard clustering technique. Each clustering solution is represented as a separate hypergraph in which a hyperedge encodes a cluster. Concatenating all these hyperedges result to an adjacency matrix (H) of a hypergraph with n vertices which is equal to the number of objects in each data, and E edges which is equal to total number of clusters in the representative clustering solutions. A similarity matrix (S) is compiled from the adjacency (H) by taking entry wise average. Each entry of S merely denotes the fraction of clustering in which two objects are members of the same group/cluster. Clustering the S matrix results in meta-clusters which essentially inherit the characteristics of all data views in the original data. We have performed a simulation study on three data views to validate the framework. For each view, we tune the parameter of data, and investigate whether the results conform the changes. We have also applied the proposed framework in a real life dataset to identify meta-clusters. Moreover, we have analyzed the resulting meta-clusters to validate our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Mining hidden community in heterogeneous social networks. In: Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD 2005, NY, USA, pp. 58–65 (2005). doi:10.1145/1134271.1134280
Dasarathy, B.: Decision Fusion. IEEE CS Press, Los Alamitos (1994)
Google Scholar
Fu, W., Sanders-Beer, B., Katz, K., Maglott, D., Pruitt, K.: Human immunodeficiency virus type-1, human protein interaction database at NCBI. Nucl. Acids Res. (Database Issue) 37, D417–D422 (2009)
Article Google Scholar
Granger, W.J.: Combining forecasts twenty years later. Eur. Conf. Mach. Learn. 8(3), 167–173 (1989)
MathSciNet Google Scholar
Green, D., Cunningham, P.: A matrix factorization approach for integrating multiple data views. In: Berlin, S.V. (ed.) Proceedings in ECML PKDD 2009, Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 423–438 (2009)
Google Scholar
Greene, D., Cunningham, P.: Multi-view clustering for mining heterogeneous social network data. In: Workshop on Information Retrieval over Social Networks, 31st European Conference on Information Retrieval, ECIR 2009 (2009)
Google Scholar
Hecker, M., Lambeck, S., Toepfer, S., van Someren, E., Guthke, R.: Gene regulatory network inference: data integration in dynamic models? A review. BioSystems 96, 86–103 (2009)
Article Google Scholar
Hull, R., Zhou, G.: A framework for supporting data integration using the materialized and virtual approaches. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, vol. 25, pp. 481–492, June 1996
Google Scholar
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998). doi:10.1137/S1064827595287997
Article MathSciNet MATH Google Scholar
Lapatas, V., Stefanidakis, M., Jimenez, R., Via, A., Schneider, M.V.: Data integration in biological research: an overview. J. Biol. Res. Thessalon. 2, 9 (2015)
Article Google Scholar
Liu, Y.T., Liu, T.Y., Qin, T., Ma, Z.M., Li, H.: Supervised rank aggregation. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, NY, USA, pp. 481–490 (2007). doi:10.1145/1242572.1242638
Ray, S., Bandyopadhyay, S.: A NMF based approach for integrating multiple data sources to predict HIV-1-human PPIs. BMC Bioinform. 17(1), 121 (2016). doi:10.1186/s12859-016-0952-6
Article Google Scholar
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., Lewis, S.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25, 1251–1255 (2007)
Article Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3(12), 583–617 (2002)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Aliah University, Kolkata, India
Saif Ayan Khan & Sumanta Ray

Authors

Saif Ayan Khan
View author publications
You can also search for this author in PubMed Google Scholar
Sumanta Ray
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saif Ayan Khan .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India
J. K. Mandal
Department of Computer and System Sciences, Visva Bharati University, Bolpur Santiniketan, West Bengal, India
Paramartha Dutta
Department of Information Technology, Calcutta Business School, Kolkata, India
Somnath Mukhopadhyay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khan, S.A., Ray, S. (2017). Integrating Multi-view Data: A Hypergraph Based Approach. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 776. Springer, Singapore. https://doi.org/10.1007/978-981-10-6430-2_27

Download citation

DOI: https://doi.org/10.1007/978-981-10-6430-2_27
Published: 26 September 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6429-6
Online ISBN: 978-981-10-6430-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics