Skip to main content

Integrating Multi-view Data: A Hypergraph Based Approach

  • Conference paper
  • First Online:
Computational Intelligence, Communications, and Business Analytics (CICBA 2017)

Abstract

In this paper, we have proposed a novel framework to integrate multiple views of data using hypergraph based partitioning approach. Representing each view as separate graphs, we partitioned them into clusters using standard clustering technique. Each clustering solution is represented as a separate hypergraph in which a hyperedge encodes a cluster. Concatenating all these hyperedges result to an adjacency matrix (H) of a hypergraph with n vertices which is equal to the number of objects in each data, and E edges which is equal to total number of clusters in the representative clustering solutions. A similarity matrix (S) is compiled from the adjacency (H) by taking entry wise average. Each entry of S merely denotes the fraction of clustering in which two objects are members of the same group/cluster. Clustering the S matrix results in meta-clusters which essentially inherit the characteristics of all data views in the original data. We have performed a simulation study on three data views to validate the framework. For each view, we tune the parameter of data, and investigate whether the results conform the changes. We have also applied the proposed framework in a real life dataset to identify meta-clusters. Moreover, we have analyzed the resulting meta-clusters to validate our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Mining hidden community in heterogeneous social networks. In: Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD 2005, NY, USA, pp. 58–65 (2005). doi:10.1145/1134271.1134280

  2. Dasarathy, B.: Decision Fusion. IEEE CS Press, Los Alamitos (1994)

    Google Scholar 

  3. Fu, W., Sanders-Beer, B., Katz, K., Maglott, D., Pruitt, K.: Human immunodeficiency virus type-1, human protein interaction database at NCBI. Nucl. Acids Res. (Database Issue) 37, D417–D422 (2009)

    Article  Google Scholar 

  4. Granger, W.J.: Combining forecasts twenty years later. Eur. Conf. Mach. Learn. 8(3), 167–173 (1989)

    MathSciNet  Google Scholar 

  5. Green, D., Cunningham, P.: A matrix factorization approach for integrating multiple data views. In: Berlin, S.V. (ed.) Proceedings in ECML PKDD 2009, Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 423–438 (2009)

    Google Scholar 

  6. Greene, D., Cunningham, P.: Multi-view clustering for mining heterogeneous social network data. In: Workshop on Information Retrieval over Social Networks, 31st European Conference on Information Retrieval, ECIR 2009 (2009)

    Google Scholar 

  7. Hecker, M., Lambeck, S., Toepfer, S., van Someren, E., Guthke, R.: Gene regulatory network inference: data integration in dynamic models? A review. BioSystems 96, 86–103 (2009)

    Article  Google Scholar 

  8. Hull, R., Zhou, G.: A framework for supporting data integration using the materialized and virtual approaches. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, vol. 25, pp. 481–492, June 1996

    Google Scholar 

  9. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998). doi:10.1137/S1064827595287997

    Article  MathSciNet  MATH  Google Scholar 

  10. Lapatas, V., Stefanidakis, M., Jimenez, R., Via, A., Schneider, M.V.: Data integration in biological research: an overview. J. Biol. Res. Thessalon. 2, 9 (2015)

    Article  Google Scholar 

  11. Liu, Y.T., Liu, T.Y., Qin, T., Ma, Z.M., Li, H.: Supervised rank aggregation. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, NY, USA, pp. 481–490 (2007). doi:10.1145/1242572.1242638

  12. Ray, S., Bandyopadhyay, S.: A NMF based approach for integrating multiple data sources to predict HIV-1-human PPIs. BMC Bioinform. 17(1), 121 (2016). doi:10.1186/s12859-016-0952-6

    Article  Google Scholar 

  13. Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., Lewis, S.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25, 1251–1255 (2007)

    Article  Google Scholar 

  14. Strehl, A., Ghosh, J.: Cluster ensembles a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3(12), 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saif Ayan Khan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Khan, S.A., Ray, S. (2017). Integrating Multi-view Data: A Hypergraph Based Approach. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 776. Springer, Singapore. https://doi.org/10.1007/978-981-10-6430-2_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6430-2_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6429-6

  • Online ISBN: 978-981-10-6430-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics