Skip to main content

Extreme Learning Machine to Graph Convolutional Networks

  • Conference paper
  • First Online:
Intelligent Systems (BRACIS 2022)

Abstract

Graph Convolutional Network (GCN) is a powerful model to deal with data arranged as a graph, a structured non-euclidian domain. It is known that GCN reaches high accuracy even when operating with just 2 layers. Another well-known result shows that Extreme Learning Machine (ELM) is an efficient analytic learning technique to train 2 layers Multi-Layer Perceptron (MLP). In this work, we extend ELM theory to operate in the context of GCN, giving rise to ELM-GCN, a novel learning mechanism to train GCN that turns out to be faster than baseline techniques while maintaining prediction capability. We also show a theoretical upper bound in the number of hidden units required to guarantee the GCN performance. To the best of our knowledge, our approach is the first to provide such theoretical guarantees while proposing a non-iterative learning algorithm to train graph convolutional networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. J. Mach. Learn. Res. 18 (2018)

    Google Scholar 

  2. Baykara, M., Abdulrahman, A.: Seizure detection based on adaptive feature extraction by applying extreme learning machines. Traitement du Signal 38(2), 331–340 (2021)

    Article  Google Scholar 

  3. Chen, J., Ma, T., Xiao, C.: Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247 (2018)

  4. Chiang, W.L., Liu, X., Si, S., Li, Y., Bengio, S., Hsieh, C.J.: Cluster-gcn: an efficient algorithm for training deep and large graph convolutional networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 257–266 (2019)

    Google Scholar 

  5. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  6. Deng, W., Zheng, Q., Chen, L.: Regularized extreme learning machine. In: 2009 IEEE Symposium on Computational Intelligence and Data Mining, pp. 389–395. IEEE (2009)

    Google Scholar 

  7. Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216 (2017)

  8. He, B., Xu, D., Nian, R., van Heeswijk, M., Yu, Q., Miche, Y., Lendasse, A.: Fast face recognition via sparse coding and extreme learning machine. Cogn. Comput. 6(2), 264–277 (2014)

    Google Scholar 

  9. Huang, G.B., Wang, D.H., Lan, Y.: Extreme learning machines: a survey. Int. J. Mach. Learn. Cybern. 2(2), 107–122 (2011)

    Article  Google Scholar 

  10. Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)

    Article  Google Scholar 

  11. Inaba, F.K., Teatini Salles, E.O., Perron, S., Caporossi, G.: DGR-ELM - Distributed Generalized Regularized ELM for classification. Neurocomputing 275, 1522–1530 (2018)

    Google Scholar 

  12. Jin, G., Wang, Q., Zhu, C., Feng, Y., Huang, J., Zhou, J.: Addressing crime situation forecasting task with temporal graph convolutional neural network approach. In: 2020 12th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pp. 474–478. IEEE (2020)

    Google Scholar 

  13. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  14. Kipf, T.N., Welling, M.: Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016)

  15. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  16. Li, Q., Han, Z., Wu, X.M.: Deeper insights into graph convolutional networks for semi-supervised learning. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  17. Lv, Q., Niu, X., Dou, Y., Xu, J., Lei, Y.: Classification of hyperspectral remote sensing image using hierarchical local-receptive-field-based extreme learning machine. IEEE Geosci. Remote Sens. Lett. 13(3), 434–438 (2016)

    Google Scholar 

  18. Martínez-Martínez, J.M., Escandell-Montero, P., Soria-Olivas, E., Martín-Guerrero, J.D., Magdalena-Benedito, R., Gómez-Sanchis, J.: Regularized extreme learning machine for regression problems. Neurocomputing 74(17), 3716–3721 (2011). https://doi.org/10.1016/j.neucom.2011.06.013. https://linkinghub.elsevier.com/retrieve/pii/S092523121100378X

  19. Seo, Y., Defferrard, M., Vandergheynst, P., Bresson, X.: Structured sequence modeling with graph convolutional recurrent networks. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11301, pp. 362–373. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04167-0_33

    Chapter  Google Scholar 

  20. Shchur, O., Mumme, M., Bojchevski, A., Günnemann, S.: Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018)

  21. da Silva, B.L.S., Inaba, F.K., Salles, E.O.T., Ciarelli, P.M.: Outlier robust extreme machine learning for multi-target regression. Expert Syst. Appl. 140, 112877 (2020)

    Article  Google Scholar 

  22. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)

  23. Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., Weinberger, K.: Simplifying graph convolutional networks. In: International Conference on Machine Learning, pp. 6861–6871. PMLR (2019)

    Google Scholar 

  24. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Networks Learn. Syst. (2020)

    Google Scholar 

  25. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI conference on Artificial Intelligence (2018)

    Google Scholar 

  26. Yang, Z., Cohen, W., Salakhudinov, R.: Revisiting semi-supervised learning with graph embeddings. In: International Conference on Machine Learning, pp. 40–48. PMLR (2016)

    Google Scholar 

  27. Ying, R., He, R., Chen, K., Eksombatchai, P., Hamilton, W.L., Leskovec, J.: Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 974–983 (2018)

    Google Scholar 

  28. You, Y., Chen, T., Wang, Z., Shen, Y.: L2-gcn: layer-wise and learned efficient training of graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2127–2135 (2020)

    Google Scholar 

  29. Zeng, H., Prasanna, V.: Graphact: accelerating gcn training on cpu-fpga heterogeneous platforms. In: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 255–265 (2020)

    Google Scholar 

  30. Zhang, K., Luo, M.: Outlier-robust extreme learning machine for regression problems. Neurocomputing 151, 1519–1527 (2015)

    Article  Google Scholar 

  31. Zhang, M., Chen, Y.: Link prediction based on graph neural networks. arXiv preprint arXiv:1802.09691 (2018)

  32. Zhang, Z., Cai, Y., Gong, W., Liu, X., Cai, Z.: Graph convolutional extreme learning machine. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)

    Google Scholar 

  33. Zhao, L., et al.: T-gcn: a temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst. (2019)

    Google Scholar 

Download references

Acknowledgments

This work was supported by grant #2018/24516-0, São Paulo Research Foundation (FAPESP). The opinions, hypotheses and conclusions or recommendations expressed in this material are those of responsibility of the author(s) and do not necessarily reflect FAPESP’s view.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thales Gonçalves .

Editor information

Editors and Affiliations

Appendices

Appendix 1. Proofs

In this appendix we present the proofs of Theorems 1 and 2, which support the theory of Extreme Learning Machine to Graph Convolutional Networks. We also show that RELM-GCN matches ELM-GCN when \(\gamma \rightarrow \infty \).

Theorem 1

Proof

We interpret the convolved graph signal matrix \(\hat{X}\) rows as the input samples of classic ELM Theorem (Theorem 2.1 from [10]). Since the hypotheses that rows of \(\hat{X}\) are distinct, \(\sigma \) is an infinitely differentiable activation function, and \(\varTheta ^{(1)}\) is randomly sampled from some continuous probability distribution, we conclude the hypothesis of the classic ELM Theorem is satisfied, thus \(\sigma (\hat{X} \varTheta ^{(1)}) = \sigma (\hat{A} X \varTheta ^{(1)}) =: Y_h\) is invertible with probability one.

Since the edge weights of G are non-negative, all the elements in matrix \(\tilde{A}\) (Eq. 2) are also non-negative. Moreover, diagonal elements of \(\tilde{A}\) are positive. Thus, diagonal elements of \(\tilde{D}\) (Eq. 2) are also positive and hence \(\tilde{D}\) and \(\tilde{D}^{-1/2}\) are invertible matrices. Furthermore, from the hypothesis that \(\tilde{A}\) is invertible, Eq. 2 shows that \(\hat{A}\) must also be an invertible matrix. Therefore, with probability one, \((\hat{A}\ Y_h)\) is invertible and by defining \(\varTheta ^{(2)} := (\hat{A} Y_h)^{-1}T\) we have \(\Vert \hat{A} Y_h \varTheta ^{(2)} - T \Vert = 0\), concluding the proof of the Theorem.

Theorem 2

Proof

Following exactly the same proof of classic ELM Theorem (Theorem 2.2 from [10]), the validity of the Theorem comes from the fact that, otherwise, one could choose \(H = N\) which makes \(\Vert \hat{A} Y_h \varTheta ^{(2)} - T \Vert = 0 < \epsilon \) according to Theorem 1.

Now we show that ELM-GCN is a special case of RELM-GCN when \(\gamma \rightarrow \infty \).

Proof

When \(\gamma \rightarrow \infty \) we have \(\frac{1}{\gamma }I \rightarrow 0\). Thus the analytical assignment to \(\varTheta ^{(2)}\) by RELM-GCN (last instruction of Algorithm 3) becomes:

$$\begin{aligned} \varTheta ^{(2)}&= \bigg ( \frac{1}{\gamma }I + (\hat{A}Y_h)^T (\hat{A}Y_h) \bigg )^\dagger (\hat{A}Y_h)^T\,T = \bigg ( (\hat{A}Y_h)^T (\hat{A}Y_h) \bigg )^\dagger (\hat{A}Y_h)^T\,T\\&= (\hat{A}Y_h)^\dagger \Big ( (\hat{A}Y_h)^T \Big )^\dagger (\hat{A}Y_h)^T\,T = (\hat{A}Y_h)^\dagger \Big ( (\hat{A}Y_h)^\dagger \Big )^T (\hat{A}Y_h)^T\,T\\&= (\hat{A}Y_h)^\dagger \Big ( (\hat{A}Y_h) (\hat{A}Y_h)^\dagger \Big )^T \,T = (\hat{A}Y_h)^\dagger (\hat{A}Y_h) (\hat{A}Y_h)^\dagger \,T = (\hat{A}Y_h)^\dagger \,T\\ \end{aligned}$$

which is the assignment to \(\varTheta ^{(2)}\) given by ELM-GCN (last instruction of Algorithm 2).

Appendix 2. Additional Experiment Details

In the following, we further validate the results obtained in the experiments involving real data. Specifically, we analyse the different runs that produced Fig. 3 using Wilcoxon Signed-Rank Test [5]. This hypothesis test is a non-parametric statistical test that compares two samples that are paired. Precisely, the test compares accuracy and training time produced by either ELM-GCN or RELM-GCN against the ones resulting from the other two algorithms. First, we consider the null hypothesis that one of our approaches and competing algorithms generate results according to the same distribution. If the null hypothesis is rejected, we proceed to the next step which considers another null hypothesis that ELM-GCN or its regularized version performs worst (i.e. lower accuracy or higher training time) than the other learning techniques. In all tests we use significance of 99.9%.

Regarding ELM-GCN, the Wilcoxon test rejected the hypothesis that this technique produces output at least as accurate as BP or fastGCN. However in terms of training time, the null hypothesis that ELM-GCN comes from same distribution as its competitors is rejected. Moreover, the Wilcoxon test also rejects the hypothesis that ELM-GCN has higher training time than BP or fastGCN.

Comparing RELM-GCN with BP, the Wilcoxon test could not reject the null hypothesis that both techniques produce the same accuracy. However, when RELM-GCN is compared against fastGCN, we get an interesting outcome. The null hypothesis can not be rejected when both learning algorithms are compared in the inductive paradigm, but the hypothesis is rejected in the transductive scenario. Moreover, the hypothesis that RELM-GCN is less accurate than fastGCN in the same paradigm is rejected. Indeed, a careful analysis of Fig. 3a shows that RELM-GCN consistently outperforms fastGCN on the first 3 datasets while performing comparably on Reddit.

Furthermore, the Wilcoxon test rejected the null hypothesis that RELM-GCN is as fast as competing techniques, regardless of the algorithm and paradigm chosen to compare. Moreover, the second step of the test showed that we should also reject the hypothesis that RELM-GCN is slower than the other algorithms in any learning paradigm. Conclusions provided by the Wilcoxon test are consistent with training times shown in Fig. 3, since RELM-GCN outperforms the competing algorithms in most datasets, being only comparable with fastGCN in Pubmed.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gonçalves, T., Nonato, L.G. (2022). Extreme Learning Machine to Graph Convolutional Networks. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. Lecture Notes in Computer Science(), vol 13654 . Springer, Cham. https://doi.org/10.1007/978-3-031-21689-3_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21689-3_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21688-6

  • Online ISBN: 978-3-031-21689-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics