Extreme Learning Machine to Graph Convolutional Networks

Gonçalves, Thales; Nonato, Luis Gustavo

doi:10.1007/978-3-031-21689-3_42

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13654 ))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

896 Accesses

Abstract

Graph Convolutional Network (GCN) is a powerful model to deal with data arranged as a graph, a structured non-euclidian domain. It is known that GCN reaches high accuracy even when operating with just 2 layers. Another well-known result shows that Extreme Learning Machine (ELM) is an efficient analytic learning technique to train 2 layers Multi-Layer Perceptron (MLP). In this work, we extend ELM theory to operate in the context of GCN, giving rise to ELM-GCN, a novel learning mechanism to train GCN that turns out to be faster than baseline techniques while maintaining prediction capability. We also show a theoretical upper bound in the number of hidden units required to guarantee the GCN performance. To the best of our knowledge, our approach is the first to provide such theoretical guarantees while proposing a non-iterative learning algorithm to train graph convolutional networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. J. Mach. Learn. Res. 18 (2018)
Google Scholar
Baykara, M., Abdulrahman, A.: Seizure detection based on adaptive feature extraction by applying extreme learning machines. Traitement du Signal 38(2), 331–340 (2021)
Article Google Scholar
Chen, J., Ma, T., Xiao, C.: Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247 (2018)
Chiang, W.L., Liu, X., Si, S., Li, Y., Bengio, S., Hsieh, C.J.: Cluster-gcn: an efficient algorithm for training deep and large graph convolutional networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 257–266 (2019)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Deng, W., Zheng, Q., Chen, L.: Regularized extreme learning machine. In: 2009 IEEE Symposium on Computational Intelligence and Data Mining, pp. 389–395. IEEE (2009)
Google Scholar
Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216 (2017)
He, B., Xu, D., Nian, R., van Heeswijk, M., Yu, Q., Miche, Y., Lendasse, A.: Fast face recognition via sparse coding and extreme learning machine. Cogn. Comput. 6(2), 264–277 (2014)
Google Scholar
Huang, G.B., Wang, D.H., Lan, Y.: Extreme learning machines: a survey. Int. J. Mach. Learn. Cybern. 2(2), 107–122 (2011)
Article Google Scholar
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)
Article Google Scholar
Inaba, F.K., Teatini Salles, E.O., Perron, S., Caporossi, G.: DGR-ELM - Distributed Generalized Regularized ELM for classification. Neurocomputing 275, 1522–1530 (2018)
Google Scholar
Jin, G., Wang, Q., Zhu, C., Feng, Y., Huang, J., Zhou, J.: Addressing crime situation forecasting task with temporal graph convolutional neural network approach. In: 2020 12th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pp. 474–478. IEEE (2020)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Kipf, T.N., Welling, M.: Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Li, Q., Han, Z., Wu, X.M.: Deeper insights into graph convolutional networks for semi-supervised learning. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Lv, Q., Niu, X., Dou, Y., Xu, J., Lei, Y.: Classification of hyperspectral remote sensing image using hierarchical local-receptive-field-based extreme learning machine. IEEE Geosci. Remote Sens. Lett. 13(3), 434–438 (2016)
Google Scholar
Martínez-Martínez, J.M., Escandell-Montero, P., Soria-Olivas, E., Martín-Guerrero, J.D., Magdalena-Benedito, R., Gómez-Sanchis, J.: Regularized extreme learning machine for regression problems. Neurocomputing 74(17), 3716–3721 (2011). https://doi.org/10.1016/j.neucom.2011.06.013. https://linkinghub.elsevier.com/retrieve/pii/S092523121100378X
Seo, Y., Defferrard, M., Vandergheynst, P., Bresson, X.: Structured sequence modeling with graph convolutional recurrent networks. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11301, pp. 362–373. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04167-0_33
Chapter Google Scholar
Shchur, O., Mumme, M., Bojchevski, A., Günnemann, S.: Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018)
da Silva, B.L.S., Inaba, F.K., Salles, E.O.T., Ciarelli, P.M.: Outlier robust extreme machine learning for multi-target regression. Expert Syst. Appl. 140, 112877 (2020)
Article Google Scholar
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., Weinberger, K.: Simplifying graph convolutional networks. In: International Conference on Machine Learning, pp. 6861–6871. PMLR (2019)
Google Scholar
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Networks Learn. Syst. (2020)
Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI conference on Artificial Intelligence (2018)
Google Scholar
Yang, Z., Cohen, W., Salakhudinov, R.: Revisiting semi-supervised learning with graph embeddings. In: International Conference on Machine Learning, pp. 40–48. PMLR (2016)
Google Scholar
Ying, R., He, R., Chen, K., Eksombatchai, P., Hamilton, W.L., Leskovec, J.: Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 974–983 (2018)
Google Scholar
You, Y., Chen, T., Wang, Z., Shen, Y.: L2-gcn: layer-wise and learned efficient training of graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2127–2135 (2020)
Google Scholar
Zeng, H., Prasanna, V.: Graphact: accelerating gcn training on cpu-fpga heterogeneous platforms. In: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 255–265 (2020)
Google Scholar
Zhang, K., Luo, M.: Outlier-robust extreme learning machine for regression problems. Neurocomputing 151, 1519–1527 (2015)
Article Google Scholar
Zhang, M., Chen, Y.: Link prediction based on graph neural networks. arXiv preprint arXiv:1802.09691 (2018)
Zhang, Z., Cai, Y., Gong, W., Liu, X., Cai, Z.: Graph convolutional extreme learning machine. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)
Google Scholar
Zhao, L., et al.: T-gcn: a temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst. (2019)
Google Scholar

Download references

Acknowledgments

This work was supported by grant #2018/24516-0, São Paulo Research Foundation (FAPESP). The opinions, hypotheses and conclusions or recommendations expressed in this material are those of responsibility of the author(s) and do not necessarily reflect FAPESP’s view.

Author information

Authors and Affiliations

Institute of Mathematics and Computer Sciences, University of São Paulo (ICMC-USP), São Carlos, Brazil
Thales Gonçalves & Luis Gustavo Nonato

Authors

Thales Gonçalves
View author publications
You can also search for this author in PubMed Google Scholar
Luis Gustavo Nonato
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thales Gonçalves .

Editor information

Editors and Affiliations

Federal University of Rio Grande do Norte, Natal, Brazil
João Carlos Xavier-Junior
Federal University of Bahia, Salvador, Brazil
Ricardo Araújo Rios

Appendices

Appendix 1. Proofs

In this appendix we present the proofs of Theorems 1 and 2, which support the theory of Extreme Learning Machine to Graph Convolutional Networks. We also show that RELM-GCN matches ELM-GCN when $\gamma \rightarrow \infty $.

Theorem 1

Proof

We interpret the convolved graph signal matrix $\hat{X}$ rows as the input samples of classic ELM Theorem (Theorem 2.1 from [10]). Since the hypotheses that rows of $\hat{X}$ are distinct, $\sigma $ is an infinitely differentiable activation function, and $\varTheta ^{(1)}$ is randomly sampled from some continuous probability distribution, we conclude the hypothesis of the classic ELM Theorem is satisfied, thus $\sigma (\hat{X} \varTheta ^{(1)}) = \sigma (\hat{A} X \varTheta ^{(1)}) =: Y_h$ is invertible with probability one.

Since the edge weights of G are non-negative, all the elements in matrix $\tilde{A}$ (Eq. 2) are also non-negative. Moreover, diagonal elements of $\tilde{A}$ are positive. Thus, diagonal elements of $\tilde{D}$ (Eq. 2) are also positive and hence $\tilde{D}$ and $\tilde{D}^{-1/2}$ are invertible matrices. Furthermore, from the hypothesis that $\tilde{A}$ is invertible, Eq. 2 shows that $\hat{A}$ must also be an invertible matrix. Therefore, with probability one, $(\hat{A}\ Y_h)$ is invertible and by defining $\varTheta ^{(2)} := (\hat{A} Y_h)^{-1}T$ we have $\Vert \hat{A} Y_h \varTheta ^{(2)} - T \Vert = 0$, concluding the proof of the Theorem.

Theorem 2

Proof

Following exactly the same proof of classic ELM Theorem (Theorem 2.2 from [10]), the validity of the Theorem comes from the fact that, otherwise, one could choose $H = N$ which makes $\Vert \hat{A} Y_h \varTheta ^{(2)} - T \Vert = 0 < \epsilon $ according to Theorem 1.

Now we show that ELM-GCN is a special case of RELM-GCN when $\gamma \rightarrow \infty $.

Proof

When $\gamma \rightarrow \infty $ we have $\frac{1}{\gamma }I \rightarrow 0$. Thus the analytical assignment to $\varTheta ^{(2)}$ by RELM-GCN (last instruction of Algorithm 3) becomes:

$$\begin{aligned} \varTheta ^{(2)}&= \bigg ( \frac{1}{\gamma }I + (\hat{A}Y_h)^T (\hat{A}Y_h) \bigg )^\dagger (\hat{A}Y_h)^T\,T = \bigg ( (\hat{A}Y_h)^T (\hat{A}Y_h) \bigg )^\dagger (\hat{A}Y_h)^T\,T\\&= (\hat{A}Y_h)^\dagger \Big ( (\hat{A}Y_h)^T \Big )^\dagger (\hat{A}Y_h)^T\,T = (\hat{A}Y_h)^\dagger \Big ( (\hat{A}Y_h)^\dagger \Big )^T (\hat{A}Y_h)^T\,T\\&= (\hat{A}Y_h)^\dagger \Big ( (\hat{A}Y_h) (\hat{A}Y_h)^\dagger \Big )^T \,T = (\hat{A}Y_h)^\dagger (\hat{A}Y_h) (\hat{A}Y_h)^\dagger \,T = (\hat{A}Y_h)^\dagger \,T\\ \end{aligned}$$

which is the assignment to $\varTheta ^{(2)}$ given by ELM-GCN (last instruction of Algorithm 2).

Appendix 2. Additional Experiment Details

In the following, we further validate the results obtained in the experiments involving real data. Specifically, we analyse the different runs that produced Fig. 3 using Wilcoxon Signed-Rank Test [5]. This hypothesis test is a non-parametric statistical test that compares two samples that are paired. Precisely, the test compares accuracy and training time produced by either ELM-GCN or RELM-GCN against the ones resulting from the other two algorithms. First, we consider the null hypothesis that one of our approaches and competing algorithms generate results according to the same distribution. If the null hypothesis is rejected, we proceed to the next step which considers another null hypothesis that ELM-GCN or its regularized version performs worst (i.e. lower accuracy or higher training time) than the other learning techniques. In all tests we use significance of 99.9%.

Regarding ELM-GCN, the Wilcoxon test rejected the hypothesis that this technique produces output at least as accurate as BP or fastGCN. However in terms of training time, the null hypothesis that ELM-GCN comes from same distribution as its competitors is rejected. Moreover, the Wilcoxon test also rejects the hypothesis that ELM-GCN has higher training time than BP or fastGCN.

Comparing RELM-GCN with BP, the Wilcoxon test could not reject the null hypothesis that both techniques produce the same accuracy. However, when RELM-GCN is compared against fastGCN, we get an interesting outcome. The null hypothesis can not be rejected when both learning algorithms are compared in the inductive paradigm, but the hypothesis is rejected in the transductive scenario. Moreover, the hypothesis that RELM-GCN is less accurate than fastGCN in the same paradigm is rejected. Indeed, a careful analysis of Fig. 3a shows that RELM-GCN consistently outperforms fastGCN on the first 3 datasets while performing comparably on Reddit.

Furthermore, the Wilcoxon test rejected the null hypothesis that RELM-GCN is as fast as competing techniques, regardless of the algorithm and paradigm chosen to compare. Moreover, the second step of the test showed that we should also reject the hypothesis that RELM-GCN is slower than the other algorithms in any learning paradigm. Conclusions provided by the Wilcoxon test are consistent with training times shown in Fig. 3, since RELM-GCN outperforms the competing algorithms in most datasets, being only comparable with fastGCN in Pubmed.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gonçalves, T., Nonato, L.G. (2022). Extreme Learning Machine to Graph Convolutional Networks. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. Lecture Notes in Computer Science(), vol 13654 . Springer, Cham. https://doi.org/10.1007/978-3-031-21689-3_42

Download citation

DOI: https://doi.org/10.1007/978-3-031-21689-3_42
Published: 19 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21688-6
Online ISBN: 978-3-031-21689-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Extreme Learning Machine to Graph Convolutional Networks

Abstract

Access this chapter

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Appendix 1. Proofs

Proof

Proof

Proof

Appendix 2. Additional Experiment Details

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation