Edinburgh Research Explorer Dispersion Entropy for Graph Signals

We present a novel method, called Dispersion Entropy for Graph Signals, DE G , as a powerful tool for analysing the irregularity of signals defined on graphs. DE G generalizes the classical dispersion entropy concept for univariate time series, enabling its application in diverse domains such as image processing, time series analysis, and network analysis. Furthermore, DE G establishes a theoretical framework that provides insights into the irregularities observed in graph centrality measures and in the spectra of operators acting on graphs. We demonstrate the effectiveness of DE G in detecting changes in the dynamics of signals defined on both synthetic and real-world graphs, by defining a mix process on random geometric graphs or those exhibiting small-world properties. Our results indicate that DE G effectively captures the irregularity of graph signals across various network configurations, successfully differentiating between distinct levels of randomness and connectivity. Consequently, DE G provides a comprehensive framework for entropy analysis of various data types, enabling new applications of dispersion entropy not previously feasible, and uncovering nonlinear relationships between graph signals and their graph topology.


Introduction
The field of data analysis has significantly benefited from the evolution of entropy-based measures [1], vital tools for assessing irregularities and nonlinear behaviours in data [2,3].These measures have a broad spectrum of applications, spanning across various fields such as finance [4,5], biology [6,7], industrial process [8,9] and even conflicts and international events [10] to name a few.Among these entropybased measures, Dispersion Entropy (DE) has carved a unique space for itself due to its distinct properties and effectiveness [11] .
Serving as a robust algorithm, DE is specifically designed to capture intricate dynamics within one-dimensional time series data.Its standout characteristic is the dual consideration of both the order and amplitude of data points, offering a comprehensive perspective on the system's dynamics [11,12].
DE boasts several distinct advantages.It is capable of identifying changes in noise bandwidth as well as concurrent shifts in frequency and amplitude, making it a highly adaptable tool for signal analysis [11].Its superior performance has been demonstrated in applications on real-valued signals, where it outperforms other entropy measures in distinguishing different groups within datasets [13].Furthermore, it exhibits high computational efficiency, necessitating significantly less computation time compared to several other entropic measures [14].This property makes it especially suited for processing large or complex datasets.
Owing to its versatility and efficiency, DE has been employed in various domains.These include EEG analysis [13], diagnosis and monitoring of rotary machines [9], and fault detection in rolling bearings [15].With the increasing complexity and volume of data, the utility of DE continues to grow, making it a tool of choice in diverse applications.
The ongoing expansion of complex data recorded in a distributed way over networks, including transportation systems [16], social and web networks [17], has led to growing interest in broadening the scope of entropy metrics beyond univariate time series to encompass more general domains.There have been recent advancements in extending entropy measures to analyse images (2D data) [18] and irregular domains such as graphs [19].Despite the improvements made in DE for 1D and 2D data [12,20], a void remains in its application for data defined on graphs.Bridging this gap would allow us to analyse the nonlinear behaviour of real-world systems with graph-based structures, where the conventional DE was not previously applicable.This would provide a powerful framework for data analysis across a broad range of applications in the field of Graph Signal Processing (GSP) [16].
Smoothness is a fundamental property extensively studied in GSP [16,21,22], typically through the use of the combinatorial Laplacian's quadratic form.Intuitively, a graph signal is considered smooth if connected vertices exhibit similar values [21].Nonetheless, this definition may not fully capture the complex dynamics of graph signals due to its https://doi.org/10.1016/j.chaos.2023.113977Received 12 May 2023; Received in revised form 31 July 2023; Accepted 24 August 2023 relationship with the spectrum [23].To address this limitation, here we propose a novel method, enabled by our new method DE G to assess irregularity in graph signals, which effectively captures the irregularity of graph signals, providing critical insights into the underlying graph structure and data.
To evaluate our method's performance, we employ synthetic and real-world graphs, including random geometric graphs (used to model wireless sensor networks [24]) and small-world networks (observed widely in biological systems [25], social networks [26], and complex systems [27]).In our analysis, we generalize the mix process -a stochastic process combining a sinusoidal signal with random dynamics controlled by the parameter  ∈ [0, 1] -to the setting of graph signals.This process has been employed to assess the performance of various entropy metrics in time series [14,28] and images [29].Moreover, we analyse centrality measures, which assign ranking values to the graph's vertices based on their position or importance within the graph.Centrality measures play a crucial role in social network analysis for evaluating the importance of vertices in communication [30,31].

Contribution
In this paper, we propose a method for defining Dispersion Entropy for Graph Signals, denoted as DE G .Our approach generalizes the classical univariate definition of DE by incorporating topological information through the adjacency matrix.We demonstrate the effectiveness of DE G on synthetic and real-world datasets, and characterize the relationship between graph topology and signal dynamics.Our results indicate that DE G is a promising technique for analysing graph data, holding potential for numerous applications in fields such as biomedicine and social sciences.

Background and notation
This section aims to provide the foundational background and set the notations that are used throughout the remainder of this study.We also present a brief overview of the classical Dispersion Entropy method for univariate time series.
The degree matrix  is a diagonal matrix where each diagonal entry   is the degree of the vertex   .The combinatorial Laplacian matrix of the graph, , is defined by  =  −  and the normalized Laplacian, , is defined as  =  −1∕2  −1∕2 .These matrices serve as key tools in the spectral analysis of graphs [32,33].
In the context of this study, a graph signal is a real mapping  ∶  → R. For computational purposes, it is convenient to represent the graph signal  as an -dimensional column vector in R ×1 , written as , where the indexing corresponds to the vertices.

Example
Consider the graph  shown in the left panel of Fig. 1, composed of vertices and edges  as indicated by the connecting lines.
The graph signal vector.The signal values are visualized using red lines for positive values and blue lines for negative values as shown in Fig. 1.
In the middle panel of Fig. 1, we present the adjacency matrix  of the graph .This symmetric matrix, with dimensions  ×  (in this case, 8 × 8), has entries   =   that are set to 1 if an edge exists between   and   , and 0 otherwise

The classical dispersion entropy for time series
Dispersion Entropy (DE) is an important tool used for analysing time series data [13].The wide array of applications and the robustness of the DE method make it an essential technique for time series data analysis.More importantly, it has proven to be an effective tool for handling nonlinear and non-stationary data, which are common in realworld applications.We provide a concise step-wise explanation of the DE method here, with a detailed mathematical derivation available in the Appendix A: 1.The samples of the signal are discretized into different classes according to the signal values, resulting in a classified signal.2. The classified signal is scanned looking for patterns, composed of a series of samples given an embedding dimension and delay time.Each pattern corresponds to a unique dispersion pattern.3. The occurrence frequency of each potential dispersion pattern is calculated, indicating the dominance of certain patterns in the time series.4. The DE is computed based on these frequencies using the concept of Shannon entropy, hence offering a measure of the complexity or irregularity in the time series.

Dispersion Entropy for Graph Signals (𝐃𝐄 𝐆 )
The Dispersion Entropy for Graph Signals (DE G ) extends the classical method to graph signals, adapting the entropy calculation to the underlying graph structure, providing unique insights into the complexity and interconnectedness of graph-based data.

The algorithm
Let  be a graph signal defined on , 2 ≤  ∈ N be the embedding dimension,  ∈ N be the delay time and  ∈ N be the class number.The DE G is defined as follows: 1.The embedding matrix  ∈ R × is given by defined by where where || represents cardinality.(3)

Properties
The DE G algorithm offers several unique features and properties.
The embedding matrix  ∈ R × is a critical component encapsulating the intricate topological relationships between the graph and signal.Typically, an embedding dimension is chosen within the range 3 ≤  ≤ 7, with the delay time often set to  = 1, as suggested in [2].
Eq. ( 1) plays a pivotal role in the algorithm, not only because it intertwines the graph's topology and the signal's values, but also due to the fact that its simple and efficient implementation belies its profound geometric implications.
Each column vector   is calculated by averaging the signal values of the neighbouring vertices.The first column of the matrix  corresponds to the original graph signal, i.e.,  0 = .For  = 1, the Eq.(1) simplifies to  1 = , where   = 1∕ ∑  =1 ()  = (  ), and (  ) denotes the number of edges connected to the vertex   .This results in: This demonstrates that the second column associated with the vertex   is the average of the graph signal's values of the vertices connected to   .Consequently, the second column is related to the normalized Laplacian , more specifically,  1 =  − .
In a more general context, for   =     , a similar interpretation applies.The exponent of the adjacency matrix   indicates the number of -walks between two vertices.That is, the element (  )  equals Hence, we have: Here, the numerator is the weighted signal with respect to the number of walks between the vertices, and the denominator serves as a normalization factor.This factor ensures that computations are properly balanced and scaled, which is crucial for accurate graph-signal analysis.Map functions.To address limitations in assigning the signal  to only a limited number of classes, various map functions  ∶ R → N  have been proposed [11].The Normal Cumulative Distribution Function (NCDF) is commonly utilized [34].The map  ∶ (0, 1) → N  is defined as () = (+0.5),where rounding increases or decreases a number to the nearest integer.The map NCDF ∶ R → (0, 1) is defined as: where  and  represent the mean and standard deviation of , respectively.Thus,  = • NCDF ∶ R → N  is the map function used in our implementation of the DE G algorithm.Dispersion patterns.The number of possible dispersion patterns that can be assigned to each embedding vector is   .Moreover, the number of embedding vectors constructed in the DE G algorithm is , one for each vertex.In contrast, classical DE has a number of embedding vectors dependent on the parameters  and , specifically,  − ( − 1).
The normalized Shannon's entropy provides a measure of irregularity that can be used to compare signals defined on different graphs.The value of this normalized entropy ranges from 0 (regular behaviour) to 1 (irregular behaviour).It is noteworthy to clarify that the usual Shannon's entropy given by − ∑ ∈ () ln () takes values between 0 to ln(  ), where   represents the number of potential dispersion patterns.Therefore, by normalizing the Shannon's entropy with ln(  ), we ensure that the entropy values are scaled to fall in the interval [0, 1].
Table 1 summarizes the main parameters of the DE G algorithm, providing a clear overview of their role in the computation process and the typical values used.

Example
We exemplify the utilization of DE G with the graph signal  and graph  introduced in Section 2.2 (also illustrated in Fig. 1).We establish the class number  = 3, the embedding dimension  = 2, and the delay time  = 1, and undertake the following sequence of steps: Step 1: We initially compute the embedding matrix  ∈ R 8×2 , as defined in Eq. ( 1).The resultant matrix is shown in Fig. 2(a).
Step 2: We apply the map function  to each entry of the normalized matrix, using the Normal Cumulative Distribution Function (NCDF) as formulated in Eq. ( 4), where () = 0.11 and () = 0.44.The map function  transforms the entries into an integer range from 1 to  = 3, resulting in the matrix illustrated in Fig. 2(b).
Step 3: Subsequently, we map each row of  () to a distinctive dispersion pattern.Given the parameters  and , there are   = 3 2 = 9 possible dispersion patterns.These patterns are presented in Fig. 2(c).
Step 4: We compute the relative frequencies of each dispersion pattern, utilizing Eq. ( 2), and display the results in Fig. 2(d).
Step 5: In the final step, we compute DE G using Eq. ( 3), which yields the following outcome: The DE G value encapsulates the regularity of the signal propagated across the graph.As is apparent, the graph is primarily dichotomous: Changes on the underlying graph that increases the irregularity of the signal.Significantly, the removal of this intergroup edge (between  4 and  5 ) leads to a more regular signal, as reflected by a decreased entropy value (see Fig. 3a).Conversely, introducing more edges between the two groups intensifies the signal's irregularity and correspondingly elevates the entropy values, as illustrated in Fig. 3c-e.This pattern elucidates that introducing edges between vertices from differing classes (thus different values) engenders more irregular signals.
Even if the signal remains unchanged, the underlying topology varies, a shift accurately captured by the DE G algorithm.
Changes on the underlying that preserves the irregularity of the signal.On the other hand, if we augment the graph with additional edges that join vertices within the same class, the signal irregularity remains similar, as detected by the DE G algorithm, producing similar or identical entropy values.In Fig. 4a-e, we display several underlying graphs with identical graph signals, all yielding the same entropy values, further exemplifying this point.

Same graph but different signals.
It is observed that the algorithm DE G is capable of discerning the dynamics of diverse graph signals that are embedded within the same graph structure.Let us consider the graph signal defined in Fig. 1 as our first example.The entropy computed by DE G for this configuration is 0.4434.When we alter the signal on vertex  3 from  3 = −0.6 to  3 = 0.6 (as shown in Fig. 5a), the entropy rises due to increased irregularity in the signal.Furthermore, the level of irregularity is even more pronounced if we modify the value on vertex  2 from  2 = −1.3 to  2 = 1.3 (refer to Fig. 5c).The reason being, the magnitude of irregularity has grown substantially.This effect is further magnified if we elevate the vertex with the lowest signal value,  1 , to a higher value (changing from  1 = −1.5 to  1 = 1.5, as depicted in Fig. 5d).A sharp increase in entropy is also observed when we change the signal values on two vertices as demonstrated in Fig. 5e.This underscores the sensitivity of the entropy measure to variations in graph signals.Lastly, Fig. 5b shows the resultant entropy value when we alter a single signal value belonging to the red group (positive values).In this case,  6 is modified from  6 = 0.5 to  6 = −0.5.The ability of DE G to distinguish these modifications and appropriately quantify the increase in entropy highlights its effectiveness in tracking changes in the dynamics of graph signals, even within the same underlying graph structure.Fig. 4. All graphs have the same set of vertices, then we can consider the same graph signal on each , but the underlying graph is different, however the dynamical is similar in all of them (two separated groups, red and blue, joined by one and only one edge.).

Fig. 5.
To assess the dynamics of similar graph signals with respect to the underlying graph topology, we maintained the same underlying graph as depicted in Fig. 1 for all scenarios.We also considered the same signal as that in Fig. 1, which has an entropy DE G = 0.4434, with the only difference being that we altered the sign of some vertices.This exercise allows us to examine the behaviour of similar signals on the same graph.The original graph signal is illustrated in grey.

Dispersion entropy for directed graphs
The algorithm DE G provides a tool for analysing undirected graph signals, and can be extended to directed graphs with minor modifications (see Appendix B).Additionally, the algorithm can be applied to any graph signal, but for time series, it produces the same values as the classical DE [11].This is established in Proposition 1 Proof.Please refer to the Appendix C. □

MIX Process and 𝐃𝐄 𝐆
In this section, our objective is to demonstrate the detection of irregularities, not only in simple graphs as illustrated in Section 3.3, but also in more complex structures such as Random Geometric Graphs, and when dealing with more complex signals as is the case of the MIX process.

Random geometric graph
The structure of a RGG is significantly influenced by the proximity parameter 0 ≤ .Two vertices   ,   ∈  are connected by an edge if and only if the Euclidean distance between their coordinates is less than or equal to , i.e., |  −   | ≤ .
For larger values of , more edges will be formed as the condition is more likely to be satisfied, leading to a denser and more connected graph.On the other hand, smaller values of  impose a more stringent condition for edge creation, resulting in sparser and more disconnected graphs.Hence,  acts as a tunable parameter controlling the sparsity and connectivity of the resulting RGG (see [36]).

MIX process
We introduce a graph-based stochastic process MIX  defined on RGGs to assess the performance of DE G in capturing complex signal dynamics.
First we define MIX R  ∶ R  → R given by where  ∶ R  → [0, 1] is a random variable with a probability  of taking the value 1 and a probability 1 −  of taking the value is uniformly distributed white noise, and . Observe that the function  is determined by the function sin (2 ) with period 1  and frequency  .Hence, the MIX R  is uniquely determined by the dimension , the amplitude of the noise  and the frequency  . 1  The selection of parameters  = 1,  = 1∕12 Hz,  = 3 and equidistant sampling was proposed by [7] and  = 5 Hz in [35].The choice was made such that the MIX process in time series could not be differentiated based on their sample means and standard deviations.The same parameter selection, except that  = 2, i.e. for images is studied in [29].

MIX process on RGGs
Let  be a -dimensional RGG with  vertices  = { 1 ,  2 ,  3 , … ,   }, and   →   = ( 1   , … ,    ) ∈ [0, 1]  .The graph signal MIX  is 1 In this paper, we utilize the term ''frequency'' to describe the parameter  within the MIX process.However, it is important to note that  is essentially the frequency of the sine function that constitutes the main component of the MIX process.Although the MIX process possesses periodic properties with a period of ( , it would be technically incorrect to label  as the frequency of the MIX process itself.This convention can be traced back to the case when  = 1, wherein  serves as the frequency of both the sine function and the MIX process.Moreover, we have adopted the units in the domain as seconds, thus defining  in Hertz, in alignment with the frequency of the sine function rather than the MIX process.Despite these detailed clarifications, it is critical to recognize that such convention does not impact the interpretation of our results.The primary objective remains to employ the MIX process as a benchmark to validate the effectiveness of DE G in diverse applications.defined as the restriction of the MIX R  (Eq.( 5)) to the graph domain, i.e.,: Similarly to the general process MIX R  (Section 4.2), the MIX  is defined for the graph  and determined by probability , the noise amplitude , the dimension  and the frequency  . 2  The construction of a -dimensional RGG requires selecting two parameters,  and .The graph signal generated by the MIX  process incorporates random noise (determined by ) with different amplitudes (determined by ) into some values of the sinusoidal signal (determined by  ).Finally, for the algorithm DE G , and according to Table 1, we employ a fixed embedding dimension of  = 3, the number of classes set at  = 3, time delay  = 1, and NCDF as the nonlinear map (similar results are obtained for others nonlinear mappings and values of , , and ).
Our algorithm, DE G , detects changes in the frequency of the signal (increasing  ), the presence of white noise (increasing ), and the graph connectivity (increasing ) by increasing the entropy values of DE G , but it is robust to the amplitude of the noise .Fig. 6 illustrates the effectiveness of DE G in detecting the dynamics of the MIX  process.
Analysis of the dimension .For clarity and simplicity in our discussion, we represent the RGG in a 2-dimensional space, i.e.,  = 2. Hence, each   corresponds to   = ( 1   ,  2  ), and (  ) = sin

𝑖
) for all 1 ≤  ≤ .It is crucial to note that our discussion's mathematical principles and results do not depend on this choice of the dimension  and the choice was made for better visualization.However, the analysis is valid for any higher dimension but will require more complex figures.
Impact of the frequency  and probability  in the construction of the graph signal MIX  .We analyse the impact of different parameter values on the irregularity of the graph signal MIX  by fixing the underlying RGG with constant  = 1500 and  = 0.06.
Increasing the frequency  of the MIX  process results in a more irregular graph signal.The frequency  = 1 Hz and  = 2 Hz of the sine function in Eq. ( 6) are depicted in Fig. 6(a)-(b).This increase in the frequency produces more variation in the graph signal values between neighbouring vertices.Our algorithm DE G detects these dynamics by increasing the entropy values.Similarly, an increase in the randomness parameter  results in a more random signal.The parameters  = 0 and  = 0.2 in Eq. ( 6) are depicted in Fig. 6(a), (c).The DE G algorithm detects the change in randomness, by increasing the entropy values.
More generally, we compute the entropy values for a range of frequencies from 3∕4 Hz to 8 Hz, as well as for different levels of noise, 2 As with the description in the previous section, the term ''frequency''  primarily represents the frequency of the sin , that define the MIX  process and not the frequency of graph signal MIX  .It is worth mentioning that common terms such as frequency, sampling, filtering, usually associated with Fourier Transform, lack a universally agreed-upon definition in the Graph Signal Processing domain.Robustness of the amplitude .The performance of the DE G algorithm is robust to variations in the amplitude  of the white noise, as defined in Eq. ( 6).This robustness was observed across a variety of experiments (0 ≤  ≤ 5) and persisted despite changes to the characteristics of the graph signal dynamics.This robustness can be attributed to the utilization of a mapping function in step 2 of the DE G algorithm.Specifically, although the introduction of noise leads to changes in the values of the embedding matrix, a significant proportion of these altered values are mapped to the same class number.Consequently, the overall distributions of permutation patterns and, by extension, the entropy values remain largely similar.This robustness to changes in noise amplitude is a key feature of the DE G algorithm's performance, and indicates that the graph version, DE G , maintains this property of the original DE for univariate time series [11] and images [12].

The spectrum of the Laplacian and 𝐃𝐄 𝑮
Let  be a graph signal; the smoothness of  is given by    [16].We examine the relationship between DE G and the spectrum of  acting on an RGGs (similar results are obtained for other random graphs).
Let  be a RGG with  = 1, 500 vertices.The eigenvalues of  and its corresponding eigenvectors are denoted by  = { 1 ≤  2 ≤ ⋯ ≤   } and {  }  =1 , respectively.The smoothness of each eigenvector is evaluated and normalized based on the classical definition, i.e.,  −1       , and the results are shown in Fig. 8.Each eigenvector   is considered as a graph signal and DE G is computed for  = 2, 3, 4 and  = 2.The results are depicted in Fig. 8.The smoothness definition is an increasing function, i.e., smaller eigenvalues correspond to smoother eigenvectors (also known as graph Fourier modes [37]).Such information is limited especially when eigenfunctions associated with equal eigenvalues (and equal smoothness) exhibit different levels of irregularity.By applying the DE G algorithm, we can better understand and analyse the dynamics of these eigenfunctions.
The dispersion entropy computed for different values of  and  enables us to capture abrupt changes in entropy values as the dynamics of the eigenfunctions change.Fig. 9 depicts six eigenvectors  {  } 532 =527 corresponding to the eigenvalues {  } 532 =527 .The definition of smoothness of   coincides with the value , and the eigenvalue  528 = 15 has a multiplicity equal to four, and its eigenfunctions {  } 531 =528 exhibit a regular behaviour, while  527 and  532 are more irregular.Hence, classical definitions are not able to fully capture the difference in dynamics within the graph signals.In contrast, the DE G algorithm is capable of detecting them.In particular, the entropy value of the eigenfunctions is nearly close to 0 if the signal exhibits a more regular dynamics and close to 1 for the most irregular eigenfunctions.Thus, DE G detects eigenvalues with high multiplicity, useful for the construction of isospectral graphs [38].

Small-world networks and 𝐃𝐄 𝐆
We evaluate the performance of DE G in detecting dynamics on signals defined on small-world networks, generated by the Watts-Strogatz model [26], and changing the mean degree  and rewiring probability .Let  be a small-world network with  = 1, 500 and various graph signals, including a random signal, a recurrence relation (logistic map [2]), a stochastic process (Wiener process [39]), and a periodic signal (sine).
Fixing , changing .By fixing  = 1, we analyse the effect of the parameter  (ranging from 0 to 1) in the construction of the network   and the entropy values.We compute DE G for each graph signal for 20 realizations, and the mean and standard deviation are depicted in Fig. 10(a).For  = 0, the underlying graph   is a cycle of  vertices.A path graph is a geometric perturbation of a cycle [40] and due to Proposition 1, we can consider the values of  = 0 to be the classical DE.The classical DE is able to detect the dynamics of various signals, but its computation does not involve the topological structure, thus it only works for the path graph.In contrast, DE G takes into account not only the signal information but also the graph structure.In this setting, the dynamics of the random signal is almost constant, because it is not affected by   .The Wiener process and sine signals exhibit lower entropy values for  = 0 (e.g., the cycle), as their dynamics stem from either periodicity (sine) or stochastic processes (Wiener).However, as  increases, the underlying graph becomes more random, and hence the entropy value also increases.In any case, DE G is still able to distinguish  the random signal from the periodic signal and the Wiener process (for all  < 0.8).Two logistic map signals are generated, one with oscillatory behaviour ( = 3.3) and one with chaotic behaviour ( = 3.7).These characteristics are well detected by DE G for all values of .
Fixing , changing .By fixing  = 0.05, the underlying graph   where  ranges from 1 to 6, thereby increasing the connectivity.In Fig. 10(b), we present the entropy values for each graph signal.The entropy values for the sine and Wiener signals almost remain constant, independent of   , due to their periodicity and stochastic dynamics.However, the logistic map exhibits a higher degree of variability in its entropy values as  increases.This is because the logistic map is defined by a recurrence formula, where each value depends only on the previous sample, and if  increases, the underlying   has more connections between neighbourhoods.This may disrupt the recurrence relation, generating more irregular signals and resulting in higher entropy values.Conversely, the random signal shows a reduction in entropy values as  increases, as the creation of more connections leads to a more robust average value due to the law of large numbers.

Graph Centrality Measures and 𝐃𝐄 𝐆
Each centrality measure can be considered as a graph signal, allowing the application of the DE G algorithm to assess the irregularity of centrality measures on real and synthetic graphs (refer to Table 2).
We used six centrality measures as graph signals, namely [30,31]: Eigenvector centrality, Betweenness, Closeness, Harmonic centrality, Degree and Pagerank.The DE G algorithm leverages the graph topology to effectively detect irregularities generated by each centrality measure, as demonstrated in Fig. 11.In particular, the Eigenvector Centrality produces smooth signals [21] in most graphs, and this is reflected in low entropy values.Well-connected vertices tend to appear on the shortest paths between other vertices.When the graph has only a few such vertices, the entropy of the Betweenness measure is lower.In cases where the graph has a more irregular distribution of vertices with this characteristic (e.g., in the sphere due to its symmetry), the entropy values are higher.A similar effect occurs when considering the average length of the shortest path between the vertex and all other vertices, as detected by the Closeness measure.Finally, the Degree and PageRank measures produce more irregular graph signals because each signal's value defined on the graph depends only on local properties (the degree or the number and importance of the other vertices connected to it) rather than global properties (such as average paths between vertices in the previous measures).

Comparing 𝐃𝐄 𝐆 and other entropy metrics performance
In Section 3.4, we establish an equivalence between DE and DE G for directed paths.In this context, we extend our discussion to include a comparison between the performance and computational time of DE G and other entropy metrics.The Permutation Entropy for Graph Signals, denoted by PE G [19], marked the first entropy metric specifically designed for graph-based data analysis.Both methods rely on the adjacency matrix, but PE G primarily focuses on the order of amplitude values (local properties), which might result in the loss of valuable information regarding the amplitudes (global properties).DE G addresses these limitations by providing a more comprehensive way to characterize the dynamics of graph signals.We conducted the same previous analysis with PE G (see the Appendix D), and found that DE G consistently outperforms PE G in all cases, highlighting the potential of our novel method for effectively analysing graph signal irregularities.

Computational cost
The computational cost of DE G was compared to PE G using  as a 2-dimensional grid graph of size  ×  with a MIX process defined on .The entropy values were computed for  contains  2 vertices, with values varying from 10 2 to 150 2 .The results are displayed in Fig. 12.The computational times for both metrics PE G and DE G (for equals underlying graphs) were almost identical, with a minor increase in DE G due to the mapping computation in Step 3 of the DE G algorithm (Section 3.1).
The MIX process signal, defined on a grid, was treated as an image to apply the Dispersion Entropy for Images (DE 2 ) presented in [12].The computational time for this setting was also evaluated (see Fig. 12).DE G was more computationally demanding than DE 2 , which was anticipated due to DE G not making any prior assumptions about the signal sample domain structure.
Notably, the computational time of DE G escalates not only with the image size (for an image of size  × , we must calculate the power of the adjacency matrix of size  2 ×  2 ) but also with the graph's connectivity.When a path graph was considered instead of the grid, the computational time was almost unchanged due to the edge count increasing linearly with the number of vertices ( 2 ).However, for the × vertices in a complete graph, the edge count increases quadratically with the number of vertices.Consequently, the number of non-zero entries in the adjacency matrix also escalates quadratically, leading to increased computational costs (see Fig. 12).
The DE 2 algorithm does not factor in the underlying graph topology, therefore yielding identical results for path, grid, or complete graphs.However, this algorithm is unsuitable for irregular domains.Thus, while DE 2 is faster for regular images or time series, our algorithm is applicable to any domain, though similar results are obtained for images.We hypothesize that efficient implementations for DE G could be developed for special types of graphs using the graph's symmetries or when its adjacency matrix is periodic.However, these optimizations are beyond this study's scope, which reports the cost of the general algorithm for images under various graph settings.
Simulations were conducted on a PC with an Intel(R) Core(TM) i7 using MATLAB R2023b.

Conclusions
We have introduced Dispersion Entropy for Graph Signals (DE G ), a method that enhances the analysis of irregularities in graph signals.Our approach generalizes classical dispersion entropy, enabling its application to a wide array of domains, including real-world graphs, directed, undirected and weighted graphs, and unveiling novel relationships between graph signals and graph-theoretic concepts (e.g., eigenvalues and centrality measures).Notably, DE G allows the application of the Dispersion Entropy concept not only to univariate time series, but also to multivariate time series and images.By overcoming the limitations of the classical smoothness definition, DE G offers a more comprehensive approach to analysing graph signals and holds significant potential for further research and practical applications, as it effectively captures the complex dynamics of signals across diverse topology configurations.

Declaration of competing interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Random graphs and PE G
The PE G algorithm is not able to detect increasing of the signal irregularity (due to frequency increments) and is unable to differentiate between distinct levels of irregularity in the MIX  signal based on the parameter  ( To achieve accurate characterizations, it is necessary to increase  > 2 and even that, the behaviour is not monotonous, whereas DE G performs well with smaller embedding dimensions.

Fig. 1 .
Fig. 1.Representation of a simple undirected graph, its adjacency matrix, and associated graph signal.Left: A simple undirected graph , with vertices   linked by edges.The graph signal  is shown by vertical lines, each representing the value of the function  at the corresponding vertex.Centre: Adjacency matrix , an  ×  symmetric matrix showcasing the connectivity between vertices.Right: Vectorized form of the graph signal , translating the function  ∶  → R into an -dimensional vector.

Proposition 1 (
Equivalence of DE and DE G for Time Series).Let  = {   }  =1 be a time series and ⃖⃖ ⃗  = ⃖⃖ ⃗  is the directed path on  vertices.Then, for all ,  and  the following equality holds: DE(, , ) = DE ⃖ ⃗  (, , ) .

Fig. 6 .
Fig. 6.Examples of RGGs with  = 1, 500 and values  = 0.06 and  = 0.10.The graph signals are generated by the MIX  process with different parameter values.

Fig. 7 .
Fig. 7. Entropy values (a) for a fixed graph, increasing the noise and for several frequencies and (b) the underlying graph is more connected.

Fig. 8 .
Fig. 8. Entropy values of DE G and smoothness based on the Laplacian  for the eigenvalues as graph signals.

Fig. 10 .
Fig. 10.Entropy values for different signals defined on a small-world network generated by the Watts-Strogatz model.

Fig. 12 .
Fig. 12. Computational time of DE G , PE G , and DE 2 with varying node number for grid, path, and complete graphs.

Fig. D. 13 .
Fig. D.13.Entropy values using PE G (a) for a fixed graph, increasing the noise and for several frequencies and (b) the underlying graph is more connected.

Fig. D. 14 .
Fig. D.14.Entropy values of PE G and smoothness based on the Laplacian  for the eigenvalues as graph signals.

Fig. D. 15 .
Fig. D.15.Entropy values of PE G for different signals defined on a small-world network generated by the Watts-Strogatz model.

Table 1
Summary of parameters used in DE G algorithm.
Fig. 2. Illustration of the step involved in the DE G algorithm.Fig.3.All graphs have the same set of vertices, then we can consider the same graph signal on each , but the underlying graph is different.thefirst group consists of vertices with positive values ( 1 ,  2 ,  3 ,  4 ), all of which are interconnected; the second group contains vertices with negative values ( 5 ,  6 ,  7 ,  8 ), which also exhibit interconnections.The only cross-group connection exists between  4 and  5 , thereby introducing irregularity into the signal.

Table 2
Overview of the graph structures used, including the number of vertices and edges.
John Stewart Fabila-Carrasco reports financial support was provided by the Leverhulme Trust via a Research Project Grant (RPG-2020-158).Javier Escudero reports financial support was provided by the Leverhulme Trust via a Research Project Grant (RPG-2020-158).John Stewart Fabila-Carrasco reports a relatonship with The University of Edinburgh that includes: employment.Javier Escudero reports a relatonship with The University of Edinburgh that includes: employment.Chao Tan reports a relatonship with Tianjin University that includes: employment.For the purpose of open access, the author has applied a Creatve Commons Attributon (CC BY) licence to any Author Accepted Manuscript version arising from this submission.Elsevier and Jisc have established an agreement to support authors in the United Kingdom who wish to publish open access.The University of Edinburgh is cover by this agreement.