Hyperharmonic analysis for the study of high-order information-theoretic signals

Network representations often cannot fully account for the structural richness of complex systems spanning multiple levels of organisation. Recently proposed high-order information-theoretic signals are well-suited to capture synergistic phenomena that transcend pairwise interactions; however, the exponential-growth of their cardinality severely hinders their applicability. In this work, we combine methods from harmonic analysis and combinatorial topology to construct efficient representations of high-order information-theoretic signals. The core of our method is the diagonalisation of a discrete version of the Laplace–de Rham operator, that geometrically encodes structural properties of the system. We capitalise on these ideas by developing a complete workflow for the construction of hyperharmonic representations of high-order signals, which is applicable to a wide range of scenarios.


Introduction
The principle of representing interdependencies as networks has revolutionised complexity science by introducing a systematic approach to gain insight into the inner structure of a wide range of complex systems (Newman, 2018).These networks provided a lingua franca to describe the properties of interdependencies found in chemical, biological, social, and technological systems (Vasiliauskaite and Rosas, 2020), and have enabled great advances in a widening range of areas including computational neuroscience (Rubinov and Sporns, 2010), human evolution (Donges et al., 2011), financial analysis (Bonanno et al., 2004), and epidemic spreading (Pastor-Satorras and Vespignani, 2001), just to name a few.However, by their very nature these methods focus on the analysis of pairwise interactions, and hence are prone to miss important high-order synergistic phenomena that are a hallmark of many complex systems.
This critical limitation of traditional network analyses has been acknowledged by a number of recent research efforts that focus on developing techniques to study highorder interactions (Petri et al., 2014;Iacopini et al., 2019;Battiston et al., 2020).These developments have provided novel techniques capable of, for example, detecting nonlocal structures (Petri et al., 2013), highlighting the role of inhomogeneities in functional connections (Petri et al., 2014), and characterising discontinuous transitions (Iacopini et al., 2019).However, it is important to notice that many of these approaches are based on hypergraphs and other high-order structures build solely from pairwise statistics, and hence their scope remains limited.
An attractive set of tools to further extend the reach of high-order analyses can be found in a parallel body of work, which originated in efforts to develop tools to capture high-order statistical phenomena related to the brain (Tononi et al., 1994;Schneidman et al., 2003a;Latham and Nirenberg, 2005;Ganmor et al., 2011).Particularly interesting are multivariate extensions of Shannon's mutual information, including the Interaction Information (McGill, 1954), Total Correlation (Watanabe, 1960), and Dual Total Correlation (Han, 1978), which can be used to gain insights about the high-order structure exhibited by groups of three or more interdependent variables (see e.g.Timme et al. (2014); Baudot et al. (2019)).In this work we focus on the recently proposed O-information (Rosas et al., 2019), which is a principled tool to identify synergydominated systems, and has have found to be relevant for analysing various complex systems -including the study of neural spiking data (Stramaglia et al., 2020) and aging in fMRI data (Gatica et al., 2020).
When applied to large systems, metrics such as the O-information are naturally represented as high-order signals whose domain is the set of all hyper-edges on a regular hypergraph.Because the cardinality of these hypegraphs grows super-exponentially with the system size, a key open problem is to find efficient ways to represent the content of these signals.While simple approaches such as computing the average of the signal across dimensions can be effective (Gatica et al., 2020), an important challenge is to find principled ways to generate low-dimensional representations of these signals while preserving their intrinsic relational structure across the hypergraph.A popular technique to study similar issues in the case of traditional (weighted) networks is to transform the signals into a basis of eigenvectors of the graph Laplacian (Belkin and Niyogi, 2003), which has been used with great success in the context of graph signal processing (see e.g.Shuman et al. (2013); Sandryhaila and Moura (2014); Atasoy et al. (2016Atasoy et al. ( , 2017)); Expert et al. (2017)).Related methods based on harmonic analysis and combinatorial topology, which we refer to as hyperharmonic analysis, have been recently developed to study high-order signals (see e.g.Barbarossa and Sardellitti (2020)); however, they have not yet -to the best of our knowledge -been used to analyse high-order information-theoretic signals.Other aspects of the relationship between highorder information-theoretic quantities and topology has been explored in Baudot and Bennequin (2015); Baudot (2019).
This article establishes a bridge between the domains of high-order informationtheory and hyperharmonic analysis, and introduces a complete workflow for the study of high-order information-theoretic signals using the hyperharmonic modes of a structural simplicial complex.Our choice of hyperharmonic modes is based on a discrete version of the Laplace-de Rham operator that geometrically encodes the strength of low-order interactions.Our workflow makes no assumptions about the structure of the data, and hence can be applied to a broad range of scenarios.As a proof of concept, we illustrate our approach analysing the musical scores of the latter symphonies written by F.J. Hadyn, were our results demonstrate the far superior dimensionality-reduction capabilities of our method compared to other representations.
The rest of the article is structured as follows.First, Section 2 introduces key notions from multivariate information theory, reviewing the state-of-the-art of highorder metrics.Then, Section 3 presents fundamental notions from combinatorial topology, required to define the Fourier transform of higher-dimensional signal.Section 4 introduces our proposed workflow, and Section 5 illustrates the method on Hadyn's symphonies.Finally, Section 6 summarizes our conclusions and discusses future work.

High order information-theoretic measures
Let us consider a scientist studying a complex system, whose state is described by the vector X N = (X 0 , . . ., X N ).Let us assume that the scientist has enough data to allow for the construction of a reliable statistical description of its joint probability, which is denoted by p(X 0 , . . ., X N ).The goal of this study is to leverage the statistics encoded in p in order to understand the structure of interdependencies that characterize X N .This endeavor can lead either to build statistical markers to classify different systems -or different states of the same system, or to build parallels between seemingly heterogeneous systems based on the similarity of their internal structure.
Through this section, random variables are denoted by capital letters (e.g.X, Y ) and their realisations by lower case letters (e.g.x, y).Random vectors and their realisations are denoted by capital and lower case boldface letters, respectively.

Networks and hypergraphs based on pairwise statistics
A popular way to analyze the interactions within X N is to represent them as networks, where each variable X 0 , . . ., X N is represented as a node, and edges between nodes represent the strength of their interaction.A simple way to build such a network is to calculate the correlation matrix R X N := [r i,j ] with components given by (1) and use it as an adjacency matrix -either binarising its components via a threshold, or considering weighted edges.This construction, however, only captures linear relationships between the variables.More encompassing analyses often consider nonlinear measures of dependency such as Shannon's mutual information and focus on the matrix I X N := I(X i ; X j ) of mutual information terms, which are computed as (2)

High-order statistical effects
It is important to realise that the joint probability distribution p(X N ) may contain substantial information that is not assessed by any of the pairwise marginals p(X i , X j ).
An elegant way of gaining insights into this is provided by information geometry, as presented in Amari (2001) (for related discussions, see also Rosas et al. (2016)).Consider the k-marginals that are obtained by marginalising p(X N ) over N − k variables.One can see that the k-marginals provide a more detailed description of the system than the (k − 1)-marginals by noting that the latter can be directly computed from the former by marginalising the corresponding variables.In contrast, the process of marginalising involves irreversible information loss, as there are many k − 1 marginals that are consistent with a given a set of k-marginals.
Perhaps the simplest example of high-order statistical dependency is given by the exclusive-or (xor) logic gate.To see this, consider two independent fair coins X 0 and X 1 , and let A quick calculation shows that the mutual information matrix I X 2 of (X 0 , X 1 , X 2 ) has all its off-diagonal elements equal to zero, making it indistinguishable from an alternative situation where X 2 is just another independent fair coin.Therefore, put simply, any "xor-like" effects is completely ignored by constructions based only on pairwise statistics -either networks or hypergraphs.While commonly neglected, high-order statistics has been recently proven to be instrumental in a number of systems.For example, synergies have been shown to play a key role in distributed interdependent systems, including the most complex types of elementary cellular automata (Rosas et al., 2018).It has also been suggested that highorder interactions could drive thermalisation processes within closed systems (Lindgren and Olbrich, 2017).Additionally, high-order interactions have been argued to play a key role in neural information processing (Wibral et al., 2017) and high-order brain functions (Tononi et al., 1994), being at the core of popular metrics employed in consciousness science (Mediano et al., 2019).Furthermore, it has been recently shown that high-order statistics also play a crucial role in enabling emergent phenomena (Rosas et al., 2020a).

O-information
Shannon's mutual information is limited to capture the dependencies of two groups of variables, but cannot directly assess triple or higher interactions.The most popular non-negative multivariate extensions of the mutual information are the Total Correlation (TC) (Watanabe, 1960) and the Dual Total Correlation (DTC) (Han, 1978), which are defined as is the conditional Shannon entropy, and X N −i is the vector of all variables except X i (i.e., X N −i = (X 0 , . . ., X i−1 , X i+1 , . . ., X N )).Importantly, both TC and DTC are zero if and only if all variables X 0 , . . ., X N are jointly statistically independent -i.e. if p(X N ) = N i=0 p(X i ).Unfortunately, both TC and DTC provide metrics for high-order interdependency which are difficult to analyse together.A recent approach, introduced in Rosas et al. ( 2019), proposes to employ a linear transform over these two metrics to obtain the O-information and the S-information, which have more intuitive interpretations. (4) Similarly, their S-information is defined as The O-information can be seen as a revision of the measure of neural complexity proposed by Tononi, Sporns and Edelman in Tononi et al. (1994), which provides a mathematical construction that is closer to their original desiderata (Rosas et al., 2019).In effect, the O-information is a signed metric that satisfies the following key properties: (1) It is zero for systems with only pairwise interdependencies.
(2) It is additive over non-interactive subsystems.
Hence, Ω(X N ) < 0 implies a predominance of statistical synergy within the system X N .Conversely, Ω(X N ) > 0 implies that the system X N is redundancy-dominated.
On the other hand, the S-information is an over-encompassing account of interdependencies taking place at all orders, being sometimes described as a "very mutual information" (James et al., 2011).In fact, a quick calculation shows that the S-information can be decomposed as Σ(X N ) = N −1 i=0 I(X i ; X N −i ), which can be seen as a chain rule where the interdependencies involving each variable are sequentially addressed.
In summary, while the TC and DTC provide alternative representations to the same construct, Ω and Σ provide a complementary account of the system: the latter addressing the overall strength of interdependencies, and the former qualitatively characterising their dominant nature.

Other metrics of high-order effects
Another popular metric of high-order interdependencies in the Interaction Information, first introduced in (McGill, 1954) for systems with three variables. 1Building on an application of the inclusion-exclusion principle to entropies, the Interaction Information of X n is a signed metric given by where the sum is performed over all subsets γ ⊆ {0, . . ., N }, with |γ| the cardinality of γ, and X γ the vector of all variables with indices in γ.While this measure has a direct interpretation as redundancy minus synergy for N = 2, it no longer reflects this balance for larger system sizes (Williams and Beer, 2010, Section V).However, the Interaction Information can still be interpreted for arbitrary N under a topological formulation of information, as described in Baudot and Bennequin (2015); Baudot et al. (2019).

Hyperharmonic analysis
This section, subdivided into three parts, introduces the basic concepts from combinatorial topology that we used to decompose high-order signals into hyperharmonic modes.First, Section 3.1 describes the objects over which signals will be considered; these are higher-dimensional versions of weighted graphs known as weighted simplicial complexes.Then, Section 3.2 introduces the algebraic structure used to model high-order signals on weighted simplicial complexes; it consists of a family of inner product spaces, one for each dimension, and a pair of canonical linear maps between adjacent spaces.Finally, Section 3.3 introduces the discrete analogue of the Laplace-de Rham operator, and defines the Fourier basis using a maximal set of linearly independent eigenvectors of this operator.Throughout the presentation, we provide references to more general treatments and original sources when possible.

Simplicial complexes
A hypergraph S = (V, E) is determined by a set of vertices V = {0, . . ., N } with N ∈ N and a set of hyper-edges E ⊆ P(V ), where P(V ) is the power set of V .Furthermore, a hypergraph S is said to be a simplicial complex if it satisfies two conditions: (1) all singletons {k} with k ∈ V are included in E, and (2) if σ ∈ E and ρ is a subset of σ, then ρ ∈ E.
Please note that the passage to simplicial complexes does not restrict the theory of hypergraphs significantly, since to every hypergraph one can assign a canonical simplicial complex by downward closure.Explicitly, this is the smallest simplicial complex that contains the hypergraph.For an introduction to graphs and hypergraphs, we refer to Berge and Minieka (1973); for a comprehensive introduction to hypergraphs in the context of complex system analysis see Johnson (2013).
If S = (V, E) is a simplicial complex, the elements of E are referred to as simplices; and their dimension is defined as one less than their cardinality (i.e. the number of elements they connect).This shift can be understood noting that the natural dimension of a point, corresponding to a singleton, is 0. The subset of E consisting of simplices of dimension n is denoted by S n .The elements of S n are typically called "n-simplices", with the 0-simplices and 1-simplices being informally called vertices and edges, respectively.If v 0 < • • • < v n are the vertices of a simplex, we denote this simplex by [v 0 , . . ., v n ].
Example 1 The simplicial complex ∆ N with vertices V = {0, . . ., N } and containing all possible simplices is referred to as the standard N -simplex.In our applications, the N + 1 vertices of the ∆ N will be associated with a set of random variables X 0 , . . ., X N .

Higher-dimensional signals
An n-dimensional signal on a simplicial complex S is a function α : S n → R, that is to say, an assignment of a real number to each n-simplex.We refer to n-signals with n ≥ 2 as high-order signals.
Example 2 Consider a collection of random variables X 0 , . . ., X N .For every n ∈ {2, . . ., N }, the high-order n-signals Ω and Σ on ∆ N are defined by For a simplicial complex S and n ∈ N, we denote by C n (S) the vector space generated by all the n-simplices of S, i.e., Please note that there is a natural bijection between the set of n-signals on S and C n (S) established by α This bijection is used implicitly when referring to the elements of C n (S) as n-signals. 2et us now consider the linear map ∂ n : C n (S) → C n−1 (S) defined on basis elements by with v i denoting the absence of v i from the simplex.One can visualise ∂ n geometrically in terms of the boundary of a basis element.Up to a sign, the basis elements appearing as summands on ∂ n [v 0 , . . ., v n ] are all simplices of dimension n − 1 contained in [v 0 , . . ., v n ].Furthermore, the sign is determined in terms of the orientations of the simplices, as illustrated in Figure 1.The maps ∂ n play a central role in algebraic topology holding a significant amount of topological information.In particular, the difference between the dimension of the kernel of ∂ n and the dimension of the image of ∂ n+1 is known as the n-Betti number of the simplicial complex, a powerful topological invariant generalising the Euler characteristic.For a systematic treatment of these ideas, please consult Hatcher (2002).
In this work we are not only interested in the topology of S, but also on the "geometric" structure encoded by weights assigned to its simplices.A weighted simplicial complex is a pair (S, w) where S is a simplicial complex and w = {w n : S n → R >0 } is a set of non-negative weight functions.In the following, a weighted standard simplex (see Example 1) is referred to as a structural simplex.
Example 3 We can build a structural simplex using a collection of random variables X 0 , . . ., X N by following Example 1, and defining the weights as follows.First, for each Subsequently, for an n-simplex [v 0 , . . ., v n ] with n > 1, its weight is defined as the mean value of the mutual information of all pairs of random variables in {X v 0 , . . ., X vn }.
Given a weighted simplicial complex (S, w), one can encode the structural information provided by w n (where the subscript denotes the dimension) as an inner product on the vector space of n-signals on S. Explicitly, for any pair of n-simplices we have Importantly, this inner product allow us to introduce the adjoint operator of ∂ n , which is denoted by δ n .That is to say, the operator δ n : C n (S) → C n+1 (S) is defined by the identity ∂ n+1 α, β w = α, δ n β w which holds for any α ∈ C n+1 (S) and β ∈ C n (S).Note that ∂ n does not depend on the weights w, but δ n does.

High-order Laplace operator and Fourier basis
The classical sine and cosine functions form the basis of the spectral representation of time-domain signals.In effect, classical Fourier analysis guarantees that a large class of functions are expressible in terms of linear combination of these functions.The coefficients of this linear combination are the Fourier coefficients, which correspond to a geometric projection of the original function over this new basis.Interestingly, numerous signals of practical relevance are more compactly represented in the spectral domain, and hence Fourier analysis is often employed as a principled way to perform dimensionality-reduction.For a more extensive exposition of these ideas, we refer the reader to Bracewell (1986).Importantly, sine and cosine functions are also eigenfunctions of the Laplace operator on the circle -a one-dimensional manifold.
Put differently, a spectral representation is equivalent to a diagonalisation of the Laplace operator.Mathematicians have generalized the Laplace operator to higher-dimensional manifolds via the Laplace-de Rham operators.In this more general context, functions on a smooth manifold lie at the bottom of a sequence of higher-dimensional objects on which the corresponding Laplace-de Rham operator acts.The eigenvectors of these operators play a central role in modern geometry -most notably through Hodge theory (Hodge and Atiyah, 1989).For an exposition of these ideas we refer the reader to Morita et al. (2001).
A discrete analogue of the higher-dimensional objects on which the Laplace-de Rham operators act is provided by high-order signals defined over a simplicial complex.In this correspondence, the Laplace-de Rham operators are represented by the discrete Laplace operators first introduced in Eckmann (1944).See also Horak and Jost (2013) where a weighted version of these is presented, and Parzanchevski and Rosenthal (2017) where the connection to random walks is explored.We now introduce the particular version of the discrete Laplace-de Rham operators that we use.
Definition 2 Let (S, w) be a weighted simplicial complex.The n-Laplace operator Notice that we are abusing notation by omitting w from the notation referencing the Laplace operators since, although ∂ n does not depend on w n , δ n and therefore ∆ n do.For the interested reader, we remark that this Definition 2 has a strong resemblance to the form the Laplace-de Rham operator adopts in Rimmanian geometry.In effect, by denoting by d the exterior derivative of differential forms and d * its adjoint, the Laplace-de Rham operator in this context can be expressed as dd * + d * d.Furthermore, the geometry defined by the Riemannian metric is reflected in the Laplace-de Rham operator through the operator d * only.Consult Morita et al. (2001) for further details.
The well-known graph Laplacian, a central concept in spectral graph theory (Chung et al., 1997), is equivalent to the 0-Laplace operator defined above -when a weighted graph is regarded as a weighted simplicial complex.Importantly, the n-Laplace operator is self-adjoint for any weighted simplicial complex (S, w), i.e.
∆ n (α), for all α, β ∈ C n (S).This implies that ∆ n (α) is diagonalisable; that is to say, there exists a basis of C n (S) consisting of eigenvectors of ∆ n .3An n-Fourier basis for (S, w) is a maximal set of linearly independent orthonormal (with respect to •, • w ) eigenvectors of ∆ n .Please note that we speak of the Fourier basis, with the understanding that there is a sign choice for each of its elements.
Finally, the hyperharmonic representation of a high-order signal defined over a weighted simplicial complex is given by its change of bases from the canonical to the Fourier one.In the case of graphs, the use of this transformation as a dimensionalityreduction method was pioneered by Belkin and Niyogi (2003), and has served as an early application of the field of graph signal processing -see for example Shuman et al. (2013); Sandryhaila and Moura (2014) or Ortega et al. (2018) for an overview of this field.In recent years, some key ideas from graph signal processing have been adapted to hypergraphs, see for example Barbarossa and Sardellitti (2020) and Schaub and Segarra (2018); however, the applicability of Fourier analysis as a compression tool of high-order signals is, to a large extend, still unexplored territory.

Proposed workflow
This section describes our proposed workflow, which capitalises the theoretical constructs elaborated in Sections 2 and 3.The overall pipeline is illustrated in Figure 2, being composed of six steps that are described in the following.

Steps of the analysis
The proposed pipeline consists in six steps.
(1) From data to a distribution: The starting point of the workflow is multivariate data, which typically takes the shape of a sequence of vectors of dimension N + 1 (e.g.successive samples of time series of the form s(t) = (s 0 (t), . . ., s N (t)) at different values of t ∈ R).Using these data, the first step is to construct a joint probability distribution for a (N +1)-dimensional vector, denoted as p(X 0 , . . ., X N ).
(2) The structural simplex: To build the structural simplex, the first step is to use p to calculate the mutual information I(X i ; X j ) for each pair of variables of the system.These values are then used to define the weights on a pairwise network, where each edge correspond to one of the variables X 0 , . . ., X N .Subsequently, a weighted n-simplex is build for each n = 2, . . ., N by taking the average value of all the edges that connect two elements within the simplex.
(3) High-order signals: For each subset of cardinality n + 1 of {X 0 , . . ., X N }, one computes the O-information and S-information and arrange them as high-order signals following Example 2. These signals are stored as N +1 n+1 -dimensional vectors o n and s n , representing them in the canonical basis of simplices.Note that there is a standard order on the elements of this basis, which is given by the lexicographic principle (i.e.[v 0 , . . ., v n ] < [v 0 , . . ., v n ] if v j < v j and v i = v i for all i < j).
(4) The Laplace operator: Using the weights of the structural simplex constructed in Step (2), one then builds the corresponding n-Laplace operator as in Definition 2. Concretely, we use the canonical basis to represent the d = N +1 n+1 weights of order n+1 within a d×d diagonal matrix W n , and represents the linear map ∂ n as a matrix B n of the same dimensions (for an algorithmic description of the construction of B n , see Appendix B).Then, the discrete n-Laplace operator is represented in the canonical basis as the matrix with L up n and L down n given by where B n is the transpose of B n .To recapitulate, Eq. ( 10) is the matrix representation, in the canonical basis, of Eq. ( 9) with respect to the inner product given by Eq. ( 8).
(5) The Fourier basis: The n-Fourier basis of the structural simplex is constructed by choosing a maximal linearly independent set of eigenvectors of the n-Laplace operator that are orthonormal with respect to the inner product •, • w defined by the weights of the structural simplex (see Eq. ( 8)).Concretely, one needs to find a matrix with D n a diagonal matrix, which also satisfies where I n is the identity matrix of dimension N +1 n+1 .(6) Hyperharmonic representation: As a final step, one calculates the Fourier transform of the high-order signals calculated in Step (3), denoted by o n and s n , as follows:

Variations
This workflow has been designed with modularity and flexibility in mind, in order to facilitate its adaptation to the needs of diverse applications.This section highlights possible variations over the workflow that can better accommodate the specific needs of different scenarios.First, note that our choice of Ω and Σ in Step (3) as high-order information signals is based on their interpretability, and on their promising value for the analysis of complex systems.Nevertheless, other high-order measures (e.g. the ones described in 2.4) can also be analysed following the same pipeline.
Additionally, Step (2) suggests to build the structural simplex by propagating the underlying mutual information from edges to higher-dimensional simplices via averages.
However, one could use other constructions: e.g.use the maximum or minimum mutual information instead of the average.Additionally, one could replace the mutual information with other non-negative metrics of similarity, including the total variation distance, the Wasserstein distance, or the absolute value of the Pearson correlation.Furthermore, one could also use non-negative high-order metrics (such as the TC or DTC, see Section 2.3) for building the structural simplex directly.
Finally, it is important to note that Step (1) is included mainly for pedagogical purposes, but is often omitted in practice.In effect, most modern techniques to estimate information-theoretic quantities from data avoid to build an explicit joint distribution, as this introduces additional biases.For discrete data, we recommend considering Bayesian estimators such as the ones discussed in Archer et al. (2013), and the software package DIT (James et al., 2018).For continuous data, our preferred choice are Kraskof estimators (Kraskov et al., 2004) that can be implemented via JIDT (Lizier, 2014).However, these estimators require substantial amounts of data, and a more flexible alternative are provided by estimators based on Gaussian Copulas (Ince et al., 2017).

Proof of concept: Haydn symphonies
As an illustration of the proposed workflow, this section presents an analysis of the degree of dimensionality-reduction that can be attained by performing hyperharmonic analysis over the O-information and S-information signals calculated over a small dataset.For this purpose, we use data from the music scores of Franz Joseph Haydn (1732-1809), one of the most iconic figures of the Classic Period.We focus on Haydn's latter "London symphonies", which are typically divided into two groups: Symphonies Nos.93-98, which were composed during the first visit of Haydn to London, and Symphonies 99-104, which were composed either in Vienna or London during Haydn's second visit (Clark, 2005).

Method description
Our analysis is based on electronic scores that are publicly available at http://kern.ccarh.org,from where we extracted the scores of Symphonies No. 93-94 and No. 99-104. 4All symphonies use the same basic instrumentation: flutes, oboes, bassoons, horns, trumpets, timpani, violins, viola, violoncello, and double bass; only some of these symphonies employ clarinets, which were therefore not included in the analysis.To avoid duplicates, our analyses consider only one instrument of each kind, which left an arrangement of nine parts (violoncello and double bass were assumed to be equivalent).
The scores of the four movements of the selected symphonies were pre-processed in Python 3.8.5 using the Music21 package (http://web.mit.edu/music21).Each movement was transformed into nine coupled time series taking 13 possible valuesone for each note plus one for the silence, using a small rhythmic duration as time unit.With these data, the joint distribution of the values for the nine-note chords was estimated using their empirical frequency.Note that regularisation methods (such as Laplace smoothing) were not employed, as many configurations (e.g.highly dissonant chords) are never explored in the Classic repertoire.
Our subsequent analyses were restricted to high-order signals depending on the joint probability of no more than 6 instruments.Given the structure of the dataset (8 symphonies with 4 movements each), we computed the structural simplex using a distribution calculated using all the data.In contrast, individual high-order signals were calculated for each movement of each symphony by using only the corresponding data.
To measure compressibility, we used the following metric defined for any basis.Consider a given n-dimensional high-order signal, whose coefficients on the given basis are α = {α i }.Without loss of generality we assume that α 2 i ≥ α 2 j if i ≤ j.Then, we define the functions with EV α (k) being the (normalised) explained variance by the k-th strongest component, and CEV α (k) the cumulative explained variance by the k strongest components.This definition is motivated by Parseval's theorem, which guarantees that the sum of the square of the Fourier coefficients is equal to the variance of the signal; hence, the CEV(k) of the Fourier transform of a signal corresponds to the percentage of the variance that is accounted by the k strongest components.

Results
The hyperharmonic representation of the considered high-order signals (Ω and Σ) was found to be substantially more concentrated than their corresponding canonical representations.Figure 3 illustrates the curves of cumulative explained variance for various dimensions, and Table 1 presents the number of components required to fulfil various reconstruction levels.In particular, it was found that a small number of components suffices to account for most of the variance observed in the hyperharmonic representations of Ω and Σ.Moreover, this good performance is found to be stable across dimensions.
To verify the unique properties of the Laplace operators, an additional control was run were the same signals were transformed according to randomly-generated bases.Interestingly, these representations do not exhibit the degree of dimensionality-reduction shown by the hyperharmonic representations (see Appendix A).
Finally, our results also show that the canonical representation of the S-information is much less concentrated than the canonical representation of the O-information.This dissimilarity suggests that the O-information is capturing a more specific signature of the different modalities of interdependency that exist across the orchestra.Moreover, the fact that the dissimilarity is not seen in their hyperharmonic representation further  12)) of high-order signals in their canonical and hyperharmonic representation.The dimensionality-reduction capabilities of hyperharmonic analysis is substantial, and stays consistent across dimensions.suggests that both signals may actually carry an equivalent amount of information, which happens to be differently localised over the canonical basis.

Conclusion
Complex systems are characterised by having multiple levels of organisation, which makes network analyses focused on pairwise interactions unable to give a full account of their properties.Phenomena beyond pairwise interactions can be effectively captured by high-order information-theoretic metrics; however, their applicability and interpretability is limited due to the fast growth of their cardinality as a function of the system's size.Here we propose to represent these high-order metrics using the hyperharmonic modes of a geometrical representation of structural properties of the system.This provides a principled approach to constructing low-dimensional representations of high-order signals, which can retain most of the informational content.
Our proposed approach is widely applicable, and promises to enable a range of future explorations.It is our hope that this incursion into the intersection of multivariate information theory and combinatorial topology will motivate further developments on this fertile area of research.12)) of high-order signals in the hyperharmonic representation and the average CEV of these signals over randomly generated bases.The dimensionality-reduction capabilities of hyperharmonic analysis is substantial, and stays consistent across dimensions.
matrices that correspond to N = 3:

Figure 1 .
Figure 1.The linear map ∂ n interpreted geometrically in terms of the boundary of a simplex and the induced orientations.

Figure 2 .
Figure 2. Workflow: (1) Estimate a joint probability from the input data (2) Build a structural simplex using averages of mutual information values.(3) Compute the O-information (and S-information) and store it as an N +1 n+1 -dimensional vector for every dimension n (4) Compute the n-Laplace matrix using the boundary and weight matrices.(5) Diagonalise the Laplace matrices for each dimension.(6) Write the signal in the Fourier basis and return it as output.

Figure 3 .
Figure 3.Comparison of the cumulative explained variance (as defined by Eq. (12)) of high-order signals in their canonical and hyperharmonic representation.The dimensionality-reduction capabilities of hyperharmonic analysis is substantial, and stays consistent across dimensions.

Figure A1 .
Figure A1.Comparison of the cumulative explained variance (as defined by Eq. (12)) of high-order signals in the hyperharmonic representation and the average CEV of these signals over randomly generated bases.The dimensionality-reduction capabilities of hyperharmonic analysis is substantial, and stays consistent across dimensions.

Table 1 .
Number of components needed to recover a given percentage of the cumulative explained variance for the considered signals, either in the Fourier basis or in the canonical basis.