Kernel-based joint independence tests for multivariate stationary and non-stationary time series

Multivariate time-series data that capture the temporal evolution of interconnected systems are ubiquitous in diverse areas. Understanding the complex relationships and potential dependencies among co-observed variables is crucial for the accurate statistical modelling and analysis of such systems. Here, we introduce kernel-based statistical tests of joint independence in multivariate time series by extending the d-variable Hilbert–Schmidt independence criterion to encompass both stationary and non-stationary processes, thus allowing broader real-world applications. By leveraging resampling techniques tailored for both single- and multiple-realization time series, we show how the method robustly uncovers significant higher-order dependencies in synthetic examples, including frequency mixing data and logic gates, as well as real-world climate, neuroscience and socio-economic data. Our method adds to the mathematical toolbox for the analysis of multivariate time series and can aid in uncovering high-order interactions in data.


Introduction
Time series that record temporal changes in sets of system variables are ubiquitous across many scientific disciplines [1], from physics and engineering [2] to biomedicine [3,4], climate science [5,6], economics [7,8] or online human behaviour [9,10].Many real-world systems are thus described as multivariate time series of (possibly) interlinked processes tracking the temporal evolution (deterministic or random) of groups of observables of interest.The relationships between the measured variables are often complex, in many cases displaying inter-dependencies among each other.For example, the spreading of Covid-19 in Indonesia was dependent on weather conditions [11]; the Sustainable Development Goals have extensive interlinkages [12]; there are strong interconnections between foreign exchange and cryptocurrencies [13]; and the brain displays multiple spatial and temporal scales of functional connectivity [14].Driven by technological advances (e.g., imaging techniques in the brain sciences [15], or the increased connectivity of personal devices via the Internet of Things [16]), there is a rapid expansion in the collection and storage of multivariate time series data sets, which underlines the need for mathematical tools to analyze the interdependencies within complex high-dimensional time series data.
Characterising the relationships between variables in a multivariate data set often underpins the subsequent application of statistical and machine learning methods.In particular, before further analyses can be performed, it is often crucial to determine whether the variables of interest are jointly independent [17].Joint independence of a set of d variables means that no subset of the d variables are dependent.We need not look further than ANOVA and t-tests to find classic statistical methods that assume joint independence of input variables, and the violation of this assumption can lead to incorrect conclusions [18].Causal discovery methods, such as structural equation modelling, also require joint independence of noise variables [19].Furthermore, joint independence has applications in uncovering higher-order networks, an emergent area highlighted in recent studies [20][21][22][23][24].
Kernel-based methods offer a promising framework for testing statistical independence.Notably, the d-variable Hilbert-Schmidt independence criterion (dHSIC) [19] can be used as a statistic to test the joint independence of d random variables.Developed as an extension of the pairwise HSIC [25] , a statistical test that measures the dependence between two variables [25][26][27], dHSIC measures the dependence between d variables [19].In words, dHSIC can be simply defined as the "squared distance" between the joint distribution and the product of univariate marginals when they are embedded in a reproducing kernel Hilbert space (RKHS).Crucially, kernel methods do not make assumptions about the underlying distributions or type of dependencies (i.e., they are non-parametric).Yet, in its original form, dHSIC assumes the data to be iid (i.e., drawn from identical independent distributions).This is an unreasonable assumption in the case of time series data, and it has precluded its application to temporal data.
To the best of our knowledge, dHSIC has not yet been extended to time series data.The pairwise HSIC has been extended to deal with stationary random processes under two different test resampling strategies: shifting within time series [26], and the Wild Bootstrap method [27].However, the assumption of stationarity, by which the statistical properties (e.g., mean, variance, autocorrelation) of the time series are assumed not to change over time, is severely restrictive in many real-world scenarios, as non-stationary processes are prevalent in many areas, e.g., stock prices under regime changes or weather data affected by seasonality or long-term trends.Hence there is a need for independence tests that apply to both stationary and non-stationary processes.Recently, pairwise HSIC has been extended to non-stationary random processes by using random permutations over independent realisations of each time series, when available [28].
In this paper, we show how dHSIC can be applied to reject joint independence in the case of both stationary and non-stationary multivariate random processes.Following recent work [28], we adapt dHSIC so that it can be applied to stationary and non-stationary time-series data when multiple realisations are present.Additionally we develop a new bootstrap method inspired by Ref. [26], which uses 'shifting' to deal with stationary time-series data when only one realisation is available.Using these methodological advances, we then introduce statistical tests that rely on these two different resampling methods to generate appropriate null distributions: one for single-realisation time series, which is only applicable to stationary random processes, and another for multiple realisation time series, which is applicable to both stationary and non-stationary random processes.We show numerically that the proposed statistical tests based on dHSIC identify robustly and efficiently the lack of joint independence in synthetic examples with known ground truths.We further show how recursive testing from pairwise to d-order joint independence can reveal emergent higher-order dependencies in real-world socio-economic time series that cannot be explained by lower order factorisations.

Kernel-based tests for joint independence
Definition (Joint independence of a set of variables).The d variables X j , j = 1, . . ., d, with joint distribution P X 1 ,...,X d are jointly independent if and only if the joint distribution is fully factorisable into the product of its univariate marginals, i.e., P X 1 ,...,X d = d j=1 P X j , where the P X j denote the marginals.
Remark (Joint independence of subsets).If d variables are jointly independent then any subset of those d variables are also jointly independent, e.g., P X 1 ,X 2 ,X 3 = P X 1 P X 2 P X 3 implies P X 1 ,X 2 = P X 1 P X 2 , which follows from marginalisation with respect to X 3 on both sides of the equality.Hence, by the contrapositive, lack of joint independence of a subset of variables implies lack of joint independence of the full set of variables.
A series of papers in the last two decades has shown how kernel methods can be used to test for independence of random variables (for details see Refs.[19,25]).The key idea is to embed probability distributions in reproducing kernel Hilbert spaces (RKHS) [29] via characteristic kernels, thus mapping distributions uniquely to points in a vector space.For a summary of the key definitions and foundational results see Refs.[30,31].
Definition (RKHS and mean embedding for probability distributions [32,33]).Let H k be a RKHS of functions f : X → R endowed with dot product ⟨•, •⟩, and with a reproducing kernel k : X ×X → R. Let P be a distribution defined on a measurable space X , then the mean embedding of P in H k is an element µ P ∈ H k given by µ P := k(x, •) P(dx), with the property ⟨f, If the kernel is characteristic, the RKHS mapping is injective and this representation captures uniquely the information about each distribution.Based on such a mapping, statistics have been constructed to test for homogeneity (using the maximum mean discrepancy, MMD [33]) or independence (using the Hilbert-Schmidt independence criterion, HSIC [25]) between two random variables.
Remark.An example of a characteristic kernel is the Gaussian kernel k σ (x, y) = exp (−∥x − y∥ 2 /σ 2 ) where x, y ∈ R p .The Gaussian kernel will be used throughout our applications below, but our results apply to any other characteristic kernel.
Recently, an extension of HSIC for d variables, denoted dHSIC, was introduced and used as a statistic for joint independence to test the null hypothesis H 0 : Definition (dHSIC [19]).Let us consider d random variables X j , j = 1, . . ., d, with joint distribution P X 1 ,...,X d .For each X j , let H k j denote a separable RKHS with characteristic kernel k j .
The d-variable Hilbert-Schmidt Independence Criterion (dHSIC), which measures the similarity between the joint distribution and the product of the marginals, is defined as: where Remark.Given the definition (1), dHSIC is zero if and only if the variables are jointly independent, i.e., when the joint distribution is equal to the product of the marginals.This is the basis for using dHSIC to define the null hypothesis for statistical tests of joint independence.
Remark (Emergent high-order dependencies).As noted above, the rejection of joint independence for any subset of a set of d variables implies also the rejection of joint independence for the full set of d variables.Therefore, many observed rejections of joint independence at higher orders follow from rejections of joint independence at lower orders (i.e., within subsets of variables).To identify more meaningful high order interactions, in some cases we will also consider 'first time rejections' of d-way joint independence, i.e., when the joint independence of a set of d variables is rejected but the joint independence of each and all of its subsets of size d ′ < d cannot be rejected.We denote these as emergent high-order dependencies.

Time series as finite samples of stochastic processes
Our interest here is in the joint independence of time series, which we will view as finite samples of stochastic processes.
Notation (Stochastic processes and sample paths).We will consider a set of d stochastic processes {X j (t; ω) : t ∈ T }, j = 1, . . ., d, where t ∈ T is defined over the index set, corresponding to time, and ω ∈ Ω is defined over the sample space.Below, we will also use the shorthand {X j t } to denote each stochastic process.
For each stochastic process we may observe n independent realisations (or paths), which are samples from Ω indexed by ω i : {X j (t; ω i ) : t ∈ T }, i = 1, . . ., n.Furthermore, each path is finite and sampled at times t = 1, . . ., T j .
Remark (Time series as data samples).For each variable X j , the data samples (time series) consist of n paths (X j (1, ω i ), . . ., X j (T j , ω i )), i = 1, . . ., n, which we arrange as T j -dimensional vectors x j i = (x j i,1 , . . ., x j i,Tj ), i = 1, . . ., n, i.e., the components of the vector are given by x j i,t k := X j (t k ; ω i ).
Definition (Independence of stochastic processes).Two stochastic processes {X j t } and {X j ′ t } with the same index set T are independent if for every choice of sampling times t 1 , . . ., t f ∈ T , the random vectors (X j (t 1 ), . . ., X j (t f )) and (X j ′ (t 1 ), . . ., X j ′ (t f )) are independent.Independence is usually denoted as {X j t } ⊥ ⊥ {X j ′ t }.Below we will abuse notation and use the shorthand From this definition, it immediately follows that the realisations are independent.
Remark (Independence of realisations).Although the samples within a path (X j (1, ω i ), . . ., X j (T j , ω i )) = (x j i,1 , . . ., x j i,Tj ) are not necessarily independent across time, each variable is independent across realisations for any time t, i.e., x j i,t ⊥ ⊥ x j i ′ ,t ∀t, ∀i ̸ = i ′ .In other words, the n time series are assumed to be iid samples, {x j i } n i=1 iid ∼ P X j , where P X j is a finite-dimensional distribution of the stochastic process {X j t }.

Definition (Stationarity).
A stochastic process is said to be stationary if all its finite-dimensional distributions are invariant under translations of time.
Aim of the paper: Here we use kernels to embed finite-dimensional distributions of the d stochastic processes {X j t } and design tests for joint independence of time series thereof.Recent work has used HSIC to test for independence of pairs of stationary [27,34] and non-stationary [28] time series.Here we extend this work to d > 2 time series using tests based on dHSIC.We consider two scenarios: • if we only observe a single time series (n = 1) of each of the d variables, then we can only consider stationary processes; • if we have access to several time series (n > 1) of each of the d variables, then we can also study non-stationary processes.

dHSIC for joint independence of stationary time series
We first consider the scenario where we only have one time series (n = 1) for each of the d variables X j , which are all assumed to be stationary.Our data set is then {x j } d j=1 , and it consists of d time series vectors x j = (x j 1 , . . ., x j T ), which we view as single realisations of the stationary stochastic processes {X j t }, all sampled at times t = 1, . . ., T .As will become clear below, the limited information provided by the single realisation, together with the use of permutation-based statistical tests, means that the assumption of stationarity is necessary [26].
Let K j ∈ R T ×T be kernel matrices with entries K j ab = k j (x j a , x j b ) where a, b ∈ {1, • • • , T }, and k j : R × R → R is a characteristic kernel (e.g., Gaussian); hence the matrix K j captures the autocorrelational structure of variable X j .In this case, dHSIC (1) can be estimated as the following expansion in terms of kernel matrices [19,35]: The null hypothesis is H 0 : P X 1 ,...,X d = P X 1 • • • P X d , and we test (2) for statistical significance.To do so, we bootstrap the distribution under H 0 using random shifting to generate S samples [26].For each of the samples s = 1, . . ., S, we fix one time series (x 1 without loss of generality) and generate random shifting points τ j s , j = 2, . . ., d for each of the other d − 1 time series, where h < τ j s < T and h is chosen to be the first index where the autocorrelation of d j=1 x j is less than 0.2 [26].Each time series is then shifted by τ j s , so that x j s,t = x j (t+τ j s ) mod T .This shifting procedure, which is illustrated in Fig. 1, breaks the dependence across time series, yet retaining the local temporal dependence within each time series.In this way, we produce S randomly shifted data sets (x 1 s , . . ., x d s ), s = 1, . . ., S, and the estimated dHSIC is computed for each shifting: dHSIC st (x 1 s , . . ., x d s ).The p-value is computed by Monte Carlo approximation [19].Given a significance level α, the null hypothesis H 0 is rejected if α > p-value.We note that although an alternative to shifting called Wild Bootstrap has been proposed [27,36], it has been reported to produce large false positive rates [37].We therefore use shifting (and not the Wild Bootstrap) in this manuscript.

Validation on synthetic stationary multivariate systems with a single realisation
To validate our approach, we apply the dHSIC test for joint independence to data sets consisting of d = 3 time series of length T with n = 1 realisations (i.e., one time series per variable).We use ...

Each variable is randomly split into two parts and their ordering is switched
The first variable is not shifted as a reference ...

Single realisation (n=1)
(x 1 , ... ,x d ) x 1 x 2 x 3 Figure 1: Shifting strategy for random sampling of single-realisation stationary time series.The shifting method for stationary time series when only one realisation of each variable is available.For each null sample s, the first time series x 1 is kept fixed and a random shifting point τ s j is chosen for each of the other time series x j , j = 2, . . ., d so that the sections before and after τ s j (darker and lighter shades of colour) are switched.This process generates S randomly shifted samples that are used to bootstrap the null distribution.
three stationary models with a known dependence structure (ground truth), the strength of which can be varied by For each test, we use S = 1000 randomly shifted samples and we take α = 0.05 as the significance level.We then generate 200 such data sets for every model and combination of parameters (T , λ), and compute either the test power (i.e., the probability that the test correctly rejects the null hypothesis when there is dependence) or the Type I error (i.e., the probability that the test mistakenly rejects the true null hypothesis when there is independence) for the 200 data sets.
Model 1.1: 3-way dependence ensuing from pairwise dependences.The first stationary example [38] has a 3-way dependence that follows from the presence of two simultaneous 2-way dependences: where ϵ t , η t , ζ t and θ t are generated as iid samples from a normal distribution N (0, 1), and the dependence coefficient λ regulates the magnitude of the dependence between variables, i.e., for λ = 0 we have joint independence of (X, Y, Z) and the dependence grows as λ is increased.Figure 2A shows the result of our test for d = 3 variables applied to time series of length T = [100, 300, 600, 900, 1200] and increasing values of the dependence coefficient 0 ≤ λ ≤ 1 generated from model (4).As either λ or T increase, it becomes easier to reject the null hypothesis of joint independence.Full test power can be already reached for λ = 0.5 across all lengths of time series.Our test also rejects pairwise independence between the (X, Z) and (Y, Z) pairs, and fails to reject independence between (X, Y ), as expected from the ground truth.
Model 1.2: Pure 3-way dependence.Our second stationary example, also from Ref. [38], includes a 3-way dependence without any underlying pairwise dependence: where ϵ t , η t , ζ t and θ t are iid samples from N (0, 1), and the coefficient λ regulates the 3-way dependence.Figure 2B shows that the test rejects the null hypothesis as either λ or T increase, although the test power is lower relative to (3), as there are no 2-way dependences present in this case, i.e., this is a 3-way emergent dependency.
Model 1.3: Joint independence.As a final validation, we use a jointly independent example [38]: where ϵ t , η t , ζ t are iid samples from N (0, 1). Figure 2C shows that in this case we do not reject the null hypothesis of joint independence across a range of values of the autocorrelation parameter a.Note that the type I error of the test remains controlled around the significance α = 0.05 for all values of T and a.

Synthetic frequency mixing data
As a further illustration linked more closely to real-world applications, we have generated a data set based on frequency mixing of temporal signals.Frequency mixing is a well known phenomenon in electrical engineering, widely used for heterodyning, i.e., shifting signals from one frequency range to another.Applying a non-linear function (e.g., a quadratic function or a rectifier) to the sum of two signals with distinct frequencies generates new signals with emergent frequencies at the sum and difference of the input signals (Figure 3A-C).It has previously been shown that the instantaneous phases of the emergents display a unique 3-way dependence, without any pairwise dependences [39][40][41].Importantly, given sufficiently long time series the instantaneous phase can be considered a stationary signal [39].Hence we can apply our test to this system.
Here, we generated a data set using the sum of two sinusoidal functions with frequencies f 1 =7Hz and f 2 =18Hz as input, to which we applied a quadratic function plus weighted Gaussian noise ϵ.This produces a signal F that contains components at input ('root') frequencies (f 1 =7Hz and f 2 =18Hz), second harmonics (2f 1 =14Hz and 2f 2 =36Hz), and emergent frequencies (f ∆ = f 2 − f 1 =11Hz and 3A and Ref. [39] for further details.We then computed a wavelet transform and extracted the instantaneous phases for frequencies f 1 , f 2 , f Σ and f ∆ , which we denoted ϕ 1 , ϕ 2 , ϕ Σ and ϕ ∆ .These phases can be considered as stationary time series.The ground truth is that there should be no pairwise dependencies between any of those phases, but there are higher order interactions involving 3-way and 4-way dependencies [39]. We applied dHSIC with shifting to all possible groupings of d phases (for d = 2, 3, 4) from the set {ϕ 1 , ϕ 2 , ϕ Σ , ϕ ∆ }.The phases consisted of time series with length T = 1000, and we used S = 1000 shiftings for our bootstrap.We found that the null hypothesis of independence could not be rejected for any of the six phase pairs (d = 2), whereas joint independence was rejected for all four phase triplets (d = 3) and for the phase quadruplet (d = 4).The rejection of all the 3-way and 4-way joint independence hypotheses, without rejection of any of the pairwise independence hypotheses, thus recovers the ground truth expected structure (Figure 3B).

Application to climate data
As an application to real-world data, we used the PM2.5 air quality data set, which contains four variables: hourly measurements of the Particular Matter 2.5 (PM2.5)recorded by the US Embassy in Beijing between 2010 and 2014, and three concurrent meteorological variables (dew point, temperature, air pressure) measured at Beijing Capital International Airport [42].Non-stationary trends and yearly seasonal effects were removed by taking differences of period 1 and period 52 in the averaged weekly data.Stationarity of the de-trended series was verified by an Adfuller test [43].As expected, we find that the null hypotheses (joint independence) are rejected for all groups of d = 2, 3, 4 variables, implying that PM2.5, dew point, temperature and air pressure are all dependent on each other.

dHSIC for joint independence of non-stationary time series with multiple realisations
When we have multiple independent observations of the d variables, these can be viewed as iid samples of a multivariate probability distribution.By doing so, the requirements of stationarity and same point-in-time measurements across all variables can be loosened.Consider the case when we have access to n > 1 observations of the set of variables (X 1 , . . ., X d ), where each observation i = 1, . . ., n consists of d time series X j , which we write as vectors x j i = (x j i,1 , . . ., x j i,Tj ) of length T j .Each of the n observations thus consists of a set {x j i } d j=1 , which can be viewed as an independent (iid) realisation of a finite-dimensional multivariate distribution P X 1 ,...,X d .To simplify our notation, we compile the n observations of each X j as rows of a n × T j matrix X j , so that X j [i,:] = x j i .Let κ j : R Tj × R Tj → R be a characteristic kernel (e.g., Gaussian) that captures the similarity between a pair of time series of variable X j .We then define the set of kernel matrices K j ∈ R n×n with entries K j αβ = κ j (x j α , x j β ) where α, β ∈ {1, • • • , n}.Therefore the matrix K j captures the similarity structure between the time series of variable X j across the n observations.This setup thus allows us not to require stationarity in our variables, since the n observations capture the temporal behaviour of the d variables concurrently.In this case, dHSIC for the set of observations (X 1 , . . ., X d ) can be estimated as [19]: Similarly to Section 3, the null hypothesis is H 0 : P X 1 ,...,X d = P X 1 • • • P X d and we test ( 6) for statistical significance.Due to the availability of multiple realisations, however, we use a different resampling method (standard permutation test) to bootstrap the distribution of ( 6) under H 0 (Fig. 4).
For each of the samples p = 1, . . ., P , we fix one variable (X 1 without loss of generality), and we randomly permute the rest of the variables across realisations to create the permuted sample , where P[i, p] indicates a random permutation between realisations, and In this way, we produce P permuted data sets (X 1 p , . . ., X d p ), p = 1, . . ., P , with X 1 p = X 1 .The estimated dHSIC ( 6) is then computed for each permutation p.Given a significance level α, the null hypothesis H 0 is rejected if α > p-value where the p-value is computed by Monte Carlo approximation [19].

Multiple realisations of each variable (n>1)
... A permutation strategy similar to the one developed for iid data [19] can be applied when multiple realisations of either stationary or non-stationary time series data are available.Each null sample p = 1, . . ., P is generated by randomly permuting the time series for variables j = 2, . . ., d across realisations i = 1, . . ., n, as indicated by the dotted lines, whilst the first variable remains unchanged.Null distributions are generated from independent samples of this process.

Validation on simple non-stationary multivariate systems
The dHSIC test is applied to data sets consisting of n observations of non-stationary time series of length T of three variables (X, Y, Z), with ground truth dependences that can be made stronger by increasing a dependence coefficient λ.For every model and combination of parameters (n, T , λ), we generate 200 data sets and compute the test power, i.e., the probability that the test correctly rejects the null hypothesis in our 200 data sets.The first model has the same dependence structure as (3), i.e., two simultaneous pairwise dependences and an ensuing 3-way dependence, but in this case with non-stationary trends: where ϵ t , η t , ζ t are iid samples from a normal distribution N (0, 1); λ regulates the strength of the dependence (λ = 0 means joint independence); and g 1 (t), g 2 (t), g 3 (t) are non-stationary trends: • Linear trend (Fig. 5A): • Complex non-linear trend (Fig. 5B): g 1 (t) = sin 2 (t) log(1+t) , g 2 (t) = cos 2 (t) log(1+t) , g 3 (t) = sin(t) cos(t) log(1+t) .Figure 5A-B shows that the dHSIC test is able to reject the null hypothesis of joint independence for (7) even for short time series and small values of the dependence coefficient λ.The test power increases rapidly as the length of the time series T or the number of realisations n are increased.As expected, the null hypothesis cannot be rejected for T = 1, since the temporal dependence is no longer observable.Model 2.2: Emergent 3-way dependence with non-stationarity.The second model has the same dependence structure as (5) (i.e., an emergent 3-way dependence without 2-way dependences) but with non-stationary trends: where, again, ϵ t , η t , ζ t are iid samples from N (0, 1), and λ regulates the strength of the dependence.We set a = 0.8, the point at which the data becomes non-stationary according to an Adfuller test.Fig. 5C shows good performance of the test, which is able to reject joint independence for low values of λ, with increasing test power as the length of the time series T and the number of realisations n is increased (Fig. 5C).

Synthetic XOR dependence
The Exclusive OR (XOR) gate (denoted ⊕) is a logical device with two Boolean (0-1) inputs and one Boolean output, which returns a 1 when the number of '1' inputs is odd.Here we consider a system with 3 Boolean variables X, Y, W driven by noise, which get combined via XOR gates to generate another Boolean variable Z: where ϵ t , η t , ζ t , θ t are iid samples from U[0, 1), a uniform distribution between 0 and 1, and X 0 , Y 0 , W 0 are initialised as random Boolean variables.The dependence in this system is high-order: it only appears when considering the 4-variables, with no 3-way or 2-way dependences.We find that our test does not reject joint independence for d = [2,3] variables, but does reject joint independence of the 4-variable case.

Application to MRI and Alzheimer's data
As a first application to data with multiple realisations, we apply our test to the MRI and Alzheimer's longitudinal data set [44], which comprises demographic and Magnetic Resonance Imaging (MRI) data collected from subjects over several visits.Here, we consider n = 56 subjects, each with at least 3 visits (T = 3), and we assume that the subjects constitute iid realisations, a reasonable assumption since this is a well-designed population study with representative samples.We then perform dHSIC tests to find dependencies between four key variables: Age, Normalised Whole Brain Volume (nWBV), Estimated total intracranial volume (eTIV) and Clinical Dementia Rating (CDR).The first three variables are clinical risk factors, whereas CDR is a standardised measure of disease progression.
Our findings are displayed as hypergraphs in Figure 6 where nodes represent variables and hyperedges represent rejections of joint independence from the 2-way, 3-way and 4-way dHSIC tests.In this case, we find only 2 pairwise dependencies (Age-nWBV and nWBV-CDR) whilst eTIV is seemingly disconnected to the rest of the variables.Note that the possible emergent 3-way interaction (Age-eTIV-CDR) is not present, although eTIV shows the expected 3-way and 4-way dependences with CDR, nWBV and Age.This example highlights how our method can be used to reveal the different higher-order dependencies beyond pairwise interactions.To understand the complex high-order interactions of the incomplete factorisations, methods based on Streitberg and Lancaster interaction can be explored in future work [45].

Application to socioeconomic data
As a final illustration in a different domain area, we test for joint independence between the United Nations' Sustainable Development Goals (SDGs) [46].This data set consists of time series of a

3-way 4-way 2-way
Figure 6: High-order dependencies between four variables in the MRI and Alzheimer's data containing multiple realisations of time series data.The hyperedges represent rejections of the respective joint independence tests.We find 2 (out of 6) pairwise dependences, 3 (out of 4) 3-way dependences, as well as the 4-way dependence between all variables.There are no emergent dependencies in this example.large number of socioeconomic indicators conforming the 17 SDGs (our variables X j , j = 1, . . ., 17) measured yearly between 2000 and 2019 (T = 20) for all 186 countries in the world (see Ref. [12] for details on the data set).We take the countries to be iid realisations, as in Ref. [12], although this assumption is less warranted here than for the dementia data set in Section 4.1.3,due to moderate correlations between countries due to socio-economic and political relationships.As an illustration of the differences in data dependences across country groupings, we consider two classic splits: (i) a split based on income level (n = 74 countries with low and lower middle income and n = 105 countries with high and upper middle income); (ii) a split based on broad geography and socio-economic development (n = 49 countries in the Global North and n = 137 countries in the Global South).This data set highlights the difficulties of examining high-order dependences as the number of variables grows, e.g., d = 17 in this case.
The results of applying this recursive scheme to the SDG data set are shown in Figure 7.The comparison between lower and high income countries (Figs.7A-C) shows that higher income countries have strong pairwise dependences (124 rejections of 2-way independence out of a total of 136 pairs) and only 1 emergent 3-way interaction (Fig. 7C), whereas lower income countries have more emergent higher-order dependences (eight 3-way and one 5-way) (Fig. 7B).These results suggest that the interdependences between SDGs are more complex for lower income countries, whereas most of the high order dependences in high-income countries are explained away by the pairwise dependences between indicators.Given that many analyses of SDG interlinkages consider only pairwise relationships, this implies the need to consider high-order interactions to capture relationships in lower income countries where policy actions targeting pairwise interlinkages could be less effective.The comparison between the Global North and Global South (Figs.7D-E) shows that the Global South has exclusively 2-way dependences, whereas the Global North has emergent 3-way interactions (12) and 4-way (1) (Fig. 7E).Interestingly, two SDGs, climate action and life below water, consistently appear in emergent high-order dependences in lower income, higher income, and in Global North groupings, suggesting their potential for further studies.In addition, the hypergraphs of emergent high-order interactions for different country groupings can be studied using network science techniques, including the computation of centrality measures to rank the importance of SDGs within the system of interdependent SDG objectives, and the use of community detection algorithms to extract clusters of highly interdependent SDGs [12].

Discussion
In this paper, we present dHSIC tests for joint independence in both stationary and non-stationary time series data.For single realisations of stationary time series, we employ a random shifting method as a resampling technique.In the case of multiple realisations of either stationary or non-stationary time series, we consider each realisation as an independent sample from a multivariate probability distribution, enabling us to utilise random permutation as a resampling strategy.To validate our approach, we conducted experiments on diverse synthetic examples, successfully recovering ground truth relationships, including in the presence of a variety of non-stationary behaviours.As illustrated by applications to climate, Sustainable Development Goal and the MRI and Alzheimer's data, the testing framework could be applicable to diverse scientific areas in which stationary or non-stationary time series are the norm.There are some computational considerations that need to be taken into account for different applications.In our numerical experiments, we have evaluated the impact of several parameters, including the length of the time series T and the number of observations n, on the computational efficiency and statistical power of our test.In general, the test statistic can be computed in O(dT 2 ) or O(dn 2 ), where d is the number of variables and T 2 or n 2 are the sizes of the kernel matrices [19].Hence the computational cost increases with the number of variables and/or number of realisations and length of time series.The computational cost also grows linearly with the number of resamplings (S or P ) used to approximate the null distribution, but our findings show that the test is robust even for low numbers of resamplings.Figure 8A shows that the test power does not improve substantially beyond 100 resamplings (permutations), a result that has been previously discussed for iid data [47] Therefore achieving a balance between test power and computational efficiency is crucial, particularly when dealing with large multivariate data sets.

GLOBAL NORTH 3-way
It is worth noting that for stationary data with multiple independent realisations, both resampling schemes (shifting and permutation) can be employed to sample the null distribution.If the number of realisations (n) is much larger than the length of the time series (T ), the permutation strategy provides more efficient randomisation as long as the realisations are diverse.Conversely, when n is smaller than T , time shifting allows to exploit better the observed temporal dynamics.As an illustration of this point for Model 1.1 (3) with multiple realisations, Figure 8B shows that if we have T = 20 time points available, then the permutation-based approach has the same performance as the   3), using shifting resampling, left) and non-stationary version (Model 2.1 (7) with linear trend, using permutation resampling, right).Relatively few null samples (S, P > 100) are enough to attain high test power for both schemes.(B) For stationary time-series with multiple realisations, both shifting and permutation can be employed.Shifting is preferred if the number of time observations (T ) is large relative to the number of realisations (n); conversely, permutation is preferred if n is large relative to T .For Model 1.1 (3) with T = 20, the permutation scheme with n = 6 already reaches comparable performance to shifting, whereas for T = 100 we need n = 20 for permutation resampling to reach comparable performance to shifting.We use S = P = 1000 for all tests in (B).
shifting approach when the number of realisations reaches n = 6.However, if T = 100 time points are available, both performances become similar when the number of realisations is n = 20.These resampling alternatives must be also evaluated in conjunction with the study of different kernels that can more effectively capture the temporal structure within and across time series (e.g., signature kernels).We leave the investigation of these areas as an avenue of future research.
The interest in higher-order networks, such as hypergraphs or simplicial complexes, has been steadily growing [24] with applications across scientific fields [22,[48][49][50][51]. Higher-order networks can be natural formalisations of relational data linking d entities [52,53].However, there is a scarcity of research and a lack of consensus on how to construct higher-order networks from observed iid or time series data [54], and the joint independence methods proposed here could serve to complement approaches based on information measures [20].By iteratively testing from pairwise independence up to d-order joint independence, our approach can uncover emergent dependencies not explained by lower-order relationships.This framework presents a direction for the development of higher-order networks, bridging the gap between observed data and the construction of meaningful higher-order network representations.

Figure 2 :
Figure 2: Stationary systems with a single realisation.Left, a visualisation of the ground truth dependences where edges represent rejection of pairwise dependence and 3-way hyperedges represent the rejection of 3-way joint independence.Middle, an example of the 3-variable time series for which dHSIC was computed.Right, test power (for A,B) and Type I error (for C) computed by applying the dHSIC test to 200 data sets generated for each model with different dependence coefficient λ (autocorrelation coefficient a for C) and length of time series T .The lines represent the average over the 200 data sets and the shaded area correspond to confidence intervals.The systems are taken from Ref. [38]: (A) 3-way dependence ensuing from pairwise dependences (3); (B) Emergent 3-way dependence with no underlying pairwise dependences (4); (C) Joint independence (5).

Figure 3 :
Figure 3: Frequency mixing.(A) Two independent input signals with root frequencies f 1 =7 Hz and f 2 =18 Hz are mixed via a quadratic function with noise to generate the signal F .This signal has components at the root frequencies, harmonics (double of the root frequencies), and emergents (sum and difference of the root frequencies), as shown by the output waveforms (right).Time series of the instantaneous phases ϕ 1 , ϕ 2 , ϕ ∆ , ϕ Σ are extracted from the output components at f 1 , f 2 , f ∆ , f Σ and dHSIC is applied to them.(B) In this case, the dHSIC test does not reject pairwise independence between any pair of variables (i.e.there are no pairwise dependencies), but rejects the joint independence between any three of the four variables (i.e.four 3-way emergent dependencies are present), as shown by the triangles, and, consequently, also rejects the joint independence between the four variables (square).

Figure 4 :
Figure4: Random permutation sampling of multivariate time series with multiple realisations.A permutation strategy similar to the one developed for iid data[19] can be applied when multiple realisations of either stationary or non-stationary time series data are available.Each null sample p = 1, . . ., P is generated by randomly permuting the time series for variables j = 2, . . ., d across realisations i = 1, . . ., n, as indicated by the dotted lines, whilst the first variable remains unchanged.Null distributions are generated from independent samples of this process.

Figure 5 Figure 5 :
Figure 5: Three-variable non-stationary systems with multiple realisations.Left, a visualisation of the ground truth dependences: edges represent pairwise dependence, 3-edges represent a 3-way dependence.Middle left, an example of a realisation of a 3-variable time series.Middle right, test power computed by applying the dHSIC test to 200 data sets generated from a model at varying values of the dependence coefficient λ and the length of time series T , with fixed number of realisations n = 100.Right, test power computed by applying the dHSIC test to 200 data sets generated from a model at varying values of the dependence coefficient λ and the number of realisations n, with fixed length of time series T = 20.The lines represent the average over the 200 data sets and the shaded areas correspond to confidence intervals.The systems are: (A)-(B) 3-way dependence ensuing from 2-way dependences (7): (A) linear trend, (B) a complex dependence term; (C) 3-way dependence with no underlying pairwise dependences and a non-stationary trend (8).

Figure 7 :
Figure 7: Emergent high-order dependences between Sustainable Development Goals (SDGs).(A-C) Comparison of SDG dependences in low and high income countries.(A) There is a higher number of emergent higher-order dependences in low-income countries.The d > 2 dependences are mapped onto d-order hypergraphs for (B) low-income and (C) high-income countries.(D-E) Comparison of SDG dependences in Global North and Global South countries.(D) Emergent highorder dependences are found in the Global North, whereas the Global South displays only 2-way dependences.(E) The d > 2 dependences for the Global North are mapped onto d-order hypergraphs.

Figure 8 :
Figure 8: Robustness and efficiency of shifting and permutation resampling strategies.(A) Test power for the same model in its stationary version (Model 1.1 (3), using shifting resampling, left) and non-stationary version (Model 2.1(7) with linear trend, using permutation resampling, right).Relatively few null samples (S, P > 100) are enough to attain high test power for both schemes.(B) For stationary time-series with multiple realisations, both shifting and permutation can be employed.Shifting is preferred if the number of time observations (T ) is large relative to the number of realisations (n); conversely, permutation is preferred if n is large relative to T .For Model 1.1 (3) with T = 20, the permutation scheme with n = 6 already reaches comparable performance to shifting, whereas for T = 100 we need n = 20 for permutation resampling to reach comparable performance to shifting.We use S = P = 1000 for all tests in (B).

1 Realisations of the same variable are permuted along dotted lines
time Variable 1 REALISATION 2 time REALISATION 1 time Variable 2 time Null Sample 1 time REALISATION . . .time ... time n realisations . . .