Deﬁning the Wavelet Bispectrum

Bispectral analysis is an eﬀective signal processing tool for analysing interactions between oscillations, and has been adapted to the continuous wavelet transform for time-evolving analysis of open systems. However, one unaddressed question for the wavelet bispectrum is quantiﬁcation of the bispectral content of an area of scale-scale space. This makes the capacity for quantitative rather than merely qualitative interpretation of wavelet bispectrum computations very limited. In this paper, we overcome this limitation by providing suitable normalisations of the wavelet bispectrum formula that enable it to be treated as a density to be integrated. These are roughly analogous to the normalisation for second-order wavelet spectral densities. We prove that our deﬁnition of the wavelet bispectrum matches the traditional bispectrum of sums of sinusoids, in the limit as the frequency resolution tends to inﬁnity. We illustrate the improved quantitative power of our deﬁnition with numerical and experimental data.


Introduction
We begin by outlining the background to wavelet bispectral analysis and its applications, the limitations of current definitions, and how we overcome these limitations in terms of suitably defined notions of wavelet bispectral density.

Background to wavelet bispectral analysis
Power-spectral analysis of time-series data may be used to investigate oscillatory influences in a signal, and likewise cross-spectral analysis to investigate common oscillatory influences between two simultaneous signals. Power spectra and cross-spectra are referred to as second-order spectra. In a similar manner, bispectral (i.e. third-order spectral) analysis of time-series data may be used to investigate influences -either in one signal or common among two or three simultaneous signalscoming from a nonlinearly combined effect of two oscillators at different 1 frequencies. In particular, this can include influence from a pair of interacting oscillators (illustrated in Fig. 1), although it can also include a nonlinear response to a superposition of two non-interacting oscillatory influences. Just as at the heart of cross-spectral analysis is the coherence of phases associated to a frequency f , so by exact analogy, at the heart of bispectral analysis is the coherence between the sum of phases associated to two frequencies f 1 , f 2 (each of which can be positive or negative) and the phase associated to the sum of the frequencies f 1 + f 2 .
Bispectra (and more general higher-order spectra) of stationary stochastic processes were introduced in 1953 by John Tukey [70], and first applied in the study of nonlinear superposition of ocean waves [69,24], subsequent to which many diverse applications have been found [65]. Bispectra have also been defined for deterministic signals [46,2]. An in-depth exposition of bispectra and more general higher-order spectra in both the stationary stochastic setting and the deterministic setting can be found in [47] (where other uses of bispectra than investigating interaction of physical processes are also discussed). and/or and/or (a) (b) (c) (d) Figure 1: Illustration of what can be investigated by (a,b) second-order, and (c,d) third-order spectral analysis. (a) An oscillatory process contributing to a signal, as detected by power-spectral analysis. (b) An oscillatory process contributing simultaneously to two signals, as detected by cross-spectral analysis. (c) An interacting pair of oscillatory processes contributing to a signal, as detected by autobispectral analysis of the signal. (d) An interacting pair of oscillatory processes contributing simultaneously to two signals, as detected by cross-bispectral analysis of the two signals; cross-bispectral analysis can also be applied to three simultaneous signals.
Traditional bispectra, as with traditional second-order spectra, are defined as functions of frequency variables but no time variable. However, thermodynamically open systems -such as biological systems -have time-variable characteristics, such as temporal intermittency of interaction between oscillatory components [72,73], and temporal variation in the basic frequencies of the oscillatory components themselves [29]. Extensions of traditional bispectra of stationary stochastic processes to the setting of periodic [16,61] and almost-periodic [10] stochastic processes are also not capable of dealing with "free-form" temporal variations, as is necessary for investigating typical open systems such as biological systems. Now in the context of second-order spectral analysis, for a time-resolved investigation of common oscillatory influences between two simultaneous signals, one can: • define a time-evolving cross-spectrum derived from a time-frequency representation of the signals (that is, a time-evolving representation of the frequency content of the signals); • then use this time-evolving cross-spectrum to analyse coherence of phases within a sliding time-window, where the temporal spread of the window is on the one hand, sufficiently short that changes in behaviour can be resolved and located in time reasonably well, but on the other hand, sufficiently long that coherence of phases over the window can be meaningfully considered.
This approach to time-evolving second-order cross-spectral analysis was extended to bispectral analysis in the pioneering papers [72,73]. The time-frequency representation used there for defining time-evolving bispectra was the continuous wavelet transform, and the time-evolving bispectra defined in this way are called wavelet bispectra. The advantage of the wavelet transform over other time-frequency representations is that it rescales its time-localisation in accordance with each frequency under investigation, thereby enabling the simultaneous resolving of oscillatory contributions on a broad range of timescales. Wavelet bispectra, as with wavelet cross-spectra, are defined for deterministic signals rather than stochastic processes, but such objects can still be applied to stochastic processes -both stationary and non-stationary -by sample-pathwise application [9]; in this case, a statistical descriptor [40] may then be obtainable by taking expectations (and factoring out the signal or window duration as appropriate). There do exist various time-frequency representations other than the continuous wavelet transform [4], an important example being the windowed Fourier transform. Various time-evolving bispectra other than the wavelet bispectrum of [72,73] have also been introduced for both deterministic and stochastic processes (in both discrete and continuous time) [46,18,64,15,51,63,1,28,56,58,54,5,53,68].

Overview of application of wavelet bispectral analysis
Since its introduction, wavelet bispectral analysis has found effective application in diverse areas, including: the identification of relations between different oscillations in turbulent flows [17], especially of plasmas [72,73,12,34,60], and in laminar flows at the boundary of transition to turbulence [23]; investigation of water wave interactions beyond linear superposition [11,75,6]; analysis of electrostatic fluctuations in the ionosphere [41,13]; interactions between brain waves measured in EEG signals during sleep [42], anaesthesia [7], and burst suppression while sedated [55] (where a slightly modified version of the wavelet transform is used), as well as characterisation of pain from EEG signals [21]; prediction of epileptic seizures from real-time analysis of electrocorticography signals [71]; characterisation of wheezes from their audio recording [66], and characterisation of abnormal cardiac sounds [35]; investigation of cardiorespiratory interaction from respiration signals and ECG and/or blood flow signals [27,33]; analysis of oscillations in photovoltaic current of surface state electrons on liquid helium [8]; coordination of renal autoregulation mechanisms via analysis of kidney blood flow signals [57]; and analysis of vibrations to detect mechanical faults [74,43,44]. Surrogate testing for wavelet bispectral analysis results is discussed in [38,57,20,33].

Current definition and its limitation
In the papers [72,73] where wavelet bispectral analysis was introduced, given two signals x(t) and y(t) (which could be the same or different), the wavelet bispectrum over a time-interval I = [T, T + ∆T ] was defined by replacing Fourier transform terms in the traditional definition of bispectral density of deterministic finite-energy signals with time-dependent continuous wavelet transform terms, and integrating over the time-interval. That is, where s 1 and s 2 are the input timescales, and W x (s, t) and W y (s, t) are the continuous wavelet transforms of the signals x(·) and y(·) respectively (which depend on the choice of mother wavelet for the wavelet transform). One can easily extend this definition to cover three signals x(t), y(t) and z(t) by The wavelet bispectrum defined in this manner makes it possible for the concept of a third-order coherence index, as had been defined for stationary stochastic processes [47,Sec. 2.3.9], to be translated to the setting of deterministic signals with no assumption of stationary statistics: specifically, as a third-order analogue of wavelet coherence, wavelet bicoherence was defined in [72,73] by c I xxy (s 1 , s 2 ) := |B I xxy (s 1 , s 2 )| 2 Eq.
(1) has continued to persist as the standard definition of the wavelet bispectrum [22]. However, one basic question that cannot be addressed by wavelet bispectral analysis based on this definition of the bispectrum is how to evaluate quantitatively, rather than just describe qualitatively, the bispectral content within a given region in scale-scale space. Even for just one isolated bispectral peak in the scale-scale space, the Heisenberg-Gabor uncertainty principle [31,Sec. 2.2] implies that this peak will not be a Dirac mass but rather a blur; and this peak will only be yet more blurry if the frequencies of the oscillations involved are time-varying. Naturally, the inability to quantify the bispectral content of a region in scale-scale space also implies the inability to describe how bispectral content is distributed across different regions in scale-scale space.
The reason why this basic question has received almost no attention is likely because wavelet bispectral analysis has tended to focus specifically on bicoherence analysis using Eq. (3), for which Eq. (1) is merely a preliminary definition necessary in the construction of the wavelet bicoherence c I xxy . However, only analysing the wavelet bicoherence c I xxy directly makes it difficult to tell whether high bicoherence-values correspond to interacting oscillatory influences or just the absence of oscillatory components and associated harmonics, and in the former case, also makes it difficult to locate the approximate frequencies of the oscillatory influences. The actual bispectral values prior to the factoring out of norms in Eq. (3) may contain important information.

Illustration of the limitation
To illustrate the problem, in Fig. 2 we plot over scale-scale space the integrand of Eq.
(2) at a fixed time τ , i.e. we plot the "instantaneous wavelet bispectrum" using the conventional s − 1 2 normalisation in the wavelet terms as used in [72,73], for signals of the form For our choice of signals and parameters, the dependence on the fixed time τ is negligible, so the results are essentially the same as if integrating over a time-interval I as in Eq. (2). We use a "lognormal" mother wavelet as described in Sec. 5. For the frequency resolutions used here, this is very similar to the more conventional Morlet wavelet; but all our later numerical illustrations will use lognormal mother wavelets, and so we also use it here. (The reason for our choice of lognormal wavelets in this paper are that they are analytic -which is required for the theory developed in this paper -and that the lognormal wavelet transform has good time-frequency resolution properties [25], including rapidly decaying wavelet power in the high-frequency tails.) In Fig. 2 (both (a) and (b)), we see clearly a bispectral contribution associated to the frequency pair (ν 1 , ν 2 ) and a bispectral contribution associated to the frequency pair (ν 1 , ν 3 ), both appearing as "blurry" peaks as described above; but it is not at all immediately clear how to compare the weight of the two bispectral contributions, let alone how to quantify them absolutely. Certainly one would not immediately expect from the figure that the two bispectral contributions should be considered equal in magnitude. But from a Fourier analysis perspective, they are indeed equal: in both cases the three contributing sinusoidal components all have amplitude equal to 1.

How we solve the problem
Traditional time-independent spectral analysis defines spectral densities that can be integrated over regions of the space of input frequencies. Likewise, time-dependent second-order spectral analysis, via the windowed Fourier transform or the continuous wavelet transform, defines spectral densities that can be integrated over a region of time-frequency space or time-scale space.
Accordingly, to solve the problem described above, we seek to define a notion of wavelet bispectral density 2 by suitable normalisation of the formula (1)/(2), which can then be integrated over a region of time-frequency-frequency space or time-scale-scale space to give the bispectral content of that region. A definition of wavelet bispectral density is suitable insofar as it simultaneously fulfils the following two requirements: for the signals x, y, z in (5) with ν 1 = 2.8 Hz, ν 2 = 1.3 Hz, ν 3 = 12 Hz and θ = π 3 , using conventional p = 1 2 definition of the wavelet transform, with a lognormal mother wavelet (Sec. 5) with frequency resolution parameter (a) σ = 1 and (b) σ = 4. With these lognormal mother wavelets, we consider scale s and frequency f to be related by f = 1 s . The signals are simulated over a time-interval [0 s, 60 s] and the time-instant τ is taken at τ = 30 s.
• For strictly quasiperiodic signals, the wavelet bispectral content of each contribution to the bispectrum should match well that given by Fourier bispectral analysis; and the "blurriness" of the peaks in the wavelet bispectral density should be in accordance with the frequency resolution of the underlying wavelet transform.
• For more general signals, the bispectral content of a given region of frequency-frequency space over a given time-interval I should be very little affected by the behaviour of the signals outside of the time-interval I.
In seeking to obtain such a definition of wavelet bispectral density, we specifically consider wavelets that are analytic with non-negative-valued Fourier transform; this is a reasonable assumption for time-frequency analysis, and non-analytic wavelets can be made analytic by convolution with the inverse Fourier transform of a suitable cut-off function. We assume a simple inverse relationship f = κ s between scale and frequency. This is very natural for time-frequency analysis using wavelets with unimodal Fourier transform, and is also necessary for the application of wavelet bispectral analysis to investigate nonlinear oscillatory interactions [73]. We obtain: • a "global" definition of wavelet bispectral density with respect to logarithmic frequency axes, which enables one to define the time-evolving distribution of bispectral content across frequency-frequency space (and also to compare this across signals); • a "local" definition of wavelet bispectral density with respect to logarithmic frequency axes, which is (very slightly) better suited to the investigation of bispectral content around a prespecified location in frequency-frequency space than the "global" definition is.
In the "global" definition, the normalisation factor in front of the product of wavelet terms in (2) is itself dependent on the ratio of the two input frequencies or scales. (If p = 1 is used for the s −p normalisation in the individual wavelet terms themselves, then it is only the ratio, and not the two individual input frequencies, that is needed in the extra normalisation factor.) An important question is whether this non-constancy of the normalisation factor in our "global" definition causes the distribution of bispectral content over frequency-frequency space to be misrepresented. Accordingly, Theorem 8 justifies the soundness of our definition in a limit as the frequency resolution of the mother wavelet tends to ∞, while the numerics in Sec. 5.1 indicate that our definition gives suitable results even when the frequency resolution is not high at all. The wavelet bispectral densities described above are with respect to logarithmic frequency, but can easily be converted into densities with respect to linear frequency. Nonetheless, the derivation of these densities is based on consideration of logarithmic frequency. A formula derived analogously from consideration of linear frequency can also be obtained, but this is only able to give a "local" definition of wavelet bispectral density, not a "global" one. This is presented in Appendix C.

Relation to existing work
The significance of the question of how to be able to compare wavelet bispectral content across different regions of scale-scale space has been highlighted in [29], when comparing wavelet bispectral analysis with traditional Fourier-based bispectral analysis. No clear mathematical answer to this question is obtained, but working with a modified Morlet wavelet transform, the paper does suggest a wavelet normalisation that essentially corresponds to taking p = 1 in the s −p normalisation for wavelet transforms, and appears to hint implicitly at the fact that doing so normalises the modal magnitude of a fixed-frequency bispectral contribution in proportion with its Fourier bispectral content; we describe this fact in Sec. 3.3.
Normalisation of wavelet bispectra with reference to traditional bispectra has also been addressed in [39], from quite a different point of view from our question of how to compute the bispectral content of a region of time-frequency-frequency space. Namely, for application to stochastic processes of constant expectation, a normalised wavelet bispectrum formula (given explicitly for discretely sampled finite-duration time-series recordings) is presented, that is, roughly speaking, derived so that if applied to a stationary stochastic process of flat bispectral density, the infinite-timeaverage expected wavelet bispectrum at each pair of frequencies matches the traditional bispectral density.

Significance of our result
By defining wavelet bispectral density, we enable a quantitative evaluation of the time-localised distribution of bispectral content over frequency-frequency space, in a manner that can still take advantage of the multiscale resolution capacity of the continuous wavelet transform. This has the potential to lead to more refined indicators of the various physical and biological phenomena in open systems that wavelet bispectral analysis is used to investigate.

Structure of the paper
Sec. 2 covers preliminaries on the (continuous) wavelet transform, second-order Fourier and wavelet spectra, and third-order Fourier spectra. After starting with notational conventions, we introduce the wavelet transform, first according to the p = 1 convention which we consider most simple and natural, and then with general p. We then discuss the basic scaling property of the wavelet transform, particularly as compared with the windowed Fourier transform. We then present definitions and basic results concerning second-order Fourier spectra of finite-energy deterministic signals, followed by analogous definitions and results for wavelet second-order spectra of deterministic signals. Finally, we present definitions and basic results concerning Fourier bispectra of finite-energy deterministic signals.
Sec. 3 covers our definitions and results pertaining to wavelet bispectral analysis. We start by describing how bispectral density would be defined for the windowed Fourier transform, and the technical difficulty in trying to do the same for the wavelet transform. In view of the impossibility of obtaining a "perfect" definition of wavelet bispectra, we explain our criteria for a suitable definition, after which we introduce our main definition of wavelet bispectral density and the wavelet bispectrum. Additional theoretical justification (Theorem 8) for the suitability of this definition is provided. We then discuss frequency-localised adaptation of our definition of the wavelet bispectrum.
In Sec. 4, we prove Theorem 8. In Sec. 5, we provide numerical illustration of our wavelet bispectrum formula. First, we apply it to signals with pure sinusoidal components, to illustrate the match with Fourier analysis. Then, we apply it to a coupled phase-oscillator pair, to illustrate the usage of our new formula for investigating interactions within time-dependent oscillatory dynamics.
In Sec. 6, we illustrate an application of our wavelet bispectrum formula to recorded time-series of cell membrane potentials.
In Appendix A, we describe the relationship between the lognormal wavelet parameter σ and corresponding time and frequency resolution properties.
In Appendix B, we discuss definitions of coherence, bicoherence and "bicorrelation" (introduced in [33] under the name "real wavelet biphase"). In particular, we introduce a new definition of "wavelet-phase bicoherence". We also briefly describe how wavelet bispectral analysis can be used to suggest unidirectionality of coupling [30].
In Appendix C, we present an alternative frequency-localised wavelet bispectrum formula to that given in Sec. 3.3.

Availability of codes
Codes for the wavelet bispectral analysis methods developed and used in this paper will be made available on the Lancaster Publications and Research system Pure.

Preliminaries
In this section, we discuss the wavelet transform and its basic properties, introduce second-order spectra for both Fourier and wavelet transforms, and introduce third-order Fourier spectra. This then lays the foundation for introducing third-order wavelet spectra in Sec. 3.

Some conventions and notations
Fourier transform. In this paper, the Fourier transform is formally defined according to the convention It will also be very useful to definex (f ) :=x 1 f .
One-point compactification of the real line. In accordance with the usual notation, we writeR for the one-point compactification of R, namely R ∪ {∞} where ∞ is regarded as the same as −∞. (Theˆhere is unrelated to Fourier transforms.) Dimensional analysis. Generally speaking, variables denoted t or τ will correspond to physical quantities of the time dimension, variables denoted f , ν or ξ will correspond to physical quantities of the frequency dimension, variables denoted ζ will correspond to a logarithmic representation of quantities of the frequency dimension, and variables denoted r will correspond to dimensionless values. The variable λ (parametrising "projective frequency-frequency space") and wavelet-related parameters κ and σ are also dimensionless.
Spaces of functions. Given a space of complex-valued functions f : U → C whose notation follows the style "F (U )" (e.g. Lebesgue spaces L p (U ), Sobolev spaces W k,p (U ), etc.), for any X ⊂ C we use the style of notation "F (U, X)" to mean the set of functions f : U → X that belong to F (U ) when the codomain is expanded from X to C. (For example, L 2 (R) is the set of square-integrable functions f : Delta-ring of finite integration. Given a Borel set X ⊂ R d and a measurable function f : X → C, we define One can easily show that B f is a δ-ring (i.e. is closed under pairwise union, countable intersection and relative complement) on which Summary of main other notations: ψ denotes a (mother) wavelet function; W ψ,κ,x is the corresponding wavelet transform (with p = 1) of the signal x(t), where κ is the constant of inverse proportion between scale s and frequency f ; W ψ,κ,x is the wavelet transform for general p in the s −p normalisation; second-order Fourier spectral densities of finite-energy signals x and y are denoted P xy (with x = y being "energy spectral density" and x = y being "cross-energy spectral density"); p ψ,κ,xy likewise denotes second-order wavelet spectral density (as a function of both frequency and time) with respect to a linear frequency axis, and p ψ,κ,xy with respect to a logarithmic frequency axis; p ψ,κ,xy is the second-order wavelet spectrum obtained by integrating the second-order wavelet spectral density; B xyz denotes the Fourier bispectral density for a triplet of finite-energy signals (x, y, z); b ψ,κ,xyz denotes globally defined wavelet bispectral density with respect to logarithmic frequency axes, andb ψ,κ,xyz is the transformed version for linear frequency axes; b ψ,κ,xyz is the globally defined wavelet bispectrum obtained by integrating the wavelet bispectral density; "local" wavelet bispectral density and the "local" wavelet bispectrum have an extra subscript λ for the frequency ratio λ = f 1 f 1 +f 2 about which the localisation takes place.

Definition of the wavelet transform
In our definition of the wavelet transform, we will specifically work with absolutely integrable Hermitian analytic wavelet functions with non-negative-valued Fourier transform; these assumptions are reasonable for time-frequency analysis. Any non-analytic wavelet (meaning that the Fourier transform contains negative-frequency content) can be made analytic by convolution with the inverse Fourier transform of a suitable cut-off function: provided the frequency resolution of the original non-analytic wavelet is not too poor, this analyticisation can be achieved with very little modification to the wavelet itself. Whereas the wavelet transform traditionally has scale s as its first argument, throughout this paper the wavelet transform's first argument is a frequency variable f that is reciprocally proportional to scale s (with a constant of proportion denoted κ), and all densities considered are with respect to f or log f rather than with respect to s.
A (mother ) wavelet function will mean a Hermitian function ψ ∈ L 1 (R) \ {0} withψ ≥ 0, ψ (−∞,0] = 0, and ∞ 0ψ (r) r dr < ∞. (This implies in particular that R ψ(r) dr = 0.) Given a wavelet function ψ and a value κ > 0, we define the associated (continuous) wavelet transform of any x ∈ L ∞ (R, R) by for all f > 0 and t ∈ R. In practice, one often chooses the wavelet function ψ such thatψ is unimodal, and then takes κ to be whereψ is maximised [25,73]. We can also take the wavelet transform of an unbounded function x ∈ L q (R, R) for any q ∈ [1, ∞), in which case (6) is well-defined for almost all (f, t) ∈ (0, ∞) × R. When x ∈ L 1 ∪ L 2 , the wavelet transform can be computed from the Fourier transform of its temporal sections, for almost all ξ ∈ R, for each f > 0.
Although not typically necessary, one can choose to extend the definition of W ψ,κ,x (f, t) to include negative f , by for all f = 0 and t ∈ R. In this case, (7) holds for negative as well as positive f . The wavelet transform provides a means of time-localised frequency analysis: in the case that x(t) = A cos(2πνt + φ) with ν > 0, we have for all f > 0 and t ∈ R. If we include negative f , this becomes From (8) we see that, unlike the Fourier transform, the representation of a sine wave given by the wavelet transform is, at each time t, not a sharp peak but rather a blurry peak in accordance with the shape ofψ. This is an inherent property of time-localised frequency analysis: very fine timelocalisation necessitates poor frequency resolution, and very high frequency resolution necessitates poor time-localisation. This is due to the Heisenberg-Gabor uncertainty principle mentioned in the Introduction. We refer to the modulus and the argument of W ψ,κ,x (f, t) respectively as the wavelet amplitude and the wavelet phase associated to the frequency f at time t. Remark 1. It is always possible to rescale ψ such that κ becomes 1: namely, defining ψ 1 (r) = 1 κ ψ r k , we have W ψ,κ,x = W ψ 1 ,1,x . Thus, from a mathematical point of view, we could simply take κ = 1 without any loss of generality. However, in practice, the conventional formulations of some wavelets (such as Morlet wavelets) do not have their Fourier transformψ maximised at or near 1. Because of this fact, we retain the presence of κ in our definitions.

Definition with general p
An alternative, more commonly found, definition of the wavelet transform is for all f = 0 and t ∈ R; and more generally, for any p ≥ 0, one can define for all f = 0 and t ∈ R. So W ψ,κ,x corresponds to p = 1.
One justification for defining the wavelet transform according to (10) rather than (6) is that the is constant over varying f , being simply equal to the L 2 norm of ψ itself. However, for the context of higher-order spectra the meaningfulness of such a justification is doubtful. We will see that when working with logarithmic frequency axes in accordance with the logarithmic frequency resolution of the wavelet transform (Sec. 2.2), the formulae (25), (26), (45) and (49) for both second-and third-order densities are simplest when we take p = 1. (By contrast, in the "linear-frequency derived" wavelet spectral density formulae, while the secondorder densities (23) and (24) are simplest with p = 1 2 as in (10), the third-order densities (C.8) are instead simplest with p = 1 3 .) Two important further practical advantages of working with p = 1 are: • Assuming that κ is the mode ofψ, the maximal amplitude for the wavelet transform of a sinusoidal input will occur precisely at the frequency of the input, if and only if p = 1.
• Computation of the wavelet transform via (7) is simplest with p = 1: an additional pre-factor |f | κ p−1 is needed for any other value of p.

Finite-time signals
The theoretical definition of the wavelet transform works with infinite-time signals. For a finitetime signal x : (a, b) → R, a typical procedure is as follows: First define an extension x ext : R → R of x; typical examples include the zero-padding extension (x ext (t) = 0 for t / ∈ (a, b)) and the periodic padding extension (x ext is (b − a)-periodic), as well as more sophisticated forms of "predictive" padding [25]. Then define the wavelet transform of x to be the restriction of W ψ,κ,xext to some subset C ⊂ (0, ∞) × (a, b) of (f, t)-space, where C is chosen such that the behaviour of x ext outside the time-interval (a, b) has little influence on the values of W ψ,κ,xext (f, t) for (f, t) ∈ C. This set C is called the cone of influence. See [25] for further details (as well as discussion of other practical aspects of wavelet analysis of digital signals).
Throughout the rest of this paper, we present theory for infinite-time signals; application to finite-time signals can be achieved by the above procedure.

Comparison with the windowed Fourier transform
Given an even function w ∈ L 1 (R, R) with R w(t) dt > 0 (called a window function), we define the associated windowed Fourier transform of a function x ∈ L ∞ (R, R) by In other words, Note thatŵ is an even real-valued function. Typically, one uses a window function w such that w is maximised at 0 and such that at any other local maximum of |ŵ| the value of |ŵ| is very small compared toŵ(0). Thus the windowed Fourier transform provides a means of time-localised frequency analysis, over frequency ranges that do not come too close to 0: for →0 uniformly over f >0 as ν→∞ (12) for all f, t ∈ R. We see from (12) that the smallest frequencies for which the windowed Fourier transform is able to give meaningful results depends on how quickly the tails ofŵ decay. At the heart of the difference between the windowed Fourier transform and the wavelet transform is the following: In the windowed Fourier transform, the "envelope" w(τ − t) of the function against which the input signal is integrated is independent of the frequency f under investigation. By contrast, in the wavelet transform the function τ → f κ ψ (τ −t)f κ against which the input signal is integrated is a linearly rescaled version of ψ where the temporal rescaling is inversely proportional to the frequency f under investigation. Note that, from the point of view of dimensional analysis, the inputs of a window function w are of the time dimension, while the inputs of a wavelet function ψ are dimensionless. This rescaling of the wavelet in accordance with the frequency under investigation is the key property that makes the wavelet transform able to investigate different oscillatory components across multiple simultaneous timescales. A basic consequence of this rescaling is that if we define x c (t) = x(ct) for any constant c > 0, then for all f = 0 and t ∈ R. (For more general p, the right-hand side has an extra pre-factor c p−1 .) Eq. (13) can be interpreted as saying that the wavelet transform has logarithmic frequency resolution: For signals of the form the resolvability of the two sinusoidal components by the wavelet transform depends only on the ratio between the two frequencies ν 1 and ν 2 . Indeed, if one plots |W ψ,κ,x | over time-frequency space with a logarithmic frequency axis, multiplying the frequencies ν 1 and ν 2 by the same number c would simply shift the entire diagram up the frequency axis by an amount proportional to log c. This is in contrast with the linear frequency resolution of the windowed Fourier transform, where the resolvability of the two sinusoidal components would depend essentially on the difference in frequency |ν 2 − ν 1 | rather than the ratio ν 2 ν 1 . It is precisely the wavelet transform's "logarithmic" rather than "linear" treatment of frequencies that leads to the main non-triviality in defining the wavelet bispectrum, because the definition of bispectra involves a linear sum of the two input frequencies. (See the start of Sec. 3 for details.)

Second-order Fourier spectra of deterministic signals
Throughout this paper, the abbreviation "ESD" stands for "energy spectral density", and "cross-ESD" is to be interpreted "cross-energy spectral density".
The second-order Fourier spectral densities of finite-energy signals are defined as the integrands in the "frequency-domain integral formulae" for the L 2 -norm and L 2 -inner product. Namely, for and in particular where We are careful to refer to these functions as "spectral densities", with a "spectrum" itself referring to the measure (for the energy spectrum) or complex-valued measure (for the crossenergy spectrum) obtained by integrating the associated spectral density. Note the equalities for all f ∈ R, from which we have the following two consequences (which will also hold for secondorder wavelet spectra): • for second-order spectral analysis it is sufficient to consider only positive frequencies; • for a pair of functions x and y there are essentially three distinct second-order spectra, namely the energy spectra of x and y and the cross-energy spectrum of x with y.
The above spectra are defined for real-valued signals. We also recall that for general complexvalued functions x, y ∈ L 2 (R), and due to (17), for all

Second-order wavelet spectra
Now let us consider a wavelet analogue of the second-order Fourier spectra. For any wavelet function ψ, define The value C ψ is often known as the admissibility constant. It can be regarded as a "powernormalisation constant" for the wavelet transform: given a signal for any t ∈ R. More generally, if we have signals for any t ∈ R. (We mention now that although (22a) and (22b) describe the same object C ψ , the objects described by their "third-order analogues" -namely D ψ (λ) in Eq. (44), andD ψ (λ) in (A)-(C) in Appendix C -are not the same as each other.) Given functions x, y : R → R with wavelet transforms W ψ,κ,x and W ψ,κ,y , we define the linearfrequency wavelet ESDp ψ,κ,xx : and the linear-frequency wavelet cross-ESDp ψ,κ,xy : and we define the logarithmic-frequency wavelet ESD p ψ,κ,xx : and the logarithmic-frequency wavelet cross-ESD p ψ,κ,xy : (0, ∞) × R → C of x with y by At each time t, these spectral densities as a function of f satisfy the same relations (16) satisfied by the Fourier second-order spectral densities. Regarding the cross-energy spectral densities, note where Whereas the Fourier spectral densities in Sec. 2.3 are densities over the frequency axis, these wavelet spectral densities are densities over time-frequency space, with a logarithmic frequency axis in the case of p ψ,κ,xy and a linear frequency axis in case ofp ψ,κ,xy . The wavelet energy spectrum of x is the measure p ψ,κ,xx on (0, ∞) × R given by This measure can also be extended to include the negative frequency axis (i.e. to be a measure on the whole of R 2 ) using (29). Similarly, for any A ⊂ (0, ∞) × R with A ∈ Bp ψ,κ,xy we define Again we can extend p ψ,κ,xy to all sets A ∈ B(R 2 ) with A \ ({0} × R) ∈ Bp ψ,κ,xy using (31). We refer to p ψ,κ,xy as the wavelet cross-energy spectrum of x with y. If x, y ∈ L ∞ (R, R) then for any compact K ⊂ (0, ∞) × R the restriction of p ψ,κ,xy to B(K) defines a complex-valued measure on K.
In analogy to the energy-preservation and cross-energy-preservation properties of the Fourier transform (14)-(15), we have the following well-known fact [31,Sec. 3.2]. (We include a proof, particularly for the sake of the discussion at the start of Sec. 3.) and in particular, Proof. We first prove (33): (7) and (19) One then obtains (32) by repeating the calculation with (17) in place of (19), with the validity of the steps justified by the fact that (f, t) → are square-integrable as proved above.
Of the differences between the Fourier cross-energy spectrum and the wavelet cross-energy spectrum, one that we emphasise is that due to Eqs. (27) and (30)/(31), the wavelet cross-energy spectrum p ψ,κ,xy is able to take into account the level of coherence of phases over time, by which we mean how constant the phase difference φ x (f, t) − φ y (f, t) associated to each frequency f remains over time. This is particularly useful when analysing signals recorded from open systems, where oscillatory components have time-dependent characteristics and there will be various time-dependent background effects picked up in the signals being measured. Detailed discussion of coherence of wavelet phases and its quantification is given in Appendix B.1.
For any functions x, y, z : R → R with Fourier transformsx,ŷ,ẑ : R → C (defined at least up to almost-everywhere equality), define Note the symmetries The definition of B xyz (f 1 , f 2 ) makes reference to three frequencies, namely the two input frequencies f 1 and f 2 and their sum f 1 + f 2 . As shown in Fig. 3, we define the following 6 regions of R 2 according to the signs of these three frequencies: It will also be useful to define the subset Γ 1− := {0 < f 2 < f 1 } of Γ 1 . We refer to the twodimensional space of inputs (f 1 , f 2 ) of the function B xyz as "frequency-frequency space". We also define projective frequency-frequency space as the set of all straight lines through the origin (0, 0). For our purposes, projective frequency-frequency space is naturally parametrised by a value λ ∈R corresponding to f 1 f 1 +f 2 . With this parametrisation, we have that Now by iterated application of the convolution theorem, one obtains that for all "sufficiently controlled" functions x, y, z : R → R (e.g. if these belong to the Sobolev space W 2,1 (R)), the following "third-order versions" of (14) and (15) hold: and in particular, The functions B uvw with u, v, w ∈ {x, y, z} represent third-order spectra or bispectra. However, the functions themselves are densities: we refer to each function B uvw with u, v, w ∈ {x, y, z} as a Fourier BD, and we refer to B xxx as the Fourier autoBD of x; a function B uvw for which it is not the case that u = v = w is referred to as a Fourier cross-BD. Although these functions B uvw are often referred to simply as bispectra, we regard "bispectra" as referring to the complex-valued measures on frequency-frequency space whose densities are given by the functions B uvw . Note that it is possible to restrict all bispectral analysis to positive frequencies only: 3 With the symmetries of the bispectrum, the complete list of possible cross-bispectra given a pair of signals or a triplet of signals is as follows: • For a pair of signals x and y, there are essentially 4 different bispectra (B xxx , B xxy , B xyy , B yyy ), or 6 if we restrict to positive frequencies (B xxx , B xxy , B xyx , B xyy , B yyx , B yyy ).
• For a triplet of signals x, y and z, there are essentially 10 different bispectra (B xxx , B xxy , Finally, note that Eq. (35) can be extended to complex-valued signals as (38) which is the third-order analogue of (18).
In this section, we will see how a "perfect" definition of the wavelet bispectrum is not achievable, but how we can nonetheless provide a "suitable" definition that fulfils the needs of time-evolving bispectral analysis.

The difficulty in defining wavelet bispectral density
We seek to obtain a "third-order analogue" of the second-order wavelet spectrum p ψ,κ,xy . As described in Sec. 2.2, the difficulty in achieving this lies in the contrast between linear sum of frequencies inherent to bispectral analysis and the logarithmic treatment of frequencies inherent to wavelet analysis. We will now explain this in detail.
We first consider how to define the windowed-Fourier-transform bispectrum [46]. Let w be a window function as defined in Sec. 2.2, which for simplicity we will assume to belong to the Schwartz space S(R, R), and assume additionally that (Window functions w used in practice typically fulfil the stronger condition that either w ≥ 0 everywhere orŵ ≥ 0 everywhere.) We use the same notation for the windowed Fourier transform as in Sec. 2.2. We present two ways to derive the definition of windowed-Fourier-transform bispectral density (also called a bispectrogram): • In the first instance, we consider finite-energy signals (which, for simplicity's sake, we will assume to be Schwartz functions).
• In the second instance, we consider infinite-time sinusoidal signals.
Consider signals x, y, z ∈ S(R, R). In analogy to (7), we have for all ξ, f ∈ R. Consequently, following analogous steps to the proof of Proposition 2 (and noting thatẑ and F w,z (·, t) are Hermitian), we have by (39) and (38) Therefore we define the windowed-Fourier-transform bispectral density as for f 1 , f 2 , t ∈ R. Now instead consider sinusoidal input signals with ν 1 , ν 2 > 0. Using (12) one can show that at each t ∈ R, where c(t) is a finite sum of zero-mean circular motions in C. And by expressing x, y, z in terms of complex exponentials, one can easily check that the average value of the function t → x(t)y(t)z(t) is indeed 1 4 A 1 A 2 A 3 cos(θ). Thus, consideration of sinusoidal input signals also justifies the definition (40) for windowed-Fourier-transform bispectral density.
The windowed-Fourier-transform bispectrum for input functions x, y, z ∈ L ∞ (R, R) is then defined by integrating the density, i.e. it is the map on all sets A ∈ B(R 3 ) over which the integrand is absolutely integrable.
Having defined the windowed-Fourier-transform bispectral density, through two approaches that produce the same result, let us now consider how the analogous calculations would look for attempting to define wavelet bispectral density. More specifically, at this point we will only consider the calculation for finite-energy signals; the case of sinusoidal signals is treated in Sec. 3.1.
For simplicity, take κ = 1 and assume that ψ ∈ S(R). For signals x, y, z ∈ S(R, R), we have If the term f 1 +f 2 ξ 1 +ξ 2 were instead f 1 ξ 1 + f 2 ξ 2 , then as in the proof of Proposition 2 the change of variables r i = f i ξ i (i = 1, 2) would cause the above expression to simplify to Thus it is the discrepancy between f 1 +f 2 ξ 1 +ξ 2 and f 1 ξ 1 + f 2 ξ 2 that prevents the calculation from going through.
How to define a "reasonable" notion of wavelet bispectra Although we cannot obtain a "perfect" third-order analogue of the second-order wavelet spectra, nonetheless we can still obtain somewhat "reasonable" notions of a wavelet bispectrum.
A valid candidate for a definition of the "wavelet bispectrum" is one expressible through integration of a formula for "wavelet bispectral density", whose value at (f 1 , f 2 , t) depends on the input signals only via values of their wavelet transforms at time t. This implies in particular that the wavelet bispectrum can serve as a time-localised measure of the bispectral content of signals with time-dependent oscillatory characteristics.
Of course, the "reasonableness" of such a candidate definition of the wavelet bispectrum is a rather vague question, but a natural way of assessing it is by the following (also somewhat vague) question: For signals x, y and z that include prominent sinusoidal contributions A 1 cos(2πν 1 t + φ 1 ), • is the bispectral contribution within positive frequency-frequency space made by this triplet of sinusoidal components over a time-interval of duration ∆T approximately equal to 1 • is this bispectral contribution concentrated, in due proportion with the frequency resolution of the underlying wavelet ψ, around (ν 1 , ν 2 )?
A valid candidate for the definition of the wavelet bispectrum is considered to be a "reasonable" definition in so far as it provides an affirmative answer to the above question. We go on to give a definition of the wavelet bispectrum whose "reasonableness" is formalised in Theorem 8, in a limit as frequency resolution tends to ∞. Our numerics in Sec. 5.1 will indicate that the frequency resolution needed for "reasonable" results is not high at all.

Definition of the wavelet bispectrum (global version)
Given a wavelet function ψ as defined in Sec. 2.1, for each λ ∈ [0, 1], define Note that D ψ (λ) is strictly positive (by fixing r 1 = r 2 ∈ψ −1 ((0, ∞)) in the integrand) and bounded above by ∞ 0ψ (r) r dr 2 maxψ(·). It serves as a kind of third-order analogue of C ψ by fulfilling the following third-order analogue of (22b): Given a wavelet function ψ, a value κ > 0 and signals with ν > 0, we have for any t ∈ R. This fact is essentially immediate from (8).
Remark 3. If we chose any p-value other than 1 for the wavelet transform in (44), then we would need to change D ψ (λ) to an expression taking the form (This can be derived explicitly but also follows immediately from dimensional analysis.) At the start of Sec. 3 we showed the difficulty in defining the wavelet bispectrum, using finiteenergy input signals. If instead we consider sinusoidal signals as above, then this impossibility of a perfect third-order analogue of the second-order wavelet spectra is now represented by the fact that the factor D ψ is not constant but depends on the ratio of the frequencies as represented by λ. This is in contrast with the windowed-Fourier-transform bispectrum, where the factor K w is constant. Nonetheless, using D ψ (·) as defined above, we will define wavelet bispectral density by "pointwise" application of the normalisation D −1 ψ : Definition 4. Given a wavelet function ψ, a value κ > 0 and functions x, y, z ∈ L ∞ (R, R), define the logarithmic-frequency wavelet BD b ψ,κ,xyz : The normalisation factor D ψ f 1 f 1 +f 2 can be expressed as We refer to b ψ,κ,xxx as the logarithmic-frequency wavelet autoBD of x. As in Sec. 2.5, a BD b ψ,κ,uvw for which it is not the case that u = v = w is referred to as a cross-BD.

For any
We refer to b ψ,κ,xyz as the wavelet bispectrum. (The terminology regarding wavelet autobispectra and wavelet cross-bispectra is analogous to the above terminologies for autoBD and cross-BD.) For any compact K ⊂ (0, ∞) 2 × R the restriction of b ψ,κ,xyz to B(K) defines a complex-valued measure on K.
Remark 5. Since we constructed our definition of wavelet bispectra on the basis of logarithmicfrequency consideration as reflected in (44), we have not incorporated negative frequencies in our definition. This is not problematic or restrictive, since in general, all bispectral analysis can be restricted to positive frequencies (see Sec. 2.5).
Remark 6. The integral in (46), as well as the integral in (42) and the integral in (43), can be computed by integrating only below the diagonal and then doubling the result. Note also that D ψ (λ) = D ψ (1 − λ), or equivalently that the expression D ψ is symmetric in f 1 and f 2 . It follows in particular that the wavelet autobispectral density b ψ,κ,xxx orb ψ,κ,xxx is symmetric in its frequency inputs.  Now using the same wavelet phase notation as was used in Eq. (27), we have that We refer to the argument φ as the biphase associated to the frequency pair (f 1 , f 2 ) at time t. Of the differences between Fourier bispectra and wavelet bispectra, one that we emphasise is that due to Eqs. (48) and (47), the wavelet bispectrum b ψ,κ,xyz is able to take into account how constant the biphase associated to each frequency pair remains over time. Just as described in Sec. 2.4 for second-order wavelet cross-spectra, this is particularly useful when analysing signals recorded from open systems. Constancy of biphase is referred to as bicoherence. Detailed discussion of bicoherence and its quantification is given in Appendix B. 3.2. Theoretical justification for the wavelet bispectral density formula (45) In this section, we consider the wavelet bispectrum of a triplet of signals (any two of which or all three of which could be the same), each consisting of a finite sum of sinusoids. We work with a one-parameter family of wavelets whose parameter σ represents the "frequency resolution" (or inversely represents "frequency uncertainty"), and consider the limit as σ → ∞. So, in principle, the setup we consider here is , and (taking κ = 1 without loss of generality) we consider the limiting behaviour of b ψ,1,xyz as σ → ∞. However, one slight issue is that for a given function g, even if the inverse Fourier transform of the function exists and fulfils all the conditions of being a wavelet function (as defined in Sec. 2.1), this might not guarantee that the inverse Fourier transform ofψ is an L 1 function for all σ, and therefore might not guarantee that the wavelet transforms W ψ,1,x , W ψ,1,y and W ψ,1,z as given by (6) are well-defined for all large σ. Therefore, in the result below we consider the "wavelet transform" defined by (8) rather than (6).
The proof is given in Sec. 4. Remark 9. In Theorem 8, the limit as frequency uncertainty tends to zero is only taken within a one-parameter family (g σ ) σ≥1 . The natural next question is whether some condition on a general sequence of wavelet functions (ψ n ) n∈N can be found that guarantees the same conclusions as in Theorem 8. We conjecture that under some suitable quantitative definitions of frequency uncertainty and time-frequency uncertainty for wavelet functions, the conclusions of Theorem 8 will hold for any sequence (ψ n ) n∈N of wavelet functions for which the frequency uncertainty tends to zero while the time-frequency uncertainty remains bounded.

Frequency-localised bispectra
Suppose we have a region of frequency-frequency space S ⊂ (0, ∞) × (0, ∞) contained within a strip of the form } for some fixed λ ∈ (0, 1). One may wish to consider the local logarithmic-frequency wavelet BD b ψ,κ,xyz;λ : The point here is that, in contrast to the "global" definition (45), we do not vary the normalisation factor D ψ (λ) −1 in accordance with the input frequency pair (f 1 , f 2 ). We then define for any A ∈ Bb ψ,κ,xyz contained in S ×R. Note that the localised bispectrum b ψ,κ,xyz;λ is meaningless if the frequency resolution of the wavelet is so low that S does not capture virtually all the wavelet bispectral content arising from bispectral contributions of interest within S.
Recall that our notion of the wavelet bispectrum was constructed from logarithmic-frequency consideration as in (44). Frequency-localised bispectra constructed from analogous linear-frequency consideration will be presented in Appendix C.
If one is not concerned with "absolute" values for bispectral results but only comparisons between bispectral results, then the normalising factor D ψ (λ) −1 can be removed from (49). Nonetheless, due to Remark 3, the wavelet transform terms must still be with p = 1. The unnormalised function (f 1 , f 2 , t) → W ψ,κ,x (f 1 , t)W ψ,κ,y (f 2 , t)W ψ,κ,z (f 1 + f 2 , t) has an additional possible advantage for locating the frequencies of oscillatory contributions: Ifψ is unimodal with its mode at κ, then for input signals with A 1 , A 2 , A 3 , ν 1 , ν 2 > 0, at any time t the mode of will occur precisely at (ν 1 , ν 2 ). (This of course does not hold for any p = 1.) Moreover, if we define the normalised version then at any time t the value b ψ,κ,xyz (ν 1 , ν 2 , t) -whose magnitude is the peak value of |b ψ,κ,xyz | -is precisely equal to A 1 A 2 A 3 e iθ .
Let A be the maximum of the amplitudes A ij , and let γ = max r>0 g(r).

By (58), this implies that
Combining this with (59) gives that for all t ∈ R, as required.

Numerical examples
The purposes of this section are: (a) to illustrate that our definition of the wavelet bispectrum in Sec. 3.1 gives appropriate results for fixed-frequency oscillatory components (in accordance with the "reasonableness" criteria in the discussion immediately preceding Sec. 3.1); (b) to illustrate the usage of our wavelet bispectrum formula for investigating interactions within time-dependent oscillatory dynamics, with the example of a coupled phase-oscillator model.
The first purpose is primarily addressed by Sec. 5.1, and the second is addressed by Sec. 5.2. For our numerics, we will use a family of wavelets ψ σ , indexed by a parameter σ > 0, given bŷ for r > 0, where f σ is the probability density function of a normal distribution of mean zero and variance σ 2 ; and we take κ = 1, which is whereψ is maximised for each σ. These wavelets are known as lognormal or log Gabor wavelets [32,14,45,25]. How the parameter σ relates quantitatively to the frequency resolution and time localisation of the lognormal wavelet is presented in Appendix A. But the key point is that larger σ corresponds to higher frequency resolution and poorer time localisation. If the frequencies of oscillatory components in the signals being analysed are not very far from each other, then a high value of σ needs to be used, otherwise the representations of the oscillatory components in the wavelet transform will overlap with and interfere with each other. If the frequency of an oscillatory component of interest is quickly varying over time, then one must not use too high a value for σ, otherwise temporal variations will be smeared and will interfere with each other in the wavelet representation. In any case, in application to real data, the value chosen for σ should generally not be less than about 0.7, otherwise the wavelet transform is too highly time-localised to be able to "see oscillations". (This is analogous to the fact that the human ear cannot hear pressure vibrations of frequency less than about 20 Hz due to the ear's time-frequency analyser being too time-localised to pick up such low-frequency oscillations.) For σ larger than about 0.7, the lognormal wavelet may be approximated reasonably well by the Gabor function This is illustrated in Fig. 5. The admissibility constant (20) of the lognormal wavelet function is given by In (a,c,e) the modulus |ψ σ (r)| is shown in colour, against which f σ (r) is shown in black. In (b,d,f) the argument of ψ σ (r) is shown in colour, against which the wrapped angle 2πr is shown in black. We see that |ψ σ (r)| is approximately equal to f σ (r), and we see that where f σ (r) is not very small (compared to its maximum value 1 √ 2πσ ), the argument of ψ σ (r) is approximately equal to 2πr modulo 2π.

Sinusoidal oscillations
In this section, we define the "instantaneous wavelet bispectral content" of a set A ∈ B((0, ∞) 2 ) at a time τ by b inst ψ,κ,xyz (A, τ ) := provided the integrand is absolutely integrable. All the signals in this section are simulated over a time interval [0 s, 60 s], and instantaneous wavelet bispectral results are considered at τ = 30 s. We have seen that whereas the second-order normalisation C ψ derived in (21) or (22b) is a constant, the analogous third-order calculation represented by (44) yields a quantity D ψ (λ) that is dependent on the distance (in terms of logarithmic axes) from the diagonal of frequency-frequency space. Since the quantity D ψ (λ) is derived precisely in terms of integration of bispectral results for sinusoids, all "imperfect" results in the application of our wavelet bispectrum definition to sinusoids will be specifically due to the variability of D ψ (λ) with respect to λ. Theorem 8 justifies that the imperfection tends to 0 in the limit as frequency resolution tends to ∞. The goal of this section is essentially to illustrate that for the lognormal wavelet, any choice of σ ≥ 0.5 will only lead to small imperfection. (As stated at the start of Sec. 5, using σ much less than 0.7 is physically meaningless.) This fact is already shown to some degree by Fig. 4: it is seen that for σ ≥ 0.5, the variability of D ψσ (λ) with respect to λ is fairly small. In the limit as σ → ∞, Lemma 10 gives that for which the range is between ( √ 8π) −1 ≈ 0.1125 and ( √ 6π) −1 ≈ 0.1299. We start by considering exactly the same signals (5) as considered in the Introduction with the same parameters, namely ν 1 = 2.8 Hz, ν 2 = 1.3 Hz, ν 3 = 12 Hz and θ = π 3 (as in the caption of Fig. 2). The logarithmic-frequency wavelet bispectral density b ψσ,1,xyz (f 1 , f 2 , τ ) is shown in Fig. 6, using σ = 1 and σ = 4 just as in Fig. 2. We see a blurry peak around (ν 1 , ν 2 ) and around (ν 1 , ν 3 ) for both σ = 1 and σ = 4. For σ = 1, we mark a region S 1 containing the visible blur around (ν 1 , ν 2 ) and a region S 2 containing the visible blur around (ν 1 , ν 3 ), both indicated in Fig. 6. A numerical computation yields that Likewise, for σ = 4, we mark a region S 3 containing the visible blur around (ν 1 , ν 2 ) and a region S 4 containing the visible blur around (ν 1 , ν 3 ), again both indicated in Fig. 6. A numerical computation yields that b inst ψ 4 ,1,xyz (S 3 , τ ) ≈ 0.1250 and b inst ψ 4 ,1,xyz (S 4 , τ ) ≈ 0.1250e 0.3333πi . So we see that all four of these regions have instantaneous bispectral content equal in magnitude to approximately 1 8 . In particular, for both frequency resolutions we are able to show that the two  Figure 7: More wavelet bispectra for fixed-frequency oscillatory components. In (a), |b ψσ,1,xxy (f 1 , f 2 , τ )| is shown for x, y as in (61) with ν = 2.8 Hz, with σ = 0.5. In (d,g) is shown, respectively with σ = 1 and σ = 2, |b ψσ,1,xyz (f 1 , f 2 , τ )| for x, y, z as in (5) with ν 1 = ν 2 = 2.8 Hz, ν 3 = 12 Hz and θ = 0. In (b,e,h) is shown, respectively with σ equal to 0.5, 1 and 2, |b ψσ,1,xyz (f 1 , f 2 , τ )| for x, y, z as in (5) with ν 1 = 2.8 Hz, ν 2 = 1.3 Hz, ν 3 = 12 Hz and θ = 0. In (c,f,i) is shown, respectively with σ equal to 0.5, 1 and 2, |b ψσ,1,xyz (f 1 , f 2 , τ )| for x, y, z as in (5) with ν 1 = 2.8 Hz, ν 2 = 0.9 Hz, ν 3 = 12 Hz and θ = 0. All nine plots use the same frequency axes. The instantaneous wavelet bispectral content of the circled regions in (a-c) are given in the text.
bispectral contributions are of essentially the same magnitude as each other, which could not be seen from the current standard definition of wavelet bispectra as described in the Introduction. Even if we consider lower frequency resolution, our wavelet bispectrum definition gives good results. Fig. 7(a) shows b ψσ,1,xxy (f 1 , f 2 , τ ) with σ = 0.5 for the pair of signals with ν = 2.8 Hz. The instantaneous wavelet bispectral content of the white-circled region in Fig. 7(a) is approximately 0.1255, again remarkably close to the ideal value of 1 8 . Likewise Fig. 7(bc) shows b ψσ,1,xxy (f 1 , f 2 , τ ) with σ = 0.5 for signals x, y, z of the form (5), with θ = 0. The instantaneous wavelet bispectral content of the white-circled region and the black-circled region in Fig. 7(b) are approximately 0.1260 + 0.0002i and 0.1254 − 0.0002i respectively. The instantaneous wavelet bispectral content of the white-circled region and the black-circled region in Fig. 7(c) are approximately 0.1253 + 0.0001i and 0.1250 − 0.0001i respectively. All of these values are very close to 1 8 , as desired. We also see in the remaining plots of Fig. 7 how the size of the blurs in frequency-frequency space decrease with increasing σ.

Coupled phase oscillators
In this section, given a time interval I ⊂ R we define the "time-marginalised bispectral density"  1 3 s. (This corresponds to one time period of the frequency modulation that will be introduced later.) One form of interaction between oscillators that bispectral analysis is likely to be able to detect is dynamical coupling in the form of added coupling terms in the differential equations of motion [73,28]. Here, we consider Kuramoto-type symmetric coupling between two phase oscillators. The simplest scenario would be to introduce the coupling between two linear phase oscillators, as in the usual Kuramoto model; however, as we shall explain later, bispectral analysis is probably unable to detect such coupling between two linear phase oscillators. Therefore, instead, we introduce the Kuramoto coupling between two highly nonlinear phase oscillators. We consider two phase oscillators θ 1 (t) and θ 2 (t) and define the signal x(t) = cos(θ 1 (t)) + cos(θ 2 (t)) on which autobispectral analysis will be carried out. In the first instance, we suppose the phase oscillators are uncoupled and have fixed basic frequency; specifically, we take them to follow the differential equation θ 1 (t) = 2πν 1 (1 + 0.6 cos(θ 1 )) θ 2 (t) = 2πν 2 (1 + 0.6 cos(θ 2 )) (64) with ν 1 = 0.9 Hz and ν 2 = 1.4 Hz. The oscillator θ 1 is strictly periodic with frequencyν 1 := 0.8ν 1 = 0.72 Hz, and the oscillator θ 2 is strictly periodic with frequencyν 2 := 0.8ν 2 = 1.12 Hz. For this case, Fig. 9(a) shows the wavelet transform of x and Fig. 9(b) shows b I 0 ψσ,1,xxx (f 1 , f 2 ), both with σ = 3. The six most prominent bispectral contributions that we see in Fig. 9(b) are: the diagonal contribution around (ν 1 ,ν 1 ), and the off-diagonal contribution around (2ν 1 ,ν 1 ) and its mirror-reflection around (ν 1 , 2ν 1 ); and likewise, the diagonal contribution around (ν 2 ,ν 2 ), and the off-diagonal contribution around (2ν 2 ,ν 2 ) and its mirror-reflection around (ν 2 , 2ν 2 ). These bispectral contributions are simply due to the nonlinearity of the two individual oscillators; they do not signify any kind of interaction between the oscillators.
However, when we introduce coupling, new peaks appear that indicate an interaction between the oscillators. Specifically, consider now the coupled system θ 1 (t) = 2πν 1 (1 + 0.6 cos(θ 1 )) + K sin(θ 2 − θ 1 ) with ν 1 and ν 2 as before, and K = 0.2 rad/s. This coupling is not very strong, but it has a significant effect. Once again, Fig. 10(a) shows the wavelet transform of x and Fig. 10(b) shows b I 0 ψσ,1,xxx (f 1 , f 2 ), both with σ = 3. We see in Fig. 10(b) prominent bispectral contributions around roughly the same points as in Fig. 9(b), plus some new peaks, the most prominent being a bispectral contribution roughly around (ν 2 ,ν 1 ) as well as its mirror-reflection roughly around (ν 1 ,ν 2 ). So we have illustrated how the bispectrum can detect the introduction of an interaction between the oscillators. But all of this so far could have been illustrated by traditional non-time-evolving Fourier bispectral analysis. So we now introduce slow frequency modulation. For simplicity, the modulation is sinusoidal in shape, but any shape of (not too fast) frequency modulation can be treated by wavelet bispectral analysis. Consider the system θ 1 (t) = 2πν 1 (t)(1 + 0.6 cos(θ 1 )) + K sin(θ 2 − θ 1 ) θ 2 (t) = 2πν 2 (1 + 0.6 cos(θ 2 )) + K sin(θ 1 − θ 2 ) with ν 2 and K as before, and where f m = 0.03 Hz. So now θ 1 has slowly varying inherent frequency, approximately of the sinusoidal shape 0.8ν 1 (t). The time interval I 0 corresponds to exactly one time period of ν 1 (t). Fig. 11(a) shows the wavelet transform of x over I 0 and Fig. 11(b) shows b I 0 ψσ,1,xxx (f 1 , f 2 ), both with σ = 3. Some bispectral contribution such as that around (ν 1 ,ν 1 ) are now bimodal rather than unimodal. This is because the sinusoidal temporal variation of ν 1 (t) spends more time near its extreme values 0.81 Hz and 0.99 Hz than it does around each value in between.
With the above example, we now illustrate how our new definition of the wavelet bispectrum is able to quantify and compare bispectral contributions over a given time-interval, across different parts of frequency-frequency space; as described in the Introduction, this was not possible under the previous state-of-the-art of wavelet bispectral analysis.
In Fig. 11(b) are marked regions R purple , R grey and R black respectively around the blurry peaks containing (ν 1 ,ν 1 ), (2ν 2 ,ν 2 ) and (ν 2 ,ν 1 ). Once again, the first two regions contain contributions due to the nonlinearity of the individual oscillators, while the third represents the interaction that has been introduced between the oscillators. The bispectral content associated to each of the three indicated bispectral contributions are: Moreover, one can actually track over time the motion of the peaks and associated bispectral content. Define the time intervals I 1 , I 2 , I 3 ⊂ I 0 to be the 5-second intervals centred on 7.25f −1 m , 7.5f −1 m and 7.75f −1 m respectively; these are marked in Fig. 12. In Fig. 13 are shown b I 1 ψσ,1,xxx (f 1 , f 2 ), b I 2 ψσ,1,xxx (f 1 , f 2 ) and b I 3 ψσ,1,xxx (f 1 , f 2 ); and on these plots are marked respectively the regions R black , R black and R (3) black which help to trace over time the bispectral contribution to the green-marked region in Fig. 11(b). Again, our new definition of the wavelet bispectrum enables quantification of the time-evolving bispectral contributions: Here, we have sampled three time-subintervals. A more continuous-time tracking of bispectral content can be achieved by following the frequency variation of oscillatory components of interest directly from the wavelet transforms themselves via "ridge-extraction" methods, as described in Remark 12 (in Appendix B). Bicoherence analysis (Appendix B.2) can then also be carried out.

Why the bispectrum does not detect the coupling for linear phase oscillators
Still working with x(t) = cos(θ 1 (t)) + cos(θ 2 (t)), we consider the Kuramoto model We see in Fig. 14 that the Fourier components of x(t) have frequencies that all lie within a set of the form {|ξ 1 + nξ 2 | : n ∈ Z}. If ξ 1 and ξ 2 are rationally independent, then no two frequencies in this set can have a sum that also lies in this set. Thus bispectral analysis would not be able to detect the coupling. Instead, at least some kind of trispectral (i.e. fourth-order spectral) analysis would be needed: perhaps the simplest and most direct approach, as proposed in [28], would be to investigate the presence of frequency triplets with equal spacing between the two consecutive pairs of frequencies, using a time-evolving version of the "modified" Fourier bispectrum (cf. also [62])

Application to cell membrane potential data
Cell membrane potential oscillations and coupling of cellular oscillators are of considerable importance in understanding cellular biology [19,3,48,49]. In this section, we illustrate application of our new definition of the wavelet bispectrum to a cell membrane potential time-series, recorded with the free-running patch clamp technique; full details of the experimental setup are given in [50]. For a much more extensive analysis of cell membrane potential time-series, see [52].
The membrane potential time-series x raw (t) that we consider is a digital signal recorded over a time-interval I = [0 s, 600 s], shown in Fig. 15(b). We apply the MATLAB R detrend function, which subtracts a best-fit linear trend, to obtain a linearly detrended signal x(t) defined over the same time-interval I. Fig. 15(a) shows the wavelet transform of x(t), using a lognormal wavelet ψ σ (as used throughout Sec. 5) with frequency resolution σ = 3. Along similar lines to Sec. 5.2, before seeking to track time-evolving bispectral values, we first give a plot of the bispectral content of x taken "over the whole time-interval", in order to ascertain where peaks in the bispectrum lie. Since the signal is of finite duration and finite sampling frequency, there is only a subset of time-frequency space over which the wavelet transform can be calculated (as seen in Fig. 15(a)). Therefore, for each pair of frequencies (f 1 , f 2 ), we define I(f 1 , f 2 ) ⊂ I to be the interval of times t at which the product of wavelet terms W ψ 3 ,1,x (f 1 , t)W ψ 3 ,1,x (f 2 , t)W ψ 3 ,1,x (f 1 + f 2 , t) can be computed. We write T (f 1 , f 2 ) for the duration of the time-interval I(f 1 , f 2 ). In Fig. 16(a), we plot the magnitude of the "time-averaged bispectral density" where the notation b I ψ,κ,xyz denotes "time-marginalised bispectral density" as in Eq. (63). To help give an indication of whereabouts in frequency-frequency space the bispectral values shown in Fig. 16(a) are significantly high, we carry out a 95%-significance surrogate test using WIAAFT surrogates [38]: 19 digital signals x n (t), n = 1, . . . , 19, of the same length as the digital signal x(t), are sampled independently of each other from the WIAAFT surrogate distribution generated by x(t), and the quantity b ave,xnxnxn (f 1 , f 2 ) is computed for these 19 signals exactly as it was computed for x(t). At each pair of frequencies (f 1 , f 2 ), the 95% significance threshold is given by Fig. 16(b) shows this critical threshold. Fig. 16(c) shows the region of (f 1 , f 2 )-values for which , with the colour-coding representing the difference We note that all three plots in Fig. 16 are symmetric in the diagonal, because these plots are for autobispectra.
One of the significant bispectral peaks that we see in Fig Fig. 17 show the time-marginalised bispectral densities over these five time-intervals. We specify a region R around the abovementioned frequency pair; this region is marked by a black circle in Fig. 16(a,c) and all the plots in Fig. 17. The time-evolution of the bispectral content of R can be traced as follows: The sequence of above values reflects the general trend seen in Fig. 17 that the contributions to the wavelet autobispectrum of x are decreasing in intensity over time. Indeed, by the third minute of the experiment, the main bispectral contribution in R has essentially disappeared altogether. This general trend seen in Fig. 17 may be due to the weakening activity of cellular oscillatory processes as the cell gradually dies during the patch clamp experiment [52].
In this section, we have illustrated how our new definition of the wavelet bispectrum b ψ,κ,xyz can be used to study the time-evolving bispectral content of experimental time-series data.

Appendix A. Resolution properties of lognormal wavelets
We describe how, for lognormal wavelets, the parameter σ relates quantitatively to the frequency resolution and time localisation of the wavelet transform; for reference, we compare this with the frequency resolution and time localisation properties of the Gaussian-windowed Fourier transform. A numerical study of the performance of lognormal wavelets is also included in [25], which uses an alternative approach to quantifying resolution properties from what we present below.
For reference we start by considering windowed Fourier transforms with Gaussian windows. First let us define our approach to quantification of time localisation and frequency resolution for windowed Fourier transforms. Given an even function w ∈ L 1 (R, [0, ∞)) ∩ L 2 (R, [0, ∞)) \ {0}, probably the most standard way to quantify the time localisation and frequency resolution of the associated windowed Fourier transform is, inversely, by the Heisenberg uncertainty properties of w.
• the Heisenberg frequency uncertainty of w by . Now for any τ > 0 let f τ be the probability density function of the normal distribution of mean zero and variance τ 2 . The squared Heisenberg uncertainties of the Gaussian function f τ are given by The product time (f τ ) freq (f τ ), called the Heisenberg area or the Heisenberg time-frequency uncertainty of f τ , is equal to 1 4π (independently of τ ). This is the lowest possible value for the Heisenberg time-frequency uncertainty of a window function, and is uniquely obtained by Gaussian windows [31,Sec. 2.2].
We now describe how the concept of Heisenberg uncertainties can be translated to the setting of wavelets and the wavelet transform. Given a wavelet function ψ with ψ 2 := ∞ −∞ |ψ(r)| 2 dr < ∞ and with admissibility constant C ψ := ∞ −∞ψ (e r ) 2 dr as in (20), together with a value κ > 0 (typically taken to be whereψ is maximised), we define • the Heisenberg time uncertainty of (ψ, κ) by • the Heisenberg linear frequency uncertainty of (ψ, κ) by • the Heisenberg logarithmic frequency uncertainty of (ψ, κ) by The logarithmic-frequency approach in the definition of freq (ψ, κ) is very natural for wavelets, due to the inherently logarithmic nature of frequency resolution for the wavelet transform as described in Sec. 2.2. Let ψ σ be the lognormal wavelet as defined in Sec. 5. We will take κ = 1, which is whereψ σ is maximised. It is not hard to show that ψ σ belongs to the Schwartz space S(R) for all σ. One can compute (via Plancherel's theorem) that This is very slightly more than the classical Heisenberg time uncertainty of a Gaussian window with variance σ 2 , namely 1 2 σ 2 . The Heisenberg logarithmic frequency uncertainty of (ψ σ , 1) is (essentially by definition) the same as the classical Heisenberg frequency uncertainty of a Gaussian window with variance σ 2 ; that is, Finally, one can compute that linfreq (ψ σ , 1) 2 = e 1 2π 2 σ 2 − 2e 3 16π 2 σ 2 + 1.
For σ not too small, this is very close to 1 8π 2 σ 2 .
Given a family (S f ) f >0 of bounded integrable functions S f : R 2 → [0, ∞) satisfying S f (ζ, τ ) = S f (ζ, −τ ) for all ζ, τ ∈ R, define the wavelet coherence between signals x and y by for all f > 0 and t ∈ R where this is well-defined (which includes in particular that the two terms in the denominator are non-zero). The family (S f ) f >0 serves as a smoothing kernel. The default dependence of S f on f should be for some S : R 2 → [0, ∞); a simple approach would be to take S to be a bivariate Gaussian probability density function with mean (0, 0) and to use p = 1 in the wavelet transform. However, if one already has information regarding how the quickly time-varying the frequencies of the different oscillatory components in the signals are, then it may make sense to alter the dependence of S f on f accordingly. If one wishes simply to follow the coherence along an individual time-frequency curve as described in Remak 11, then it may make sense not to have any dependence of S f on f . If well-defined, the wavelet coherence c (1) ψ,κ,xy (f, t) takes values between 0 and 1. Larger values may indicate the presence of a common oscillatory influence upon the signals x and y around time t, with the frequency at time t being approximately equal to either f or (in the case of a nonlinear oscillatory influence) f n for some integer n. One can then look at the wavelet energy spectra of x and y to gauge the strength and the linearity or nonlinearity of this oscillatory influence.
Note that the "autocoherence" c (1) ψ,κ,xx (f, t), wherever it is well-defined, is equal to 1. One can also choose not to smooth in time-frequency space but just to smooth in time (in which case the value of p makes no difference), and track over time the presence of frequency-intervals of high values. Additionally, one can fix a time-interval [T 0 , T 1 ] over which to look at the overall coherence at each frequency, by integrating against the rectangular window function 1 instead of smoothing.

Wavelet-phase coherence
An alternative approach to coherence is to reverse the order of the "smoothing" and the "angleextracting" operations, so that wavelet amplitudes |W [p] ψ,κ,x (f, τ )| play no role whatsoever: one simply smooths the wavelet phase difference between the signals. The key difference between this approach and the previous approach is that in the previous approach, the measure of coherence between phases is weighted over time in proportion with the instantaneous wavelet amplitudes. (So while the previous approach is invariant under time-independent rescaling of wavelet amplitudes, it would not be invariant under time-dependent rescaling of wavelet amplitudes.) The "phases-only" approach was introduced in [36], using temporal smoothing with rectangular windows. More generally one can smooth in time-frequency space: Given a family (S f ) f >0 of integrable functions S f : R 2 → [0, ∞) satisfying S f (ζ, τ ) = S f (ζ, −τ ) for all ζ, τ ∈ R, define the wavelet-phase coherence between x and y by c (2) ψ,κ,xy (f, t) = R 2 S f (ζ, τ − t) W ψ,κ,x (e ζ , τ )W ψ,κ,y (e ζ , τ ) d(ζ, τ ) where v := v/|v|, for all f > 0 and t ∈ R where this is well-defined. Note that changing the value of p makes no difference to the results. The default dependence of S f on f should be as in (B.2), in which case But again, if one wishes simply to follow the coherence along an individual time-frequency curve as described in Remak 11, then it may make sense not to have any dependence of S f on f . As with wavelet coherence, the wavelet-phase coherence c (2) ψ,κ,xy (f, t) takes a value between 0 and 1, where larger values may indicate to the presence of a common oscillatory influence just as described above for wavelet coherence. Wherever the "autocoherence" c (2) ψ,κ,xx (f, t) is well-defined, it is equal to 1. Once again, one can choose to smooth only in time and track the presence of frequency-intervals of high values, and one can also fix a time-interval [T 0 , T 1 ] over which to look at the overall coherence at each frequency.

Appendix B.2. Bicoherence
Given either a single signal or two or three simultaneous signals, the wavelet bispectra can be used to search in a time-localised manner for influences coming from a pair of interacting oscillatory processes. For instance, one can integrate a sliding time-frequency-frequency window function with respect to the wavelet bispectrum and look for where the absolute value is high; that is, one can plot a function taking the form (or one can just smooth the wavelet BD with respect to time rather than time-frequency-frequency), and look for high values. However, high values can easily be due to high wavelet amplitudes happening to occur by fluke at a triplet of frequencies satisfying the frequency-sum rule. If the frequencies are time-varying, one way to help overcome this issue is to use a quantity that is invariant under rescaling of wavelet amplitudes and gives an indication of how constant the biphase remains around a given point in time-frequency-frequency space. Constancy of biphase is generally referred to as bicoherence. We will describe two possible such quantities measuring bicoherence, namely the third-order analogues of the measures of coherence described in Appendix B.1.
Remark 12. One can extend Remark 11 to the setting of bicoherence: We will give definitions of wavelet coherence c (1) ψ,κ,xyz (f 1 , f 2 , t) and wavelet-phase coherence c (2) ψ,κ,xyz (f 1 , f 2 , t) defined over (f 1 , f 2 , t) space, but it may be useful to follow these specifically over a "time-frequency-frequency curve" ((f 1,t , f 2,t )) t∈ [a,b] . In particular, by looking at the wavelet transform of x, y or z, one may be able to trace the temporal variations in frequency of some oscillatory component, e.g. by "ridgeextraction" methods as detailed in [25,26]. Accordingly, one can: Then one can plot t → c ψ,κ,xyz (i = 1, 2) is the time-frequencyfrequency-smoothed version of bicoherence as in (B.4) or (B.8). The same also applies to the "bicorrelations" ρ (i) ψ,κ,xyz (i = 1, 2) introduced further below.
Remark 13. A slightly simpler approach to looking for interacting oscillatory contributions via time-evolving bispectral analysis, as described in [28], is as follows: Plot over frequency-frequency space the modulus of the time-averaged wavelet BD over a time-interval [T 0 , T 1 ] of interest, identify points (ν 1 , ν 2 ) around which there is high bispectral content, and then plot the time-evolution of the biphase associated to (ν 1 , ν 2 ) over the time-interval [T 0 , T 1 ]. Subintervals during which this biphase remains roughly constant may correspond to interaction between oscillatory processes. One can also plot the "biamplitude" |b ψ,κ,xyz (ν 1 , ν 2 , t)| over t ∈ [T 0 , T 1 ] to look at the time-localised intensity of the bispectral contribution during the time-subintervals on which the biphase remains roughly constant. The limitation of this approach in its "raw form" is that it cannot follow variations of frequency in oscillatory components. Therefore a possible modification as described in [29] is to extract frequencies of oscillatory components as a function of time (e.g. as described in Remark 12) so as to give a time-dependent point (ν 1,t , ν 2,t ) in frequency-frequency space, and then -provided these frequencies are sufficiently slowly varying -plot the biphase at (ν 1,t , ν 2,t , t) against time t and once again look for time-subintervals during which this remains roughly constant.
In the below, we fix three signals x(t), y(t) and z(t), any two or which or all three of which could be the same.

Wavelet bicoherence
The definition of wavelet bicoherence was introduced in [72,73], where temporal evolution of bicoherence was tracked by partitioning the signal duration into small time-intervals on each of which the bicoherence at each point in frequency-frequency space was computed according to Eq. (3). The definition can be adapted to the more general framework of time-frequency-frequency smoothing in analogy to the time-frequency smoothing described in Appendix B.1, as follows: Given a family (S f 1 ,f 2 ) f 1 ,f 2 >0 of bounded integrable functions S f 1 ,f 2 : R 3 → [0, ∞) satisfying S f 1 ,f 2 (ζ 1 , ζ 2 , τ ) = S f 1 ,f 2 (ζ 1 , ζ 2 , −τ ) for all ζ 1 , ζ 2 , τ ∈ R, define the wavelet bicoherence of x and y with z by c ψ,κ,y (e ζ 2 , τ )W for some S : R 3 → [0, ∞); a simple approach would be to take S to be a trivariate Gaussian probability density function with mean (0, 0, 0) and to use p = 1 in the wavelet transform. However, if one already has information regarding how the quickly time-varying the frequencies of the different oscillatory components in the signals are, then it makes sense to alter the dependence of S f 1 ,f 2 on (f 1 , f 2 ) accordingly. If one wishes simply to follow the bicoherence along an individual timefrequency-frequency curve as described in Remak 12, then it may make sense not to have any dependence of S f 1 ,f 2 on (f 1 , f 2 ). If well-defined, the wavelet bicoherence c (1) ψ,κ,xyz (f, t) takes values between 0 and 1. Larger values may indicate the presence of a coupled-oscillator influence upon the signals x, y and z around time t, with the frequencies at time t being approximately equal to f 1 and f 2 , or f 1 and f 1 + f 2 , or f 2 and f 1 + f 2 , or possibly more generally any pair of frequencies whose integer span includes f 1 and f 2 . One can then look at the wavelet bispectrum b ψ,κ,xyz to gauge the strength and nonlinearity of this coupled-oscillator influence.
Note that unlike for second-order spectra, the wavelet autobicoherence c (1) ψ,κ,xxx (f 1 , f 2 , t) is just as free to take any value in [0, 1] as wavelet cross-bicoherences are.
One can also choose not to smooth in time-frequency-frequency space but just to smooth in time (in which case the value of p makes no difference), and track over time the presence of frequencyfrequency regions of high values. Following the same temporal rescaling as in (B.5), this would be as follows: given a bounded integrable even function S : R → [0, ∞), one defines the associated wavelet bicoherence by ) wherever this is well-defined. Likewise, one can look at the overall wavelet bicoherence across a time-interval [T 0 , T 1 ], namely wherever this is well-defined.

Wavelet-phase bicoherence
We introduce here an alternative approach to bicoherence which works directly with the wavelet phases, which we accordingly call "wavelet-phase bicoherence". (It may also be regarded as the phase-lag-invariant analogue of the "real wavelet biphase" introduced in [33].) The key difference between this approach and the previous approach is that in the previous approach, the measure of bicoherence is weighted over time in proportion with the instantaneous wavelet amplitudes. (So while the previous approach is invariant under time-independent rescaling of wavelet amplitudes, it would not be invariant under time-dependent rescaling of wavelet amplitudes.) Given a family ( for all ζ 1 , ζ 2 , τ ∈ R, define the wavelet-phase bicoherence of x and y with z by c where v := v/|v|, provided this is well-defined. Note that changing the value of p would make no difference whatsoever to the results. Again, a natural default dependence of S f 1 ,f 2 on (f 1 , f 2 ) would be as in (B.5), in which case R 3 S f 1 ,f 2 (ζ 1 , ζ 2 , τ ) d(ζ 1 , ζ 2 , τ ) = 1 min(f 1 , f 2 ) R 3 S(ζ 1 , ζ 2 , τ ) d(ζ 1 , ζ 2 , τ ).
But again, if one wishes simply to follow the bicoherence along an individual time-frequencyfrequency curve as described in Remak 12, then it may make sense not to have any dependence of S f 1 ,f 2 on (f 1 , f 2 ). Again, this takes a value between 0 and 1, where larger values may indicate to the presence of a coupled-oscillator influence just as described above for wavelet bicoherence. One can look at the wavelet bispectrum b ψ,κ,xyz to gauge the strength and nonlinearity of this influence. Once again, the wavelet-phase autobicoherence c (2) ψ,κ,xxx (f, t) is just as free to take any value in [0, 1] as wavelet-phase cross-bicoherences are.
A remark about unidirectional coupling Sometimes unidirectionality of coupling of oscillators can be suggested by analysis of bicoherences and bispectral densities. Slightly generalising the description in [30]: Suppose we have signals x, y and z, where y and z could be the same as each other or different, recorded from a system that contains an oscillatory process consisting of two phase-coupled oscillators. Suppose that • analysis of properties of the bispectrum b ψ,κ,xyz around (f 1 , f 2 , t) detects the presence of this oscillatory process; • analysis of properties of the bispectrum b ψ,κ,xyx around (f 1 , f 2 , t) does not detect the presence of this oscillatory process; • analysis of properties of the cross-energy spectrum p ψ,κ,xy around (f 2 , t) does not detect the presence of this oscillatory process.
This may indicate that only one of the two oscillators is detected within the signal x, and that this oscillator is unidirectionally driving the other oscillator.

Quadratic phase coupling and bicorrelation
In some contexts where bicoherence is observed, i.e. where the biphase remains roughly constant over a time-interval of interest, it may also be useful to ascertain whether or not the constant value at which the biphase roughly lies is zero, since this may help indicate in the context the type of interaction between the oscillations [57]. Bispectral content around a point in frequency-frequency space where the biphase remains constant at 0 is often known as quadratic phase coupling (although the terminology is not entirely uniform 6 ).
All of these bicorrelation values are between −1 and 1: values close to 1 are indicative of quadratic phase coupling.
Remark 15. For λ ∈ (0, 1), the formula (42) for D ψ (λ) can be expressed in a similar form to (C.2), namely in which case the integrand converges pointwise to 0 as λ → 0 or 1, just as it does for (C.2); however, the conditions of the dominated convergence theorem are not fulfilled as they were for the expression (42) (as described in Remark 7).