Temporal Multiresolution Graph Learning

Estimating time-varying graphs, i.e., a set of graphs in which one graph represents the relationship among nodes in a certain time slot, from observed data is a crucial problem in signal processing, machine learning, and data mining. Although many existing methods only estimate graphs with a single temporal resolution, the actual graphs often demonstrate different relationships in different temporal resolutions. In this study, we propose an approach for time-varying graph learning by leveraging a multiresolution property. The proposed method assumes that time-varying graphs can be decomposed by a linear combination of graphs localized at different temporal resolutions. We formulate a convex optimization problem for temporal multiresolution graph learning. In experiments using synthetic and real data, the proposed method demonstrates the promising objective performances for synthetic data, and obtains reasonable temporal multiresolution graphs from real data.


I. INTRODUCTION
M ANY applications of signal processing, machine learning, and data mining require the handling of sensor data, where the sensors are often distributed nonuniformly in a physical space. Analyzing such data by considering their underlying spatial structure, i.e., network, can significantly improve the quality of data analysis. Graphs are useful tools to mathematically represent such networks.
In many cases, graphs are not given a priori. Therefore, graph learning [11]- [14], techniques and algorithms for estimating a graph from observed data and/or feature values, are required in various GSP applications, particularly for sensor measurements.
Various graph learning methods have been proposed thus far, which are summarized in two overview papers [13], [15]. Most of these methods belong to a class of static graph learning (SGL), which represents the task of learning a single graph from a set of available data. SGL assumes that all the data follow the same signal generation model. However, there often exist time-varying relationships in many applications. In other words, underlying networks vary over time wherein the observed data follow a time-varying generation model. An example of a time-varying generation models is the brain functional connectivity from EEG or fMRI data, where signals can be thought of as generated by evolving brain networks. Furthermore, we can easily identify the model from the seasonal behavior of sensor measurements.
To handle these dynamic behaviors, time-varying graph learning (TVGL) methods have been proposed [12], [16]- [18]. Technically, a time-varying graph consists of multiple graphs where one graph corresponds to the relationship among vertices in a certain time slot. Typically, TVGL divides multivariate time-series data into consecutive (overlapped or nonoverlapped) data segments, and learns multiple graphs from these segments.
TVGL demonstrates a trade-off in the temporal resolution. For example, a large time window allows the capture of a global structure but results in the loss of the local temporal behavior. In contrast, selecting a small time window enables the capture of fast-changing behaviors but may also result in noise sensitivity. To tackle this problem, most existing TVGL methods [12], [16], [17] impose constraints for temporal variations of graphs between neighboring time slots. However, they have two main limitations. First, they are not suitable when the data are not fitted to the prior assumption of temporal variations. Second, even if the temporal variation information is known, it is often difficult to determine the hyperparameter(s) such as those in the constraints on the temporal variation and the temporal window size.
In this study, we propose a TVGL method without using a specific prior for network evolution. Instead, we assume that the time-varying graphs have a temporal multiresolution (TMR) structure: They can be represented by the combination of graphs at different temporal resolutions from local (i.e., short time period) to global (i.e., static) ones. This is desirable because multivariate time-series data tend to have a multiresolution property. An example of such TMR data is temperature data observed in multiple sensor locations. The measurements in each location exhibit the interdependence relationships which change hourly, daily, monthly, and even yearly, where the relationships correspond to structures localized at different temporal resolutions. Our proposed method automatically reveals which edge sets are localized at which temporal variation. In our problem setting, temporal resolution levels correspond to time window sizes in the timevarying graphs, but are not necessarily set as a hyperparameter.
The proposed TVGL is formulated as a convex optimization problem derived from the generation model. We also present an iterative algorithm for solving the optimization problem efficiently, which guarantees the convergence of the solution.
In experiments with synthetic datasets, the proposed method demonstrates superior performance to that of conventional single-resolution TVGL methods. Experiments on a real climate dataset demonstrate that the proposed methods can learn reasonable time-varying graphs that capture seasonal and geographical characteristics, which experimentally proves our concept.
Our preliminary work was presented in [19]. The present study significantly extends our previously proposed approach to simultaneously learn graphs localized at multiple temporal resolutions.
The remainder of this paper is organized as follows. Related works on the proposed methods are summarized in Section II. Notations used in this paper and preliminaries are defined in Section III. We present a generic TVGL framework from multivariate time-series signals in Section IV. Section V presents the proposed TMR graph learning method. Experimental results with synthetic and real data are presented in Section VI. Finally, we present our conclusions in Section VII.

II. RELATED WORKS
Many methods for graph learning have been proposed thus far. Most of them are summarized in the two overview papers [13], [15]. Without being exhaustive, we review SGL and TVGL methods related to our approach.
The basic strategy of SGL is the design of optimization problems based on some desired criteria for learned graphs. For example, [20]- [22] assume that signals are smooth on a graph. This characteristic is often represented as the Laplacian quadratic form. Instead of smoothness, [11], [23], [24] assume that signals are generated from a Laplacian constrained Gaussian Markov random field (LGMRF), and maximize its regularized likelihood. Some studies such as like [11], [13], [15] suggest a relation between the signal smoothness and the LGMRF likelihood; the smoothnessbased approach in [21] solves a relaxed problem of the LGMRF likelihood criterion.
As TVGL methods, [12], [16], [17] learn time-varying graphs with the assumption of signal smoothness and impose constraints on the network evolution. These TVGL methods are designed to learn time-varying graphs with a userspecified single temporal resolution (i.e., window size).
In a different line of GSP research, some graph learning methods assume that observations are generated by applying some filters, e.g., graph variation and heat diffusion operators, to a latent signal [25]- [27]. Their extensions to TVGL, proposed in [28], [29], focus on estimating graphs and corresponding filters simultaneously under the assumption on stationarities of graph signals or signal generation models based on graph filters. Such a simultaneous estimation is out of the scope of this study. The method proposed in this study, in contrast, is based on a different signal generation model. Note that all of the previous works mentioned above are single temporal resolution TVGL approaches.
In contrast to the existing approaches, we estimate graphs having multiple temporal resolutions to capture various temporal relationships. To the best of our knowledge, this is the first attempt in which TVGL has been used to extract a TMR behavior.
Some studies focus on learning multiple graphs (not necessarily time-varying) from observations. While they yield multiple graphs, the learned graphs may not represent timevarying relationships [30], [31].
From a machine learning perspective, TVGL relates to time-varying inverse covariance estimation [32]- [35]. The main difference between TVGL and inverse covariance es-timation is whether the optimization problem contains the constraint on the graph Laplacian. For example, a wellknown inverse covariance estimation, graphical Lasso [36], yields a covariance matrix that corresponds to a graph with negative edges and self-loops. This would be inappropriate if we need to learn time-varying graphs with nonnegative edge weights without self-loops, which is a typical assumption of GSP. In contrast, our approach constrains a solution space for graphs such that they have nonnegative edges and no selfloops.

A. NOTATION
Lowercase normal, lowercase bold, and uppercase bold letters denote scalars, vectors, and matrices, respectively. Both X ij and [X] i,j represent the (i, j) entry of the matrix X. Furthermore, the jth column vector of X is denoted by [X] j . 1 is an all-one vector. X•Y represents the Hadamard product of X and Y. The Moore-Penrose pseudo inverse of X is denoted by X † .

B. BASIC DEFINITIONS FOR GRAPHS
A weighted graph G = (V, E, W) is a graph with a vertex set V and an edge set E, where the number of nodes and edges are denoted by N = |V| and E = |E|, respectively. W ∈ R N ×N denotes a weighted adjacency matrix whose (i, j)-th element represents an edge from the ith vertex to the jth vertex. In this study, we assume that G is undirected with nonnegative edge weights and does not have self-loops, i.e., W is symmetric with nonnegative elements and all its diagonal elements are zero. Graph Laplacian is given by L = D − W, where D is the degree matrix defined as D ii = j W ij .

C. GRAPH LAPLACIAN OPERATOR
Let w ∈ R N (N −1)/2 + be a vector composed of edge weights. It corresponds to the vectorized version of the lower triangular part in W (excluding diagonal elements). The graph Laplacian operator L : w ∈ R

D. PRIMAL-DUAL SPLITTING ALGORITHM
Let Γ(R n ) be the set of all appropriate lower semicontinuous convex functions on R n . A primal-dual splitting algorithm [37] can solve the following optimization problem for some M ∈ N: functions that a proximal operator can be computed efficiently, and A i is a linear operator. The proximal operator prox γf : R n → R n of f with a parameter γ > 0 is defined by T , its proximal operator can be reduced to the proximal operator for each of them as follows: Let γ 1 , γ 2 > 0 be parameters satisfying the convergence condition The primal-dual splitting algorithm is given by the following iteration:

IV. LEARNING GRAPHS WITH LGMRF
Here, we present an overview of a generic formulation of SGL and (single-resolution) TVGL methods based on LGMRF in the literature because our formulation also leverages it. The formulation is also useful to distinguish the proposed method from existing methods.
Several existing approaches can be written using this generic formulation, although they have been proposed independently. We derive some representative graph learning methods from the generic formulation and reveal the relationship between them.

A. GENERAL FORMULATION
Suppose that multivariate time-series data {x t } T −1 t=0 are given where x t ∈ R N and T is the duration of the time-series. We also assume that x t is generated by LGMRF as follows: where L t is the graph Laplacian at time t that corresponds to the underlying graph of x t , and gdet represents the generalized determinant [38]. (7) can be rewritten as The edge weight vector w t will be sparse and nonnegative. Assume that the prior distribution of w t is the following Evariate exponential distribution [11]: The maximum a posteriori (MAP) estimation of p(x t |w t ) leads to the following optimization problem: where α controls the sparsity of the graph.
The optimization problem in (10) is the general form for the SGL and TVGL problems based on LGMRF. We also utilize (10) for the multiresolution TVGL proposed in this study.
While the problem itself is generic, directly solving this problem is impractical because it needs to learn one graph from one data sample due to overfitting. Therefore, we often need to divide the data into multiple segments. By assuming that the data within the same segment have one common graph, (10) is reduced to well-known SGL and TVGL problems. In the following subsections, we derive representative SGL and TVGL settings from (10).

B. STATIC GRAPH LEARNING
Suppose that w t is constant for all time instances t, i.e., the graph is static over time. This leads to w t = w, t = 0, . . . T − 1. Then, (10) is reduced to the following problem: The third term in (11) represents the smoothness of the signal as the Laplacian quadratic form x T t Lx t . This term can be rewritten by using the sample covariance matrix S as x t x T t Lw) = T tr(SLw). (12) As a result, (11) can be rewritten using (12) as It is equivalent to the graphical Lasso problem with graph Laplacian constraints [11].

C. TIME-VARYING GRAPH LEARNING
As previously mentioned, learning one graph from one sample from (10) causes overfitting. Instead, we consider a TVGL problem by dividing the time-series data with nonoverlapping time windows in the same manner as [12], [16], [17].
be the kth data chunk where r is the time window size and k is the index for the time window. We denote w (k) as the edge weight vector corresponding to the underlying graph of X (k) .
Under the assumption that the graph within the same time window is fixed, i.e., w t = w (k) , (t = kr, . . . , (k + 1)r − 1), TVGL is formulated as follows: where ψ(·) is an additional regularizer that characterizes the temporal evolution based on the prior knowledge of timevarying graphs and β is its parameter. Note that the problem is identical to the SGL in (11) if ψ(·) = 0. As possible regularizers, ψ(·) = · 2 2 reflects a timevarying graph whose edge weights change smoothly over time, and ψ(·) = · 1 leads to the graph wherein only a small number of edges change at any given time. A similar problem to (14) is proposed in [18].
This approach is effective as long as we have appropriate prior knowledge of the temporal evolution, i.e., ψ(·), and the accurate window size r. However, an inappropriate choice of ψ(·) or r leads to inappropriate graphs. To tackle this problem, we propose TMR TVGL in the next section.

V. MULTIRESOLUTION TIME-VARYING GRAPH LEARNING
In this section, we present the formulation of the TMR TVGL and an algorithm for solving it.

A. FORMULATION
Here, we introduce a TVGL method that learns {w t } T −1 t=0 based on a multiresolution assumption. For simplicity, suppose that T is divisible by 2 L , however, this method is applicable to general values of T .
Suppose that W t can be represented as a combination of graphs localized at a temporal resolution W l,m , as illustrated in Fig. 1. We refer to W l,m as the TMR graph at the temporal resolution l and the segment index m. Therefore, the multiscale representation of {w t } T −1 t=0 is given by the sum of TMR graphs corresponding to time t as where q(t) = t T 2 L and L is the maximum temporal resolution level.
This TMR representation has two advantages. First, it reduces the number of parameters to learn. For TMR TVGL, we need E(2 (L+1) − 1) parameters, whereas the number of parameters in a single-resolution TVGL is ET . E(2 (L+1) − 1) ≤ ET when L ≤ log 2 T − 1. It is beneficial if we only have a limited amount of available data. Second, the TMR representation enables the capture of the edges localized in an arbitrary temporal resolution, without specifying the temporal window size. Now, we consider the detailed formulation of the proposed TVGL. The goal is to learn w l,m from {x t } T −1 t=0 . Substituting (15) into (10) leads to the following problem: min w0,0,...,w L,2 L−1 ≥0 Letting (16) can be rewritten as: where M ∈ R 2 (L+1) −1×2 L is given by in which l = log 2 (i + 1) and m = mod(i + 1, 2 l ). Note that [FM] k =w kR = · · · =w kR+R−1 .
In (18), we need to obtain a sparse w l,m to capture the temporally localized structure. However, the direct constraint on the sparseness of FM does not result in a sparse F. Therefore, we replace the first term in (18) with the sparse constraint on F as follows: This is the proposed TVGL formulation for learning TMR graphs. In the following subsection, we describe an algorithm to solve (19).

B. ALGORITHM
The optimization problem in (19) is convex and can be solved using the PDS algorithm. Here, we reformulate (19) to the PDS applicable form. Let Z k ∈ R N ×N be a pairwise distance matrix computed from VOLUME 4, 2016 and z k ∈ R E be the vector form representation of Z k . The third term of (19) can then be rewritten as where Z all = [z 0 , . . . , z 2 L −1 ]. Here, we denoteF = F T andL i X = L(X T ) i for notation simplicity. By using the indicator function, (19) can be reduced to the following optimization problem: where ι is defined by Owing to the nonnegative constraint onF, the first and second terms in (22) can be merged as where H = 11 T ∈ R E×2 (L+1)−1 . By introducing the linear operatorL : R E×2 L → R 2 L N ×N defined as and a dual variable V := [V T 0 , . . . , V T 2 L −1 ] T =L(M TF ), we can convert (24) into the form in (3) as follows: The proximal operator for the function g corresponds to that of the weighted 1 norm with the nonnegative constraint, and it is given by where B = αH + MZ T all . The proximal operator of h can be computed as follows. In general, the logarithm of a generalized determinant is a nonconvex function. Under the assumption that the learned graph is connected (which is often the case), it can be replaced with a convex function as follows [11,Proposition 1]: Then, the proximal operator is given by where φ (λ i ) = λi+ √ λ 2 i +4γ 2 and U and λ i are the eigenvector matrix and the eigenvalue of A+ 1 N 11 T , respectively. The eignvalues are ordered as λ 0 ≤ λ 1 ≤ λ 2 · · · ≤ λ N −1 1 .
Finally, we present the algorithm for the multiresolution TVGL in Algorithm 1. The condition of convergence is given by Based on the submultiplicativity of the operator norm, the upper-bound of ML * L M T can be computed from because of L * L = 2N and MM T = 2 L+1 − 1. Consequently, the convergence condition in (31) can be rewritten as .
The computational complexity of our algorithm is O(2 L N 3 ) per iteration.

VI. EXPERIMENTAL RESULTS
In this section, we present experimental results on synthetic and real datasets. The existing and proposed methods are abbreviated as follows: • SGL based on smoothness criterion (SGL-S) [21].
• TVGL based on smoothness with temporal variation constraint (TVGL-S) [12], [16], [17]. • TVGL with LGMRF incorporating the temporal variation constraint (TVGL-LG) [18]. 1 Even if the original graph has disconnected components, we can avoid the problem of the calculation of the proximal operator by adding a small regularizing parameter c to the input as follows [30]: The proximal operator of this approximation also can be computed in the same manner as (29).

Algorithm 1 Temporal multiresolution graph learning
Input: The stopping criterion of the iterations for each methods is set to w (n+1) − w (n) / w (n) < 1.0 × 10 −3 .

A. EXPERIMENTS ON TEMPORAL MULTIRESOLUTION GRAPHS
To demonstrate the concept of TVGL for TMR graphs, we first present the results by constructing a simple TMR graph dataset.

1) Dataset
The dataset is constructed in two sequential steps: 1) construction of time-varying graphs and 2) generation of data samples based on the time-varying graphs. First, we construct TMR graphs with four levels (l = 0, . . . , 3) as shown in Fig. 2. The number of vertices N is set to N = 81 and the edge weights between vertices are random values drawn from a uniform distribution from the interval [0. 1,3]. The lowest resolution graph, i.e., the graph reflecting the global structure, is W 0,0 , as shown in Fig. 2(a), where the graph has a grid-like structure while the edges only run vertically, except for the horizontal edges at the center of the grid. As shown in Figs. 2(b)-(o), the graphs at levels 1 to 3 have horizontal edges, diagonal edges from the upper right to lower left, and diagonal edges from the upper left to lower right, respectively. By combining W l,m 's, we obtain prototype graphs W (0) , . . . , W (7) , as shown in Fig. 3.
From the prototype graphs, we then construct time-varying graphs {W 0 , . . . , W T −1 }. We set T = 640 in this experiment. As the number of mutiresolution graphs in the highest resolution is eight, each of them has been duplicated 80 times and then they are concatenated, i.e., W t := W ( t/80 ) (t = 0, . . . , T − 1).
Second, multivariate time-series signals X are generated from the following GMRF: where L t is the graph Laplacian associated with W t . We set σ to 0.5.

2) Experimental Condition
We evaluate the performance in terms of relative error and F-measure, each averaged over all time slots. Relative error is given by where W is the estimated weighted adjacency matrix, and W * is the ground-truth. It reflects the accuracy of edge weights on the estimated graph. The F-measure is given by where the true positive (tp) is the number of edges that are included both in W and W * , the false positive (fn) is the number of edges that are not included in W but are included in W * , and the false positive (fp) is the number of edges that are included in W but are not included in W * . The F-measure, which is the harmonic average of the precision and recall, represents the accuracy of the estimated VOLUME 4, 2016 Visualization of the ground-truth graphs.
FIGURE 3. Time-varying graphs obtained from the multiresolution graphs in Fig. 2.
graph topology. The F-measure takes values between 0 and 1. The higher the F-measure, the higher the performance of capturing the graph topology.
In this experiment, we construct training and test data and evaluate the performance of graph learning on the test data using the hyperparameters that minimize the relative error on the training data. We search for optimal hyperparameters using Bayesian optimization [39]. Additionally, 1 norm is used for the temporal variation regularization of the existing TVGL approaches.
We evaluate the performance with different window sizes to study the robustness of each method for the choice of the window size K. The existing methods use K = 20, 40, or 80, and the proposed method uses the maximum temporal resolution level L = 5. The proposed method can reconstruct timevarying graphs corresponding to K = {20, 40, 80} from a set of TMR graphs. Note that the existing methods need to fix K before running their algorithms, whereas the proposed TVGL method simultaneously estimates time-varying graphs in the different window sizes. Table 1 summarizes the average performance of the learned graphs. As shown in the table, TVGL-MR nearly outperforms the other methods both in terms of F-measure and relative error. This indicates that the TVGL performances can be improved by TVGL-MR if time-varying graphs can be assumed to have multiresolution characteristics. Fig. 4 visualizes the time-varying graphs learned by TVGL-S, TVGL-LG, and TVGL-MR. As shown in Fig. 4, the alternative TVGL methods fail to capture temporal multiresolution structures, particularly those at the highresolution level. In contrast, the proposed method captures edges localized at various temporal resolutions.

3) Results
Furthermore, Fig. 5 shows the TMR graphs learned by TVGL-MR. The figure also demonstrates that the proposed method can successfully learn TMR graphs.

B. EXPERIMENTS ON SINGLE RESOLUTION GRAPHS
The previous experiment demonstrates the effectiveness of the proposed method for TMR graphs. While the proposed method is not specifically designed for learning single resolution TVGL, here, we compare TVGL performances with the other methods for some single resolution time-varying graphs.

1) Datasets
The dataset is constructed with the same steps described in Section VI-A. In this experiment, we construct two types of time-varying graphs as follows: Edge-Markovian Evolving Graph (EMEG): EMEG is a stochastic time dependency evolving graph [40]. Each edge in EMEG follows the Markovian process. EMEG G s = {G t = (V t , E t , W t )} is satisfied as (a) W 0,0 where q 1 and q 2 are called birth rate and death rate, respectively. We generate an Erdős-Rényi graph with N = 36, p = 0.1 as the initial graph G 0 . The edge weights of the initial graph are selected from the uniform distribution with the interval [0. 1,3], and the weights of the newborn edges are also selected from the same distribution. We set q 1 = 0.001 and q 2 = 0.01.
Switching Behavior Graph (SBG): SBG is a timevarying graph that exhibits the transition of connectivity states. It often appears in brain connectivity dynamics [41], [42]. We construct an SBG using the following procedure. We generate six static graphs used as the connectivity states. Each of the graphs is initialized to an Erdős-Rényi graph with N = 36, an edge connection probability p = 0.05, and edge weights drawn from a uniform distribution in the interval [0. 1,3]. The initial state is selected randomly from the six connectivity states, and its state remains with a 98% probability and transits to another connectivity state with the 2% probability at each time.
Generating Graph Signals: Given graph Laplacians L (0) , . . . , L (127) of the constructed time-varying graphs, we generate multivariate time-series signal x 0 , . . . , x 5119 from the following GMRF: where σ 2 is the variance of the white Gaussian noise. We set σ = 0.5 in this experiment. Regularization Functions for Alternative TVGL Methods: TVGL methods, i.e., TVGL-S and TVGL-LG, require choosing the regularization function based on the prior knowledge of temporal graph evolution. For EMEG and SBG, we adopt 1 and 2,1 -norm as the possible regularization functions, respectively. Table 2 summarizes the performances of SGL/TVGL methods on different datasets. TVGL methods outperform the static methods on all datasets. This implies that the regularization for the temporal graph evolution or TMR assumption improves the graph learning performance.

2) Results
Among the TVGL methods, TVGL-MR ranks first or second in this experiment. This suggests the effectiveness and robustness of the proposed method even for single resolution time-varying graphs. It is also worth noting that, TVGL-MR can exhibit performance comparable to that of time-varying methods without the prior knowledge of the graph evolution over time, i.e., the regularization function. Typically, existing TVGL approaches require both prior knowledge and hyperparameter(s). In contrast, the only assumption in the proposed method is that time-varying graphs are characterized by the multiresolution property, which is a natural assumption of signal processing. This implies the flexibility of the proposed method.
Figs. 6 and 7 show the visualization of the temporal variation in the ground-truth graphs and the learned graphs with a window size of 40, respectively. The vertical and horizontal axes of these figures represent the edge and time slot indices of the time-varying graph, and the color represents the intensity of the edge weights. For simple visualization, the first 100 edge indices are visualized.
As can be seen in Fig. 7, SGL-S and SGL-LG lose the temporal relations, whereas TVGL-S, TVGL-LG, and TVGL-MR can capture the original structures more precisely than static methods. Time-varying graphs by TVGL-S, TVGL-LG, and TVGL-MR are similar, but the proposed method tends to yield larger edge weights.

C. LEARNING TEMPORAL MULTIRESOLUTION GRAPHS FROM REAL TEMPERATURE DATA
Finally, we apply TVGL-MR to the real temperature data in Hokkaido, the northernmost island in Japan. The goal of this experiment is to explore the common (time-invariant) and seasonal relationships among geographical regions using the proposed method.
We use the average temperature data 2 measured at 172 recording locations in Hokkaido from March 2014 to February 2015. We perform TVGL-MR with L = 3 (i.e., the number of graphs is four at the highest level). Fig. 8 shows the lowest resolution graph W 0,0 obtained by TVGL-MR and the graph obtained by SGL-LG from data of all time slots. Note that both of them can be regarded as static graphs. Focusing on the graph learned by TVGL-MR, the following characteristics are observed: • Vertices close to each other are basically connected, and edges between closer nodes tend to have large weights. However, if the recording locations are separated by a mountain (brown-ish area), nodes may not be connected even if they are geographically close. • Vertices with similar geographic features are often connected, i.e., ones along the coast are connected to each other, and the similar characteristic is observed for inland vertices.
The above-mentioned characteristics seem reasonable because the relationship based on the distance between nodes or geographic features is static. In contrast, the graph learned by SGL-LG is denser than that by TVGL-MR and includes many edges connecting distant nodes. Such edges may be derived from the seasonal behavior, which is described later. As SGL-LG learns a static graph from all the time slots without separating structures localized at various temporal resolutions, the learned graph may include both common and seasonal edges. Fig. 9 shows W 2,0 , . . . , W 2,3 learned by TVGL-MR, which corresponds to season-specific graphs. In contrast to the static graph, these seasonal graphs have few edges connecting nodes close to each other. This suggests that the distance-based relationship would have a weak effect on the seasonal behavior. Furthermore, the summer-and winterspecific graphs have more edges than those of the spring and autumn-specific graphs. This seems intuitive because the seasonal effects in summer and winter are expected to be stronger than those in mild , such as spring and fall.
Furthermore, edges connecting distant coastal nodes in the summer and winter-specific graphs (which are also observed in SGL-LG in Fig. 8(b)) can be attributed to the effects of  seasonal sea currents. Fig. 10 shows the sea surface temperature (SST) 3 on August 7, 2014, and January 8, 2015. As can be seen in Figs. 9(b), 9(d), and 10, vertices connected along coasts in the summer-and winter-specific graphs reflect SST behaviors for the two seasons.

VII. CONCLUSION
We proposed a temporal multiresolution graph learning method from multivariate time-series data. The proposed method is designed based on a signal generation model in 3 The daily SST was provided by Japan Meteorological Agency, from their website at https://www.jma.go.jp/jma/index.html accordance with an LGMRF, and enables the capture of timevarying structures having a multiresolution property in one single framework. The TVGL is formulated as a convex optimization problem and can be solved efficiently using a primal-dual splitting algorithm. The experiments on synthetic and real datasets demonstrate that the proposed method outperforms the existing static and time-varying graph learning methods.