From Time Series to Networks in R with the ts2net Package

Network science established itself as a prominent tool for modeling time series and complex systems. This modeling process consists of transforming a set or a single time series into a network. Nodes may represent complete time series, segments, or single values, while links define associations or similarities between the represented parts. R is one of the main programming languages used in data science, statistics, and machine learning, with many packages available. However, no single package provides the necessary methods to transform time series into networks. This paper presents ts2net, an R package for modeling one or multiple time series into networks. The package provides the time series distance functions that can be easily computed in parallel and in supercomputers to process larger data sets and methods to transform distance matrices into networks. Ts2net also provides methods to transform a single time series into a network, such as recurrence networks, visibility graphs, and transition networks. Together with other packages, ts2net permits using network science and graph mining tools to extract information from time series.

Complex networks are one of the most prominent tools for modeling complex systems [1]. The network topology permits not just the analysis of the small parts (nodes) in the system but also their relationship (links) and the complex phenomena that could emerge from these links. Furthermore, this model allows network science and graph mining tools to extract information from complex systems [2]. Complex systems are commonly represented by a set of time series with interdependencies or similarities. This set can be modeled as a network where nodes represent time series, and links connect pairs of associated ones. Associations are usually measured by time series distance functions, carefully chosen by the user to capture the desired similarities. Thus, the network construction process consists of two steps. First, the distance between all pairs of time series is calculated and stored in a distance matrix. Then, this distance matrix is transformed into an adjacency matrix using strategies such as k-nearest neighbors, neighborhood, or complete weighted graph.
R is one of the most used programming languages in statistics, data science, machine learning, complex networks, and complex systems. Different packages provide distance functions and network analysis, but no single package provides the necessary tools to model time series as networks. This abstract presents ts2net, a package to construct networks from time series in R. This package provides tools for measuring linear, non-linear, and event-based associations between time series. Distance calculations run in parallel using multi-core programming to speed up the modeling process. Ts2net provides methods to run the calculations in supercomputers and computer clusters using multiple jobs. The package is open-access and released under the MIT license. The source code is available on GitHub at github.com/lnferreira/ts2net and the package can be installed from the Comprehensive R Archive Network (CRAN) by running install packages("ts2net") on R.
The first step in the network construction process consists of calculating a distance matrix D that stores the dis-tances between all pairs of time series. The ts dist function calculates all pairs of distances and returns a distance matrix D. It runs serial or in parallel using different cores. The ts dist part calculates distances in part of the time series set, which is particularly useful to run in parallel as jobs in supercomputers or computer clusters (HPC). The ts dist part file function works similarly to ts dist part, but it reads the time series from serialized objects (RDS files) in a directory. This means that ts dist part file requires much less memory and should be preferred when memory consumption is a concern, e.g., huge data sets or data sets with very long time series.
A distance function is required to construct the distance matrix D. The ts2net package provides the most used time series comparison functions: Pearson correlation (tsdist cor), cross-correlation (tsdist ccf), dynamic time warping DTW (tsdist dtw), normalized mutual information (tsdist nmi), variation of information (tsdist voi), maximal information coefficient (tsdist mic), events synchronization (tsdist es), and van Rossum distance (tsdist vr). Some distances provide statistical tests that can be considered during the network construction process to avoid spurious links. Other distance functions can be easily implemented or adapted to be used by the package.
The second step in the network construction process is the transformation of D into an adjacency matrix A. The ts2net package provides four ways of network construction: k-NN (net knn), -NN (net enn), or complete weighted networks (net weighted) [2]. Fig. 1 illustrates the whole net construction process using a set of temperature time series from 27 cities in the US.
In summary, the ts2net package makes complex system modeling with networks much simpler. Together with other R packages, ts2net permits the analysis and information extraction using network science and graph mining tools.