Who talks to whom: an evaluation of a call log visualization

Adding temporal information to social network visualizations is still a challenging task despite previous research efforts. Visualizing call logs on an event-based level can show various attributes of a connection. The dimension time is of great interest to analysts as it offers insights into trends and patterns such as changing relationships between different actors or economic opportunities for businesses. Yet current approaches suffer from limitations that can be improved with the visualization design presented in this work. Our presented visualization was developed considering aesthetic criteria and characteristics of adjacency matrices and node-link diagrams. A heuristic evaluation according to these criteria was conducted. In a formative evaluation process, an artificial dataset was specifically created to examine dynamic social networks. A qualitative user study with observation and think-aloud protocols was conducted and analyzed with regard to the user’s strategies, limitations of the visualization and potential additional features. The visualization appears to be suitable for all of the evaluated network tasks; however, path-related tasks were more challenging than other tasks.

traces can be used to depict the social network among the individuals. According to Ghoniem et al. (2004), an intuitive approach to visualize relations is to use links between the actors to show who is connected to whom (e.g., a node-link diagram where nodes represent the actors). Researchers have also come up with other ways to visualize social networks and provide additional information within the visualizations. Ahn et al. (2011) discuss the aspect of dynamics in social networks, as communities are rarely static. Due to a variety of reasons such as cultural, environmental, economic or political trends or individuals changing their roles, social status or interests, relations change over time. Vehlow et al. (2015) note that this inherently influences the structure of the network.
Therefore, it is useful to add time as a dimension to the representation of the social network. Considering the evolution of communities can help to understand the past with its influencing circumstances and to use this knowledge for predicting the future. Analyzing social networks this way can, according to Henry and Fekete (2007), enable specialists to discover various trends, patterns, economic opportunities or even terrorist networks.
There has been a considerable amount of research in visualizing static networks, which can be used as a simplification of evolving networks by choosing a single point in time or by aggregating longer spans of time (Beck et al. 2014). In comparison, there has been little research on visualization approaches for dynamic social networks, especially empirical evaluations addressing the question whether users are able to make sense of such visualizations and derive insights from them. These visualizations still suffer from several limitations that impair their readability such as edge crossings, clustering, visual clutter or the high cognitive load required to read them.
Representing dynamic network data presents specific challenges. Simple social networks can be easily presented as node-link diagrams, but it is difficult to introduce time as a feature in such diagrams. A possible solution for this problem could be using more than one link between nodes to show the temporal development of relationships between two nodes. The problem here is that this does not scale very well. Only a few points in time can be shown. Another solution could be to use small multiples. This leads to a high cognitive load because users have to keep in mind all the small multiples to be able to identify patterns. Therefore, specific visualizations have to be developed to show temporal changes in social networks. This topic is highly relevant. There are many application areas investigating temporal developments of social networks, especially social networks like Facebook, WhatsApp, LinkedIn or Twitter. We chose another important application area as an example-telephone connection data. This area is especially interesting for companies wanting to analyze the communication patterns of their employees to reduce communication costs and make communication more efficient.
The aim of this work is to propose a novel visualization that deals with current limitations of social networks, especially the investigation of single connections over time. As our visualization approach focuses on certain characteristics it serves specific use cases and comes with its own challenges and limitations. These have been investigated in a qualitative user study leading to insights into the strategies participants used to solve network tasks as well as design implications for this type of visualization. For this investigation an artificial dataset of phone calls was used. The visualization was created with Tableau, which is a graphical software for creating interactive visualizations.
We conducted a user study with six participants performing a set of 17 tasks and a follow-up interview. We analyzed data from observations, think-aloud and participants ratings of confidence and difficulty for each task. The analysis emphasized three aspects: the suitability for graph visualization tasks, the usability of the visualization and possible design improvements.
We also conducted a heuristic evaluation indicating that the proposed visualization is appropriate for some types of tasks, but not for others. In the context of the heuristic evaluation, we also compared the proposed system to other solutions to identify specific strengths and weaknesses of our visualization.
We describe related work, our novel visualization approach and the qualitative user study. The results of this user study showed that the visualization appears to be suitable for all network tasks; however, tasks that involve finding paths were more challenging. Implications for design were derived.
Our contributions to the field are the following: -Development of a prototype of a visualization for the representation of temporal developments in social networks. Our use case is phone calls. -User study with six participants. This study indicates that the proposed visualization is appropriate for most of the investigated tasks.
-Heuristic evaluation. This study indicates that the proposed visualization has some advantages over more traditional forms of visualization (e.g., reduction of visual clutter, preservation of mental map, low cognitive load). -Set of design implications based on the evaluation. This paper is based on Riegler et al. (2019) which is a shorter version of our work. The previous paper described the user study and the design recommendations derived from this study rather briefly. In this paper, we will also discuss a heuristic evaluation that identifies possible advantages and disadvantages compared to other possible solutions for the problem of the representation of developments in social networks. In addition, we also describe the visualization in more detail and provide a more extensive overview of related work.
2 Related work

Visualization techniques
The visualization of dynamic networks has become an active research discipline with increasing relevance for various communities. Different concepts of layout, visualization and interaction techniques can be found in the literature as analyzed by von Landesberger et al. (2011) and surveyed by Shaobo and Lingda (2017). Several ways to map the time dimension to the visualization of dynamic graphs have been explored. The two most common are to either map time to time, which results in an animation or to map time to space which creates a timeline within the visualization (Beck et al. 2017). Most commonly, a series of diagrams gets animated. To follow and understand changes over time, mental representations need to be preserved, which can become critical in animation as limitations of memory and cognitive overload can become a problem (Archambault et al. 2014;Beck et al. 2014). The cognitive load theory describes that humans can improve thinking and reasoning by using external structures to reduce the need to keep information in memory (Sweller et al. 2011). In that sense, showing diagrams next to each other as small multiples is the better alternative as each slice in time can be observed at once.
Evaluations of graph representations shed light on the suitability for specific tasks and scalability for appropriate dataset sizes. The evaluation of Ghoniem et al. (2004) compares a node-link diagram to a matrix-based representation, which results in both being able to show advantages depending on graph size and density. Simple time to space mappings, such as time to color, saturation, glyphs, etc., are rarely applied independently and rather combined with other techniques (Beck et al. 2014). A challenge in representing dynamic graphs is to provide overview and detail. Burch et al. (2015) present edge-stacked timelines using color to show individual weighted connections of a graph. They use two linked representations side by side, one showing the graphs of each point in time in a minimized form on a narrow timeline. Each graph can then be selected to be shown in more detail for further investigation on the timeline in the second representation. Only through interaction one can follow edge changes on the overview timeline and obtain details about which nodes are connected in the second visualization.
To avoid timeslicing on continuous datasets, Simonetto et al. (2018) developed a dynamic graph drawing algorithm to be able to show dynamic graphs in a space-time cube. They argue that despite being intuitive it is often unclear how many timeslices to select to get a reliable picture of the graph structure while not slowing down the visualization by too many timeslices.

Evaluations in dynamic graph visualization
A considerable amount of evaluation studies can be found in the field of information visualization. Beck et al. (2017) categorized visualization papers on the basis of the evaluation approach that was used. They describe seven types of evaluations: case study, user study, algorithmic, expert, survey, theoretical and none (for papers not mentioning an evaluation). They further state that deciding whether a visualization technique is helpful or not in the end always requires to involve users. User studies, for example, can be used to compare visualization approaches or test visualization parameters under controlled conditions. Often, specific visualizations have been compared in that way to demonstrate the superiority of one visualization over the other. The user studies of Farrugia and Quigley (2011), e.g., report response times with static representations being significantly faster for most tasks compared to the animation of two dynamic graph series.
Response times alone can not provide a conclusive picture of the qualities of a visualization. Qualitative methods can elicit user preferences and thoughts and, thus, provide insights about how humans make sense of visualized data. Lam et al. (2012) introduce seven evaluation scenarios divided into process and visualization categories. The first category allows analysts to understand the process and the roles played by visualizations, while the second category focuses on analyzing the underlying visualization itself, e.g., by testing its usability and making design decisions.
The role of aesthetic criteria in perceived usability has been the subject of several evaluation studies. Beck et al. (2009) provide a framework for a standardized qualitative assessment to identify strengths and weaknesses of visualizations. They evaluated three dynamic graph visualizations in this way, while pointing out that this kind of assessment is subjective to some extent. Burch (2017) sets aesthetic criteria as targets for the design of the Dynamic Graph Wall, a dynamic graph visualization using multiple visual metaphors. The author considers node-link diagrams, adjacency matrices and adjacency lists to automatically render the appropriate use case based visualization in a separate view as graph properties change over time.
Research shows that the interaction design is a critical factor for the success of scalable graph visualizations (Pienta et al. 2015). One approach to analyze the dynamics in a graph is to show differences in specific points in time. NetEvViz enables to select two time points on a timeline to show the differences in a node-link representation (Khurana e al. 2011). Another approach is to use multiple parallel links between nodes (Beck et al. 2014). Node-link diagrams with multivariate edges were presented as multiple threads by Ko et al. (2014), who used colored parallel lines. However, colors can be difficult to distinguish and the approach does not scale well.
NetVisia, an approach to show evaluation in social networks (Gove et al. 2011), combines heat maps and matrices to support the exploration of temporal changes in networks. It was evaluated with regards to potential usability problems as well as new data insights. Think-aloud protocols of the users were analyzed to see which kind of insights could be gained through the visualization. Results show that novice users could gain insights within a few minutes despite observed usability problems.

Dynamic network domains
Dynamic networks are relevant for various fields. Social network is only one specific application example. Other applications for dynamic network visualization can be found in the domain of security data (Zhao et al. 2014) or ranking data (Lei et al. 2016). Stoiber et al. (2019) present netflower, a visual analytics tool for the open exploration for data journalists. Netflower uses a Sankey diagram, a flow diagram in which the width of the arrows is proportional to the flow rate as the main visualization, to represent network change over time. The tool extends the main visualization by bar chart sparklines on both sides to provide temporal overviews of data attributes.
Visual analytics tools can also provide advantages in crime investigations where multiple heterogeneous data sources need to be combined as well. Seidler et al. (2016a) introduced a combined node-link and matrix visualization system to visualize dynamics of criminal behavior. They evaluated the technique with domain experts using think-aloud protocols with observation. They showed that using both, a node-link and a matrix visualization, strengthens the identification of harmful developments. Moreover, Gove et al. (2011) designed a novel social network system which includes heat maps and matrices to support users exploring temporal changes in networks. Hlawatsch et al. (2014) introduced a visualization for dynamic, weighted graphs based on adjacency lists which is timeline based and juxtaposed. They represent nodes and corresponding links separately on two horizontal axes and use colors to further distinguish the nodes. A quantitative user study confirms the suitability for dynamic weighted graphs, especially for the analysis of link distribution in graphs with asymmetric characteristics. Based on their results, aesthetic criteria, such as compactness and visual clutter are further discussed.
To communicate results and insights gained from data exploration another aspect of visualization comes into play: being able to tell a story. Benjamin et al. (2016) showed that graph comics are a good storytelling medium as complex temporal changes in networks can be conveyed with minimal textual annotations and no training of the audience. The structure of a network can provide the analyst with information of different roles of the actors. As Liu et al. (2017) analyze connected nodes, they distinguish the nodes of an intermediary to the ones of a professor. An intermediary may know many people who are not interconnected as they often do not know each other. Whereas professors know many students that may have many links that tie them to each other (as they probably visit the same classes and are in the same social circles). As soon as these roles or professions change, the social network changes as well which might lead to new trends and patterns.

Inspiration for our visualization design
There are two approaches that were a significant inspiration for the design of the proposed visualization of this work. In the approach of Burch et al. (2015), stacked-edges show individual connections of a graph. The main difference to our design is that we do not use color to show weighted graphs. Another inspiring approach was developed by Kriglstein et al. (2016) who examined visualizations for group meetings. In their representations, letters denote locations, numbers denote hourly intervals and colors denote different individuals. Especially, their augmented matrix was an inspiration for this work. The idea to connect persons in a matrix structure via links inspired the approach of this work in that the person's row can be scanned to see who the person connects with. In their empirical evaluation, Kriglstein et al. could show that matrix representations were superior to other forms of visualizations.

Methodology
The visualization was created using Tableau having the aesthetic criteria from Beck et al. (2009) in mind, see Sect. 3.1. Tableau is a commercial visual analysis desktop application. We used the free student version of Tableau Desktop, version 9.3.1. For the illustrations in this work, an artificial dataset of phone calls (see Sect. 3.2) was created. These components will be described in more detail in the following subsections.

Aesthetic criteria
An aesthetically pleasing design can significantly influence the usability, readability and ultimately the usage and success of a visualization or of any product for that matter. The aesthetic criteria discussed in the literature often also include usability considerations. Edge crossing is one of the most often mentioned of these criteria. Avoidance of edge crossing does not only result in a more pleasing appearance, but also in an increased readability. Beck et al. (2009) addressed aesthetic dimensions of dynamic graph visualizations in an effort to help designers come up with applicable new designs in practice as well as to compare and evaluate them. They point out that modified design criteria are necessary for dynamic graphs compared to static graphs. In their work, they attempted to translate vague aesthetic aspects into specific criteria that are directly applicable to arbitrary dynamic graph visualizations. They consider a graph readable or aesthetic when the user is both able to access detail information and to uncover general regularities and anomalies of the graph structure. Detail information may include the point in time and weights of edges, the specific nodes that are connected by an edge and paths. General regularities and anomalies could be clusters of vertices, outliers, trends, symmetries or patterns (Beck et al. 2009). They organized these aesthetic criteria into the categories ''general criteria,'' ''dynamic criteria'' and ''scalability criteria.'' As these criteria (Beck et al. 2009) were specifically tailored to dynamic graph visualizations and could therefore directly be applied, we chose to consider these for designing and evaluating the visualization.

General aesthetic criteria
The first category deals with general aesthetic criteria.
GAC1: Reduce visual clutter Too many elements or elements that are disorganized may overwhelm the user so that he or she loses overview and orientation which makes it hard for them to read the visualization. This is called visual clutter.
GAC2: Reduce spatial aliases Spatial aliases occur when visual elements are mistaken one for the other, especially because of their placement (e.g., similar elements that are placed too close to each other).
GAC3: Spatial matching of multiple representatives Multiple visual representatives of the same object that are spread apart should be matched to one another in order to avoid confusion or visual clutter.

GAC4: Maximize compactness
Space and time should be used efficiently for displaying the graph information (e.g., node-link diagrams need a lot of space; therefore, large and dense node-link diagrams often do not maximize compactness).

Dynamic aesthetic criteria
The extra dimension of time adds not only extra information but also extra aesthetic criteria as the user should be able to follow the temporal development of the graph information.
DAC1 Preserving the mental map The mental map should be kept as stable as possible (Archambault et al. 2014). This can help users to navigate in a graph, compare it to other graphs, for example, to other points in time, and to generally keep an overview in their mind.
DAC2 Reducing the cognitive load The cognitive load (Sweller et al. 2011) is the amount of information the user has to keep in his or her working memory while reading the visualization. In animations, for example, the user only sees one image at a time and has to remember what happened before, but also while reading, e.g., juxtaposed graphs the user needs to remember some information to compare them in order to observe changes.
DAC3 Minimal temporal aliases Temporal aliases are similar to spatial aliases in that they occur when visual elements are mistaken one for the other because of their placement in time/on a time axis.

Aesthetic scalability criteria
Scalability in the context of visualizations means that while the dataset may be growing it should still remain readable. The dataset can grow in three dimensions which are vertices, edges and number of points in time, i.e., subsequent graphs. The following three criteria are equivalent in that the readability of the graph is preserved for an increasing number of vertices/edges/graphs.
-SC1 Scalability in number of vertices -SC2 Scalability in number of edges -SC3 Scalability in number of graphs

Aesthetic criteria in practice
It needs to be noted that in practice some of these criteria are indirectly in conflict with one another. For example, if one would try to maximize compactness by simply shrinking parts of the visualization, it would most likely either result in increased visual clutter or in an increased cognitive load due to certain details that needed to be left out or that need to be found elsewhere to avoid visual clutter. Therefore, although it would be ideal to fulfill all of them, those criteria that are most important for the particular application should be paid more attention to and prioritized.

Dataset
For creating the visualization, we used an artificial social network with 98 phone calls between 26 individuals during a period of 6 consecutive days. The dataset contains private calls only, i.e., undirected connections between two persons. Each call lasts for a period of time, which is mapped as the intensity per connection, i.e., the call duration. The network connections are undirected, i.e., the dataset does not indicate who starts the conversation, but rather focuses on the duration of a call.
The main purpose of visualizing social networks is to be able to make sense of the social network that is represented within the data. For this reason, the dataset was created with special attention to its content in order to support storytelling, i.e., the data suggest the interpretation of circles of friends or families so that stories can unfold during the analysis.
The names of the 26 individuals each start with a letter from the alphabet to avoid confusing names and simplify reading the visualization. It is a common practice in social network analysis to prepare the dataset in a first step to assign distinct actor names, for example, aliases in crime analysis or a composition of surname and institution for scientific co-authorship network analysis. The dataset structure containing two actors per call, a numeric value representing different days and an intensity value is shown in Table 1. Each column represents a phone call and has an intensity assigned to reflect the length of the call in minutes.
In the user study, a different dataset was used. The underlying structure is the same, but the original dataset used for creating the visualization was limited to 20 actors and 56 phone calls within 7 days. The number of persons and, thus, the number of calls was reduced in order to keep the testing sessions within an appropriate time frame. The network size was still complex enough for realistic tasks to meet the goals of the study. The number of calls per day is evenly distributed showing 7-9 calls per day. The average duration of the phone calls is 17,13 minutes, with a minimum of 1 minute to a maximum of 44 min. Both datasets are provided in the ''Appendix''.
It is usually necessary to use an artificial dataset to cover all possible tasks for a user study. Real datasets are often messy and do not allow systematic testing.

Motivation
The newly developed visualization for dynamic social networks uses a timeline-based approach, featuring characteristics of node-link diagrams and matrix-based representations in an attempt to leverage their respective advantages, i.e., showing individual links between actors and enabling an analysis in a structured way. The visual representation of the phone calls network used in the user study depicts each conversation on a daily basis, i.e., it shows precisely who talked to whom on which day and for how long, compare Fig. 1.

Visualization structure and visual variables
In this representation, the different points in time (in our case: days) are shown next to each other, i.e., they are juxtaposed. The phone calls are shown on the x-axis and consecutively numbered for identification (ID 1-98, see Fig. 2). The x-axis is labeled on the top with this ID as well as the day of the call, and on the bottom with intensities representing the duration of the call. Calls are assigned intensities, which are given below each link. The calls are sorted by intensity on a daily basis, i.e., the most intense (longest) calls are shown first in the segment for one specific day.
The y-axis lists all actor names in an alphabetical order. Vertical, blue lines symbolize a connection between two persons which are indicated by the connected dots along the horizontal line to their name on the y-axis. Additionally, the names of the pair are shown along the vertical connection line to facilitate identifying the connected individuals. The length of the blue lines has no meaning, it only connects two persons. To get an overview of the connections per person, the horizontal auxiliary lines which are kept in a light gray tone can be followed. When a user wants to know how many phone calls, e.g., Emily had on day 1, she will follow the horizontal gray line and notice that there are three blue dots that represent three calls (to Anna, Olivia and Daniel). The visualization enables users, for example, to see whether a person has frequent phone calls with another (see, e.g., the phone calls between Emily and Anna) and whether the intensity (duration) of these phone calls is similar or not. Users can refer to a specific connection via the presented IDs, which can be helpful when several analysts talk about their findings. The colors are taken from the default color palette of the visualization software described in the following section. The visualization is in its early stages and interactivity is limited. Users can interact via zooming, scrolling and displaying additional information when hovering on connections. Hovering over a link shows the label of the actors, i.e., the names of the participating parties. The analysis might be more effective through the use of more advanced interaction opportunities and features such as filtering. To be able to show communications per person might be, for example, of interest. At the current stage, two views of this filter were explored and designed but not fully implemented. So far, the visualization also does not use color coding, or the thickness of lines or position encoding. It is, of course, possible to use such features to show additional variables. We kept the visualization simple for testing purposes to be able to validate the core functionality. In future work, we intend to develop a more complex visualization using all or most of the features discussed above.

Technical implementation
The interactive visualization was created with Tableau, using an artificially constructed phone call dataset consisting of 26 individuals and six points in time, see Sect. 3.2. Figure 2 shows the stimuli used in the user study. Tableau includes a graphical system that allows users to explore and analyze their own datasets in a simple, quick and uncomplicated way, as described by Wesley et al. (2011). It is based on Polaris (Stolte et al. 2002) and has become a powerful addition to the technology stack of many organizations. Tableau enables users to create interactive visualizations using data from either an online data source or an offline copy of the data. The visualizations can either be saved locally offline or can be collaboratively shared on their server (Wesley et al. 2011).

Methodology
We evaluated the proposed visualizations from two different points of view. On the one hand, we conducted a heuristic evaluation that tries to identify advantages and disadvantages through comparison with a theoretical model. In the context of the heuristic evaluation, we were also able to compare the proposed visualization to alternative solutions. We also conducted a user study to assess the usability and utility of the system. The specific strength of user studies is its ability to identify usability problems that never came up in a heuristic evaluation. Usability experts who conduct such evaluations are sometimes unable to anticipates problems encountered by inexperienced users. This makes usability studies especially valuable. On the other hand, it is difficult to compare different visualizations in the context of usability studies because tasks will always favor one or the other visualization. Therefore, heuristic evaluation and user study will complement each other because they have different strengths and weaknesses. We found that it is more useful to compare different visualizations using an heuristic evaluation and analyze the concrete interactions of users with the visualization in user studies. Beck et al. (2009) influenced the design of the visualization proposed in this paper, but we also used their criteria for evaluation purposes. We evaluated whether the aesthetic criteria by Beck et al. (2009) were truly fulfilled in an heuristic evaluation of the visualization we developed. We further wanted to clarify whether individual criteria might contradict each other. Beck et al. (2009) used their criteria to compare three different visualizations qualitatively and to assess their respective advantages and disadvantages. We compared our visualization to two generic solutions for presenting social networks: node-link diagrams and matrices, see Figs. 3 and 4 respectively.
The research question we wanted to investigate in the heuristic evaluation was: RQ (Heuristic Evaluation) Do the aesthetic criteria that are a kind of quality measure apply to the proposed visualization? How does the visualization compare to other similar visualizations concerning the aesthetic criteria?
The heuristic evaluation was conducted by two usability experts who were also knowledgeable in the area of visualizations. One of the experts had 25 years of expertise, while the other expert had only a few years of expertise. They used the criteria from Beck et al. (2009) to evaluate the visualizations systematically. Both the evaluations were collated to provide a comprehensive overview of their analyses. In this process, it was possible to integrate the different points of view of the two experts. On the whole, the views did not differ considerably. The analysis distinguishes general from dynamic aesthetic criteria, as well as aesthetic scalability criteria. The heuristic evaluation allowed to compare the proposed visualization with other alternative kinds of visualization.

User study
We conducted a small user study with a qualitative methods approach to test the usability of the visualization with regards to social network tasks and receive first feedback for improvements. The work of Lam et al. (2012) was taken as a basis for forming our research questions as well as for designing the user study. These allow the analysts to focus on analyzing the visualization itself, for example, testing its usability and making design decisions as well as the process and role of the visualization. As the primary goal of the study is to see how the visualization is accepted by users, the tasks were based on the two scenarios for evaluating user performance and user experience by Lam et al. (2012). Our research questions were as follows:   RQ1 Suitability for graph visualization tasks: How well does the visualization's performance meet the tasks for which it was designed? RQ2 Usability: Which usability issues can be identified? RQ3 Design Improvements: What design improvements can be made to increase the visualization's usability?

Tasks
The tasks for the user study were created by taking generic task types from Ghoniem et al. (2004) and task taxonomies of Lee et al. (2006) and Kerracher et al. (2015) as starting points. The tasks should represent actions that a user would typically carry out with the visualization. A total of 17 tasks was developed to investigate the research questions. To ensure that the choice of tasks is composed of a balance of tasks from different categories they were grouped into the seven generic network task types (TT) by Ghoniem et al. (2004). However, these task types by Ghoniem et al. (2004) were created for graphs without time dimension. For this reason, the tasks were furthermore adapted to include temporal aspects following the approach of Kerracher et al. (2015). They developed a task taxonomy for temporal graphs by combining each of the non-temporal components graph elements and graph structure with the temporal components time points and time intervals and thereby creating four classes. This taxonomy was considered in our task set by incorporating both time points such as ''on Day 2,'' ''find a day,'' ''when'' both direct and inverse lookup as well as time intervals such as ''over the time period'' or comparing relationships in the task questions. All four task classes by Kerracher et al. (2015) are represented in the task set. Note that some tasks do not directly include wording for time aspects; however, tasks such as finding an edge, a neighbor or a path implicitly include the entire time period as this could be found anywhere along the timeline-based visualization and still gives a sense of temporal overview to the readers.

TT1 Counting the nodes as nodeCount:
Task 1 How many persons can you identify in the network?
TT2 Counting the edges as edgeCount: Task 2 How many phone calls can you identify in the network?
TT3 Finding the most connected node as mostConnected: Task 15 Find a day where people had more than two phone calls? Task 16 Who has the most contact people? How many contact people does she/he have? Task 17 Who has the most phone calls over the time?
TT4 Finding a given node as findNode and TT5 Finding a link between two nodes as findLink: The two task types were combined as findLink implies findNode. In general, at first, the two given nodes must be found, and then find a connection between them. Tasks for temporal graphs by Kerracher et al. (2015) were also incorporated into the following task definitions. These include direct look-up, inverse look-up, direct comparison, inverse comparison and relation seeking.

Procedure
The study followed a qualitative methods approach using think-aloud protocols combined with observations and field notes. Each participant was asked to speak their thoughts out loud and was observed while performing a set of predefined tasks. The field notes include the observed user's interaction techniques, problems and quotes or comments related to the research questions. Each session was audio and screen recorded. Additionally, a follow-up interview was conducted. Each session took approximately 45-60 min. The combination of think-aloud, observation and and interview provided insights on the participants' opinion of the visualization, their strategies for each task, as well as frustrations and considerations encountered while using the visualization. The results of the evaluation were analyzed using inductive thematic analysis. The data were studied in detail to generate meaningful codes. The generated codes were visualized in a table. As a result of the inductive method of thematic analysis, the codes were grouped into four key themes: users' strategies, challenges, preferences, and additional features. Two pilot studies were conducted to assure the tasks are defined clearly and are solvable within a reasonable time.
Each session started with a short introduction of the aim of this work. Afterward, an explanation of the visualization was given personally to the participant. It was emphasized that the visualization itself is in the focus of the evaluation and not the behavior of the participant. Therefore, participants were allowed to ask any questions during the test and go on to the next task if a task was unresolved. Also zooming and scrolling the visualization was allowed. Using a 4-point Likert scale, respondents rated (a) the difficulty level (ranging from ''very easy to very difficult'') and (b) the confidence level (ranging from ''very confident to very unconfident'') of each task. These questions were posed after finishing a task.
At the end of each session, a follow-up interview collected the subjective opinion regarding efficiency, usefulness and user satisfaction with the visualization. The following questions were asked: -What was easy for you? What was difficult for you? -What did you like about the visualization? -What did you not like about the visualization? -What would help you to solve the tasks?

Participants and setting
For this qualitative user study, six participants (three males, three females) with a mean age of 23 years were recruited. Four of the six participants were engineering students, the remaining two were students for landscape architecture. As it was a qualitative study, it was possible to mix computer science students and students coming from a different discipline. Three of them stated that they had experiences working with information visualizations. We aimed to qualitatively investigate whether differences could be noted depending on the participants' previous knowledge; however, we did not find notable insights in this small population. As this is a small exploratory study, the validity of results is limited and could further be investigated in a larger scaled user study.
The visualization was shown on a 15'' Retina Display. In order to have a better understanding on specific interaction techniques with the visualization, all participants were asked to use the mousepad when solving the tasks. We conducted audio and screen recording using QuickTime Player 1. The solutions for the tasks were surveyed using Google Forms which were presented on a tablet computer during the session. The participants could submit their answers directly.

Results
First, we describe the results of the heuristic evaluation and second, the results of the user study. The data of the user study were gathered from the usability evaluation and the follow-up interviews. As a qualitative approach thematic analysis was chosen to make sense of the data. For this, 434 quotes were extracted from the evaluation. Afterward, these quotes were categorized into four themes: user's strategies while solving the tasks, challenges, additional features and general user preferences. Each of these four themes was analyzed individually.

General aesthetic criteria
The criterion GAC1 (reduce visual clutter) is fulfilled by the visualization. In the visualization, there is no occlusion of nodes or links. The visualization only grows horizontally and vertically if data are added. This is an advantage compared to node-link diagrams which are fairly cluttered when they reach a certain size.
GAC2 (reduce spatial aliases) is fulfilled to a certain amount. While the nodes are positioned close to each other, it is still fairly easy to identify the nodes, especially due to the grid included in the visualization. It should be pointed out, however, that in this visualization reduction of spatial aliases and maximization of compactness (GAC4) are contradicting each other. The more compact the visualization is, the less readable it will be. The visualization shares this disadvantage with matrices.
GAC3 (spatial matching of multiple representatives) is fulfilled. Every person is represented by several nodes in telephone calls. These nodes are ordered on a straight line and therefore easily distinguishable. In this sense, the visualization fulfills this criterion. Nevertheless, like in matrices, it is difficult to identify paths. In this case, this would be networks of people related through phone calls.

GAC4 (maximize compactness)
The visualization is fairly compact compared to node-link diagrams, but it still contains some white space. To get a more compact representation, it would be necessary to move the nodes and edges closer together. This would contradict criterion GAC2, and it would not be possible anymore to distinguish between nodes positioned near to each other. Nevertheless, it is much more compact than node-link diagrams. It is also more compact than matrices, especially when the temporal dimension is shown as small multiples. In addition, in this visualization only existing relationships are shown. ''Empty'' relationships that are shown in a matrix (as empty cells) are not shown in this visualization. Its compactness is apparently one of the most important advantages of this visualization.

Dynamic aesthetic criteria
The visualization is especially apt to show temporal patterns clearly. Showing temporal developments in social networks is generally a challenging task. In this context, the visualization presented in this paper offers some advantages.

DAC1 (preserving the mental map)
The visualization shows temporal patterns in a stable manner. Nodes and links do not change their position while showing temporal development, in contrast to some visualizations showing the temporal development of node-link diagrams where the layout is changed due to the addition of nodes and links.

DAC2 (reducing the cognitive load)
The cognitive load is rather low, especially compared to animated visualizations. The various graphs are juxtaposed; therefore, the user only needs to remember a small amount of information at any given point in time. He or she can always look up the information easily. So far, in this prototype, no filtering options are offered, but with these filtering options, cognitive load could be reduced considerably. It should be mentioned, however, that the cognitive load might increase with an increasing size of the visualization. In addition, it is difficult to identify paths in the data. In this case, the cognitive load is also high.

DAC3 (minimizing temporal aliases)
This problem especially occurs in animations. It is minimized in the visualization presented in this paper because nodes are positioned on a time axis, and, for example, persons doing phone calls can easily be identified, also across different time periods.

Aesthetic scalability criteria
In general, the visualization does not scale very well when many persons and many time steps are shown. Nevertheless, it should be pointed out that the visualization can show more nodes and links than conventional node-link diagrams and still be easy to interpret.
In general, it can be said that according to the criteria described above, the visualization presented in this paper is superior to node-link diagrams showing the same amount of information, especially because it avoids clutter. It is more compact than either node-link diagram or matrices because it conveys more information in less space. It especially conveys information about the time and the duration of phone calls easily (on the top and the bottom of the visualization). It is also possible to show persons who communicate twice in one unit of time. All this would be a bit cumbersome in either node-link diagram or matrices. The visualization is, therefore, especially appropriate to represent temporal patterns in phone calls. On the other hand, it shares some of the disadvantages of matrices. Especially in the case of a large number of members of the social network and a very compact representation, lines might be confused. All the dynamic criteria (preserving the mental map, reducing cognitive load, minimizing temporal aliases) are fulfilled. One problem about the visualization is scalability. It does not scale very well for larger numbers of persons and many points in time. Scrolling would be necessary, and this would especially contradict DAC2 (reducing the cognitive load) because when users have to scroll they have to keep more information in mind than when they perceive everything at a glance. In general, it can be said that the criteria developed by Beck et al. (2009) help to identify the advantages and disadvantages of dynamic graph visualizations.

Users' task-solving strategies
Using the observations, screen captures and categorized think-aloud quotes, it was attempted to derive strategies participants used to solve the tasks. The strategies are presented for each task type (TT1-TT7). The error rates of each task and the level of difficulty and confidence which were estimated by the participants are shown in Fig. 5.

TT1: nodeCount
In this task, the number of individuals in the network was asked. All participants answered correctly and found the task very easy. Also, five of six participants were very confident with their solution.
Two different patterns could be identified among the participants. Half of the participants counted the names on the y-axis. Those mentioned that they would find it helpful if there was a value that can be read from the visualization directly and they would expect this information on the right side of y-axis. The other half of the participants retrieved the number of individuals in the network from the y-axis on the right side. However, there were two participants who noticed the axis information, but were uncertain about what the values meant.

TT2: edgeCount
The task for edgeCount was to find how many phone calls are in the network. All participants answered correctly and perceived the task as easy. Five of 6 participants were very confident with their solution.
As expected, almost all participants solved this task by reading the information that was displayed on the top axis. There was only one participant who counted the encountered edges in the visualization. The reason for this was that the test person was unsure about the communication ID and the day as both information was displayed on the top axis.

TT3: mostConnected
There were three tasks for this task group: finding a day where people had more than two phone calls, finding the node with the most contact partners and finally finding the node with the most phone calls throughout the whole visualization. The latter two were answered correctly by all participants. The first task could not be answered correctly by two participants and one participant rated the task as difficult.
In this group of tasks all participants used similar techniques: By going through each person's horizontal line, either the number of nodes was precisely counted or a rough estimation was made. The latter was achieved by scanning through the lines quickly. Two of six participants were able to tell at a glance which individuals in the visualization have fewer connections than others and could therefore exclude these quickly.

TT4: findNode
This task type occurs in almost every task. From the analysis of tasks 3-7 various strategies to find a node could be identified. Finding specific nodes, i.e., names, was rated as very easy. Tasks including temporal aspects were rated more difficult; the combination of finding an unspecific node with a temporal criteria (task 7) was rated less easy and two participants rated it as rather and very difficult, however the error rate was 0%. All participants were very or rather confident.
The first technique applied by all participants was to find the node on the left side of the y-axis. This was particularly the case with inverse look-up tasks. Subjects went through the list from top to bottom until the name sought after was found. 5 of 6 participants did not realize that the names are sorted alphabetically in descending order. Therefore, search effort was required. Instead of scrolling to the left side, participants looked for a specific node in the visualization area. This pattern occurred when participants zoomed into the visualization and thus only a part of it was visible. Moreover, participants used this technique when a certain point in time as additional parameter was provided.

TT5: findLink
FindLink was also an essential part of tasks 3-7. The most intuitive way of finding a connection between two nodes was to focus on one row and the corresponding dots. People read the names next to the colored dots as they represent the information of a link. In contrast, a less frequently used technique to find a link was to first look for the names on the y-axis and check if they have connecting lines. Three tasks belong to this category. Two tasks were solved with a 0% error rate. The difficulty level, however, was rated as more difficult in, e.g., task 10 by half of the participants. Two participants failed to answer task 9 correctly, one participant reported lower confidence.
Participants applied three techniques in order to find common friends of two given nodes. The first approach involves taking one node and seeing all relations for one node. They followed the auxiliary line and kept the relations in mind. Afterward, the second node was focused and revisions were carried out as to whether connections exist between the second node and the adjacent nodes to the first node. Another technique is quite similar to the previous technique. Participants only looked at the connections from one node. They took the first encountered edge and checked whether the second node also has a direct connection with the focused node. If this was not the case, participants kept repeating the action until they found a node that is adjacent to both given nodes. The last technique was only used once: one participant took both nodes and followed the lines in order to see if a common friend appeared on both lines.

TT7: findPath
Finding a path in the evaluated visualization can be achieved by applying the technique for finding a neighbor repeatedly. We used four path-related tasks in this study (tasks 11-14), one of which was to validate a given path. Participants struggled most with this type of task and the highest error rate (67% ) could be observed. One participant was not confident in any of the tasks and overall all participants found the tasks to be more difficult.
We could observe the following pattern for finding paths. The participants started with one node and looked for its relations. When a given node had only one connected actor, they kept looking for the relations of the actors node. When a node had more than one unique connection then the first relation found was considered. After that, it was compared with the second given node in order to find out whether there is a link. When two nodes are given, participants started with one node and looked for its relations. Then, they worked with the neighbor nodes and continued looking for other relations until a path was found.
Another common pattern that was identified is that people tend to start with the node with fewer connections. In that case, participants used the find ''mostConnected'' technique, compare Sect. 6.2.4.

Challenges
Four main challenges were identified in the observations and in the participant's statements during the qualitative evaluation and follow-up interviews: Remembering Things The study demonstrates that for finding paths, comparing more than three nodes was difficult. Participants complained that ''it's exhausting'' and ''it's becoming difficult.'' Four of six participants mentioned that they often feel confused by persons that have several different relations. A main issue for them was that a lot of information had to be remembered and compared during the tasks. One participant also suggested: ''... for finding a path another visualization would probably be better.'' Finding Maximum of Intensity of Given Couple In some tasks, the intensity was relevant. The meaning of intensity information did not seem to be a problem. When two individuals in the network have a higher intensity, participants interpreted this as they are having a strong friendship or close relationship. However, finding the connection with the maximum intensity between two nodes over the time led to confusion at several points. In 2 of 6 interviews, it was mentioned that they were unsure whether there is a need to compare the intensity values or only the position of a conversation within a day is sufficient.
Layout This sub-theme summarizes all feedback gathered regarding the layout of graph data. An unanimous opinion emerged from the evaluation. Essential information such as number of calls or names of individuals can be clearly seen in the visualization. Nevertheless, the communication ID and the day were often confused, likely because both information were located in close proximity.
Another problem was the text orientation of the names next to the dots. Two participants found it hard to read the names because of their vertical orientation. According to the observations during two of the evaluation sessions, the participants had to turn their head to be able to read the names correctly. Regarding the names on the left side of the y-axis, there were no complaints, but one participant mentioned that it would be easier to find the name in the list when the names are sorted alphabetically in ascending order to scan the list from the top to bottom.
Navigation and Orientation Participants used the horizontal lines and the vertical lines as well to navigate through the visualization. On the one hand, they were seen as useful as one participant stated ''Lines make sense now when having so many names.'' On the other hand, one participant mentioned that following the horizontal line from one end to the other is not always possible. Especially, when users want to zoom in and out, it can happen that they slide into the wrong line. People found it challenging to do comparison tasks. When they had to jump from a point to another or jump between the lines, they lost the thread easily.

Features
In the follow-up interview, the participants were asked which additional features would have helped to solve the social network tasks. Their answers are grouped into three sub-themes.
Filtering Participants would like to filter by days, certain relations, individuals or intensities so that they can reduce the number of phone calls in the representation to only those of interest. They want to be in full control of the information that is hidden or additional information being shown. Figures 6 and 7 illustrate two filtered views of Kate's communications. Figure 6 uses the standard visualization characteristics removing the calls where Kate is not part of, which means results are ordered by day and intensity. The idea in this approach is to support the constructed mental map from the overview by changing as little as possible in the new view. Figure 7, on the other hand, groups the calls by partners instead of days. In this way, relationships and common history between pairs can be investigated easily.
The first view (Fig. 6) has the advantage to easily observe specific patterns that are related to time, for example, that Kate always communicates with John and Viktor on the same days, and never with just one of them. The same information is evident in the second view (Fig. 7) as well, but it might not be detected easily. On the other hand, this view can help to make sense of the different relationships the person has and how intense these relationships are. It is immediately recognizable, for example, if the person communicates every day with one person but only every other day with another person due to significantly different column widths per person.
Highlighting A functionality that allows users to mark specific nodes, edges or days in order to focus on the relevant data of the highlighted component was mentioned by one participant.
Paper and Pen Two participants mentioned that it would have been helpful to have a pen and paper to write notes on relations or paths they found. This feature does not address the visualization directly; however, it indicates that the mental load to remember information is quite high.

User preferences and comments
Four of six participants reported that it was easy to find whether two nodes are connected. Two participants stated that ''it was exhausting'' and ''it took so long'' to find the most connected node; however, they rated those types of tasks fairly easy. In contrast, people found tasks that asked for indirect relations with more than three nodes particularly hard because a lot of information had to be remembered and compared.
There were also comments on visual components like color, line, value and space. The opinions regarding the design of the nodes in the visualization varied among the participants. One participant found the size of the nodes too big. With 20 names and their corresponding lines, it became ambiguous for that participant. Another participant stated that the ratio of node size and line thickness is well chosen. Regarding the color of the nodes, the participant suggested that the colors of the node and connection line should not be the same as it seems monotonous. The participant stated that ''you can loose track easily.'' Lastly, one participant would prefer the vertical day separator lines to be ''a little bit darker'' because ''it can be misunderstood as the connection between two persons.'' Two participants positively noted the spacing between the lines and emphasized that the horizontal lines for each person helped a lot while solving the tasks. Overall participants find that the visualization fulfills its purpose as all tasks could be solved.

Discussion and design implications
We conducted two different kinds of evaluations with the proposed visualization: a heuristic evaluation and a user study. The heuristic evaluation indicates that the proposed visualization has some advantages. Visual clutter is reduced, but the visualization is still fairly compact. The mental map is preserved and cognitive load is fairly low, especially compared to animations. It should be mentioned, however, that the visualization does not scale very well. In general, it has some definite advantages compared to node-link diagrams and matrices.
The usability study yields the following results: The proposed visualization appears to be suitable for most of the described tasks. Especially, nodeCount, edgeCount, mostConnected, findNode and findLink seem to be solvable with ease while path-related tasks such as findNeighbor and findPath become more challenging. This reflects the similarity to matrix-based representations. The compactness of a matrix gives it certain advantages such as its resistance to higher density of the visualized dataset. On the other hand, the fact that the proposed visualization is less compact opens up possibilities for labeling which may facilitate reading detail information for users. The proposed visualization does not become crowded with increased density such as node-link representations do but instead grows in its horizontal size. Although only six subjects participated in the study, trends and possible usability issues were discovered. Based on these findings, some design implications were drawn.

Additional view
The visualization appeared to be less suitable for path-related tasks. Therefore, an additional visual representation, for example, a node-link representation which allows to easily find paths, could support the current visualization. The day separator could be designed to be more prominent such as with a darker color or a thicker line as it was often overlooked.

Use color
The current version of the visualization uses three colors to differentiate the components. Adding more colors could provide a better differentiation between the different elements.

Orient labels horizontally
The vertically oriented text was hard to read for the participants and resulted in a poor reading performance. In order to avoid this issue, the text should be oriented horizontally which however would necessitate more spacing between the connections.

Limitations of the visualization
The visualization is suited for showing dynamic social networks with the dataset used in the evaluation. This dataset, however, shows only records of phone calls between two actors. Therefore, group calls are excluded, which might occur in real life. The limited interaction possibilities might also have an effect on the results of the user study. In the current state, two views of this filter possibilities were explored and designed, but its benefits and perceived usability remain to be evaluated. Extended interactivity of the Fig. 7 Example of a filtered view with an additional grouping on the x-axis: The communications of Kate now grouped by name of the communication partner (top label). The second row in the top label shows the day and the third shows the ID of the dataset. On the bottom label, again, the intensity is given visualization should be implemented in the next phase. Features such as filtering and highlighting were mentioned by three participants. The proposed filtering techniques, therefore, seem to be relevant and should be developed further.

Conclusion and future work
In this work, the early stage of a new visualization for social networks with an added time dimension was presented. The visualization was created using the visual analysis tool Tableau and an artificial dataset having existing aesthetic criteria and task sets in mind. The proposed visualization has characteristics of both node-link diagrams and adjacency matrices.
A qualitative user study was conducted with the aim to evaluate the suitability for graph visualization tasks, the usability and to provide improvements on design. Seven task types with a total of 17 tasks were considered. The evaluation was carried out with six participants, each in a 45-60 min session. In addition, a heuristic evaluation was conducted that indicates that the proposed visualization does not contain clutter, is fairly compact and helps to overcome cognitive load. On the other hand, it does not scale well. The evaluation in this work showed that the proposed visualization performs well for direct look-ups and comparisons that do not involve finding paths. Though the visualization received overall positive feedback on its visual design, some adjustments to the labeling and coloring can be made to improve the usability for certain tasks.
Further empirical studies are subject to future work in order to obtain statistically significant results and to determine the applicability in different domains, contexts and application scenarios. In future work, the visualization can also be extended to represent group calls; the visualization approach easily allows to connect multiple people by adding dots on the respective connection lines. The visualization can further be enhanced by interactivity such as filtering, sorting or highlighting. Further evaluations are necessary to explore how the visualization scales to larger datasets and how interactive views affect the performance and usability.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Funding Open access funding provided by TU Wien (TUW)..

Appendix
See Tables 2 and 3.