Visualization of Online-Game Players Based on Their Action Behaviors

We propose a visualization approach for analyzing players’ action behaviors. The proposed approach consists of two visualization techniques: classical multidimensional scaling (CMDS) and KeyGraph. CMDS is for discovering clusters of players who behave similarly. KeyGraph is for interpreting action behaviors of players in a cluster of interest. In order to reduce the dimension of matrices used in computation of the CMDS input, we exploit a time-series reduction technique recently proposed by us. Our visualization approach is evaluated using log of an online game where three-player types according to Bartle’s taxonomy are found, that is, achievers, explorers, and socializers.


INTRODUCTION
The market size of online games continues to experience surging growth [1].At the same time, competitions among them are also becoming very high.The quality of player service plays an important role in winning such competitions.It is therefore inevitable to online-game developers and publishers to know their player behaviors so that they can develop game contents that fulfill player demands.Visualization techniques have been recently applied to discover ingame player behaviors.
Most work in the literature focuses on visualization of player trails or time series of visited locations for examining the distance over time among the members of a social group [2], discovering playing strategies in a combat game [3], and analyzing movement patterns [4].Other work focuses on extracting pathways [5] and on locating clusters of similar players based on their movement patterns [6].
This research, however, focuses on visualizing player behaviors based on their actions.According to Bartle's taxonomy [7], online-game players can be typically identified based on their action behaviors into achievers, explorers, killers, and socializers.Player-type information should, therefore, be exploited to provide game contents that players favor, for example, a wider variety of collectable items for achievers, longer missions for explorers, more hunting opportunities for killers, and a higher frequency of social events for socializers.It has been recently reported in [8] that Bartle's taxonomy is also applicable to social data in a webbased application.
In this paper, we propose an approach for visualizing players' action behaviors using classical multidimensional scaling (CMDS) [9] and KeyGraph [10], both described in Section 2. First, CMDS is used for locating clusters of similarly behaving players.KeyGraph is then used for interpreting playing behaviors of players in a cluster of interest.The input to CMDS is derived based on timeseries matrices of players' action sequences which are needlessly long due to noise and redundancy, leading to high computational cost.We, therefore, compute the CMDS input based on reduced time-series matrices obtained by our recently proposed time-series reduction technique in [11].To make this paper self-contained, this technique is described in Section 3. Evaluation of our visualization approach is given in Section 4, where achievers, explorers, and socializers are found in play log from an online-game used in the evaluation.

PLAYER VISUALIZATION
In this section, we describe CMDS, KeyGraph, log format, and visualization metrics.As with most other tools for information visualization [12], subjective interpretation is required for KeyGraph.The described visualization metrics are used for facilitating this task in Section 4.2.

Multidimensional scaling
CMDS is a prevailing technique for mapping pair-wise relationships to coordinates and has been applied to several areas such as statistics, psychology, sociology, political sciences, and marketing [9].Recently, this technique has been successfully applied to clustering of online-game players based on their movement patterns [6].CMDS takes as its input matrix D, indicating dissimilarities between player pairs, and outputs a coordinate matrix whose configuration minimizes a loss function in preserving all interpoint distances.Two time series of interest are considered similar if they have similar rise and fall patterns, although they might have different scales on the time axis.A good measurement for deriving the distance or dissimilarities between such series is the dynamic time warping (DTW) distance [13].
In our research, the i jth element in D is the DTW distance between the reduced time-series matrices of action sequences of players i and j.In addition, we use the function cmdscale in the Statistical Toolbox of Matlab for performing CMDS and select only the first two dimensions of the constructed coordinates for plotting players.

Action coding
Here, we describe how action sequences are numerically coded into time-series matrices for computation of DTW distances.Let O denote the set of action symbols of interest and |O| its cardinality.Action sequence where X(i) is a column vector with the element indexing the action symbol of s(i) being 1 and other elements 0.
Consider, for example, the set of action symbols O = {A, B, C}, and thus |O| = 3, where symbols A, B, and C are represented by column vectors [100] t , [010] t , and [001] t .In an action sequence such as

Dynamic time warping
The DTW distance between time-series matrices X and Y, dtw(X, Y), having lengths L X and L Y , is defined as follows:   where and d(i, j) is the Euclidean distance between X(i) and Y ( j).Consider, for example, the set of symbols O = {A, B, C, D, E, F} and two action sequences C, B, E, F  and y = A, B, C, E, D, F, B. The DTW distance between corresponding time-series matrices X and Y (c.f., Figure 1), dtw(X, Y), is 5.6, derived as shown in Figure 2. Three major components of KeyGraph are as follows.
(1) Foundations: subgraphs of highly associated and frequent terms representing basic concepts in the data.(2) Roofs: terms highly associated with foundations.
(3) Columns: associations between foundations and roofs used for extracting keywords or main concepts in the data.
Associations between terms are defined as the cooccurrence among them in same sentences, and keywords are the terms in either foundations or roofs that are connected to strong columns.Under KeyGraph representation, solid lines and their touching black nodes depict foundations, dotted lines depict columns, red nodes depict roofs excluding those in the foundations, and double circles depict keywords.We use a tool called Polaris [17], publicly available, for generating KeyGraphs (http://www.chokkan.org/software/dist/polaris-0.19alpha.zip).Figure 3 shows an example of KeyGraph when it is applied to the text data taken from the abstract of this paper, where common preprocessing for text data such as removing of conjunctions, determiners, and prepositions is performed.From this figure, it can be seen that there is one foundation consisting of four terms, that is, "KeyGraph," "action," "behaviors," and "players."The first three terms are also keywords.Another keyword is "proposed."Three roof terms are "consists," "two," and "techniques."These terms well represent the messages in the abstract.

Log format and visualization metrics
Player action sequences in our work are sequences of action symbols extracted from game log, of an online game discussed in Section 4, that has the following format: time stamp player ID event start position stop position , where an event consists of an action and its object, if any.This format is based on those adopted in an MMOG simulator in [18] and 3D virtual worlds in [2].In commercial online games, this kind of game log is stored in the monitoring database [19].
According to a recent work in [20], there are three categories of play motivations in online games as follows.
(i) Achievement consisting of three subcomponents, that is, advancement, mechanics, and competition.
(ii) Social consisting of three subcomponents, that is, socializing, relationship, and teamwork.
(iii) Immersion consisting of four subcomponents, that is, discovery, role-playing, customization, and escapism.
The achievement, social, and immersion categories correspond to Bartle's achievers, socializers, and explorers, respectively, although the above ten motivations overlap among player types.
In our work, we focus in particular on (i) advancement described in [20] as the desire to gain power progresses rapidly and accumulates in-game symbols of wealth or status, (ii) socializing described in [20] as having an interest in helping and chatting with other players, and (iii) discovery described in [20] as finding and knowing things that most other players do not know about.
This is because we anticipate that they should be identifiable using our action sequences and KeyGraphs.Below, we verify this anticipation with simplified data sets and their KeyGraphs, which serve as our visualization metrics for facilitating interpretation of KeyGraph results in Section 4.2.Let us consider a set of action symbols {c, w, m, n, r}, standing for chat, walk, interaction with a mission master, interaction with a nearby object (item, NPC, or monster), and interaction with a remote object, respectively.The symbol w is a fundamental action and thus should be a frequent symbol in all action sequences.It is therefore removed from our consideration.For achievers motivated by advancement, interactions with mission masters should International Journal of Computer Games Technology be frequently seen in their action sequences, and thus all possible sets of frequent action symbols for them are Figure 4 shows the resulting KeyGraphs for the three player types, where each KeyGraph was generated from the data sets of the corresponding player type.Note that m, c, and r are the keyword nodes in the KeyGraphs of achiever, socializer, and explorer, respectively, and henceforth these findings are used as visualization metrics.

TIME-SERIES REDUCTION
Our technique for obtaining compact sequences representing major player behaviors is based on Haar wavelet transform [21].The Haar wavelet transform technique has a wide range of time-series applications including classification of DNA sequences [22], which motivated us to apply the technique to action sequences.Below, we first give an outline of the Haar wavelet transform and then describe the time-series reduction technique.
In the wavelet transform concept, decomposition involves obtaining wavelet coefficients from a sequence of interest.Reconstruction involves recovering the original sequence from obtained coefficients.Henceforth, it is assumed that the length L of a sequence is adjusted so that L is a power of 2 and q = log 2 (L).The ith Haar wavelet coefficient at resolution order k, d (k,i) , is derived as where x (k,i) = (x (k+1,2i−1) + x (k+1,2i) )/2 is the ith average at order k between two corresponding adjacent values in order k + 1.With this representation, k max = q, the original sequence is represented by x = x (q,1) , x (q,2) , . . ., x (q,L) .An example of Haar wavelet decomposition of the sequence 6, 8, 2, 7, 6, 5, 4, 3 is shown in Table 1.Reconstruction of a given sequence from its Haar wavelet coefficients and averages is done as follows: Now, we describe our procedure for reducing the length of the time-series matrix of an action sequence of interest.For explanation, we use action sequence x = A, B, C, E, D, F, B, A as an example, where O = {A, B, C, D, E, F} and thus |O| = 6.
(i) Derive time-series matrix X for action sequence x of interest having length L. Figure 5 shows resulting time-series matrix X in our example.
(ii) Decompose each row in X to obtain Haar wavelet coefficients.Figure 6 shows resulting averages and coefficients for X in our example.
(iii) Reconstruct each row in X with selected Haar wavelet coefficients as follows.
Reconstruction of each row in X starts from the coefficient at the lowest resolution order, that is, d (1,1), to those at the next higher order, and so forth.At a given resolution order, when the number of remaining coefficients is less than the number of coefficients in that order, they are selected  based on their total energy value in decreasing order, where total energy of d (k,i) , E (k,i) , is defined as where d (n,k,i) is d (k,i) decomposed at row n of X.All other unselected coefficients are then reset to zero. Figure 7 shows resulting X after selection of four coefficients in our example.Following the recipe in [21], the number of Haar wavelet coefficients used in our performance evaluation is heuristically set to min(L − 1, log 2 L × 4 ).
(iv) Reconstruct X with the above coefficients (c.f., Figure 8 for our example).
(v) Reduce the size of X by sampling down a group of repetitive and consecutive elements at each reconstructed row to one element.Figure 9 shows the reduced X in our example.
Note that the DTW distance between the reduced timeseries matrices X of players i and j is assigned to the i jth element of the CMDS input matrix D discussed in Section 2.1.

Settings
We obtained player log from the online game The ICE [23], under development at our laboratory.A screen shot of The ICE and the game map in use are shown in Figures 10 and  11, respectively.The main game objects were nonplayer characters (NPCs), statically positioned at different locations, with whom player characters (PCs) must interact-(chat,  help, trade)-to receive and complete missions; the itemshop, from which PCs bought items and monsters, randomly positioned throughout the game world for PCs to attack with snowballs.Major missions in the game are as follows.
(i) Item delivery where the PC must deliver an item from the mission issuing NPC to a specified NPC.
(ii) Item trade where the PC must trade with NPCs to increase the amount of money initially provided by the mission issuing NPC.
(iii) Monster extermination where the PC must help the mission issuing NPC by exterminating monsters.
Actions available in The ICE are summarized in Tables 2  and 3.All NPCs are involved in missions, except NPCs 1, 13, 14, and 16.In the resulting KeyGraphs given in Section 4.2, the symbols for these four nonmission NPCs and monsters are preceded by "n" for those residing in Town 1, that is, nH, and "r" for those in Town 2 or the eastern border of the map, that is, rT, rU, rW, rA, and rD.This is done in order to utilize the visualization metrics in Section 2.3.
A group of 20 players, on average 20 years of age, participated in this evaluation.These players consisted of third-year and fourth-year computer science undergraduate students who were familiar with online-games but had no experience in playing The ICE.After a brief to the game, they were asked to arbitrarily play it, starting from Town 1.In addition to these 20 players, labeled p1-p20, three game masters, JOJO, Justice, KURO, also participated in the event.In the rest of our evaluation, the symbol w was removed from the log because it was frequently present in all players' action sequences and thus bared no information.

Results and discussions
Table 4 shows the mean and variance of time-series matrices of action sequences before and after the time-series reduction technique is applied.
Figure 12 plots all players on two-dimensional space obtained by CMDS.Most players form a cluster on the right half of the figure.The rests can be considered as outliers, that is, p1, p5, p8, p9, p17.To remove the effect of these outliers,  we excluded them from the log and obtained a new result in Figure 13.From Figure 13, most players can be divided into three clusters: cluster 1 of Justice, JOJO, KURO, p10, p15, p20; cluster 2 of p2, p3, p4, p6; cluster 3 of p7, p16, p18, p19.Each cluster has different player behaviors as discussed below through interpretation of KeyGraph visualization results.

Cluster 1: explorers
Figure 14 shows the KeyGraph of cluster 1 from which salient features are summarized in the following.
(i) They moved away from town 1 and fought monsters 2 and 3.
(ii) They also went to the end of the map and fought the boss monster.
(iii) They were not active in pursuing missions.
The above summary is based on our interpretation of this KeyGraph as follows.First, it can be seen that the foundation of this KeyGraph is mainly composed of warp and attack (monsters 2 and 3) nodes.Next, the symbol rD is a keyword indicating that these players went far away to the end of the map and fought the boss monster there.In addition, because there is only one NPC symbol J in the keywords, these players were not active in receiving missions, from NPCs, and in pursuing them.
Consequently, it can be stated that the players in cluster 1 like to explore the world map and that these players have no interest in pursuing missions and only fight monsters when they find them.This type of players fits Bartle's explorer.

Cluster 2: achievers
Figure 15 shows the KeyGraph of cluster 2 from which salient features are summarized in the following.
(i) They mainly moved within town 1.
(iii) They received a lot of missions.
The above summary is based on our interpretation of this KeyGraph as follows.First, it can be seen that besides nodes related to fighting (B, C, E, u, p), nodes of NPCs residing in town 1 (I, J, K) are included in the foundation of this KeyGraph.This indicates that these players were mainly in town 1.In addition, the keywords include symbols L and R which denote NPCs who are involved in several missions.As a result, it can be stated that the players in cluster 2 are aggressive in pursuing missions, especially those completable within or not far away from town 1.This type of players fits Bartle's achiever.

Cluster 3: socializers
Figure 16 shows the KeyGraph of cluster 3 from which salient features are summarized in the following.
(i) They chatted a lot.
(iii) They also fought the boss monster.
The above summary is based on our interpretation of this KeyGraph as follows.First, the foundation of this KeyGraph includes the symbol c, not seen in the foundation or keywords of the previous two clusters.This indicates that these players chatted a lot among each other.Next, the symbol rD is a keyword showing that the players also fought the boss monster.We have confirmed through directly investigating the log that a group of three players (p5, p7, p8) and another group of four players (p16, p17, p18, p19) frequently chatted among their group members and that each group went together to the end of the map to fight the boss monster.
From the above interpretation, it can be stated that the players in this cluster like to communicate with others via chats.This type of players fits Bartle's socializers.

Computational complexity
We give here the computational complexity of the techniques used in our approach.
(i) CMDS: for m players, the time complexity of the original CMDS is O(m 3 ).To cope with very large m, a recently proposed approximation [24] taking an O(m) time can be used.
(ii) DTW: the time complexity of DTW for computing the distance between two time series of length l x and l y is O(l x l y ).We coped with this issue with the time-series reduction technique in Section 3.This technique can also be used together with an approximation technique in [25] that introduces lower bounding based on warping constraints.
(iii) KeyGraph: the time complexity of KeyGraph is O(n log n), where n is the number of action symbol types.
(iv) Wavelet: for a time-series of length l, the Haar wavelet transform has an O(l) time.

CONCLUSIONS AND FUTURE WORK
Understanding the player behaviors is an important issue in improving the service quality of online games.We have proposed a visualization approach that first locates clusters of players who have similar action behaviors using CMDS and then interprets such behaviors of a cluster of interest using KeyGraph.To increase the efficiency in computation of the CMDS input, we have described the use of the timeseries reduction technique proposed recently by us in [11].Evaluation of the proposed approach has been done using log from The ICE, where three clusters have been found to fit three of the four Bartle's player types, that is, achievers, explorers, and socializers.Our future work is to apply the proposed approach to log from commercial online games and to examine if Bartle's player types can be found.It might also be interesting to investigate log formats whose information can be used for automatically identifying other types of Nick Yee's play motivations.

Figure 1 :
Figure 1: Time-series matrices X and Y.

Figure 2 :
Figure 2: Derivation of dynamic time warping distance between X and Y.

Figure 3 :
Figure 3: KeyGraph applied to the abstract of this paper.

Figure 4 :
Figure 4: (a) KeyGraphs for achievers, (b) socializers, and (c) explorers generated from the given sample data sets, where m (interaction to a mission master), c (chat), and r (interaction to a remote object) are the keywords in (a), (b), and (c), respectively.

Figure 10 :Figure 11 :
Figure 10: A screen shot of The ICE.

Figure 12 :
Figure 12: MDS result for all data.

3 Figure 13 :Figure 14 :
Figure 13: MDS result for data after exclusion of the outliers.

Figure 15 :Figure 16 :
Figure 15: KeyGraph of cluster 2, where L and R (talk to mission NPCs) are among the keywords.

Table 1 :
Example of Haar wavelet transform.

Table 2 :
Action list of The ICE.

Table 3 :
List of additional symbols related to actions Attack and Talk.
Symbol Description A A t t a c k t o m o n s t e r 1 B A t t a c k t o m o n s t e r 2 C A t t a c k t o m o n s t e r 3 D A t t a c k t o b o s s m o n s t e

Table 4 :
Mean and variance of the data lengths before and after applying the time-series reduction technique.