Economic games are the experimental instrument primarily used by economists and psychologists to study how individuals make decisions while interacting with others. Careful design of economic games allows researchers to infer people’s motives and decision processes from observing their choices. Recently, however, researchers have moved from studying choices alone to a more process-focused perspective (Camerer & Johnson, 2004; Fehr & Schmidt, 1999; Schulte-Mecklenbeck, Kühberger, & Ranyard, 2011; Wang, Spezio, & Camerer, 2010). For example, they have begun using various tools to track attention. One example is the use of MouseLab to monitor people’s information acquisition when making decisions (Willemsen & Johnson, 2011). Although MouseLab has given researchers a window into the decision-making process, some have argued that proceeding through information within this framework, in which information is hidden behind boxes, incurs costs that might influence decision processes and, ultimately, choices (Glöckner & Herbold, 2011; Lohse & Johnson, 1996). Eyetrackers come without the problem of extensive search costs, as information is available literally at a glance.

There is, however, a downside to eyetracking, one that is particularly important when studying interactive games: Given the hefty initial price tag of tens of thousands of US dollars per unit, few research institutions have previously been able to afford more than one eyetracker, if any at all. Relying on a single eyetracker is particularly problematic in the context of interactive economic games, in which researchers are interested in an online account of how participants react to the behavior of others. A single eyetracker allows researchers to track the attention of one player, but the simultaneous dynamics of attention during group interactions remains hidden.

The recent introduction of inexpensive eyetrackers is changing this landscape considerably. One example is an eyetracker marketed by The Eye Tribe (www.theeyetribe.com) for $99, or a Tobii (www.tobii.com) for $200. Other companies, including SMI (www.smivision.com), are also beginning to offer trackers in a much lower prize range (typically around $500, as of 2016). With this substantial reduction in prices, laboratories are now able to acquire several units, enabling the simultaneous eyetracking of multiple participants. To date, however, experiments involving multiparticipant interaction and multiple eyetrackers have not intersected. To our knowledge, no available software package allows researchers to construct an experiment to simultaneously collect behavioral and eyetracking data from (many) interacting participants. We created such a software package.

Synchronizing by matching time stamps

A simple, yet tedious approach to conducting simultaneous eyetracking studies is to run a behavioral experiment using existing software (e.g., zTree) while recording eyetracking data individually on multiple computers at the same time (see, e.g., Fiedler, Glöckner, Nicklisch, & Dickert, 2013). Matching the time stamps from eyetracking and behavioral data, without accessing the eyetracker controls, may provide the researcher with the desired datasets for analysis. However, we find this approach inadequate for three reasons.

  1. 1.

    Gaze-contingent experiments, in which the procedure is conditioned on specific gaze patterns, are not possible without access to the tracker data in real time. In a multiparticipant setup, the experimenter may want to make one participant’s display contingent on the participant’s own gaze location, on another participant’s gaze location, or some combination of the two. For example, the experimenter may want participants to proceed to make decisions only when all interacting participants have fixated on all relevant information. This cannot be accomplished by matching time stamps, but only by processing tracker data in real time.

  2. 2.

    Without direct access to the eyetracker controls, calibration and other tasks must be conducted independently of the behavioral task and be supervised by the experimenter. In experiments with many participants, this is costly, and participant–experimenter interaction introduces noise. With direct access to the tracker controls, the experimenter can decide a priori how to perform the calibration, and can run it in a self-paced manner. For example, the experimenter can decide under which conditions the calibration will be considered successful, the number of calibration trials, and the procedure to follow if calibration fails (e.g., to continue the experiment with or without tracking, to quit the experiment, or to replace the participant by one whose calibration was successful). In such a setup, the eyetracker can be calibrated without the experimenter’s supervision and simultaneously for several participants.

  3. 3.

    Matching time stamps generates an unnecessarily large amount of data, because the recording of information cannot be turned on and off at desired/synchronized points of the experiment. Using a centralized software package to bundle the communication of each participating client allows control of the eyetracker in terms of onset and offset in recording periods of interest. Onset and offset signals may also be contingent on specific behaviors in the experimental task, or on specific fixation patterns. This allows the more selective and efficient collection of data.

Features required in the software package

In developing software to coordinate experiments with multiple participants in eyetracking studies, we first identified the following features that would be necessary in the package:

  1. 1.

    User input: Record mouse clicks, keystrokes, or other input that participants use to express preferences or valuations.

  2. 2.

    Eyetracker data: Record time-series data generated by the eyetracker, including x,y coordinates of gaze locations on the screen and pupil dilation.

  3. 3.

    Network data: Coordinate user input, eyetracker data, and other information transmitted by instances of the software running on all participants’ computers.

  4. 4.

    Scalability: Scale easily to n participants (where n > 2).

Items 1–3 relate to data-handling issues. Some existing software packages handle these three types of data independently. For example, experimental software packages commonly used in psychology or neuroscience (E-Prime, www.pstnet.com/eprime.cfm; Open Sesame, www.osdoc.cogsci.nl) handle mouse and keyboard inputs with the appropriate drivers; several other applications are able to process eyetracker data (E-Prime; Presentation, www.neurobs.com; Psychophysics Toolbox in MATLAB, www.psychtoolbox.org) in various setups, and even network data (zTree; Fischbacher, 2007) in economic games. Although the above-mentioned software packages can handle some of the listed features, currently no software package is available that can process the flow of all these data sources natively and in real time.

Item 4, scalability, becomes important when it is necessary to adapt to changing experimental situations (e.g., different numbers of participants per experimental session). One way to achieve a scalable setup is to centralize certain processes on a server and to let clients communicate with this central node in a so-called star network (Roberts & Wessler, 1970). In a star network with n connections, each client connects to one network location, rather than to the addresses of every other client. Hence, the software on the server has the flexibility to deal with sessions with different numbers of participants. For example, if the laboratory has 20 computers prepared for the experiment but only 18 participants show up, the experiment can still be conducted, because the server will automatically account for the incoming network connections.

An additional advantage to this design is that the software running on the server can take a more active role than simply relaying information. If an experiment requires a calculation to be made using input from multiple users (e.g., the sum of participants’ contributions to a common pool), the server can perform such calculation and immediately send the result back to all clients.

Application example: Iterated public goods game

One paradigmatic example of economic games is the public goods game (PGG). Although various theories of social behavior accurately predict the aggregate data obtained in a PGG, their psychological plausibility has been questioned. Recent research attempted to uncover the mechanisms underlying decisions in social dilemmas—that is, the way information is processed to make decisions in the social context (Fiedler et al., 2013). In what follows, we briefly describe an experiment in which we used the Eye Tribe eyetracker in combination with our software package to run an iterated PGG. We had two goals. First, we sought to evaluate the quality and integrity of data stored by our software package. Second, we examined the capability of our software package to handle interactive decision-making experiments.

Method

You will find information about the participants, materials, and procedure in Appendix A.

Apparatus

Eye movements were recorded using an Eye Tribe tracker (see Fig. 1), which records binocular gaze data with a sampling rate of 60 Hz and an accuracy of 0.5°. The software provides dispersion-based fixation data for further analysis.

Fig. 1
figure 1

Eye Tribe tracker with USB plug in front of a computer monitor

The participants were seated in front of screens with a resolution of 1,920 × 1,080 pixels on computers with Windows 7. Nonoverlapping areas of interest (AOIs) around the numeric information on screen were defined, with a size of 400 × 200 pixels (see Fig. 2).

Fig. 2
figure 2

Sequence of experimental screens presented to the participants. After a set of written/graphical instructions and a comprehension test, the eyetracker is calibrated with a 9-point calibration. After successful calibration, participants see the first-round contribution screen with all possible contributions [0–20]. The following response screen shows the participant’s and the other player’s (Person B) contribution, as well as summary statistics (sum, average) for the current game. The maximum possible, nonoverlapping square areas around the numbers (contribution screen) and alphanumeric characters (response screen) were used as the AOIs

Each participant was calibrated using the 9-point calibration procedure provided in the Eye Tribe development kit. To this end, participants were asked to look at points appearing sequentially in different locations on a dark computer screen. If sufficient quality was reached (i.e., accuracy of <0.5°), the experiment was started. If the visual angle was >0.5°, recalibration was triggered automatically.

Results

Data quality

Every participant in our experiment successfully completed the calibration procedure in less than three calibration attempts, our cutoff criterion for exclusion from the study. To evaluate the quality of the data collected, we ran two simple tests: (a) correspondence between the last number fixated by a participant on the contribution screen and that participant’s actual contribution; and (b) AOI versus non-AOI fixations on the feedback screen.

Participants were asked to pick a number from 0 to 20 on the contribution screen using their mouse. These numbers were presented in matrix format across the whole screen (see Fig. 2, sixth screen). Consistent with previous results showing that mouse and eye movements are highly correlated (Chen, Anderson, & Sohn, 2001), we found high correspondenceFootnote 2 between the last fixation on an AOI and the number that the participant selected and clicked on (see Fig. 3) for the large contributions (10–20). However, this pattern was weaker for small contributions (0–9). The difference in the correspondences between high and low contributions is partly explained by the fact that contributions tended to be relatively high in general: The median contribution in Round 1 was 12, with only four cases (out of 120) below 10 (see the analysis below of the contribution patterns for more details). For the quadrant with the highest contributions in Fig. 3 (20, 20), we observed that the last fixation in an AOI showed a correspondence with the actual choice (i.e., click) on the same AOI for 24.7 % of all clicks in Game 1.

Fig. 3
figure 3

Correspondence between the last fixation of an AOI and the contribution chosen via mouse click in this round

A second indication of the data quality is the relation of the fixations within AOIs versus non-AOIs (but see Orquin, Ashby, & Clarke, 2016, on flexibility in the definition of AOIs and the consequences for data analysis). Our feedback screen consisted of eight AOIs with numeric information and six AOIs with written text (see Fig. 2, Screen 7). Across all participants, 28 % of fixations were in numeric areas, and 69 % were in the written text. About 3 % of all fixations were outside a defined AOI,Footnote 3 most likely related to participants’ orientation on the stimulus screen or reflecting incorrect classifications of a fixation to a non-AOI area.

Process analysis

On average, participants fixated on 17 AOIs (SD = 9 AOIs), which took them 92.9s (SD = 42.3 s). To evaluate the attention to different aspects of the PGGs played, we calculated the dwell time (sum of all fixation durations during a dwell—i.e., one visit to an AOI from entry to exit, measured in milliseconds; Holmqvist et al., 2011, p. 190) separately for each participant, each game (1–3), and each round (see Fig. 4).

Fig. 4
figure 4

Median dwell times for each game (1–3), round (1–10), and AOI (You, PersonB Sum, and Average)

The players’ attention was focused primarily on two AOIs: their own points (the two AOIs in the upper row of Fig. 2, indicated by “You”) and the points of the other player (the two AOIs in the second row of Fig. 2, indicated by “Person B”). Whereas the attention given to “You” (a player’s own points) decreased across games, to 0 in Game 3, the attention dedicated to “Person B” remained relatively constant across games. In contrast, the information available on both the sum score and the average attracted less attention from players; the median dwell time converged to 0 early on, and remained unchanged across Games 2 and 3. We qualified these descriptive results with a mixed-effect model approach, using the lmer function of the lme4 package (Bates, Maechler, Bolker, & Walker, 2014) in R. Significance tests were conducted using the lmertest function of the lmerTest package. Participants were modeled as random intercepts; AOIs and rounds were added as random slopes to account for the repeated measures nature of the data (Barr, Levy, Scheepers, & Tily, 2013). We found significant main effects of the time spent on an AOI, F(5, 12510) = 5.86, p = .001, with the following average dwell times on AOIs: M PersonB = 192.6 ms (SD PersonB = 120.9 ms), M You = 196.7 ms (SD You = 128.5 ms), M Sum = 175.2 ms (SD Sum = 114.4 ms), M Avg = 181.1 ms (SD Avg = 115.2 ms). Longer dwell times were found for information about the other player (Person B) and the player’s own information. Summary statistics (mean and sum) received shorter dwell times. Furthermore, round, F(1, 6959) = 10.1, p = .001, and game, F(1, 4.5) = 9.9, p = .03, both showed main effects [M Game1 = 199.2 ms (SD Game1 = 125.7), M Game2 = 181.3 ms (SD Game2 = 118.1 ms) M Game3 = 179.4 ms (SD Game3 = 115.2)], indicating faster dwell times at the end than at the beginning of the experiment. None of the interactions reached significance.

Models of contributions

We modeled individual contributions (see the analysis of the contribution patterns in Appendix B) trial by trial using conditional cooperation strategies (for details, see Fischbacher & Gächter, 2010). Specifically, we used a perfect conditional cooperator with naïve beliefs and a perfect conditional cooperator with actual beliefs—the other strategies proposed by Fischbacher and Gächter do not apply in our experimental design. We also proposed a simple matching model. Our matching model assumes that the contribution C of an individual i at trial t is a weighted average of the previous contribution of i (C i,t–1 ) and the previous contribution of the other player j (C j,t–1 ). Formally, the contribution in each trial is

$$ {C}_{i,t}={w}_i\left({C}_{i,t-1}\right)+\left(1-{w}_i\right)\left({C}_{j,t-1}\right), $$
(1)

where w i is the weight given to the player’s own previous contribution, and 1 – w i is the weight given to the other player’s previous contribution. The parameter w can be interpreted as capturing the relative attention devoted to a player’s own versus the other player’s payoffs.

We first considered a matching model with w set at .5—that is, a model that simply averaged the previous contributions of the two players. We then compared the performance of the matching model with that of the two conditional cooperation strategies by calculating the mean squared deviation (MSD) between each model and individual behavior. We found that our matching model predicted individual behavior more accurately than did the two conditional cooperation strategies, with MSDs of 20.7, 17.3, and 15.8 for perfect conditional cooperation with naïve beliefs, perfect conditional cooperation with perfect beliefs, and the matching model, respectively. We then estimated an individual w parameter for each participant, attempting to minimize the MSD between the predictions of the model and individual behavior. The result of the fitting procedure was a w i corresponding to each individual. If w captures attention, we should expect to observe a correlation between the fitted ws and the relative attention given by players to each other’s information, as measured by eyetracking.

We therefore calculated a measure of relative attention as a ratio of the sums of fixations distributed between player i’s own information and player j’s information, as displayed in the payoff screens for each round. The relative attention ra across the experiment for player i was

$$ r{a}_i=\frac{{\displaystyle \sum fi{x}_i}}{{\displaystyle \sum fi{x}_i+{\displaystyle \sum fi{x}_j}}} $$
(2)

The correlation between w and ra across individuals was .31, p = .055. As Fig. 5 shows, w and ra were positively related.

Fig. 5
figure 5

Positive relations between relative attention (ra) and the weight given to the player’s own previous contribution (w), derived from fitting the matching model to each individual

The data collected with pyeTribe allowed us to examine whether and how the attention between players in a PGG co-evolves. The left panel of Fig. 6 shows the evolution of relative attention (ra, as defined in Eq. 1) for four selected pairs of players. The patterns of relative attention are diverse. For the top two pairs in the panel, attention seems to converge across rounds. For the bottom two pairs, in contrast, attention seems more erratic. Naturally, players do not observe each other’s attention, but only their contributions. Therefore, a more reasoned examination of how attention evolves would require an analysis of the contributions, which are the only information transmitted between players. The right panel of Fig. 6 attempts to shed some light on how relative attention responds to differences in the contributions. As the right panel shows, the larger the difference between a participant’s own contribution and that of the other player, the more attention a player pays to the other participant.

Fig. 6
figure 6

(Left) Evolution of relative attention (Eq. 1) within each of four selected pairs of players. (Right) Relative attention as a function of the difference between a player’s own contribution and that of the other player. The negatively sloped regression line indicates that the more a participant contributes relative to the other participant, the more the participant pays attention to the information concerning the other player

Discussion

New research tools are opening up a wealth of opportunities for researchers to study new paradigms and ask new questions. Eyetrackers are, of course, not new, but the critical reduction in price is allowing for new experimental setups to study, among other things, attention in interactive decision-making tasks. In this article, we presented a software package designed to help researchers exploit the advantages of multiple eyetrackers while conducting interactive decision-making experiments.

Our software package has four central features. First, a large number of participants can be recorded while they interactively play economic games. Second, eyetracking calibration can be performed without the experimenter’s supervision, allowing for simultaneous calibration, saving significant resources. Third, gaze-contingent rules—for instance, proceeding to the next screen only when all interacting participants have fixated on a particular AOI—allow for complex experimental scenarios to be designed. Fourth, there is a clear potential to integrate our software package into more complex experimental systems, like BoXS (Seithe, Morina, & Glöckner, 2015) or zTree (Fischbacher, 2007).

We used a standard economic game (PGG) to demonstrate the potential of our software package and to examine the quality of the data obtained. The value-for-money ratio of the eyetracking data collected was outstanding, as shown by the high correspondence between fixations at the moment of clicking the mouse and the number selected, and the high proportion of fixations within AOIs. A modeling analysis of the individual contribution processes provided further evidence for the high quality of the data. We proposed a simple model that weight-averages a player’s own previous contribution with that of the other player. Specifically, we estimated the weight that each participant assigned to his or her own previous contributions and examined whether those weights corresponded with the attention patterns. The relative numbers of fixations between self and other were moderately correlated with the parameters estimated individually. This modeling analysis illustrates how combining behavioral and process data may improve cognitive models of behavior. Fiedler et al. (2013) demonstrated the value of adding process data to the analysis of economic games. We extended their work by facilitating the synchronous observation of multiple players in a PGG (our approach can, of course, be extended to other economic games or experimental conditions involving interactions between multiple players). Importantly, our approach makes it possible to record and examine information search before a choice is made.

Our software package (as well as the data and analysis code) is open source and can be pulled from the following git repository: https://github.com/michaelschulte/ThePyeTribe.