Operation of Public Transportation Ticket Vending Machine in Kraków, Poland: An Eye Tracking Study

Whereas the majority of evaluations of self-service kiosks are based on interviews or observations and as such are burdened with personal bias, eye tracking was seen as a method for an objective analysis. To demonstrate the feasibility and usability of such an assessment technique, the task of purchasing a public transportation ticket from a modern ticket vending machine in Kraków, Poland was evaluated. The test participants relatively easily operated the machine with time taken to purchase a ticket ranging from 54 s for foreigners not familiar with the equipment to 29 s for local inhabitants. Even though the number of gazes recorded for the foreigners group was 2.4 times higher than for the local test participants, the fixation times were almost equal. Faulty or delayed operation of the payment terminal was a meaningful equipment issue encountered by eight test participants. The study demonstrated that the operation of the analysed ticket vending machine should not cause much trouble to anyone. The use of an eye tracker, which was employed for such an assessment for the first time, permitted the identification of possible operational ambiguities that could hinder the user experience without the bias associated with other assessment techniques. The used method was found to be efficient and the results provided valuable information.


Background
Self-service sales kiosks, such as ticket-vending machines (TVM), are ubiquitous and frequently used. They are supplementing both the traditional and emerging vending channels. They must be easy to operate, or people would prefer using alternatives. Various research studies were reported related to their design and user-friendliness. However, the prior reports utilised evaluation methods such as observations and interviews, which are prone to a personal bias [1]. Assessment with fully impartial and quantifiable methodology has not been done yet. Eye tracking, which has been extensively utilised in various transport-related evaluations since the 1970s [2], was seen as such an objective methodology. Surprisingly, this technique was not reported as having been used for such testing so far.
Hence, research was initiated for a dual purpose: firstly, evaluation of the suitability of eye tracking methodology for assessment of a self-service kiosk, and secondly, testing of the kiosk's functionality. The results of the study, in which young volunteers wearing eye-tracking spectacles were asked to purchase a ticket from a modern TVM, are presented herein. Not only the time spent on each step of a ticket purchase was measured but also the number of gazes and visual fixations and the length of such fixations; errors associated with incorrect selections and with machine faults were also quantified. To the best of our knowledge, this is the first evaluation of a TVM using a mobile eye tracker device and the first assessment of a self-service kiosk in Poland. Source: Compilation of publicly available data done by the authors; status as of January 2020. 1 Suburban trains, without full tariff integration. 2 Regular daytime service. 3 Additional 8.5 km of tram routes and associated 10 stops are under construction, scheduled for completion by 2023; currently, 2 km of track and four associated stops are not being serviced. 4 Including stops served only by suburban routes, but within city zone tariff. 5 The authors define a rolling stock unit as several cars that are not permanently joined; the second car may be motorised or trailing. 6 To replace old trams, 110 new articulated trams were ordered for delivery by 2024.

Ticket Vending
As in almost all European cities, public transportation tickets in Kraków must be purchased before entering a transport vehicle and validated at the beginning of the journey. Verification is done through random checks, so theft of the right to passage does occur occasionally; however, the financial penalty for riding without a valid ticket is calculated in the amount, which makes such despicably anti-social behaviour bringing no benefit. The tickets for Kraków public transportation can be purchased through eight attended service facilities, at a plethora of kiosks and shops throughout the city, via mobile telephone applications-either standalone or integrated within some mobile banking, in 164 stationary TVM, and in TVM located in almost all of the vehicles (in this case, the ticket purchase must be done immediately upon entering the vehicle; recording from security closed-circuit cameras would serve as verification in case of dispute during a ticket control). Single-ride tickets and transfer tickets permitting unlimited connections within the fixed time (between 20 min and 7 days) can be purchased; long-term tickets, for up to 12 months, are also available for specific routes or for the network. Due to a meaningful support of public transportation through taxes, tickets are quite inexpensive, full prices starting at 3.40 PLN (0.80 EUR) for a 20-min ride, while a daily city zone pass costs 15.00 PLN (3.53 EUR), and a weekly ticket for the entire city and suburban network costs only 68.00 PLN (16.00 EUR). Numerous social groups, amongst them students from European Union countries, are entitled to a 50% discount. People with disabilities, including blindness and severe movement impairment, are entitled to the use of public transportation without payment based on their disabled person identity cards; their guides have the same privilege while taking care of them. (Ticket prices as of January 2020, values in euro are given at the average exchange  rate published by National Polish Bank.) For foreign tourists, the use of stationary TVM to purchase tickets appears to be the most convenient because of the possibility to select several display languages, the access to all types of tickets, including the periodic ones designed for tourists (1, 2, 3, and 7-day tickets for the entire network), and non-stop operation. It was reported that written simple information is easier to comprehend for foreigners than verbal communication, which could cause misunderstanding and misinterpretation due to different accents, pronunciation dissimilarities, and the risk of using a local understanding of names and procedures [6]. Thus, the written simple instructions or guidance through pictograms and the possibility of returning to a previous screen for confirmation of choices appear beneficial and preferred by tourists; they may be considered amongst the key advantages of the TVM.

Prior Related Studies
A sizeable amount of research was done on various aspects associated with selfservice kiosks in different surroundings, mostly from a sales and marketing perspective [7]. Although the first simple TVM was installed on the Central London Railway in 1904 [8], the research of such equipment began only in 1980s, when TVM started being more complex thanks to technological development. The majority of TVM research can be divided into two main directions: the evaluation of usability and the design issues. Various considerations that should be included during TVB assessment were postulated over two decades ago and still appear to be fully valid [9]. Important considerations related to the acceptance and reliability of technology in public transportation were recently summarised [10].
One of the first important scientific research works was done in the field in 1985 in the Netherlands: several hundred passengers were observed, recorded, and interviewed regarding the usage of TVM located at main train stations in Amsterdam, Rotterdam, and Utrecht. A general impression of TVM and their usage by travellers was presented; all problems associated with equipment design noted by the users were reported and changes in TVM design, aiming at making the text and images easier to understand, were proposed [11]. As a continuation of the work, the researchers discussed and assessed a decision system design: the one-button-to-press (one click for a particular ticket) versus morebuttons-to-press (the user selects the appropriate ticket through sequential clicks). Empirical and theoretical analyses clearly showed the preference of the second solution [12].
Later, researchers reported the outcome of a study done in the London Underground: based on observations and interviews, they identified the main issues associated with the operation of TVM with three different operational modes and the reasons these problems occurred [13]. Surprisingly, reported was no difference in the number of mistakes made by first-time and experienced users. It was additionally reported that a TVM user would prefer to learn by correcting own errors than to read usage instructions, which led to a conclusion that the equipment should be designed to allow operation based on a clear sequence of simple and short tasks. Subsequent analysis considered three different designs of TVM [14]. Although the main focus of that work was the analysis of errors and their comparison through empirical and analytic methods, the study also indicated a growing tendency amongst passengers to use a TVM instead of buying tickets in ticket booths. In addition, the previously identified problems with the selection and order of operational steps were confirmed. A follow-up article was concerned with the assessment of two analytical methods employed to identify TVM usefulness [15]. The above-mentioned studies were done on a push-button TVM. With technological advances and the introduction of touchsensitive screens, it was found that on-screen selection was better for complex systems; nonetheless, an absence of human cognitions studies was noted [16].
Behaviour observation, contextual interview, and online survey were the methods employed during the evaluation of TVM usability installed in trams in Graz, Austria; their comparative analysis permitted establishing the advantages and disadvantages of specific methodologies. The same dataset was used to evaluate the usage of TVM by elderly citizens-only approximately 10% of them used this method of ticket purchases, but the reasons for such low preference were not pinpointed [17]. Much more detailed research related to the elderly and analysis of the issues they encountered was done on three simulated designs of TVM: it was reported that the usage of instructions, especially in the form of a video, could improve the ease and frequency of TVM use by seniors [18]. The discrepancy with prior research suggesting a self-learning preference [13] was not noted.
Separately, a research project "INNOMAT" realised in Austria focused on the issue of how a new generation of TVM should be designed to fulfil the needs of different user groups, especially elderly and disabled people. Taking into account such a global goal, the project activities covered almost all existing TVM aspects. A user-centred design (UCD) approach, in which the entire design process was based on the number of repeated tests and evaluation loops together with the involvement of end-users, was utilised [19]. Analysis of related literature reports for the employment of UCD revealed the barriers and requirements for users, while subsequent observations and interviews, together with sales data, helped in estimation of TVM usability from users' experience and declared problems [20]. To collect the data for estimation of the TVM usability, besides the already mentioned techniques, the following tools were additionally implemented: video analysis, competitor analysis, expert interviews, and meetings with stakeholders [21]. The scope of the project "INNOMAT" included also the influence of users' computer efficacy and age on TVM usage [22]. An unrelated study based on UCD implementation was done in the United States [23]. The UCD was utilised for a recent assessment of TVM in Mexico; error assignment to different interaction steps was provided [24]. Limited research on TVM, concentrating on their function within the surrounding environment, was recently reported as well [25]. All of these studies relied on observations and interviews and were limited to people with a Western cultural background.
Amongst work done in Asia, two independent fully qualitative studies concerning TVM operated by the Taiwan High Speed Rail Company were done. Firstly, they evaluated the issues encountered during the usage of TVM based on specially adapted guidelines for the design of public service kiosks [26]. The second work aimed to analyse the interaction problems of TVM users; the issues were explained, and improvement ideas were suggested [27]. More recently, four groups of TVM designs, from China and from Japan, were evaluated from the perspective of user experiences with a human-computer interface [28]. Later, a newly installed TVM system in Lahore, Pakistan was investigated [29]; technical problems and human factors were found as major impactors causing a low level of the equipment usage (only 5% of all passengers); the interface, which was adapted from Turkey, was suggested as a possible cause for low user-friendliness. Hence, for the first time, a cultural difference was perceived as a TVM usage and experience factor. Later, evaluation of TVM based on user experience was done in Jakarta, Indonesia [30]. Observations supported by video recordings were used to calculate user performance metrics and special questionnaire tools were implemented. The analysed TVM showed generally poor usability performance for both experienced and unexperienced users; hence, a new TVM interface design was proposed. As a continuation of the study, a redesign, based on the UCD approach, of the whole TVM used by local commuter railway operator in Jakarta was described [31]. Another research study done in Indonesia, on a different TVM, confirmed the user-unfriendliness of the design for both novice and experienced users; improvement to the design was proposed [32].
None of these considerable efforts utilised the eye tracking technique to measure the time spent on a particular step and identify particular operational features that required excessive visual focus or demanded it through inadequate or excessively complicated design. Only in a loosely related research, a recent study with an eye tracker was utilised to assess the efficiency of a bank self-service machine before and after modification of the interface [33]; however, unlike in this paper, no clear information related to gaze and fixation distribution was given.

Ticket Vending Machine
The analysis was done on a stationary TVM, ticomat 9010 (Trapeze Switzerland GmbH; Neuhausen am Rheinfall, Switzerland), which was one of 59 of this model installed in Kraków. It is situated at a tram stop located in an underground tunnel directly under Kraków Główny railway station. The location of the TVM within the platform is shown in Figure 1. The machine has a 30.5 × 22.5 cm colour touch-sensitive screen, which in the calm mode circles between displaying public interest advertisements and the ticket selection screen. Above the screen are displayed the departures of the next three trams. Immediately below the screen, there are five touch buttons including "Start" (with a "home" pictogram) and "Languages" (with a "globe" pictogram). The display can be switched to six languages (Polish, English, French, Spanish, German, and Italian). Payment can be made with cash (banknotes and/or coins, with change given) or with credit or debit cards. A slot for charging long-term tickets is also included. The bottom of the touch-sensitive screen is located 120 cm above ground and the tickets are picked up at the height of 80 cm, so it is reachable for people of all heights and even users of wheelchairs. Light boxes appear after each step around the areas where the next interaction is required; after completion of the particular operation, they change colour to yellow, and the next box is lit. The entire TVM is shown in Figure 2, a ticket-selection screen is shown in Figure 3, the light boxes around payment options are shown in Figure 4, and the ticket collection area is shown in Figure 5.

Eye Tracking
Since the beginning of the 20th century, eye tracking technology has been continuously applied in various research fields. While the technology advanced, the basic concept remained mostly unchanged: different metrics related to the pupil movement of a person, e.g., gazes, fixations, fixation duration, and saccades, are collected. The methodology is so well known that reiterating its description would be pointless [34].
In the 1970s, eye tracking was first used for studies related to transport [2]. The technique is used widely to study drivers' behaviour and their reaction to various stimuli, most of which are perceived visually [35]. Amongst related current new applications, observations of horizontal road markings by drivers approaching intersections were reported [36][37][38], eye tracking studies were done on bicyclists [39], and pedestrian wayfinding and object memorisation studies were held [40]. Recently, researchers performed an in-depth analysis of directional signs observation at Kraków Główny railway station, suggesting that the inappropriate location and overabundance of such signs can cause as much confusion as poor design of the station itself [41][42][43].

Equipment and Experiment Design
The used eye tracker device was Tobii Pro Glasses 2 (Tobii AB; Danderyd, Sweden). It is based on video recording of combined pupil and corneal reflection: while the eyeballs are illuminated with near-infrared light and their movements are recorded, a forward-facing film is being also recorded, at 1920 × 1080 pixels, with a frequency of 25 Hz. The eye tracker is worn as typical spectacles; it weighs only 45 g, so it is not obtrusive, and the test participants quickly become used to wearing it.
The test participants were tested one-by-one, so the entire task would be a novelty for all of them. Firstly, the eye tracker was individually calibrated with the aid of a Tobii Pro Glasses Controller. Then, it was worn for a few minutes while finding a way at Kraków Główny railway station from a platform to the underground tram stop, which was an unrelated task and as such shall be reported separately. Upon completion of this task, which ended next to the analysed TVM, the test participants were asked to purchase a oneride public transportation discount ticket. A spoken instruction (to foreigners in English and to control group participants in Polish) was given: "Now, please purchase for yourself a one-ride ticket from this ticket vending machine". The test participants were allowed to make the purchase, and their assigned task ended when the printed ticket was picked up. Throughout the entire test, each of the participants was permitted to move freely but was discreetly observed by research assistants who did not help in making any choices. No additional questions were asked.

Test Participants
For the study, volunteer participants were selected amongst students who attended Politechnika Krakowska (4 participants, the control group) and who visited Kraków for the first time from Italy for a short-time exchange (15 participants, the foreigners group). Nobody from the foreigners group claimed any knowledge of the Polish language, while they all stated being fluent in English. All of the test participants affirmed that they frequently used various TVM. All of them declared having corrected or uncorrected 6/6 vision. During the test, ethical guidelines set by the participating universities were followed and the participants signed appropriate consent; no sensitive personal data were collected. Basic information about the test participants is provided in Table 2 (standard deviations are given in parentheses). The sex of the test participants is given, even though no differences in outcome were measured between females and males; as such, it is not discussed. The test participants were not compensated in any way, so they paid for the ticket from own pocket; however, the purchased ticket was used by them during their visit to Kraków. This small size of test groups is deemed sufficient based on previous reports related to the eye tracking technique [44,45]. Furthermore, it must be emphasised that the main goal of this experiment was more related to the utilisation of the eye tracking for this new application than to a broad testing of various people purchasing a ticket.

Data Analysis
The collected recordings were processed with Tobii Pro Lab software. For the analysis presented herein, the operation of the TVM was divided into four distinct phases: (1) screen setting, (2) ticket selection, (3) payment, and (4) ticket collection. For each of these phases measured were: time, error time, number of gazes, number of fixations, and total fixations time. Errors were considered to have occurred when an incorrect choice was made, demanding return to the previous screen. Whenever errors were a result of inadequate operation of the TVM, equipment faults were recorded. The first step, the screen setting, had to be treated separately because of the language change option, which would be unnecessary for the control group; including it would skew the overall results.

Screen Setting
The outcome from the screen setting step is summarised in Table 3 (standard deviations are given in parentheses). The lower number of test participants for this step was due to the aforementioned circling of the screen display in calm mode between the advertisement and the ticket selection.
Human errors at this step included mistakes made by three of the foreigners (one of them erred twice). Two of the foreign test participants who encountered the advertisement screen appeared disoriented until they realised that to start the operation and ticket selection, touching of the screen or the 'Start' button was required. The screen setting stage included changing the display language: three of the test participants changed the language more than once, without an apparent need. This was not treated as erroneous because it did not affect the continuity of actions; instead, it could be considered as preliminary orienteering in the machine functionality.
Amongst the recorded equipment faults at this stage, the language selection button did not respond upon first touch, which caused delays of up to 5 s. One of the foreign test participants who could not change the language proceeded in Polish, without difficulties.

Ticket Selection and Purchase
The results for the next three steps are furnished in Table 4. The entire task took the foreign test participants, who were not familiar with the TVM, on average only 1.9 times longer (mean 54 s versus 29 s) than it took the control group members. Nonetheless, for some of the foreign test participants, the task did not take longer than for those from the control group. It should be observed that the number of gazes recorded for the foreigners group (average 2103) was 2.4 times higher than in case of the control group (average 870), which is disproportional to the time for the task. The differences in the number of gazes are visualised in Figure 6. While the number of fixations followed the pattern of gazes, the average fixation lengths were almost equal for both groups (0.40 s versus 0.36 s, correspondingly for the foreigners and the control). Amongst the steps, the highest number of gazes and fixations was recorded during payment; however, the longest duration of fixations was recorded during the ticket selection step.  One of the foreign test participants was using from the beginning the display in Polish language, which was not considered as an error despite somewhat prolonging the time required to accomplish the task (completion within 59 s). Another foreigner (for whom the language change button did not work) used Polish without any noted difficulties, and the task did not demand additional time (completion within 49 s); apparently, the pictograms provided sufficient guidance. The ticket selection mistakes were quickly corrected; only one foreign test participant who selected incorrect ticket type became confused because there was no possibility of returning from the payment screen to ticket selection (such operation demands re-starting the purchase sequence). The payment process appeared to be the most problematic step, during which six errors occurred for four test participants: two people failed to select payment method before proceeding to pay, one inserted the card inadequately, and one tried to insert the payment card in the slot for charging longterm tickets. More troubling than the human errors were equipment malfunctions associated with the payment terminal operation. Sluggish operation of the contactless payment terminal was recorded for seven (six foreigners and one from the control group) out of 13 test participants (10 foreigners and three from the control group) who decided to use such type of payment. One of the control group participants who encountered it immediately realised that a card reader had to be used, but the foreigners appeared puzzled by such a highly unexpected fault and were repeating their attempts. In one case, several attempts were required before a banknote was accepted. These and other equipment faults added considerable time to the entire task. The average number of gazes that occurred during equipment malfunctions was 234 (range 156-467); these additional gazes would account for approximately 25% of all recorded at this step.

Discussion
To the best of our knowledge, this is the first reported experiment in which eye tracking was used for evaluation of TVM. Unlike in previous research, which was based on interviews, users' impressions, and observations, it was possible to objectively identify the points of errors. Interviews with users or observations, even of video recordings, are subject to personal interpretation and bias, as was proven many years ago [1]; this weakness was eliminated through employment of the eye tracking technique. The software used for analysis clearly identified the points of visual attention; hence, the time and number of gazes and fixations at each step of the self-service kiosk operation permitted impartial assessment. This novel use of eye tracking is expanding the knowledge base related to this technique and is for the nth time proving its broad usefulness. As a result of that approach, no questionnaire was administered; a comparative analysis, despite its possible usefulness, was not intended in this case.
A previous publication indicated that purchasing a ticket was not perceived by foreigners as a stress factor when using public transport in the destination cities [46]; however, foreigners count as passengers with a high demand for information that may influence the process of ticket purchase [47]. The larger number of gazes that was recorded for the foreigners group could have been an indication of stress caused by interaction with an unknown equipment, but it also could be the result of a visual search for confirmation of the choice that was made and finding of the next step. Since there was no difference in the length of fixations as compared to the control group, it is hypothesised that the foreigners were rather thoroughly reading the displayed information, and the high number of gazes was not a result of stress. Additional analysis and research, possibly with supplemental techniques, would be necessary to prove such a hypothesis. The human errors in using the analysed TVM were mostly associated with the initial screen setting and with operation of the payment terminal. This suggests a really straightforward design of the equipment interface, but it also indicates the absence of obvious prompts. Hence, this TVM met the requirements for legibility and clarity, which are necessary for positive tourist experience in a new environment [48].
Much more troublesome were the equipment faults, which were associated first with the screen setting and later causing meaningful difficulties with making payment. Any financial transactions are a stressful task in an unfamiliar environment due to occasionally occurring overcharging, lack of detailed knowledge, and the risk of various scams [49,50]. Therefore, enhancement of the equipment, particularly streamlining of the payment terminal operation, is suggested. A specific proposal of any improvements was not within the scope of this work and would demand additional analyses.
Due to the dual purpose of this research, there could be weaknesses associated with the used eye tracker and the specific analytical technique, the selection of the analysed equipment, and the choice of test participants. Eye trackers generally furnish reliable data, even if the portions might be occasionally missing or skewed due to improper positioning [45]. Such considerations are important but must be treated as an exception; in the course of this work, data losses were not recorded. The selection of this particular TVM and the absence of comparative study was purposeful; the work was never meant to be a side-by-side evaluation of equipment. An interesting follow-up experiment would be the inclusion of people from different cultural backgrounds to check whether the tested TVM operational design was easily comprehended by people habituated to different operational protocols. Assessment by people not being fluent in any of the languages that are available for such TVM would be advantageous to establish whether the pictograms are sufficient and well-designed. Cross-cultural studies done on the recognition of various road signs recently showed serious inadequacies of some symbols [51,52]. The evaluation of other social groups and different designs of TVM would be necessary for a comprehensive analysis, which was beyond the scope of this work. An additional issue here is the selection of a machine versus a human-attended kiosk for future purchases for people who previously experienced equipment failures; in the analysed case, the failures related to the payment terminal.
Amongst the possible critiques related to the choice of test participants, one could list performing the assessment on only young people. It is a valid point because all of the prior research consistently confirmed that elderly people required special attention and were less likely to use the technology that was novel to them. Such research could give an answer to the question of whether a human-computer interface as simple as in this case would be adequate. The reluctance of the elderly to utilise computer technologies was reported to be, at least partially, a myth [53]. Nonetheless, in this case, one must also consider that seniors were reported to prefer purchasing prepaid organised tours and thus are somewhat less likely to use public transport in a foreign city than youth, who prefer to travel independently [54].
It was realised too late that one should also add the task of locating the self-service kiosk itself. The analysed TVM is located in the middle of the tram stop platform, but a more reasonable location might have been its entrance. As shown in Figure 1, the location of the TVM is not well marked. The KKM logo (Krakowska Karta Miejska, Kraków City Card-a long-term ticket) that is above the TVM cannot be readily recognised by people unfamiliar with it; in addition, the logo is not uniformly used to identify public transport vehicles and infrastructure in Kraków. The tram stop itself is a part of a major transportation hub that was consistently described as 'resembling a maze' and where inadequate and inappropriate signage caused very significant confusion amongst unrelated groups of local and foreign test participants who were also young travellers [41,42]. A psychological perspective, which is beyond the scope of this study, would be needed to comprehend how such a mixture of poorly designed space combined with rather well-designed equipment is affecting the perception of the municipality by casual tourists.

Conclusions
This assessment of users' experience with TVM, which was the first to use an eye tracker, provided impartial and measurable information related to the time spent at each operation step. This new employment of the old technique can be easily and successfully used for finding errors and uncertainties associated not only with such self-service facilities but also with other equipment utilising a human-computer interface. The absence of personal bias furnished reliable information than can supplement or replace data obtained from questionnaires or observations. It is proposed that any new design of a TVM (or its interface) is tested in a similar manner to pinpoint any equipment or layout faults and correct them before the equipment is installed.
The analysed modern TVM appears to be meeting the requirements for legibility and clarity: no major human errors were encountered. Easy step-by-step visual instructions on screens designed without unnecessary clutter must be praised; the diode lights that indicate the area of the next step (ticket selection-payment-ticket collection) appeared to be a good guidance feature, even if this was not possible to quantify within this experiment. Although the task of ticket purchase took the foreigners two times longer than was measured for the control group of local TVM users, it was expected because of the need to read and understand the information. The increased number of gazes that was recorded for people not familiar with the analysed self-service kiosk is deemed as more caused by their search for visual information than by stress associated with unknown equipment. Importantly, none of the foreign test participants required external help, and all were able to easily correct own errors. Nonetheless, equipment malfunction during payment-both sluggish terminal response and a failure to accept payment-must be noted as a source of unnecessary stress; it might become a reason to utilise a human-attended facility.

Data Availability Statement:
The data presented in this study are available on request from the authors. It is not publicly available due to confidentiality and privacy.

Conflicts of Interest:
The authors declare no conflict of interest.