Experimental materials comparing individual performance implications of two decision aids: Taxonomy and tags

Design techniques have been classified to support the selection in design processes. Two decision aids have been created. We designed an experiment to compare both decision aids (taxonomy and tags) and evaluate the influence of individuals’ decision style when using a decision aid. The experiment materials included the experimental process, a training, an experimental task, and the survey questionnaire. In this method article, we describe the details of the experiment settings and use the collected data to validate the experiment. Advantages of this method include the following:• The procedure of the experiment ensured an easy-to-understand training part without any bias toward performing the experimental task and answering the survey questionnaire at the end.• The experimental process can be applied to experiments for evaluating task performance by using user interfaces with a training part before the experimental task.• The experimental task scenario and the design techniques included in the experiment can be applied in experiments with design-relevant task scenarios.


Method details
The designed experiment compared task performance by using two different decision aids (taxonomy and tags) while accounting for individuals' decision styles and cognitive effort s. The experiment was based on experimental design to evaluate cognitive fit theory [1] . A prior version of the method, adjusting the experimental method to include decision aids, involved designing an online experiment [2] . This method article describes a laboratory experiment to control for participants' environmental setting and ensure participants took part in the experiment without interruptions. The study followed a between-subject design, whereas the three groups differed in their decision aids. Next to taxonomy and tags, a list of design techniques was presented to participants in the control group. Each participant was randomly assigned to one of the groups and could only participate once. The experiment was conducted in a well-facilitated computer-based laboratory 1 with individual soundproofed cabins to ensure that each participant had no interruption from other participants and experimenters during participation. There was also a control room for the experimenters to observe each participant's screen during the experiment and see if the system ran well and whether the participant came across any questions on it. In the following section, we present our experimental materials in further depth by describing participants, procedure, manipulation, measurements, and method validation.

Participants
For the experiment, 195 participants were invited through a system 2 based on ORSEE [3] . Twelve sessions were conducted in total (four sessions each day). Each session was randomly assigned to use taxonomy, tags, or a list of design techniques. The experiment was scheduled to last 60 min. The schedule consisted of 60-min breaks between two sessions so that participants could finish the experiment at their own pace, and researchers could sum up the last session and prepare the next one. Each session included at most 20 participants. All of them were asked to arrive at the laboratory's reception 15 min before the experiment started, following the facility's guidelines. The first two authors of this paper organized the 12 sessions together. Each session started with a joint welcome of all participants and a 15-min introduction to the experiment. The introduction included nothing about the experiment content but explained the experimental process, objectives, and reimbursement. Participants received a basic payment for 8 euro for the completion of the experiment and could receive an additional 1 to 3 euro, depending on their performance. If a participant did not finish the experiment task, we paid him or her 5 Euro as a show-up fee. Subsequently, two experimenters guided the participants to their individual cubicles. The main experiment included three stages, as described in the following section. Table. 1 shows the whole process of the experiment. In the training part (stage 1), we introduced the definitions, the tools used in the task, and the questionnaire in which participants were going to evaluate themselves and answer personal questions; the three task scenarios (stage 2) were the same for all participants. The groups differed in how they could search and select design techniques; specifically, each group had a different decision aid to support their search and selection process. At the end (stage 3), participants answered a questionnaire.

Procedure
The participants, at first, attended the introduction of the experiment, where we introduced data privacy and the three stages of the experiment. Participants were informed that the data would be analyzed as a whole, and no one had access to the exact individual data collected in the experiment. Then, the researchers introduced the three stages to the participants, who needed around 45 min to finish the entire experiment.
In stage 1, the participants could review the meaning of the design techniques, taxonomy, and tags to understand the application of design techniques in digital service design processes and get familiar with the experiment tool that they would use in the following tasks. After introducing the definition of design techniques, we asked some questions on prior knowledge of design techniques. We asked questions directly after the description of design techniques instead of after finishing the tasks to make sure that the implementation of selection tasks would not influence the self-evaluation of prior knowledge.
We gave no hint to the tasks to make sure that the training part would not bias participants. Instead of explaining the definition of taxonomy and tags, we used an example of classifying animals in training. The participants viewed a short video to describe how animals could be classified into structured categories (i.e., taxonomy) or unstructured categories (i.e., tags). The short videos to describe the tool for the experiment task applied the created taxonomy and tags of animals for filtering animals. The videos introduced participants to the tasks, explaining how they could use the categories to filter information, where they could find the detailed description of each entry, and where they had to type in their selections. For the control group, there was no description of the classification. The introduction of the experiment tool used an example of selecting animals from a list. Fig. 1 illustrates how to use animal classifications to describe the creation of a taxonomy and how to use the created taxonomy. Fig. 2 shows using tagging animals to describe the creation of tags and how to use them. The following control questions were asked to ensure that participants understood the short videos: "Do you think you understand how a taxonomy/tags work when searching for information?" and "Do you think you understand how to use a taxonomy/tags/list to select information for a specific purpose?" Stage 2 was the experiment part. Each participant was asked to use the provided tool to complete the task. Each stage was a task scenario. Participants were asked to select five design techniques for each scenario. The three scenarios were not presented to participants at one time but were separated. The next task scenario would be presented after participants provided answers to the first task scenario and answered a control question on whether they understood the scenario ("Do you think you understand the task description?"). Participants then watched a natural picture for one minute as a washout phase. We designed the task process to make sure participants concentrated on only one scenario at a time and were not influenced by other scenarios. As participants were design novices who had limited knowledge of design techniques, we gave those who were assigned to the two treatment groups a piece of paper that explained the meaning of the categories included in the taxonomy or tags.
After finishing the task, participants went to stage 3, where they were asked to answer a questionnaire on measurements and demographic data. First, participants answered a question on cognitive effort spent on the tasks. Then, they answered questions on their decision style (i.e., intuitive or rational). Finally, they provided their demographical information. We designed the questionnaire to first show questions on cognitive effort because these questions needed participants to recall their task completion process. When participants finished the experiment tasks and the questionnaire, they were informed that they could leave the cabin and go to the receptionist. • A short video on how to use a list to find animals • A control question on whether the participant understood the explanation of the tool.

Stage 2 Three task scenarios
• Overview of the whole task (no detailed information on each task scenario) • A mobile app development context including three design scenarios to select design techniques. • Selecting techniques for the 1st task scenario • A control question on whether the participant understood the explanation of the selection task • One-minute washout phase • Selecting techniques for the 2nd task scenario • A control question on whether the participant understood the explanation of the selection task • One-minute washout phase • Selecting techniques for the 3rd task scenario • A control question on whether the participant understood the explanation of the selection task • The tool for using a taxonomy of design techniques for the selection task • The tool for using tags of design techniques for the selection task • The tool for using a list of design techniques for the selection task    At the reception, we checked the answers' correctness and gave money to the participants based on answers' accuracy. We had predefined five correct selections for each task scenario: if a participants' answers included four out of these five correct selections, we gave an extra 1 euro for the good performance. Each participant could at most receive 3 euro extra. The basic payment was 8 euro. If a participant did not finish the whole experiment, he or she received 5 euro. The payment amount varied between 8 and 11. Participants were not allowed to go back to the experiment room to prevent them from interrupting other participants who had not finished yet.

Manipulation
The three task scenarios in stage 2 were synthesized by reviewing 31 scenarios of using design techniques in previous studies ( Table 2 ). We predefined correct answers for each task scenario.
The developed task description is as follows. Please imagine you are working in a team on a project. Your project develops a mobile app enabling users to book cinema tickets. Please select suitable design techniques based on three design stages: • Stage 1: Generate ideas • You are at an early stage of your project.
• You want to involve a group of people (more than two) to discuss the project and generate some ideas during the discussion. • You do not have access to real users. • Stage 2: Create prototypes • You now want to create some initial prototypes based on the ideas created in stage 1.
• After creating several prototypes, you want to compare them and decide on one or two prototypes for further refinement. • You again do not have access to real users when creating and evaluating prototypes. • Stage 3: Evaluate App • You now have developed a first running version of the app • You want to evaluate the app with real users before delivering it on a large scale to the market • The evaluation seeks to collect real users' data and feedback within a couple of days.

Measurements
The measurements we used in the experiment included prior design technique knowledge, intuitive decision style, rational decision style, and cognitive effort, which were measured by a 7point Likert scale (1 = strongly disagree; 7 = strongly agree). Participants were asked to evaluate the degree of agreement to the items included in each measurement. The items for prior design technique knowledge directly appeared after the explanation of the design techniques in stage 1. The questions for measuring the other three variables were asked in stage 3. Table 3 presents the measurements and the items.

Method validation
195 subjects participated in the experiment. Two of them did not understand some parts of the experiment. We filtered them out based on the "No" answer to the control questions. Three of them did not finish the whole experiment. 190 participants provided valid data. 97.4% of the whole data points from the experiment were valid, which also meant the experiment was designed in an easyto-understand way.
The Cronbach's alpha of prior design technique knowledge, rational decision style, intuitive decision style, and cognitive effort were 0.91, 0.86, 0.88, and 0.89, respectively. Furthermore, the items loaded highly on their assigned factors and lowly on other factors. The validation checks showed adequate reliability of the measurement scales.
The results of the comparison of the three groups of participants (64 for the taxonomy, 62 for the tags, 64 for the list) indicated a statistically significant difference by using the Kruskal-Wallis test and the Mann-Whitney U test ( Fig. 3 ).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.