The TrollLabs open hackathon dataset: Generative AI and large language models for prototyping in engineering design

The TrollLabs Open dataset includes comprehensive information that offers a comparison of design practices and outcomes between human participants and Generative AI during a hackathon event. The dataset was curated through the running of a prototyping hackathon designed to assess the abilities and performance of generative AI, specifically ChatGPT, in the early stages of engineering design. This assessment involved comparing ChatGPT's performance to that of experienced engineering students in a hackathon setting, where participants competed by making a prototype that fires a NERF dart as far as possible. In this setup, all ideas, concepts, strategies, and actions undertaken by the AI-controlled team were autonomously generated by the ChatGPT, without human intervention or guidance, but implemented by two participants. Five self-directed baseline teams competed against the AI team. The dataset comprises 116 prototype entries and 433 edges (connection) that enable comparative analysis of design practices and performance between the team instructed solely by generative AI and baseline teams of experienced engineering design students. Prototypes and their attribute data were captured using Pro2booth, an online prototype capture platform running on participants' phones and computers. The dataset includes a transcript of the conversation between ChatGPT and the team responsible for implementing its recommendations, featuring 97 exchanges of prompts and responses. It contains the initial prompt used to instruct the AI, the objective and rules of the hackathon and the objective performance of teams, showing the ChatGPT team finishing 2nd among six teams. To the authors' knowledge, the TrollLabs Open dataset is the first and only open resource that directly compares the performance of generative AI with human teams in an engineering design context. Thus, it is intended to be a valuable resource to design researchers, engineering and design students, educators, and industry professionals seeking to find strategies for implementing generative AI tools in their design processes. By offering a comprehensive data collection, the dataset enables external researchers to conduct in-depth analyses that could highlight the practical implications of integrating generative AI in design practices, possibly providing an overview of its limitations and presenting recommendations for improved integration in the design process.

team responsible for implementing its recommendations, featuring 97 exchanges of prompts and responses.It contains the initial prompt used to instruct the AI, the objective and rules of the hackathon and the objective performance of teams, showing the ChatGPT team finishing 2 nd among six teams.To the authors' knowledge, the TrollLabs Open dataset is the first and only open resource that directly compares the performance of generative AI with human teams in an engineering design context.Thus, it is intended to be a valuable resource to design researchers, engineering and design students, educators, and industry professionals seeking to find strategies for implementing generative AI tools in their design processes.By offering a comprehensive data collection, the dataset enables external researchers to conduct in-depth analyses that could highlight the practical implications of integrating generative AI in design practices, possibly providing an overview of its limitations and presenting recommendations for improved integration in the design process.
© 2024 The Author(s

Value of the Data
• The dataset provides valuable data that could be analysed to generate insights into how generative AI, specifically ChatGPT, can be utilized in the early stages of engineering design.This is particularly important for understanding AI's potential in ideation, conceptualization, and initial problem-solving phases to exploit its benefits and mitigate its limitations.
• By offering a comprehensive collection of data on all parts of the design practices conducted by participants, the dataset enables external researchers to conduct in-depth analyses that could highlight practical implications of integrating generative AI in design practices, possibly provide an overview of its limitations and present recommendations for improved integration in the design process.• The dataset includes the final outputs and the entire history and rationale behind design decisions that could be used to understand both AI and human-driven design decisions and their comparison.• To the authors' knowledge, the TrollLabs Open dataset is the first open resource that directly compares the performance of ChatGPT with human teams in an engineering design context.

Background
Artificial intelligence (AI) and natural language processing (NLP) have gained increased attention due to the availability of free and easy to use tools such as ChatGPT and similar promptbased generative AI systems.AI has transcended its traditional role as a computational aid and will likely influence how design engineers conceptualize, create, refine, and design prototypes.Although generative AI can assist in the conceptual stages of design [1] , it may be inferior to tools tailored explicitly for purposes that require exacting standards and intricate technicalities [ 2 , 3 ].ChatGPT, for instance, excels at text generation but has been shown to exhibit drawbacks such as forgotten information, partial responses, and a lack of output diversity for design tasks [4] .It is, therefore, uncertain how well ChatGPT performs when tasked with generating concepts and building instructions for prototypes and physical solutions that often require domainspecific knowledge at a great degree of accuracy and detail [5] .

Data Description
Table 1 presents an overview of the dataset [ 12 ], detailing each file along with their respective descriptions to explain each entry and its origin.

TrollLabs Open Hackathon
The TrollLabs Open hackathon was run to elucidate the capabilities and performance of generative AI, specifically ChatGPT, in the early stages of engineering design by comparative analysis in a hackathon setting, where its outputs and solutions were evaluated against those developed by experienced engineering students [6] .ChatGPT actively participated in the hackathon by collaborating with two first-year PhD students, who executed the AI's design suggestions.The 48-hour challenge involved developing a prototype capable of launching a foam NERF dart the farthest distance under specific constraints in a university makerspace.It required the creation of a free-standing prototype without external power or air sources.Performance was measured based on the distance a NERF dart was launched, and teams were limited to one counting test on the final day of the challenge.These results are provided in the Test_results.csvfile.Participants, including ChatGPT's team, were given identical conditions and resources.The complete list of rules and requirements is provided in the rules.txtfile.The design task and rules were given to the participants at the start of the challenge.
ChatGPT was given a specific prompt outlining its role in the hackathon as the sole decisionmaker for its team.The prompt is provided in the prompt.txtfile.The human participants acted as executors of ChatGPT's instructions, with a mandate to seek clarification and provide feedback but not to influence the decision-making process.This setup aimed to evaluate ChatGPT's autonomous problem-solving and decision-making capabilities within the confines of the engineering design challenge.ChatGPT-4.0(Oct.19 th , 2023) was chosen as the best-fitting language model because of its analytical skills and refined understanding [7] , as well as being the most recognized LLM available to the AI-controlled team during the study.The full chat is provided in the Chat.txtfile.
Ten voluntary participants were recruited for the study alongside the ChatGPT team.These individuals were selectively invited to participate due to their active involvement in writing their master's thesis at TrollLabs, the research group in which this study has been carried out.This ensured their relevant expertise and interest in the field of early-phase engineering design and competitiveness in the hackathon.Masters' students were unaware that ChatGPT was instructing one team to mitigate biased behaviours.Complete individual demographics are provided in the Demographics.csvfile, detailing backgrounds and relevant experience.Demographics were captured using an online form.

Data Capture Method -Pro2booth
Pro2booth is an online prototype capture system designed to capture design activities by logging prototypes and their attributes [8] .Pro2booth have previously been used to capture large prototype datasets in similar events [ 9,10 ], and to provide an accurate representation of the users' design process [11] .This merited the reuse of Pro2booth for this study.The hackathon rules encourage using the platform by rewarding points for uploading prototypes.All attribute fields had to be filled for acceptance into the database to ensure prototypes were captured correctly.Participants interacted with Pro2booth through a website where they could upload prototypes and attributes in a form consisting of free-text entries and drop-down menus.Users, prototypes, and projects make up the nodes and edges of the dataset, all of which are linked through a graph database.Pro2booth captures the following attributes for each prototype: an ID for reference, a name, a team number, description (free text), domain (drop-down menu), media (picture, video, CAD files), who created it, Influences, rationale (free text), purpose of creation (drop-down menu), Insights (free text), date of capture, method/machine used (drop-down menu), time to create (drop-down menu).Prior entries were editable until the completion of the hackathon.Upon conclusion of the hackathon, the data from Pro2booth was consolidated into a .jsonfile.This file was then parsed to generate CSV files, which were categorized to extract prototypes, users and projects.This process resulted in creating the 'prototypes.csv'file in the dataset.This file encapsulates detailed information about each prototype, including its various attributes and the team responsible for its creation.Media content was required alongside prototype uploads, all referenced in the 'prototypes.csvánd saved in the media.zipfile.Additionally, a 'Teams.csv'file lists each team number and identifies their role in the hackathon.The roles are categorized as either "participant" for human teams or "ChatGPT" for the team directed by the generative AI, clearly differentiating the teams based on their operational dynamics.

Limitations
The data might be influenced by the human involvement necessary for executing ChatGPT's ideas, with the potential for either excessive or insufficient human input affecting the practicality and integrity of the outcomes.The dataset also faces limitations due to the subjective nature of interpreting ChatGPT's instructions, which were sometimes vague.This could lead to variability in the execution of instruction, potentially affecting the consistency and replicability of the data.
). Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

Table 1
File descriptions.
Contains the role of each team, regular participants denoted by the name "Participant" or controlled by ChatGPT denoted by "ChatGPT."Test_results.csvContains results from the final performance test of designs after the hackathon Chat.txtCopy of the chat between participants and ChatGPT.Prompts have the header "participant" and answer the header "ChatGPT."Demographics.csvDemographics of participants, including background and relevant experience.Media.zipContains media files uploaded alongside prototypes in Pro2booth with references to corresponding prototypes in the Prototypes.csvfile.