Exploring Dynamic Difficulty Adjustment Methods for Video Games

: Maintaining player engagement is pivotal for video game success, yet achieving the optimal difficulty level that adapts to diverse player skills remains a significant challenge. Initial difficulty settings in games often fail to accommodate the evolving abilities of players, necessitating adaptive difficulty mechanisms to keep the gaming experience engaging. This study introduces a custom first-person-shooter (FPS) game to explore Dynamic Difficulty Adjustment (DDA) techniques, leveraging both performance metrics and emotional responses gathered from physiological sensors. Through a within-subjects experiment involving casual and experienced gamers, we scrutinized the effects of various DDA methods on player performance and self-reported game perceptions. Contrary to expectations, our research did not identify a singular, most effective DDA strategy. Instead, findings suggest a complex landscape where no one approach—be it performance-based, emotion-based, or a hybrid—demonstrably surpasses static difficulty settings in enhancing player engagement or game experience. Noteworthy is the data’s alignment with Flow Theory, suggesting potential for the Emotion DDA technique to foster engagement by matching challenges to player skill levels. However, the overall modest impact of DDA on performance metrics and emotional responses highlights the intricate challenge of designing adaptive difficulty that resonates with both the mechanical and emotional facets of gameplay. Our investigation contributes to the broader dialogue on adaptive game design, emphasizing the need for further research to refine DDA approaches. By advancing our understanding and methodologies, especially in emotion recognition, we aim to develop more sophisticated DDA strategies. These strategies aspire to dynamically align game challenges with individual player states, making games more accessible, engaging, and enjoyable for a wider audience.


Introduction
Player engagement is pivotal for the success of a video game.If a game lacks intuitive gameplay and engaging mechanics, players might be tempted to switch to a different activity.This is particularly true for casual and non-gamers, who possess a diverse range of other recreational options.If these players are not immediately engaged and enjoy the game, they may set the game aside and never give it another chance.The game's difficulty significantly influences engagement; if it is too challenging, players may give up, whereas if it is too easy, they might lose interest.However, designing difficulty tiers that accommodate the diverse skill set of potential players is challenging.Factors like player skill, gaming experience, and familiarity with a game's genre are important in determining the optimal difficulty.A potential solution is to adjust the game's difficulty based on the player's performance and emotional state, gauged by using physiological sensors.Dynamically modifying game difficulty, considering players' affective states and in-game performance can enhance playtime and satisfaction.Such adaptability is important for enticing newcomers to video gaming and maintaining interest among those who do not typically engage with games.
Game designers traditionally offer fixed difficulty options, such as easy, medium, and hard (i.e., difficult), from which players can choose.However, two primary issues emerge with this approach.Firstly, these fixed settings might not cater to the full spectrum of player capabilities.Even the easiest difficulty level may prove to be too challenging for a non-gamer.For a player who may not have any experience with the controller or input system, any fixed difficulty level would prove too challenging for that user.Likewise, it is difficult to provide an appropriate challenge for experienced gamers.The toughest difficulties may not provide an engaging experience because even the highest level of difficulty available may not provide enough of a challenge.Secondly, these static difficulties do not accommodate fluctuating moods.Typically, games permit difficulty selection at the beginning of the game or specific checkpoints.Yet, a player's mood can vary frequently, influencing their gameplay preferences at any given moment.The player may get frustrated by continually failing and may just want to move past this specific point.Or, after a long day of work, they may just want to relax and not challenge themselves as much as they usually would.Addressing players' affective states when determining difficulty could notably bolster engagement.
Dynamic Difficulty Adjustment (DDA), which can consider the players' skill, emotional state, and game context, promises a more captivating experience than traditional games with fixed difficulties.This approach could be particularly beneficial for players at the skill extremes, potentially fostering continuous engagement and heightened immersion-an important element in game enjoyment [1].While DDA is not new, it traditionally emphasizes performance metrics [2].However, recent studies have begun exploring affective-based DDA as well [3][4][5].
In this study, we introduce and evaluate five distinct difficulty modification approaches.We crafted a custom game featuring seven difficulty tiers and assessed it using these five methods: fixed, user-selected, performance-based, emotion-based, and a hybrid of performance and emotion.Performance metrics were derived from in-game parameters, while emotional data were sourced from physiological sensors.We conducted a comprehensive user study with 31 participants, comparing the methods based on performance and self-reported measures, such as stress and engagement.

Related Work
DDA serves as an important mechanism in video game design, addressing the challenges of maintaining player engagement across diverse skill levels and emotional states.Unlike static difficulties that cannot adapt to individual player needs, DDA can align game challenges with player's abilities and mood in real-time, ensuring a continuous state of engagement and immersion.These approaches, leveraging real-time adjustments based on performance metrics or physiological responses, represent a significant advancement over traditional methods.The work of Bicalho et al. [6] identified player classification as a key challenge in DDA.By facilitating a more personalized experience, DDA not only broadens the appeal of video games but also offers an avenue to transform the short-lived interest of non-gamers and casual players into lasting engagement.
Many approaches to performance-based or emotion-based DDA derive inspiration from Csíkszentmihályi's Flow Theory [7].This theory suggests that a player's enjoyment and experience peak when tasked with an appropriately challenging endeavor.Describing a state of intense concentration or complete absorption in the ongoing activity, the phenomenon is often dubbed as being "in the zone".A flow state is achieved when there is an equilibrium between the player's skill level and the game's difficulty as illustrated in Figure 1.By adopting real-time difficulty adjustments as opposed to static settings, games can potentially heighten immersion.Building on the subject of player experience, Bowman and Tamborini [8] delved into the ramifications of stress and boredom on task demand in gaming contexts.Additionally, Jenett et al. [1] explored the interplay between immersion in games and corresponding affective measures.Lemmons [9] found that matching player skill to difficulty led to higher levels of flow as well as better performance, higher arousal, and more enjoyment.Flow state is achieved by presenting an appropriate challenge for the players skill level.If the game is too difficult, the player is more stressed and has higher anxiety.If the game is too easy, the player is less engaged and becomes bored.
Many studies have explored DDA and its benefits [2].Specifically, performance-based DDA emerges as a prevalent alternative to conventional static game difficulties.This approach employs in-game performance metrics, such as accuracy, damage dealt, and total score, to adjust difficulty dynamically.Research on performance-based DDA spans genres, from first-person shooters [10][11][12] to Massively-Multiplayer Online (MMO) games [13] and Multiplayer Online Battle Arena (MOBA) titles [14].Romero-Mendez et al. [15] utilized deep learning to classify players' skill levels and applied performance-based DDA, significantly enhancing gameplay engagement and immersion in an arcade shooter.In the context of enhancing player engagement in VR exercise games, Huber et al. [16] proposed using procedurally generated environments alongside DDA to dynamically tailor game levels to players' physical capabilities.Francillette et al. [17] developed and tested a model for performance-based DDA within a platformer environment, assessing difficulty through static danger zones and enemy movement patterns.
Conversely, affective-based DDA focuses on adjusting game difficulty in response to a player's emotional state, utilizing real-time physiological feedback from sensors.This method has been explored using various tools, including Electroencephalography (EEG) devices for detecting anxiety or stress [18,19] and the affective metric of focus in two-dimensional (2D) platform games [20].Stein et al. [21] implemented affective-based DDA in a multiplayer third-person shooter, targeting excitement as the key emotional driver.Other devices, such as functional near-infrared spectroscopy (fNIRS) to assess task demand [22] and heart rate variability (HRV) for stress detection, linking higher heart rates with increased stress levels [3][4][5]23] to calculate the affective state of the user, have been used as well.The galvanic skin response (GSR) is another popular measure for understanding player affect in DDA research [24][25][26].By gauging emotional states like focus, stress, excitement, and boredom, affective-based DDA aims to create a more immersive and emotionally resonant gaming experience.
The effectiveness of DDA, particularly when grounded in Flow Theory, has been debated.Zhixing Guo et al. [27] critiqued the reliance on Flow Theory, suggesting it leads to mixed outcomes in enhancing player experiences.Similarly, Koskinen's study indicated that situational interest increased only with significant downward difficulty adaptations [28].These critiques underscore the complexity of applying DDA in gaming.
Recent studies have begun exploring hybrid approaches that combine performance and affective data to refine DDA [24,29].This integration aims to address the limitations of relying solely on one type of data, marking a promising direction for future research and application in the field.Our study aims to extend this evolving discourse by comparing the efficacy of performance and emotion-based DDA with static difficultly techniques while also introducing a novel hybrid approach designed to optimize player enjoyment and engagement.

Game Environment: The Cattle Catcher Game
To test our Dynamic Difficulty Adjustment methods, we developed a custom game with several difficulty levels.The game, a first-person shooter titled "Cattle Catchers from Outer Space" (see Figure 2 for screenshots of the game), belongs to the endless survival genre.In this game, players aim to survive for as long as possible while defending against alien invaders.Apart from survival, players can also strive for high scores by accomplishing secondary objectives, which include escorting cows to safety, collecting crystals from defeated enemies, and protecting cows from unidentified flying objects (UFOs) aiming to abduct them.We have incorporated seven difficulty levels, ranging from extremely easy (1) to extremely hard (7), altering in-game variables associated with the enemies and the player character.

Design of Difficulty Levels
Before detailing the specific adjustments to player and enemy attributes across the various difficulty levels, it is important to outline the methodology behind these determinations.The values for each difficulty level were established through a series of pilot studies involving players of varying skill levels.These studies aimed to identify settings that offer a balanced and engaging experience for a wide range of players.By analyzing gameplay data and gathering feedback, we adjusted the in-game variables to ensure each difficulty level was challenging yet playable, accommodating the diverse abilities and preferences of our player base.
For the player character, three attributes vary according to difficulty: reload time, aimdown sight (ADS) speed, and aim assist.The attributes and their settings across difficulty levels (from extremely easy to extremely hard) are provided in Table 1.

•
Reload Time: Refers to the duration required for the player character (in seconds) to reload their weapon.This duration increases with rising difficulty levels.• ADS Speed: This value acts as a multiplier to the base speed for aiming down the sights, influencing how quickly the player character can enter aiming mode for the more precise targeting of enemies.A higher multiplier means quicker aiming.As the difficulty level increases, the multiplier decreases, making aiming slower and adding to the challenge.

•
Aim Assist: This attribute determines the effective hitbox size, in meters, around each projectile fired at enemies, essentially acting as the size of the detection sphere for each shot.As the game's difficulty level increases, this detection sphere-or the aim assist-becomes smaller.This reduction requires players to be more precise with their aiming, as the margin for error decreases with higher difficulty settings.Aliens, when spawned, attempt to defeat the player.They periodically move to a location on the game map near the player and fire projectiles at the player if they have a clear line of sight and are within firing range.These enemies always target the player's center mass.However, to simplify gameplay at lower difficulties, we introduced a degree of randomness in the projectiles' trajectory.The parameters altered for these enemies and their settings across difficulty levels are detailed in Table 2.

•
Health: This represents the amount of damage the enemy can sustain before defeat.It is specific to body shots since headshots guarantee instant kills.• Fire Rate: Represents the amount of time (in seconds) that passes before the enemy can fire a new projectile.

•
Aiming Error: Defines the inaccuracy in enemy targeting as a sphere of variable size around the player (in meters), within which enemies randomly select points to aim and fire.This mechanism introduces intentional inaccuracy in enemy attacks.The larger the sphere, the greater the inaccuracy.

•
Move Delay: The amount of time the enemy remains stationary before repositioning closer to the player.
• Max Spawn: Designates the maximum number of enemies allowed on the game map simultaneously.• Spawn Rate: Specifies the amount of time that passes in seconds before deploying more enemies onto the map.• Spawn Amount: Indicates the number of enemies that can spawn during this timestep.UFOs, another enemy type in the game, spawn to abduct cows.Their modifiable attributes and settings across difficulty levels are presented in Table 3.

•
Health: The amount of damage the enemy can sustain before being defeated.• Abduction Speed: Acts as a multiplier to the base abduction speed of UFOs using their gravity ray.A lower value signifies a longer duration required to abduct cows.• Max Spawn: Dictates the maximum UFOs present on the game map concurrently.• Spawn Rate: The duration (in seconds) the game waits before spawning additional UFOs.Having outlined the game environment and the design of the difficulty levels, it is important to understand the dynamic methods by which these difficulty levels can be adjusted.In the following section, we detail the various Dynamic Difficulty Adjustment methods that were explored and tested in our research.

Difficulty Adjustment Methods
We implemented five difficulty adjustment methods.Two of these methods served as our baseline and employed static, unchanging difficulty levels.The other three methods leveraged either performance data, affective data (garnered from EEG and heart rate sensors), or a combination of both.To ensure more stable adjustments, both performance and affective data utilized a moving average over the last 40 s.This approach aimed to avoid abrupt difficulty fluctuations within brief periods.Each method is described in the subsequent subsections.

Fixed Difficulty
This method employs a static, medium difficulty level (4) throughout the game, serving as a control for comparative analysis against other adjustment techniques.

User Selected
Players personally select their preferred difficulty level, which can range from extremely easy (1) to extremely hard (7).While this choice often reflects a player's gaming experience, it might also be influenced by their current emotional state, such as stress or mental fatigue.

Performance-Based
Performance metrics are derived from a weighted average of diverse in-game parameters.These parameters, and their significance in gauging player skill, are listed below.The exact weights can be referenced in Table 4.

•
Accuracy: This represents the player's current accuracy, measured as a percentage.A higher accuracy indicates a more skilled player.

•
Headshot Ratio: This is the ratio of headshots to body shots against the alien enemy type, measured as a percentage.A higher headshot ratio suggests a more skilled player.

•
Player Health: This signifies the player's current health level.This value ranges from 0 to 100.

•
Last Hit: This is the time in seconds that has passed since the player last took health damage.A higher value suggests a more skilled player.

•
Cows Corralled: This represents the secondary objective of successfully escorting cows to the safe zone.A higher value suggests a more skilled player.

•
UFOs Destroyed: This signifies the secondary objective of destroying UFOs before they can abduct cows.A higher value suggests a more skilled player.

•
Crystals Gathered: This counts the number of crystals collected, which are items dropped upon the death of the alien enemy type.A higher value suggests a more skilled player.

•
Survival Time: This is the time in seconds that the player has managed to survive in the current playthrough.Lasting longer suggests a more skilled player.Each performance parameter is assigned a weight reflecting its significance in assessing player proficiency and expertise, with these weights derived from pilot studies and player feedback.Game performance metrics, taking into account both player skill level and the game's current state, guide the recommended difficulty level in Table 5.A weighted moving average emphasizes recent performance, focusing on the last 40 s of playtime.

Emotion-Based
Drawing from Csíkszentmihályi's Flow Theory, our primary objective is to keep players within an optimal mental state that balances stress and engagement, thereby preventing either boredom or excessive mental strain.Unlike performance-based DDA, this method does not gauge the player's skill.Instead, it harnesses physiological sensors to gauge players' stress and engagement levels directly, adjusting the game difficulty accordingly.
Our system aggregates stress and engagement metrics, each ranging from 0 to 100, into a single metric.To ensure this combined metric is compatible with the 0 to 7 scale used in our performance-based Dynamic Difficulty Adjustment (DDA) system, we normalized it by dividing the sum by 28.57.This divisor was calculated by dividing the maximum possible combined value of 200 (the sum of the maximums for both metrics) by the maximum of the target range, 7, to seamlessly integrate with other DDA calculations.The normalized value then influences the moving average formula, which determines the overall game difficulty based on the last 40 s of gameplay, providing a buffer against abrupt difficulty changes.This normalization process and its application in adjusting game difficulty are further detailed in Table 5 and Figure 3.

Combined Approach
This Dynamic Difficulty Adjustment (DDA) framework employs a hybrid strategy, integrating both performance-based and emotion-based metrics to tailor the game's challenge in real-time.To achieve this, we computed a weighted average, where the emotion-based metric receives a weighting of 0.6 and the performance-based metric a weighting of 0.4.This weighting scheme was carefully chosen based on insights gained from pilot studies, which emphasized the significance of the player's current emotional state in enhancing game engagement and overall experience.
In our pilot studies, we observed that players' enjoyment and immersion were more deeply impacted by adjustments that considered their emotional responses, such as stress and engagement levels, compared to purely performance-based adjustments.Therefore, by assigning a higher weight to the emotion-based metric, we aimed to ensure that the game dynamically adapts in a manner that prioritizes player sentiment, fostering a more personalized and emotionally resonant gaming experience.This approach allows for a nuanced balance, ensuring that while performance factors remain important, they do not overshadow the importance of maintaining an engaging and emotionally satisfying challenge level for the player.

Goals
This study aims to explore the intricacies of player preferences regarding game difficulty settings, specifically examining the impact of DDA compared to static difficulty settings.Understanding player preferences can significantly enhance user experience and engagement.From this, we propose the following research questions: RQ1: Do participants prefer DDA over static difficulties in video games?RQ2: Among DDA techniques, which do participants prefer: emotion-based, performancebased, or a combination of techniques?RQ3: Are higher difficulty levels within DDA techniques associated with increased stress and engagement among players?RQ4: What differences are observed in performance and physiological metrics when applying various DDA techniques?

Experimental Design and Procedure
Upon entering the laboratory, participants were escorted to the experimental area.They then completed several forms: a consent form, a COVID-19 precautionary questionnaire, and a pre-experiment questionnaire concerning gaming experience and immersive tendencies.Following this, participants were fitted with physiological sensors.We ensured that the devices were correctly positioned and provided reliable readings.A training session was then held to acquaint participants with the game's controls, environment, and objectives.After training, they entered a baseline data collection phase, where they were instructed to relax with their eyes open for thirty seconds, followed by an equivalent period with their eyes closed.This allowed us to capture essential baseline parameters like resting heart rate and baseline emotional states.
Participants were outfitted with an EEG device while playing the shooter game and wore a wristwatch that measured HRV.These devices monitored the players' affective states, allowing the game's difficulty to dynamically adjust in response.Additionally, we recorded in-game performance metrics to modify the game in real-time and for post-experiment statistical analysis.
Participants then underwent testing for the five different difficulty adjustment methods, with the sequence determined using a Latin square.Each method was tested either until a three-minute timer expired or the game character died by losing all health due to enemy projectiles.Post-technique questionnaires (detailed in Table 6) were completed after each method to complement the EEG data and in-game metrics with participants' feedback.Following the testing of all five methods, participants were offered a break.A second trial, repeating all procedures, was conducted post-break.

Q9
To what extent did you enjoy playing the game, rather than something you were just doing?Q10 Was there a noticeable difference in difficulty from the previous trial/trials?

Participants
In our study, we recruited a total of 31 participants from the university population, comprising 27 males and 4 females, with ages ranging from 19 to 33 years (mean: 22.46).To categorize participants based on their gaming experience, we administered a pre-experiment questionnaire containing inquiries such as 'Do you currently play games', 'How often do you play games?', 'How long have you been playing games?', and a self-reported assessment of skill level in video games.
Based on their responses to these questions, participants were divided into two distinct categories: casual and experienced players.Casual players were identified as those who reported either limited gaming activity or no gaming experience at all.On the other hand, experienced players were defined as those who indicated frequent gaming activity and self-reported proficiency in video games.Of the 31 participants, 14 were classified as casual or non-gamers, while the remaining 17 were categorized as experienced gamers.This categorization facilitated subsequent analysis under the label expertise, allowing us to explore the potential differences in gaming preferences and experiences between these two groups.

Equipment and Software
The experimental setup, illustrated in Figure 4, comprised a 27 2 Dell monitor, an Emotiv Insight EEG device, a Polar Verity optical heart rate sensor wristwatch, and a PC with an Intel Core i7 8700K processor, GTX 1080 Graphics card, and 16 GB RAM, running on Microsoft Windows 10.The game was developed using Unity 3D v2018.4.14f1.
The Emotiv Insight, a 5-channel low-power EEG device, provided processed emotion values such as Stress, Engagement, Interest, Excitement, Focus, and Relaxation every 10 s.Definitions for these emotions can be found on the Emotiv website [30].This study primarily utilized the Stress and Engagement metrics for the emotion-based and Combined techniques.
The Polar Verity Sense wristwatch measured heart rate variability, aiding in the determination of participant stress levels.Prior research has linked heart rate variability with stress, finding that elevated stress and anxiety levels correspond to decreased intervals between heartbeats [3-5].For simplicity, we recorded participants' mean heart rate for each technique for that timestep and compared it to their resting heart rate during the baseline data collection.The Polar Verity Sense provided heart rate variability data at a rate of 4 Hz.
Statistical analysis, using IBM SPSS 26, was conducted using a variety of techniques tailored to the nature of the collected data.For the game and physiological data, we performed a one-way multivariate analysis of variance (MANOVA) to examine the overall differences between techniques, independent samples T-tests to compare means between casual and experienced players, and pairwise comparisons for dependent samples T-tests to assess changes within groups across different conditions.Additionally, we used Holm's sequential Bonferroni adjustment to correct for type I errors [31].For the post-questionnaire data, which utilized a 7-point Likert scale, we employed the Friedman Test to analyze differences across multiple related groups, Mann-Whitney U Test to compare two independent groups on a non-parametric variable, and Wilcoxon Signed-Rank Test to compare paired samples on a non-parametric variable.
Additionally, visualizations of the data were created using Python using the matplotlib and pandas libraries for enhanced interpretation and presentation.

Game Data and Physiological Sensors
In this study, we evaluated the effectiveness of various DDA techniques.Our approach involved collecting an array of data, which encompassed both sensor outputs and in-game metrics, as well as feedback from participants through post-experiment questionnaires to determine which DDA techniques were most successful.
In the context of the varying levels of difficulty, it was found that the Emotion technique posed the highest challenge, with an average difficulty rating of 5.93 as shown in Figure 5. On the other end of the spectrum, Performance presented the least degree of difficulty, averaging at 3.92.The Combined technique exhibited a middle-ground difficulty level, with its average score hovering around 4.71.When users were given the liberty to choose their level of difficulty, the self-selected average turned out to be 4.29.The difficulty scale in this study was designed to range from 1 to 7.
When analyzing the mean difficulty levels across different DDA techniques, accounting for expertise, a significant difference between casual and experienced participants is observable during the Performance technique.Specifically, the average difficulties, over all the techniques, for casual and experienced participants were 3.43 and 4.33, respectively, as depicted in Figure 6.A similar trend emerged in the User Selected technique, where casual players opted for lower difficulties on average.Here, casual players generally preferred a moderate difficulty setting, averaging 4.00, in contrast to the more experienced players who leaned towards a slightly tougher challenge, with an average difficulty of 4.53.This pattern is also evident in the Combined and Emotion techniques, where more seasoned players faced marginally higher difficulties on average overall as well.In Figure 7, we present the mean values for an array of game data alongside the physiological sensor measurements obtained during our user study.It is important to emphasize that, in contrast to initial expectations, the variability observed in most physiological sensor readings, the EEG and heart rate data, was significantly lower than anticipated.For example, the standard deviation across all DDA techniques indicates minimal variability in the Engagement levels, with standard deviation values ranging from 0.058 (Performance) to 0.072 (Combined).This uniformity contrasts with our initial expectations of a wider variability in emotional responses.This trend of limited variance is consistent across all emotions captured by the EEG.In terms of HRV, the Emotion technique was unique by eliciting a notable increase in heart rate.This was determined by calculating the heart rate (HR) difference and comparing the average heart rate at each timestep for each technique to the baseline resting heart rate established during the initial data collection phase.In evaluating performance metrics, the impact of the Emotion technique on gameplay becomes evident through its consistent under-performance in several key areas: it notably resulted in fewer UFOs killed, fewer cows saved, shorter survival times, and lower total scores, alongside recording the highest average damage taken.In contrast, the Fixed technique demonstrated higher overall readings, with higher total scores, more UFOs killed, longer survival times, greater damage done, less damage taken, and higher accuracy.
To provide a more comprehensive understanding of these findings, a detailed statistical analysis will be conducted on these recorded performance metrics and physiological data.For our sensor and in-game data, we performed a series of statistical tests.Our analysis includes a MANOVA for evaluating game metrics impacts (Table 7), and T-tests to explore differences between groups and techniques, detailed in Table 8 for independent samples and Tables 9 and 10 for dependent samples.Through this analysis, we aim to illustrate the intricate ways DDA shapes player experience and game dynamics.Our one-way multivariate analysis of variance investigated the impacts of different difficulty adjustment techniques-emotion-based, performance-based, a combined approach, fixed difficulty, and user selected-on game performance metrics and physiological sensor data.The findings reveal significant differences in how these adjustment techniques influence key performance metrics, such as difficulty, accuracy, and total score, showcasing the distinct impact each method has on the gaming experience (for detailed statistics, see Table 7).
When analyzing the data by separating players based on expertise (casual vs. experienced), the overall impact of expertise was less pronounced, with only difficulty and damage taken showing significant differences.This suggests that while the type of difficulty adjustment technique plays an important role across all players, the division of expertise level does not uniformly affect most game performance metrics.However, physiological measures such as focus, interest, relaxation, and stress revealed significant differences when players were categorized by expertise, highlighting how expertise influences players' emotional and cognitive reactions to gameplay.
Further exploring the differences between casual and experienced players, independent samples T-tests were conducted to assess the effectiveness of five DDA techniques in modifying key game performance metrics and physiological responses, considering player expertise.Due to multiple conditions tested simultaneously, we used Holm's sequential Bonferroni adjustment to correct for type I errors [31].Significant differences between casual and experienced players were observed in the following areas as detailed in Table 8: • Accuracy: Significant differences in accuracy were found between casual and experienced players for the Combined and Performance DDA techniques.Specifically, for the Combined technique, casual players had a mean accuracy of 0.589, while experienced players had a mean of 0.663.Similarly, under the Performance technique, the mean accuracy was 0.575 for casual players and 0.650 for experienced players.• Damage Done: For the Emotion DDA technique, significant differences were observed between casual and experienced players.Casual players had a mean damage done of 75.03, whereas experienced players had a higher mean of 117.85, indicating that experienced players tend to inflict more damage within this technique.• Damage Taken: Only the Fixed technique showed significant differences when comparing casual and experienced players.Casual players had a mean damage taken of 77.25, while experienced players had a significantly lower mean of 51.35, indicating that experienced players manage to avoid taking damage more effectively than casual players under this static difficulty.• Survival Time: For the Emotion DDA technique, significant differences were observed between casual and experienced players.Casual players had a mean survival time of 115 s, while experienced players had a mean of 150 s, indicating that experienced players tended to survive longer under this technique.• Total Score: Significant variation was found for the Performance technique when comparing casual and experienced players.Casual players had a mean total score of 9180, while experienced players had a significantly higher mean of 12,397, indicating that experienced players generally achieve higher scores under this condition.• Physiological Sensor Data: Minimal significant differences were found when accounting for expertise, except for the Focus metric with the User Selected technique, where a significant difference was noted.Casual players had a mean focus level of 0.282, while experienced players had a higher mean of 0.334, indicating that experienced players tend to maintain better focus under this condition.

Post-Questionnaire Results
Understanding and integrating user feedback directly is important when designing systems to ensure the readings collected from the physiological sensors match the expected outcomes.After testing each technique, participants were requested to complete a questionnaire.The purpose of this questionnaire was to collect self-reported data about the participants' emotional states and performance.This additional information was intended to complement the data gathered through physiological sensors, thereby providing a more comprehensive understanding of the participants' emotional states during the trial.Table 6 presents the questionnaire in full and the specific questions asked.Responses were collected on a 7-point Likert Scale.
Analysis of participant responses from Figure 8 offers insights into their experiences across various Dynamic Difficulty Adjustment (DDA) techniques.Here is a brief overview based on the specific questions asked: To further explore player feedback from post-questionnaires, we conducted three distinct statistical analyses: the Friedman Test for identifying differences across multiple groups, the Mann-Whitney U Test for comparisons between two independent groups, and the Wilcoxon Signed-Rank Test for assessing differences between paired samples.The outcomes of these tests are detailed in , highlighting how the different approaches to adjusting game difficulty shape aspects such as self-reported player engagement, fun, enjoyment, and excitement.The Friedman Test was employed to evaluate ordinal responses from the post-questionnaire, assessing differences in participant reactions across various testing conditions (see Table 11).This analysis revealed statistically significant differences in participant responses for several key areas: • Stress (Q1): There were significant variations in stress levels reported by participants across the different DDA techniques, with a Friedman Test statistic of χ 2 p4q " 23.135.This suggests that some techniques are perceived as more stressful than others.
preference for any approach to difficulty adjustments.The absence of significant findings for RQ1 suggests several possibilities.First, it may indicate that player preferences for DDA versus static difficulties are more intricate than initially hypothesized, possibly influenced by factors not captured in our study.Alternatively, the methodologies employed-both in terms of physiological measurement and self-reporting-may not have been sufficiently sensitive to detect subtle preferences or effects.This outcome emphasizes the complexity of measuring player engagement and satisfaction, highlighting the need for further research with refined methodologies or alternative metrics of player experience.Similarly, our investigation into RQ2, which asks if participants have a preference between the three DDA techniques, did not yield significant results from either the physiological sensors or post-questionnaire analyses.This finding was unexpected, as we hypothesized there would be clear distinctions in player preferences based on the adaptive nature of each DDA approach.The lack of significant findings for RQ2 may reflect that participants do not have a strong preference for one DDA method over another, or that the differences between these techniques are not as perceptible to players as we anticipated as evidenced by Q10, where no significance was found, in the post-questionnaire which asks participants if they feel there is a noticeable difference between the current technique in comparison to those previously tested.
Investigating the efficacy of Flow Theory [7] in video games, RQ3 seeks to understand if higher difficulty levels, appropriately matched to player skill, lead to higher levels of stress and engagement.Our analysis, grounded in self-reported stress (Q1) and effort (Q6) from post-questionnaires, revealed significant correlations, particularly with the Emotion DDA technique, which presented the highest challenge.This technique's elevated difficulty-both perceived (reflected in Q5 responses) and measured in actual gameplay metrics-aligns with Flow Theory's principles, suggesting that an optimal challenge level can indeed foster engagement, despite increased stress.
The investigation into our final research question RQ4 revealed discernible differences in player performance metrics attributable to the deployment of diverse DDA techniques.Notably, metrics such as shot accuracy, damage dealt, survival time, and total scores exhibited significant statistical variances, underscoring the distinct effects of each DDA approach.Furthermore, other metrics, including damage taken, cows saved, and UFOs killed, while showing variance to a lesser extent, nonetheless contributed valuable insights into the complex effects of DDA on gameplay.
A pronounced disparity was observed in the difficulty level generated by different DDA techniques, which notably could have influenced many of the game performance metric findings.The Emotion technique, in particular, was characterized by a higher mean difficulty compared to other methods.This may have had a cascading effect that influenced the other game metrics.Players navigating the increased difficulty of the Emotion technique experienced larger amounts of damage taken, aligning with our observations of the technique's elevated challenge.Conversely, the Fixed technique, marked by its lower difficulty, led to reduced damage taken by players, highlighting the inverse relationship between difficulty level and player survivability.
Moreover, the analysis of secondary objectives, such as UFOs killed, further demonstrated the influence of DDA techniques on gameplay.The high difficulty associated with the Emotion technique resulted in fewer UFOs being eliminated, whereas the lower difficulty level of the Fixed technique facilitated greater success in this objective, reinforcing the hypothesized link between difficulty and objective completion.The survival time metrics further confirmed the significant role DDA techniques had in shaping the game experience, with the Emotion techniques having higher difficulty correlating with shorter player survival times, further illustrating the impact of the difficulty of providing sufficient challenge and ensuring player engagement and longevity.
Our findings contribute to the ongoing discussion on adaptive game design, suggesting complex player interactions with DDA that merit further exploration.The study's limitations, especially in capturing the breadth of player experiences, set the stage for future research to more accurately tailor DDA techniques for improved player engagement.

Limitations
In our user study, we faced challenges with the devices used to gather physiological data.The EEG headset, used for measuring brain activity, often failed to connect reliably with participants with thick hair, leading to inconsistent signals.To preserve data integrity, we excluded the affected participants' data from our analysis.Additionally, the emotional variance captured by the device was lower than expected, which might be attributed to either the device's sensitivity or the game's inability to provoke strong emotional responses.
Our use of an optical heart rate monitor to gauge stress levels also fell short of expectations.Despite heart rate variability being a proven indicator of stress in prior studies, we did not observe significant differences during our tests.As a result, heart rate data were relegated to post-study analysis rather than being a primary measure of stress and engagement.
Furthermore, the study's participant diversity posed another limitation, especially regarding gender balance.With a participant pool of 27 males and only 4 females, there is a potential bias towards male gaming experiences and responses.This gender disparity limits the generalizability of our findings, as it may not fully account for gender-specific differences in stress response, engagement, and gaming behaviors.Future research should strive for more equitable gender representation to enhance the universality of the findings.

Conclusions
In this study, we explored the efficacy of Dynamic Difficulty Adjustment across a spectrum of player experiences in video games, contrasting their impact on both casual and experienced gamer groups.Despite the anticipation that DDA techniques would distinctly influence player preferences and emotional responses, our findings reveal a more complex interaction.No single DDA technique, including performance-based, emotion-based, or a hybrid approach, demonstrated clear superiority in enhancing the gaming experience or engaging players more deeply than static difficulty settings.
Our investigation into whether participants exhibited a preference for DDA over static difficulties or among the DDA techniques themselves did not yield significant preferences.This outcome suggests that player engagement with DDA is influenced by a broader range of factors than those captured within the scope of our study.Interestingly, the data did point to a potential alignment with Flow Theory, where the challenge provided by the Emotion DDA technique correlated with increased stress and effort, hinting at a sophisticated balance that might foster deeper engagement under certain conditions.
The modest effect of DDA on player performance metrics contrasted with its limited impact on emotional and physiological responses, illuminating the challenges in crafting adaptive difficulty settings that engage players both mechanically and emotionally.This distinction emphasizes the importance of further investigation into how DDA can be refined to suit a variety of player experiences, aiming to enhance gameplay in a way that complements both the emotional depth and physiological aspects of gaming.
Looking forward, improving our approach to emotion recognition becomes a key priority.The challenges faced with existing technologies, like the EEG device, highlight the need for more responsive and precise instruments to measure player emotions in real-time.Progress in this domain is essential for developing DDA strategies that are genuinely adaptive, potentially reshaping game design to provide experiences finely tuned to the player's emotional and cognitive states.
Our research adds to the evolving discussion on Dynamic Difficulty Adjustment, highlighting the complex interplay between game difficulty, player engagement, and emotional response.As we further examine and improve DDA techniques, our goal is to discover strategies that significantly better player experiences, thereby making games more inclusive and enjoyable for all.
In the future, we would like to investigate more into enhancing the Emotion-based DDA technique.This involves increasing the precision with which we assess players' emotional states and refining our methods for adjusting difficulty based on these measurements.Additionally, there is potential for incorporating machine learning algorithms to fine-tune game difficulties, specifically tailored to accommodate a wider range of player abilities-from complete novices to highly experienced gamers.Moreover, applying DDA techniques to a broader range of game genres beyond the first-person shooter studied here could provide insightful findings into the versatility and effectiveness of these strategies across different gaming contexts.

Figure 1 .
Figure1.Flow state is achieved by presenting an appropriate challenge for the players skill level.If the game is too difficult, the player is more stressed and has higher anxiety.If the game is too easy, the player is less engaged and becomes bored.

Figure 2 .
Figure 2. Screenshots of the Cattle Catchers game.(a) A UFO abducting a cow in the game.(b) Player shooting the enemies and dodging the enemy projectiles.

Figure 3 .
Figure 3. Chart of emotion-based difficulty adjustment.Stress and engagement scores are merged to form a unified emotion metric, which is categorized into seven distinct levels, corresponding to suggested difficulty levels (1-7) for the upcoming timestep.Each colored square represents one of the seven difficulty levels suggested for the next timestep.

Figure 6 .
Figure 6.Mean difficulty per technique and expertise.

Figure 7 .
Figure 7. Mean values of game data and physiological sensor readings across different techniques.Techniques are color-coded as follows: Combined (blue), Fixed (yellow), Emotion (green), Performance (red), and User Selected (purple).

Table 1 .
Difficulty adjustments for player character.Note: the reload time is measured in seconds, and the ADS speed and aim assist are measured in meters.

Table 2 .
Difficulty adjustments for alien enemies.Note: the fire rate, move delay, and spawn rate are measured in seconds; the aiming error is measured in meters.

Table 3 .
Difficulty adjustments for UFO enemies.Note: the spawn rate is in seconds.

Table 4 .
Performance-based metrics and weight values.Note: last hit and survival time are in seconds.

Table 5 .
Suggested difficulty for the next timestep for both performance and emotion dynamic difficulty techniques.

Table 6 .
Post-technique questionnaire.Participants answered questions regarding their current affective states during the playthrough.Answers are in the form of a 7-point Likert scale.

Table 7 .
MANOVA revealing effects of technique type and gaming expertise on participant performance.