Beyond discrete-choice options

.Forexample,function learning[13,125],wherepeoplelearnthefunctionalrelationshipbetweencontinuouscuesandtargetvariablessothatone variablecanbepredictedtobethetargetfromcues,ortemporallearning[126,127],wherepeoplelearntodecidewhen to act and to correctly time their responses, are also interesting areas in terms of continuous-option spaces and continuous-option space models, though discussing these in detail are beyond the scope of the current article.

Although continuous-option spaces intuitively seem to provide more information about the latent decision-making process than discrete decisions (see Box 2 for a discussion of these benefits), discrete decision paradigms have been dominant in most areas of psychological research.Specifically, while continuous-option spaces have been a core paradigm in a few areas of cognitive science research, such as visual working memory [6][7][8][9], numerical cognition [10][11][12], and function learning [13][14][15][16], their usage has been limited in other areas of cognitive science researchparticularly in areas that have historically focused on the time at which people make responses (i.e., response time).There are several practical reasons for the historically narrow focus of decision-making research, such as the ease of defining, controlling, and analyzing experiments where decisions are discrete, and the difficulty in generalizing discrete models to continuousoption spaces.However, the dominance of discrete decisions in cognitive science has resulted

Highlights
People constantly need to make decisions with a continuum of potential alternatives during their daily life, making continuous-option spaces a fundamental aspect of understanding human decision-making, learning, and other related cognitive processes.
Recently, theoreticians have begun formalizing how people make decisions that have continuous-option spaces and resulting theories are beginning to have a significant impact on our initial understanding of how people make such decisions.
Deeper insights into decisions with a continuous-option space can both improve our theoretical understanding and models of cognitive processes, and lead to novel applications of the principles of cognitive psychology to various domains, such as clinical science, sports, and driving. in many of the dominant formal theories of choice processessuch as sequential sampling models [17][18][19][20] and signal detection theory [21,22] being overly focused on explaining decisions about two, or at most a small number of, discrete options (though for some initial continuous extensions see [1,2,23,24]), limiting our ability to gain an in-depth understanding of the processes that underlie continuous-option space decisions.Therefore, we see developing modelsspecifically, within the theoretically dominant sequential sampling frameworkfor a range of different types of decisions with continuous-option spaces as a fundamental puzzle for both improving our understanding of human decision-making, and providing model-based inference for continuous tasks in applied settings.
The current article aims to review the previous work on modeling continuous-option space decisions, discuss the benefits that further theoretical work on continuous-option spaces might provide, and outline general future directions for continuous-option space research.Specifically, our key focus is on the extension of decision-making theories based on a sequential sampling framework to continuous-option spaces, which we argue has currently produced the most theoretical progress in explaining the joint continuous choice and response time distributions.We discuss the various models that have been proposed in the literature, some of the broad similarities and differences between these models, as well as the theoretical and practical challenges of these models.We also present an overview of clinical applications that could benefit greatly from continuous-option decision tasks and theories, both theoretically and practically.

Continuous sequential sampling theories
What theoretical models currently exist?Recent proposals modeling continuous-option decisions [1,2,23,25] in domains such as psychophysics and visual working memory [6][7][8] are extensions of existing sequential sampling frameworks used for decisions with a dichotomous or discrete option space.In the discrete case, absolute accumulators representing each of the different options race (based on the drift rate) from the starting point towards the thresholds to trigger a decisionor in the case of the dichotomous diffusion decision model (DDM) [26,27], a single relative accumulator moves continuous-option space in real life.From the top left, the decision-maker involves a pricing task (setting a price for a car as a buyer), a time allocation task (allocating a specific amount of time for a video game), an investment task (deciding how much to invest in each stock and making a portfolio), a money-saving task (deciding which portion of salary should be saved each month), and a hitting a golf ball task (deciding how hard to hit the ball as the final shot).
between two boundaries that reflect the two options (Figure 2A,B).Here, we discuss several theoretical extensions, which have been designed to improve our understanding of the cognitive constructs underlying continuous-option tasks (see Figure 2C for an example task): the circular diffusion model (CDM) [1,28,29], the spatially continuous diffusion model (SCDM) [2,25,30], the radial basis leaky competing accumulator (RBLCA) model [31], and the multiple anchored accumulation theory (MAAT) [23].Generally, these models extend the accumulation process to a continuous-option (typically, 2D) space, and when the accumulator hits one of the available Box 1.What aspects of decision-making can be 'continuous'?
Decisions can be continuous on many different dimensions.For instance, even decisions with a discrete number of options take time to make and execute, and how people's preference evolves over the course of a decisionor even after a decisioncan either be tracked in a continuous manner ('continuous time measurement'), reduced to a single measurement information ('single time measurement'), or fall somewhere in-between on this spectrum.We summarize this 'time' dimension of discrete decisions to continuous decisions on the x-axis of Figure I.By contrast, the continuous-option dimension of decisions corresponds to the y-axis of Figure I, where decisions can either be dichotomous (e.g., key presses [53]), continuous (e.g., selecting an option among a continuum by clicking [2]), or fall somewhere in-between on this spectrum (e.g., visually fixating [32] on the desired option, which can produce a very large, though discrete, spectrum of responding).Importantly, these two dimensions can also interact; for instance, fine-grained movement paradigmssuch as mouse, eye, finger, or arm tracking [2,48,93,94] are able to provide continuous information along both of these dimensions.However, it should be noted that there is some debate over how well mouse, finger, arm, and eye tracking map onto the evolution of preference in the decision-making process, with some suggesting that mouse and finger movements occur more slowly than evidence accumulates [2,18].In these cases, we are not just capturing the response at the end of the trial (e.g., clicking or eye gaze on an option, or pressing a key corresponding to an option), but can also capture some activity during the trial.This kind of information can be useful in distinguishing between theories of how decision-makers accumulate evidence over time [95,96], or make (and revise) their decisions, (e.g., double responding [94]) and stopping actions [97,98].Both continuous responding and continuous measurement are undoubtedly interesting avenues for future decision-making research, helping to advance the field in different ways.However, in order to provide a more comprehensive discussion on our primary focuscontinuous-option decisionshere we focus on the y-axis of Figure I, in situations where decisions are most often singular with respect to time.Furthermore, it should be noted that our focus is on continuous-option spaces rather than continuous spaces more generally, meaning that we will not cover areas such as motor control research using continuous action spaces without an explicit decision [99].[32], mouse-clicking [2], finger movement on touch screen [2], key pressing [53], repeated key pressing [94], eye tracking [48,55], mouse tracking [93,100]).

Glossary
Brownian motion: a stochastic process that assumes Gaussian noise and is utilized in sequential sampling models to explain the evidence accumulation process.In the decisionmaking literature, the terms 'Brownian motion' and 'Wiener process' are often used interchangeably.Continuous-option space: a decision space where all possible values within a continuum can be selected (e.g., direction of motion).In practice, many of the nominally 'continuous' spaces we discuss are very fine-grained discrete approximations of continuous spaces (e.g., when judgments are recorded using floating-point numbers).These tend not to be meaningfully different from continuous-option spaces for our purposes and for simplicity we refer to them as continuous.Decision-making: the process of selecting an option among multiple alternatives.Decisions can vary in their content matter, such as being valuebased (e.g., chocolate or vanilla ice cream based on subjective evaluation), cognitive (e.g., remembering whether or not a target stimulus was previously presented), or perceptual (e.g., assessing the predominant direction in which a cloud of dots is moving).Decisions can also vary in their context, such as the relevant information being directly given and/or described to participants, or participants having to learn the relevant information through experience.Discrete decisions: the process of selecting an option from a finite set of alternatives that is not a fine-grained approximation of a continuous range (e.g., is motion to the left or right?).Drift rate: the average rate of evidence accumulation in a sequential sampling model.Identifiability: parameter identifiability shows whether a cognitive model's parameters can be accurately estimated by fitting the model to data.Identifiability is a property of both the model and the design used to collect the data (e.g., number of test trials, conditions, participants, etc.).Identifiability is assessed through determining how precisely parameter values can be 'recovered' by fitting the model to simulated data generated from the model with known parameters in a particular design.Measurement: in the decision-making literature, the goal of measurement is options in the continuum, the decision is triggered; however, each of these models differ in the specifics about how the decision and/or response process operates.
The CDM [1] extends the 1D DDM for dichotomous decisions to a 2D Brownian motion process fluctuating within a disk (and in the general case, n -dimensional Brownian motion within a hypersphere [29]).Reaching the boundary of the disk results in a decision for the continuous response associated with that point on the disk's boundary (Figure 2D).In principle, the process can start from any point within the circle, and the initial point represents the bias towards some responses (e.g., the Euclidean distance between the starting point and options can be a measure of the bias).While the CDM provides an intuitive extension of the DDM to tasks with a circular response range (e.g., a color wheel), the model appears to be less applicable to tasks with a non-circular continuum of responses (e.g., a number line), as the model would suggest that responses at the opposite ends of the response spectrum are actually next to one another (though for a semi-circle extension that addresses this issue, see [1]).
The SCDM is another extension of the DDM.It assumes that a stimulus representation produces a distribution of evidence on a line or a plane [2].The evidence distribution is then accumulated over time, in combination with Gaussian process noise (or Gaussian random field noise in the 2D case), until a decision criterion has reached.Importantly, the Gaussian process noise creates the property that at any point in space the noise has a Gaussian distribution and the noise at nearby points is correlated.Thus, at each time step, the decision-maker samples continuous evidence from the entire decision space, and accumulates the continuous evidence until the evidence for an option within the option space reaches the decision threshold (Figure 2E).From a neuro-scientific point of view, this model can also be considered an extension of population code models, allowing them to account for continuous choices and response time [32].Moreover, this model can deal with responses in 2D planes, consistent with population code models (i.e., response areas in the motor cortex and movement-related oculomotor areas have maps representing 2D space; see [2,25] for further discussion).

Box 2. What are the benefits of continuous-option spaces?
Importantly, beyond being appropriate for many real-world decisions, one of the key potential benefits of continuous response paradigmsboth in theoretical and applied settingsis the additional information and constraint that they can provide.Specifically, while the amount of information about relative preference that can be obtained from discrete choices always contains an innate limitation based on the number of options available (i.e., note that we only consider the response choice here, and not additional variables that can provide more information, such as response times), continuous-option spaces provides a theoretically unlimitedthough practically limited by the precision of the decision-makeramount of information about relative preference.For instance, when taking an information-theory perspective on decision-making [82], discrete choices only provide n − 1 bits of information, where n is the number of available options (e.g., a choice between four options provides three bits of information; choosing A among four alternatives A, B, C, and D suggests that the decision-maker prefers A over B, A over C, and A over D), whereas a continuous-option space provides infinite information.
Moreover, the additional information provided by continuous-option spaces has already proven to be more than a mere theoretical concept.Previous research using continuous-option decisions has found patterns in how responses are distributed across the option space in different tasks.For instance, tasks that involve specifying a price for a gamble (as a buyer/seller) often show a strong right skewed response distribution [82,101], suggesting that either the seller or buyer under-evaluates gambles with higher expected values.For the number line task, which is frequently used in numerical cognition, an M-shaped pattern of errors as a function of the stimulus is observed in many studies [11], showing that people respond more accurately to anchor points on the number line (i.e., at the ends of 0, 100, and halfway to 50).Also, in visual working memory and perceptual tasks, sometimes response distributions are multimodal or heavy-tailed [1,2,28,34,102], which are the result of different types or errors (e.g., random guessing or misperception of the stimuli).Crucially, the additional information contained in continuous-option decisions can potentially provide a level of constraint on models beyond what is possible with discrete choice decisions, both in terms of theoretical tests, where candidate models need to also successfully predict the shape of the choice distribution, and in more applied contexts, where potentially model parameters can be estimated more efficiently (i.e., based on fewer decisions).most often to record observable information that we believe can provide insight into the underlying decisionmaking process, such as behavior and/or neural activities that occur before, during, and/or after a specific decision is made.Reinforcement learning: a process where people learn a suitable decision strategy based on the outcome feedback (i.e., rewards and punishments) that they observe and/or experience.Sequential sampling models: also known as 'evidence accumulation models'.A family of decision-making models that explain response time and choice data by accumulating evidence over time until the evidence total reaches a predefined threshold (e.g., the diffusion decision model).

Trends in Cognitive Sciences
Starting point: the initial point of the accumulation process relative to thresholds, which reflects the decisionmaker's bias towards an option.Threshold: a component of the sequential sampling models, also known as the 'criterion', 'decision boundary', or 'boundary separation', which determines when the accumulation process terminates.Within-trial variability: unsystematic moment-to-moment fluctuations in evidence during the course of a decision, for Brownian noise quantified by the 'diffusion coefficient' (i.e., the standard deviation of Gaussian noise).
In contrast to the CDM and SCDM, which assume a continuous representation of the option space, the RBLCA proposes a fine-grain discrete representation of the continuous-option space [31].Specifically, RBLCA approximates the continuous-option space by breaking it into discrete points and assigns an accumulator to each of these discrete points.During a noisy (Brownian motion) accumulation process, the accumulators can inhibit or facilitate each other based on their spatial locations (i.e., nearby accumulators facilitate each other).Since the RBLCA approximates the continuous-option space with fine-grained discrete points, it is able to accommodate a wide range of response continuums, such as lines, semicircles, circles, and even standard dichotomous tasks, creating a direct link between dichotomous and continuous responding.
Similar to the CDM and SCDM, MAAT [23] assumes a continuous representation of the option space.However, MAAT also assumes that people represent continuous-option spaces with multiple anchors corresponding to the dimensionality of the underlying representation of the continuous quantity.Specifically, MAAT uses n accumulation processes with a scalar threshold being defined as a function of all of their total absolute evidence (e.g., the processes stop when the sum of their squares becomes greater than the scaler threshold).This framework was implemented with deterministic evidence accumulation, as in the linear ballistic accumulator model [33], but could also be adapted to include within-trial variability.Importantly, MAAT not only theoretically creates a direct link between dichotomous and continuous responding, but has also successfully been able to simultaneously explain both discrete and continuous decisions in a unified framework [23], providing clear evidence for a common underlying process between discrete and continuous choice tasks.
What are the similarities and differences between these models?
Although these continuous-option space models are all extensions of the discrete sequential sampling framework, they have several key differences in the processes that they propose underlie continuous responding tasks, which can result in different predictions about empirical data.For example, one difference between these models is whether they can naturally predict multimodal response distribution, where responses cluster in multiple, separate areas of the continuum [2].As the RBLCA model consists of multiple discrete accumulators that each correspond to a discrete point in the continuum, it will naturally generate multimodal response distributions with peaks at each of the discrete points.Similarly, the SCDM can also produce multimodal response distributions based on its integration of the evidence distribution across the entire option space.However, since CDM utilizes a single accumulator, predicting multimodal response distributions is only possible with an additional mechanism (e.g., adding across-trial variability in drift rate or increasing the dimensionality of option space [34]), making multimodal response distributions a more challenging empirical trend to explain for the CDM.
Another important theoretical difference between the models is how different response options interact and inhibit one another.In the SCDM, the overall amount of evidence accumulated at any point in time during the accumulation process is constant (i.e., evidence at each time step is normalized), meaning that evidence for one option is evidence against the other options (i.e., feed-forward inhibition).The spatial locations of options are crucial, as the evidence and noise of nearby points are correlated, meaning that nearby points facilitate each other.However, while the CDM and RBLCA both include inhibition, they provide different proposals for how the inhibition is implemented, which do not map onto the idea that evidence for one option is evidence against the other options.For instance, while the CDM implements what could be thought of as a circular representation of feed-forward inhibition, where evidence for options at one end of the circle is evidence against options at the other end of the circle (e.g., evidence for the option located at 45°is evidence against the option located at 225°), the inhibition does not apply to options that are spatially close (e.g., 45°and 50°).Therefore, the spatial location of the options on the circle in CDM will directly determine the inhibition between them.In contrast to the SCDM and CDM, the RBLCA contains a mutual (also referred to as lateral) inhibition mechanism, meaning that evidence for one option at a specific point in time has no direct influence on the evidence for the other options (i.e., no version of feed-forward inhibition).However, the mutual inhibition mechanism means that the accumulated absolute evidence for a specific discretized option impacts the current accumulation of other options, facilitating accumulators for spatially close options and inhibiting accumulators for spatially distant options.Importantly, these different inhibition mechanisms could potentially have an impact on the patterns and shapes of response distributions that the different models can predict; however, further work comparing the models is required to determine which exact trends in the response distributions can distinguish their predictions (see [35] for a comparison between different types of inhibition mechanisms in discrete choice tasks).
Interestingly, continuous sequential sampling models can also explain the decision process in discrete choice tasks.Generally, there is a distinction between the stopping rule and the decision rule in sequential sampling models, where the stopping rule refers to the process that causes the decision-maker to stop accumulating information, whereas the decision rule refers to the process that determines which decision is made [36].These rules are often determined by the same process; for example, in the DDM, crossing the threshold for a specific alternative both results in a termination of the decision process (i.e., the stopping rule) and a response in favor of that alternative (i.e., the decision rule).However, these rules can also be separate, particularly in the case of continuous-option space models, where we theorize that discrete choice tasks may have a continuous underlying representation of the option space.In these cases, the decision rule also requires mapping the underlying continuous decision choice to one of the possible discrete response options, separating it from the stopping rule.For example, both the geometric similarity representation [37] a framework developed to provide some underlying commonality for existing models of continuous-options spacesand MAAT have been shown to be equally applicable to both discrete decisions and continuous-option decisions using the same underlying representation with a different decision rule [38], while also providing additional information about the latent space of the accumulation process, like similarity relation between options [39].However, it is important to note that other continuous space models can, in principle, use the same strategy (i.e., partitioning the option space) to produce discrete as well as continuous responses.
What theoretical and practical challenges do these models face?One important distinction to consider between different models of continuous-option decisions is what purpose they serve: to provide precise theories of how continuous-option space decisions are made, to act as useful tools for researchers who wish to answer applied questions in continuous response tasks, or perhaps both [40].Specifically, all of the continuous-option space models discussed earlier intend to provide a precise theory of how people make decisions in these tasks, in order to increase our understanding of the decision-making process.Unfortunately, this desire for theoretical precision also comes with a major drawback; each of these modeling frameworks are mathematically complex, meaning that working with them can be practically difficult for researchers with limited expertise in advanced computational modeling techniques (see Box 3 for a discussion of these practical difficulties).However, when the goal of the model is only to serve Box 3. What are the practical challenges to using continuous-option space models?More complex modeling frameworks for continuous-option decision-making can be mathematically intractable, making working with them practically tricky.For instance, while the CDM has an analytical formulation for the likelihood function when the option space is a complete circle, changing the option space to a semi-circle, which is suitable for modeling decisions within an interval, requires simulation-based approximations.Moreover, the proposed analytical likelihood function for CDM is unstable in some parts of the parameter space, resulting in artificial spikes in the approximated probability density function, particularly when attempting to model short response times (though see [103]).Other continuous-option space models provide even greater challenges, with the SCDM [2,30] requiring approximation methods as it lacks a tractable likelihood function.Importantly, these computational difficulties can impede theoretical and applied progress for continuousoption decisions, contributing to researchers continuing to prefer to use experiments with a limited number of options.
Recent methodological developments provide clear starting points for developing more powerful model-fitting frameworks.For example, the issue for intractable likelihood functions can be solved in some cases through the use of partial differential equations [104,105], or more generally can be eased through probability density approximation [106,107] combined with efficient simulation methods [108].In terms of parameter estimation, each of these previous methods of obtaining the likelihood can be combined with Markov chain Monte Carlo (MCMC) approaches common in cognitive modeling [109,110], or recently developed deep learning approaches [111,112], to provide Bayesian parameter estimation.These methods can then be combined with methods of marginal likelihood estimation [113] to produce efficient, fully Bayesian inference in complex models without likelihoods [114,115], meaning that a general framework could be created that is applicable to any model of continuous-option decisions that can be simulated.Ideally, these methods and models could be implemented in a user-friendly Matlab, R, and/or Python package for fitting continuous-option decision-making models [116], similar to those that have been developed for two-choice decisions [112,[117][118][119][120][121][122].
Interestingly, a recently proposed deep neural network method for parameter estimation of continuous-option models was able to estimate most parameters with reasonable fidelity based on only 20 trials [82].Considering the low number of trials, the approach performed remarkably well at estimating the threshold and bias strength parameters.However, it should be noted that there is greater variability in estimating non-decision time (i.e., the time taken for encoding and motor execution.),which might be resolved by increasing the number of trials or potentially using a different loss function for training [123].
as a measurement tool for the latent cognitive constructs in the model (e.g., estimating how good people are at a specific continuous-option decision task), then the functional form of the models can be greatly simplified, which can also improve the identifiability of their parameter estimates [41].Importantly, several models of continuous-option decision-making have been developed that can also serve as useful measurement tools, as they pose much less of a computational burden than other models.Specifically, these models either provide a simplification of the circular diffusion framework that allows the model to be analytically tractable at the expense of some theoretical precision [42], or generalize the accumulator framework from discrete decisions into a framework that is capable of accounting for continuous decisions [23].
Furthermore, we believe that there are many interesting theoretical questions that continuousoption space paradigms and models could help to answer in different areas of decision-making research.One example is optimality research, which focuses on the idea that people solve problems or achieve their objectives in an approximately optimal manner.This approach has historically been adopted by various disciplines, including neuroscience, psychology, economics, and computer science [43][44][45][46].Although there are several branches of optimality research that vary from one another in contexts and timescales (though for common theoretical frameworks see [47][48][49]), most branches have tended to focus on decisions in a discrete situation [50][51][52][53], rather than decisions in continuous-option spaces (though see [9,16,32,54]).One reason for the focus of optimality research on discrete decisions is that many optimal policies are much more clearly defined under discrete situations.For instance, in the case of reward rate optimality, where the goal is often to obtain as many correct responses as possible in the shortest amount of time possible (i.e., maximizing the ratio of correct responses to time), defining what it means to make a 'correct' response is straightforward when there are a small number of discrete response options.However, in continuous-option spaces, the situation is less of a case of whether or not the choice was correct and more a case of how correct was the choice, either being based on the distance from the chosen response to the perfect response, the similarity between the chosen and perfect response, or some other metric (i.e., the reward function).Importantly, this graded sense of what it means for a response to be 'correct' creates much greater complexity in defining what it means to be optimal, as specific metrics used for calculating the level of correctness can produce different optimal strategies and participants themselves may have a different internal formula for calculating how correct they believe that they were.
Fortunately, we believe that it is possible to extend existing theories used within discrete optimality research to continuous optimality research.In the case of reward rate optimality, researchers typically define the decision as being a sequential sampling process, with the optimal strategy being the level of caution (i.e., the decision threshold) that maximizes the person's achieved reward rate, given the person's task ability (i.e., drift rate) and other factors (e.g., [50,53]).In the case of continuous-option spaces, the discrete speed-accuracy trade-off becomes a continuous speed-variance trade-off, where participants must balance the competing goals of speed and maintaining not only high accuracy, but also low variability (i.e., high precision) in their responses.After defining a continuous reward function, which determines what reward a participant receives based on the correctness of their response, the expected rewards for a specific decision threshold can be defined as integral of the predicted response distribution for that threshold multiplied by the reward distribution, meaning that the optimal strategy can be defined as the threshold that maximizes the ratio of this integral and time.
Finally, we believe that future research should also focus on extending existing continuous-option space models to other areas and sources of data.Discrete choice sequential sampling models have a rich history of integrating with theories from other research areas and providing joint accounts of multiple sources of data, such as attention-based diffusion models that can also explain data such as eye movement [48,55,56], neuro-cognitive models that attempt to directly explain both behavioral data and neural activity [57][58][59][60][61][62][63], or reinforcement learning-sequential sampling model that explains the interaction between learning and decision process ( [49,64]; see Box 4 for a discussion).We believe that similar developments with continuous-option space models would be valuable, particularly to understand whether continuous-option space sequential sampling models can account for these data in the same way as their discrete counterparts.

Clinical applications for decisions in continuous-option spaces
Clinical psychopathology research has long taken advantage advances in cognitive science to better understand individual cognitive factors that contribute to the development and maintenance of clinical conditions [65,66].Much of this has involved performance on binary choice tasks (see [66,67] for review).For example, on a wide range of dichotomous decision tasks, information processing biases and reduced rate of evidence accumulation have been documented for people with attention deficit hyperactivity disorder (ADHD) [66,68], autism [69], dyslexia [70], anxiety [71,72], schizophrenia [67,73,74], and other conditions.Such findings take the important first step of identifying cognitive differences that might inform taxonomic models of clinical conditions [75].However, as in other areas, binary choice tasks do not comprehensively characterize human behavior because the performance in such tasks can be measured by binary accuracy.For instance, misperceptions of random motion direction may be more common in those with developmental conditions such as autism and dyslexia [76], which can only fully be explored with continuous-option tasks, where recent investigations have suggested that direction is represented as a bimodal probability distribution [77] (for a discussion on continuous motion processing, see [78]).Moreover, autistic people report difficulties in everyday decision-making tasks [79], but often perform well when making decisions in binary choice paradigms (e.g., deciding whether a cloud of dots is moving up or down [80]).It is possible that performance in continuous-option spaces may better approximate the real-world decision-making difficulties that autistic people Box 4. How can learning impact decision policy?
The interplay between learning and decision making is dynamic, where decisions prompt actions and changes in the environment.This, in turn, leads to feedback, which updates our understanding of the environment.Subsequently, our future decisions are influenced by this updated knowledge.Recent empirical studies have highlighted the importance of this interaction, demonstrating that individuals can enhance their speed and accuracy in reinforcement learning tasks, and proposed a unified computational framework called reinforcement learning sequential sampling modeling (RLSSM) [49,64,124].RLSSM assumes that the decision-maker updates the decision policy (i.e., updates the parameters of the decision process like drift rate) followed by updating the current knowledge/belief about the decision options.IA), in which the decisionmaker should learn the true association within a continuum of options, is an example of learning a continuous action.In this experiment, the angular distance between the correct and actual responses can modulate the reward obtained.For instance, the cosine of the angular distance between the correct and actual responses can be utilized as the reward (i.e., feedback).Also, the key observed pattern should be similar to the discrete task (Figure IE), which can be predicted by drift modulation in the continuous model (Figure IF).
It should be noted that the interaction between reinforcement learning and decision processes is only one of many potential interactions between learning and decision-making that are relevant in continuous environments.For example, function learning [13,125], where people learn the functional relationship between continuous cues and target variables so that one variable can be predicted to be the target from cues, or temporal learning [126,127], where people learn to decide when to act and to correctly time their responses, are also interesting areas in terms of continuous-option spaces and continuous-option space models, though discussing these in detail are beyond the scope of the current article.
face.Thus, as with other areas of decision-based research, modeling more complex and ecologically valid contexts in clinical science is now needed.For example, both fine and gross motor coordination difficulties, as well as atypical social skills and peer relationships are commonly observed in neurodevelopmental conditions [81].Paradigms and models of cognition that can be used to explore dynamic motor adjustments, the give and take of social interactions, as well as the integration of action and decision, are critical to explaining both individual and developmental differences in the execution of such complex behaviors.
The other advantage of experimental paradigms based on continuous-option spaces is that they can potentially reduce the number of trials required for reliable measurements of cognitive constructs [82] (also see Boxes 2 and 3).A reduction in the number of trials needed in a task is particularly important in clinical and applied settings, as participants may be individuals who have difficulties completing long experiments (e.g., children) and clinical/applied experimental protocols can often be even more lengthy than standard ones.For instance, the standard paradigm for examining the multisensory integration ability of children with dyslexia is the temporal order judgment task, where participants are presented with an auditory and a visual stimulus and need to determine which stimulus was presented first (i.e., auditory or visual) [83][84][85].In order to obtain a temporal window of integration, the experiment usually contains different stimulus onset asynchrony values, with each value requiring a multitude of trials [86,87].Consequently, this paradigm usually takes more than an hour, which can be too long for children with dyslexia.While other clinical paradigms have been created in an attempt to reduce the amount of time taken for these binary choices (e.g., the temporal bisection task [88]), we believe that considering continuousoption spaces (e.g., identifying the time difference between modalities) could provide a sizable time reduction in clinical settings like this, making it more suitable for those with clinical conditions that make long experiments difficult to complete.

Concluding remarks
While some progress has been made in understanding continuous-option spaces, there are still many unresolved research questions to be investigated (see Outstanding questions).For example, most existing neuro-imaging studies on perceptual, cognitive, and/or value-based decision making, which can potentially help us connect cognitive theories to neural activity, are based on paradigms with discrete options [89,90] (though see [91]).While the brain-behavioral links from research using discrete choices may generalize to continuous-option spaces, we believe that empirical assessments are crucial to creating and refining neuro-cognitive decision-making models.
Although both modeling and non-modeling approaches can shed light on the cognitive and neural mechanisms of decisions in continuous-option spaces, and both can be used in a complementary fashion, we believe that modeling approaches are crucial to develop precise theories and computational frameworks for linking neural activity and behavior in continuous-option space [92].Thus, we believe that developing powerful computational packages for fitting the proposed models is an essential step toward a better understanding of decisions in continuous-option spaces.After this step, the computational models can be utilized in theoretical and applied studies and provide a deeper insight into decision-making processes and their relations to other processes.
In the current article, we have attempted to shed light on the importance of continuous-option spaces and developing theories that account for both the distributions of responses and response times in these tasks.Given the intuitive importance of understanding continuousoption decisions, and the potential practical benefits of using a continuous-option space in experimental settings, we believe that moving beyond discrete-choice options remains a crucial puzzle for many areas within cognitive science and that understanding continuous-option decisions will be a fundamental aspect of understanding cognition as a whole.

Outstanding Questions
How can we define optimal behavior for continuous-option decisions?While there are various definitions for optimality in discrete decisions (e.g., reward rate), no consensus has been reached regarding the optimality of continuousoption decisions.Consequently, one of the crucial questions, and arguably the first step for investigation, is how to define the optimality/optimal policy for decisions with continuous-option spaces.
How does visual attention impact information processing in a continuousoption space, and how can we explain it within a theoretical framework?Despite the obvious role of visual attention in the choice process, to the best of our knowledge no research has investigated the role of visual attention in continuousoption decisions.
How can we explain inhibitory control in complex, continuous inhibitory tasks with continuous theories?Although there are many situations where we are required to inhibit our responses in some continuous way, there is currently no theoretical framework to explain human behavior under these conditions.
How can theoretical accounts of reinforcement learning and decisionmaking in continuous-option spaces be integrated into a unified theoretical framework, capturing both withinand across-trial dynamics?While the literature on discrete choice has made attempts to unify these frameworks, does this unification generalize to continuous choice?How do cognitive abilities relate to learning in continuous-option spaces?Previous studies in discrete choice tasks illustrated how human cognitive abilities, like working memory, can explain certain learning mechanisms.However, can these cognitive abilities also explain learning in continuousoption spaces?How can we develop more ecologically valid paradigms using continuousoption spaces for applied studies?Ecological validity of the experimental designs provides a key challenge for some dichotomous decision tasks, which can be particularly problematic in some applied studies.

TrendsFigure 1 .
Figure 1.Examples of decisions with continuous-option space.Here are some examples of decisions with Figure I. Various measurement techniques.An overview of measuring techniques that are currently being used in the different fields of cognitive science (i.e., eye fixation [32], mouse-clicking[2], finger movement on touch screen[2], key pressing[53], repeated key pressing[94], eye tracking[48,55], mouse tracking[93,100]).

Figure 2 .
Figure 2. Discrete versus continuous experiments and models.(A) The orientation discrimination task in a binary choice design is where the decision-maker determines whether the orientation of the Gabor patch is upper left or upper right.(B) A schematic view of the DDM, which can explain the evidence accumulation process in a binary choice task.In this model, a single accumulator integrates evidence over time until the relative evidence crosses one of the thresholds (i.e., the upper threshold is for the right response and the lower is for the left response).(C) The orientation discrimination task in continuous choice design; the Gabor patch varies over time until the decision-maker responds.In this design, the decision-maker can choose any point on the circle.(D) A schematic view of the circular diffusion model (note that the accumulation domain of the multiple anchored accumulation model is also a circle).In this model, a single accumulator integrates evidence based on the input signal; each point on the circular boundary corresponds to one point on the option space and when the process hits a point on the circular boundary, the corresponding response is triggered.(E) A schematic view of the spatially continuous diffusion model.In this model, the input signal distribution is integrated over time until the first point reaches enough relative evidence.Abbreviations: CDM, circular diffusion model; DDM, diffusion decision model; SCDM, spatially continuous diffusion model.
Figure IA shows an example of a two-armed bandit task in which the decision-maker should learn which color is the true association with the presented item.The key behavioral pattern in this experiment is the descending trend of response time and the ascending trend of accuracy as trials pass (Figure IB).One of the advantages of RLSSM over the other reinforcement learning models is that it can predict these behavioral patterns by modulating the drift rate (Figure IC).The continuous-armed bandit task (Figure ID; the continuous version of the experiment in Figure

Trends
Figure I. Discrete and continuous RLSSM.An overview of discrete and continuous reinforcement learning tasks and how to model them using RLSSM.(A) A two-armed bandit task in which the decision-maker should learn the true association between the colors (blue or red) and the presented item.(B) The key behavioral pattern in the two-armed bandit task shows that response time decreases and accuracy increases with learning.(C) The DDM as the choice rule in which the drift rate is updated every trial by the difference of reward expectation for each option.(D) A continuousarmed bandit task in which the decision-maker should learn the true association between the colors and the presented item.Here, all colors on the color wheel are in the option space and the decision-maker should select them within a continuum.(E) The key behavioral pattern in the continuous-armed bandit task shows that learning decreases response time and standard deviation of error.Note that the error is the angular distance between the actual and true response in the continuous tasks.(F) The CDM is the choice rule in which the drift angle and length are updated every trial by receiving feedback and updating the reward distribution of the entire option space.Abbreviations: CDM, circular diffusion model; DDM, diffusion decision model; RLSSM, reinforcement learning sequential sampling modeling; RT, response time; SD, standard deviation.