Neural correlates of valence-dependent belief and value updating during uncertainty reduction: An fNIRS study

Selective use of new information is crucial for adaptive decision-making. Combining a gamble bidding task with assessing cortical responses using functional near-infrared spectroscopy (fNIRS), we investigated potential effects of information valence on behavioral and neural processes of belief and value updating during uncertainty reduction in young adults. By modeling changes in the participants ’ expressed subjective values using a Bayesian model, we dissociated processes of (i) updating beliefs about statistical properties of the gamble, (ii) updating values of a gamble based on new information about its winning probabilities, as well as (iii) expectancy violation. The results showed that participants used new information to update their beliefs and values about the gambles in a quasi-optimal manner, as reflected in the selective updating only in situations with reducible uncertainty. Furthermore, their updating was valence-dependent: information indicating an increase in winning probability was underweighted, whereas information about a decrease in winning probability was updated in good agreement with predictions of the Bayesian decision theory. Results of model-based and moderation analyses showed that this valence-dependent asymmetry was associated with a distinct contribution of expectancy violation, besides belief updating, to value updating after experiencing new positive information regarding winning probabilities. In line with the behavioral results, we replicated previous findings showing involvements of frontoparietal brain regions in the different components of updating. Furthermore, this study provided novel results suggesting a valence-dependent recruitment of brain regions. Individuals with stronger oxyhemoglobin responses during value updating was more in line with predictions of the Bayesian model while integrating new information that indicates an increase in winning probability. Taken together, this study provides first results showing expectancy violation as a contributing factor to sub-optimal valence-dependent updating during uncertainty reduction and suggests limitations of normative Bayesian decision theory.


Introduction
The ability to encode, process, select and integrate new information from the environment to reduce uncertainty is vital for adaptive behavior, such as making good decisions in complex and changing situations.Of specific relevance here, Bayesian and the related predictive inference theories (e.g., Friston et al., 2012) as well as model-based reinforcement learning (e.g., Doll et al., 2012) postulate that people are sensitive to the statistical properties (e.g., uncertainty) of the environments or tasks they are confronted with and the outcomes of their actions, such as rewards or losses.Internal modelscommonly known as beliefs or expectations about the states of the environment or task at handare formed through experiences and learning to guide choice behavior (Friston et al., 2021;Ma, 2019;Rushworth and Behrens, 2008;Yon and Frith, 2021).Upon observing new information, prior beliefs need to be flexibly updated, which can then be utilized to anticipate outcomes of future choices and actions (Behrens et al., 2007;Itti and Baldi, 2009;Ma and Jazayeri, 2014;Nassar et al., 2010;Payzan-LeNestour and Bossaerts, 2011).Other than applications in the domain of decision making, similar theoretical propositions have also been applied to understand perception and motor behavior in dynamic and changing situations (e.g., Friston, 2005, 2009, see Da Costa et al. 2020 for a review).
Decisions usually involve some degrees of uncertainty.Choice contexts with uncertainty can be subdivided into being risky or ambiguous (Ellsberg, 1961;Huettel et al., 2006;Tymula et al., 2013).Risky circumstances pertain to situations of uncertain outcomes but with known outcome probabilities, whereas outcome probabilities remain unknown in ambiguous situations.A growing body of research indicates that people are sensitive to the nature of uncertainty, particularly whether it is reducible or irreducible, when using new information to guide their decisions.New experiences are used differently during updating processes depending on whether they convey information that does not further reduce uncertainty, such as under risky situations, or information that helps reduce uncertainty, such as in ambiguous circumstances (e.g., Kobayashi and Hsu, 2017;O'Reilly et al., 2013;Schulreich and Schwabe, 2021).
These two types of uncertain situations have different implications for the adaptive use of new information.Consider, for instance, a gamble with two dice, A and B, where winning requires rolling a specific face of the dice (e.g., "3").You know that dice A is fair and has the same probability of landing on any of the six faces, based on your past experiences of rolling this dice.The 1/6 probability of landing on the face "3" makes rolling dice A risky, but the degree of uncertainty is known.In contrast, suppose you have no prior knowledge regarding whether dice B is fair or biased.In this case, rolling dice B is an ambiguous situation associated with unknown degree of uncertainty.If you observe that the face of "6" keeps landing on top consecutively across several rolls when rolling dice B, you might suspect that this dice is biased towards "6" and the probability of landing at the winning face of "3" could be much lower than 1/6.However, obtaining the same sequence of results when rolling dice A would not make you think the same way.Even though the observed events in both cases are the same and may surprise you to some extent.Rationally, you would only update your beliefs about dice B given the new information.Updating your beliefs (or expectations) could subsequently affect how you value potential outcomes of rolling dice B in the next round of the game, but not those of dice A. As illustrated in the example above, updating processes for decision making involve multiple aspects.Beliefs based on prior experiences about the statistical properties of an uncertain context (e.g., dice A or dice B) may affect the subjective values of potential action outcomes in that context (e.g., rolling dice B).First, rational updating requires assessing whether the uncertainty in a given context can or cannot be reduced by further information.Second, new information should be differentially utilized to update beliefs (e.g., the dice being fair or biased) and subjective values of a given action in that context (e.g., of rolling one of the dices).Third, it is important to distinguish between updating processes and mere violations of expectancy that naturally arising from probabilistic events which may not provide systematic information about the environment but occur also in situations with irreducible uncertainty.
Indeed, results from several previous studies suggest dissociable processes of belief and value updating as well as expectancy violation at the neural level (Kobayashi and Hsu, 2017;O'Reilly et al., 2013;Payzan-LeNestour and Bossaerts, 2011).Despite considerable differences in task designs across studies, belief updating seems to involve particularly frontoparietal regions (Gläscher et al., 2010;Kobayashi and Hsu, 2017;Nour et al., 2018;Schulreich and Schwabe, 2021;Tomov et al., 2018); whereas the bilateral insula (Kobayashi and Hsu, 2017), striatal regions as well as the frontal gyrus and anterior cingulate have been found to be associated with expectancy violation (see D'Astolfo and Rief, 2017 for a meta-analysis based on reinforcement learning paradigms).Furthermore, the subjective value of specific actions or choices in each task context (e.g., rolling one of the two dices) would be updated if there are changes in the beliefs about task uncertainty (e.g., the fairness of the dice).Brain activity in the ventral and medial prefrontal cortex as well as in the inferior parietal cortex has been shown to underlie such value-updating process (Kobayashi and Hsu, 2017).
Bayesian decision theory provides a normative benchmark for how new information can be optimally integrated into prior beliefs about the decision contexts.Previous studies using incentivized games (e.g., gambles) have shown that young adults are sensitive to the reducibility of uncertainty and use new information to update beliefs and values in a quasi-normative manner (Kobayashi and Hsu, 2017;Schulreich and Schwabe, 2021).Normative Bayesian perspectives assume equal weighting of positive and negative outcomes; however, other than considering statistical properties of the task context, human decision-making is also known to be prone to bias, influenced by perceptual, cognitive, and affective processes and may rely on heuristics (Kahneman and Tversky, 1979;Kahneman, 2003;Gigerenzer and Gaissmaier, 2011).Of relevance for the current study are empirical findings suggesting that people may deviate from Bayesian normativism and process information in a valence-dependent manner by weighting positive and negative information differently (see Guitart-Masip et al., 2014;Sharot and Garrett, 2016 for reviews).In tasks regarding self-relevant beliefs, positive information tends to be weighted more than negative information when it concerns feedbacks from others about one's own personal characteristics (Möbius et al., 2022) or the likelihood of personally experiencing particular life events (Garrett and Sharot, 2017;Kuzmanovic et al., 2015;Marks and Baines, 2017;Moutsiana et al., 2015;Sharot et al., 2011;2012).Nevertheless, accumulated findings over the past years indicate that the presence and direction of valence-dependent updating were inconsistent across studies, depending on the domain and nature of the tasks (Bromberg-Martin and Sharot, 2020;Coutts, 2019).For example, the self-related optimism bias (i.e., overweighting positive information) vanished under perceived threat (Garrett et al., 2018).In the monetary domain, Kuhnen (2015) found that people weighted negative information more (i.e., showing a pessimism bias) in the loss condition when performing an active investment task.Moreover, reinforcement-learning models have included valence-dependent learning rates for positive or negative prediction errors to estimate subjective values in a risk-sensitive manner (e.g., Niv et al., 2002).Taken together, existing findings indicate that the nature of valence-dependent processing depends very much on specific contexts and features of the tasks.
Thus, it is of interest to also explore the nature of valence-dependent behavior in situations entailing reducible and irreducible uncertainties to gain a more comprehensive understanding of adaptive updating.Kobayashi and Hsu (2017) investigated young adults' updating behavior using a gamble bidding task that included both ambiguous and risky situations.In an initial behavioral experiment, the participants could actively indicate their bids (i.e., subjective values) for a given gamble after experiencing different scenarios, which either indicated a change of outcome probabilities or not.Their results suggested that young adults take the reducibility of uncertainty into account when ascribing values to different scenarios of the gambles.In line with the Bayesian normative prediction, participants' value updating was valence-independent and quantitatively comparable across scenarios with either positive information (i.e., indicating more favorable winning probability) or negative information (i.e., indicating less favorable winning probability).In a further fMRI experiment of the same study, another group of participants only passively observed the different gamble scenarios without actually giving the values they would bid after receiving new information about the gambles.The implicit processing of the new information was found to be associated with brain activity in the frontoparietal network for belief and value updating in a dissociable manner.However, because of the passive viewing nature of their fMRI task, potential valence-dependent effects could not be examined (Kobayashi and Hsu, 2017).In a later study, Schulreich and Schwabe (2021) used the active variant of the gamble bidding task and found that X.-R.Peng et al. value updating was conservative (i.e., less than predicted by the Bayesian normative model) both after receiving positive or negative new information about the gambles.However, salivary levels of the stress hormone cortisol correlated positively with value updating following positive information indicated higher winning probabilities, but not with value updating following negative information.This indicates that different mechanisms might underlie updating processes after experiencing positive or negative new information and suggests that valence-dependent effects may be observed in situations with reducible uncertainty.
To further investigate potential valence-dependent effects of updating behavior during decision making with uncertainty and the related brain mechanisms, the current study used an active variant of the gamble bidding task in combination with functional near-infrared spectroscopy (fNIRS).The first goal was to replicate key results of frontoparietal involvements in adaptive updating from the study by Kobayashi and Hsu (2017) and to extend these findings to situations when participants can also actively indicate their adjusted subjective values after being presented with new information in separate scenarios of the gamble.In this regard, we also adopted a quantitative Bayesian model (Kobayashi and Hsu, 2017) to account for updating behaviors in risky and ambiguous situations.We expected that frontoparietal cortical activity associated with belief and value updating could also be observed and partially dissociated in the active bidding paradigm using a different imaging modality.The second goal was to explore valence-dependent effects on belief and value updating and their brain correlates.Crucially, only by using observed value updates made by the participants instead of model-derived updating values, which are normative and symmetric with respect to valence, would it be possible to detect potential valence-dependent asymmetries in neural responses.The exact direction of valence-dependent effect is difficult to anticipate a priori.However, considering that value updating in the positive domain has been associated with a stronger bias (Palminteri and Lebreton, 2022) or malleability (e.g., regarding stress hormones, Schulreich and Schwabe, 2021) compared to the negative domain, sub-optimal updating behavior that deviates from Bayesian predictions could be more likely after experiencing positive information about the gambles.Moreover, given that the ability of decision making depends on integrating new information into existing beliefs to adjust choices and varies among individuals (Nassar et al., 2010), we also assessed whether individual differences in logical reasoning ability, which is an important aspect of fluid intelligence, would be associated with variations in updating behaviors.Although this specific analysis was not pre-registered, we hypothesized a positive relationship between logical reasoning ability as assessed by the well-established Raven's Matrices test (Raven et al., 1998) and Bayesian rationality in updating, as logical reasoning ability had been found to be correlated with decision-making functions implicating frontal brain regions (Eppinger et al., 2015).

Participants
A total of 47 adults (age ranged from 18 to 30 years) participated in this study.All participants were right-handed, had normal or correctedto-normal vision, and had no history of psychiatric illness, neurological diseases, or medical scalp conditions.The participants gave informed consent before participation and were compensated with 10 Euros per hour plus a possible bonus that could be gained at the end of the gamble bidding task.The local ethics committee of the TU Dresden approved the study (SR-EK-6012021).Two participants had to be excluded due to the low quality of the assessed fNIRS data (see Methods for details), resulting in a final sample of 45 participants (27 females, 18 males; mean age ± SD: 22.04 ± 2.8 years).

Study procedure
The study consisted of one experimental session that took about 3 h.The participants first filled out a demographic questionnaire and the Edinburgh Handedness Inventory (EHI; Oldfield, 1971).Subsequently, the experimenters measured the participant's head size for selecting the suitable fNIRS cap to set up the fNIRS montage while the participants read through task instructions of the gamble bidding task.Before performing the task, a short quiz about key aspects of the task was given to ensure that the participants understood the task instructions.Next, participants performed two practice rounds (one with the experimenter, the other by themselves) to familiarize with the gamble bidding task before starting the actual experiment.The fNIRS data recording of the gamble bidding task began 30 s before the main experiment and terminated with task completion (see Section 2.4 for details about fNIRS setup and procedure).The main program of the gamble bidding task lasted about 38 min on average.After the gamble bidding task, cognitive covariates of verbal knowledge and basic information processing speed that are commonly used in the research of adult development (cf.Li et al., 2004) were respectively assessed with the Spot-the-Word Test (Baddeley et al., 1993) and the Identical-Pictures Test (Lindenberger and Baltes, 1997) to better characterize our sample.Although these abilities may not be directly associated with performance variations in the gamble bidding task, including these measures would allow comparisons of basic characteristics of our sample with samples of other ages in future studies and meta-analyses.Afterwards, the participants were instructed and performed a separate 3-state Markov decision task (Eppinger et al., 2015) that lasted 45 min.This study focuses only on the gamble bidding task.Lastly, besides other factors, it is known that fluid intelligencethe capacity for logical reasoning, problem-solving, and adapting to dynamic situationsalso plays a role in decision making (Bruine de Bruin et al., 2020).Thus, after the Markov decision task and removal of the fNIRS cap, we assessed logical reasoning ability with the Raven's Progressive Matrices test (Raven et al., 1998) to assess whether individual differences in reasoning ability may be related to variations in normatively optimal updating behavior.Individual differences in the Raven's test had been previously found to be associated with decision processes implicating frontal brain functions (Eppinger et al., 2015), which are also relevant for performing the gamble bidding task.

Task design
We adapted and programmed (in MATLAB R2020b using Psy-chToolbox 3.0.16)a variant of the gamble bidding task used in two previous studies (Kobayashi and Hsu, 2017;Schulreich and Schwabe, 2021), which was based on Ellsberg's urn problem (Ellsberg, 1961).The participants interacted with the task via a standard PC keyboard.The task consisted of multiple gambles, each with an urn containing 2 to 4 balls in three colors (red, blue, and yellow).The participants had to bid a price (value) for each gamble after been shown initial information about general composition (not exact content) of balls in the urn.In each gamble there were two kinds of balls: (1) the risky balls had a single color (e.g., blue balls in Fig. 1) for which the exact amount of balls in this color was known; (2) the ambiguous balls were initially shown as bicolored (e.g., half red and half yellow in Fig. 1) for which only the total number, but not the exact color-to-ball assignments, was known.Thus, the distribution of balls of these two colors in that given gamble was ambiguous.To eliminate potential color bias, the allocation of specific colors (i.e., red, blue, and yellow) to risky/ambiguous categories was randomized across participants and no color bias was observed (for a check of potential bias, see Supplementary Materials, Text S1).The types of uncertainty in a given gamble were manipulated by a predetermined winning color displayed above the urn.If the winning color is that of the risky ball(s), the winning probability of the given gamble can be calculated directly based on the initial information about urn composition; however, if the winning color is one of the two ambiguous colors, the winning probability will be initially unknown.As a concrete example (see Fig. 1A, left panel), let us consider an urn with four balls, of which two are risky (e.g., blue) and the other two ambiguous (e.g., initially shown in mixed red/yellow color).If the winning color is blue (the risky color in this case), the probability of winning is 50%; whereas, if the winning color is yellow (one of the ambiguous colors), the exact probability of winning is not known as the possible urn contents could be (1) two blue and two red (0% winning probability), (2) two blue, one red and one yellow (25% winning probability), or (3) two blue and two yellow balls (50% winning probability).
Following the initial presentation of urn composition, further information about the urn content of a given gamble was shown in three independent scenarios.The initial presentation provided prior information about potential probabilities of ambiguous colored ball(s) in the urn on which participants could base their initial judgements, and later update their beliefs and values for bidding with new information gained in the three scenarios of the gamble.Comparing the participants' bidding behavior before and after each of the three scenarios allowed us to evaluate participants' sensitivity to the nature of uncertainty in the gamble and to assess their behaviors of value updating.Specifically, before (predraw) and after (postdraw) each scenario, the participants were asked to bid a value to indicate a price in Euro at which they would be willing to sell (WTS) the gamble.In other words, the WTS value is the minimum amount of money which a person is willing to accept to forgo the right to play the lottery (hence, also sometimes referred to as Fig. 1.Experimental paradigm and task procedure.(A) The urn composition and draw scenarios of an example ambiguous gamble (adapted from Kobayashi and Hsu, 2017).Left panel: The urn of this gamble contains four balls, two of which are risky (blue) and the other two ambiguous (red/yellow).The exact number of red and yellow balls is unknown to participants.Hence, there are three different possible urn compositions as shown.Middle panel: A draw of a red ball (ambiguous color) results in belief updating and eliminates one possible urn composition (i.e., two blue and two yellow balls).Expectancy violation is defined as the difference between 1 and the probability of drawing a red ball.Right panel: A draw of a blue ball (risky color) does not result in belief updating but is linked to expectancy violation (i.e., counter probability of drawing a yellow ball).(B) The trial sequence for an example gamble, adapted from Schulreich and Schwabe (2021, p. 3).(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) X.-R.Peng et al.
"willingness to accept"), which is a well-established indicator in economics and decision research (e.g., Kobayashi and Hsu, 2017;Novemsky and Kahneman, 2005).They were instructed to enter their WTS values by entering numbers between 0 and 10 using the number pad of the keyboard.Each of the three scenarios was shown as a randomly drawn ball (once in each color) that could potentially provide more information about the urn content.Further knowledge of an urn's composition could only be gained if an ambiguous ball was drawn.If a risky ball was drawn, no new information could be gained because the number of the risky ball(s) was already fixed at the initial presentation of urn composition.If the participants are sensitive to the nature of uncertainty, belief and value updating would only take place in ambiguous but not in risky gambles.Notably, participants were instructed to treat the three scenarios of a gamble as independent (i.e., each of the scenarios starts with the same initial composition; an observed draw in a given scenario does not change the initial composition for the next scenario).No effect of scenario order was observed, indicating independent processing of the scenarios (see Text S2).To enhance task engagement and ensure that the WTS values the participants ascribed to each of the gambles reflect their subjective prices, a resolution draw took place at the very end of the experiment after the participants had finished bidding for all gambles.The resolution draw determined the amount of a possible bonus the participants could gain besides the reimbursement for their participation.During the resolution draw, one of the gambles (including all the predraw and postdraw scenarios) shown during the experiment was randomly selected.Following the Becker-DeGroot-Marschak bidding procedure (BDM; Becker et al., 1964), a price (in the range of 0€ to 10€) randomly generated by the computer was compared to the WTS value the participant entered for that gamble during the experiment.If the participant's WTS value was below the computer-generated price, they sold the gamble to the computer and received the computer-generated price as a bonus; otherwise, they played the gamble with a chance to win either 10€ or nothing.Our task instructions provided a clear explanation of this procedure through a detailed illustration.
The winning colors and the corresponding colors of the observed draws were manipulated such that the scenarios provided either (i) new and relevant information (drawing ambiguous balls in ambiguous gambles), (ii) new but irrelevant information (drawing ambiguous balls in risky gambles) or (iii) no new information at all (i.e., drawing risky balls in either type of gambles).By design, this allowed for the experimental separation of belief updating and expectancy violation, such that updating behavior can be separately analyzed for cases in which updating was normatively predicted and in cases when it was not.If the participants behave rationally, belief updating could be expected after observing draws of ambiguous balls and this would be reflected in changes in the WTS values of ambiguous gambles (i.e., value updating).No belief (and value) updating would be expected after observing draws of risky balls.In the example shown in Fig. 1A (middle panel), observing a draw of a red (ambiguous) ball reduces uncertainty because this information indicates that at least one of the ambiguous balls is red.Therefore, the chance of drawing a red ball in the resolution draw increases (ΔP red > 0), while the probability of drawing a yellow ball and thus the chance of winning the full 10€ decreases (ΔP yellow < 0).Consequently, this makes this gamble less attractive for the participants since the winning color is yellow in this example and should result in negative value updating.In the case of gambles with risky winning color (i.e., if the winning color would be blue), drawing an ambiguous ball provides new but irrelevant information given that the probabilities of risky gambles are fully specified beforehand (ΔP blue = 0).Thus, no relevant information could be gained when the risky color is the winning color for any draw.Accordingly, zero updating is expected for these cases.
Furthermore, given that the Bayesian model assumes that the participants may form expectations about the probabilities of drawing balls of different colors based on initial information about the urn composition and the probabilistic nature of the draw's color at each trial, any draw (including draws of colors of risky balls) in any gamble (including risky gambles) involves some degrees of "surprise" or expectancy violation.Since drawing a ball (any color) from the urn in the gamble is an event that would certainly occur, the sum of the probabilities of all the outcomes (drawing balls of all three colors) equals 1. Expectancy violation in this context is quantified by 1 -P(drawcolor), where P (drawcolor) is the prior probability of drawing a ball of a particular color.Given that P(drawcolor) is <1, 1 -P(drawcolor) is >0 for any draw (Kobayashi and Hsu, 2017).
Altogether each participant was presented with six different urn content compositions in the task, which varied with respect to the total number of balls in the urn and the number of risky and ambiguous balls (see details of urn content compositions in Table S1).Besides the gambles (i.e., three winning colors × six urn content compositions) used in two previous studies (Kobayashi and Hsu, 2017;Schulreich and Schwabe, 2021), we added one more repetition for each urn content composition in a balanced manner across gamble types and winning colors to increase the signal-to-noise ratio for the fNIRS analysis.Specifically, among the six additional gambles, the winning colors are evenly divided, such that two gambles have the color of the risky ball and two gambles with the color of each of the two ambiguous balls as the winning colors.In total, our variant of the task included 24 gambles (8 risky gambles and 16 ambiguous gambles) that were evenly associated with three possible winning colors.Crucially, this design balanced the magnitude and direction of the positive and negative information for the ambiguous gambles (see Table S2 for details) to ensure that valence-dependent effects, if observed, would not be confounded by differences in gamble types.The order in which the gambles were presented was randomized across participants.Each of the 24 gambles was further associated with three scenarios, each with one different colored ball drawn from the urn.For instance, in an ambiguous gamble, there would be one scenario with an ambiguous ball that matches the winning color, one scenario with a mismatched ambiguous ball, and one scenario with a risky ball, so that no systematic bias would be introduced by the task design (see Table S2 for details).The order in which the different colors were drawn in the scenarios was also randomized.Altogether we could collect 24 predraw WTSs and 72 (24 gambles × 3 scenarios) postdraw WTSs from each participant.

Task procedure
As shown in Fig. 1B, the urn content composition was displayed for s at the beginning of each gamble.Subsequently, participants were asked to enter their WTS pre value (predraw bid price) within a maximum duration of 10 s, followed by the instruction text of "Waiting for a new scenario …" that was shown on a black screen with a mean interval of 11.5 s and a jitter between 10 and 16 s.Then the first scenario was presented for 4 s, followed by the instruction text of "Please wait a moment to enter your value…" (mean interval of 5.5 s, jittered between 4 and 10 s).Next, participants entered their WTS post value (postdraw bid price) of the first scenario within 10 s, followed by an inter-scenario interval (mean interval of 11.5 s, jittered between 10 and 16 s).The second and third scenarios of the given gamble were then presented using the same procedure.All inter-stimulus intervals were jittered in 0.5-s steps according to a long-tailed exponential distribution (λ = 3; Hagberg et al., 2001).
In case the participants exceeded the 10s-time limit and thus failed to register their WTS pre value, the entire gamble was considered missing, because no reference value could be used to compute value updating in this case.If there was no WTS post , only this value was considered missing.However, the WTS post of the other two scenarios could still be included for computing value updating.In total, only 0.5% of the trials (16 of 3240 trials from all included participants) were time-outs and discarded from the analyses.

fNIRS data acquisition and optode montage
We used two NIRSport (NIRx Medical Technologies, LLC, USA) continuous-wave fNIRS devices in the tandem mode of the NIRStar acquisition software (version 15.3) for data collection.Each NIRSport system has 8 sources with electromagnetic wavelengths of 760 and 850 nm and 8 detectors sampled at 3.472 Hz using the default standard illumination pattern.Two sizes of standard NIRS caps (56 and 58 cm; https://nirx.net/nirscap)were available for participants.Prior to the experiment, we measured each participant's head circumference to determine the appropriate cap size.Before proceeding with the experiment, a built-in calibration procedure of the NIRSport system was used to check signal quality for each channel, with readjustments of optodes when necessary (i.e., in cases of initial insufficient signal quality due to placements of optodes).Data recording commenced only when the majority of channels (i.e., >35 of 40 channels) exhibited excellent signal quality, the remaining channels were deemed acceptable, and there were no critical issues or missing channels.The two cap sizes fitted the participants well; none of them had to be excluded due to calibration problems using this procedure before fNIRS recording or reported discomforts during or after the experiment.All caps were prepared according to our frontoparietal montage (see next section, Fig. 2A), containing 16 sources and 16 detectors, which resulted in 40 active channels.A reliable channel distance of 3 cm was obtained by inserting stabilizing links (NIRx, Germany).To improve contact between optodes and scalp, spring-loaded grommets with pressure level 2 secured the optodes in the parietal area where the hair is thicker.Velcro strips were used to minimize any strain on the cable during the recordings.A black over-cap provided by NIRx (produced by EasyCap) was pulled over the NIRS cap to eliminate external light sources.The cap was then placed and verified based on the international 10-20 location of Cz (Klem et al., 1999).Cotton swabs were used to move the hair aside when mounting the optodes to ensure proper contact with the scalp.
The fNIRS Optodes' Location Decider (fOLD; Zimeo-Morais et al., 2018) toolbox was used to guide optode positioning in accordance with the international 10-10 system to cover anatomical regions of interest (ROI).With the primary aim of measuring neural activity correlated with belief updating, our targeted regions were decided based on a previous fMRI study, which used a similar task variant (Kobayashi and Hsu, 2017).Specifically, this previous study found that belief updating correlated with activity in lateral frontoparietal areas.Thus, the Automated Anatomical Labeling (AAL2; Rolls et al., 2015) was specified to generate probe arrangement to provide coverage for the bilateral superior frontal gyrus (Frontal_Sup), middle frontal gyrus (Frontal_Mid), superior parietal gyrus (Parietal_Sup) and inferior parietal gyrus (Pari-etal_Inf).Furthermore, value updating was associated with activity in the cingulate and the medial prefrontal cortex (MPFC).Notably, some of our superior frontal channels (i.e., channels 3, 5 and 9) could also detect hemodynamic changes from the MPFC (see Table S3 for coverage probabilities).However, given the limitations of the light emitters, activity in deeper cortical areas, such as cingulate and insula, previously found to be uniquely associated with expectancy violation after adjusting for belief and value updating (Kobayashi and Hsu, 2017), cannot be reliably measured with fNIRS (Cui et al., 2011).Two 8 × 8 fNIRS systems (each resulting in 20 channels) separately covered the target regions in the left and right hemispheres (see Table S3 for the anatomical specificity of each channel to the ROIs).The sensitivity profile of our montage (Fig. 2B) was generated by modeling the light transport in tissues using the Monte-Carlo transport software (tMCimg) embedded in the AtlasViewer (Aasted et al., 2015;Boas et al., 2002).

Data analysis 2.5.1. Behavioral analysis
Theoretically, value updating is the consequence of belief updating, which refers to internal processes that cannot be directly measured with choice data (see related descriptions in Section 2.3.1).Thus, in the context of the gamble bidding task, the value updating behavior of the participants served as the primary dependent variable and a proxy for participants' belief updating (Kobayashi and Hsu, 2017;Schulreich and Schwabe, 2021).Specifically, value updating was quantified as the trial-wise difference of postdraw and predraw WTSs from the 72 scenarios (i.e., ΔWTS = WTS post − WTS pre ) for each participant.Although individual differences of WTS pre at the start of a trial may affect the degree of possible value adjustments at postdraw, the ΔWTS as computed here takes into account such individual differences.Based on the combination of the winning color and the color of the observed draws, ΔWTSs were classified into three categories: normatively positive (norm-pos), normatively negative (norm-neg) and normatively zero trials (norm-zero).In a normatively positive trial, the color of the observed draw was ambiguous and matches the winning color (i.e., ambiguity was reduced through the draw), thus indicating a better winning probability for the gamble.In contrast, in a normatively negative trial, the colors do not match, revealing a lower winning probability.In normatively zero trials, no new information about winning probability can be gained.Of note, the norm-zero updating trials include three subcategories: risky color draws in ambiguous gambles, and ambiguous or risky color draws in risky gambles (see Table S2).
To examine whether the participants' value updating behavior was sensitive to the nature of uncertainty from a normative perspective, onesample Wilcoxon signed rank tests were first performed (using function wilcox_test in the rstatix package; Kassambara, 2021).These tests were not pre-registered but included to test the ΔWTSs of each category against zero deviation from the Bayesian predictions.Next, the pre-registered analyses of using Friedman's ANOVA and post-hoc pairwise Wilcoxon signed rank tests (using function friedman_test, friedman_ effsize, and wilcox_test in rstatix package; Kassambara, 2021) were executed to compare updated values between trial categories (norm-pos vs. norm-zero vs. norm-neg).For the post-hoc analysis, the p-values were corrected using Bonferroni-Holm method to control the family wise error rate.To this end, ΔWTSs were sign-flipped in the category where negative value updating was expected, such that large values indicate strong value updating behavior, irrespectively of valence.

Bayesian quantitative model.
As pre-registered, we adopted a Bayesian model (Kobayashi and Hsu, 2017) to predict updating behavior quantitatively.This model incorporates two stages: (1) belief formation and (2) valuation.The stage of belief formation models the probability distribution of a future draw's color.Before the scenarios of a gamble were presented, the participants were aware of the total number of risk balls (n r ) and ambiguous balls (n a ) as presented to them in the initial urn composition.Yet, the distribution of each ambiguous colors within n a , i. e., the number of balls of the ambiguous color 1 (n a1 ) and ambiguous color 2 (n a2 ) was not known.The probability of drawing a risky ball is straightforwardly specified as However, to estimate the probability of a future draw of ambiguous balls, all possible urn content compositions need to be considered and weighted.Assuming a binomial distribution, the probability of numbers of balls in one ambiguous color a1 is obtained as ) .
Accordingly, the probability of a future draw in ambiguous color a1 is estimated as Since observed draws of ambiguous balls provide further information about the urn content, beliefs should be updated under the Bayesian rule because ambiguity is reduced.In case of a draw in color a1, the postdraw probability of n a1 follows again assuming the binomial probability distribution.When the observed draw was in color a2, the postdraw probability of n a1 is obtained as ) .
In the case of drawing a risky ball, the postdraw probability of n a1 is equal to its predraw probability, which is defined as Because no new information about the urn content is provided in this case, the degree of ambiguity remains constant.
In the valuation stage, an expected value (EV) is generated by multiplying the given probability of the winning color P w with the monetary reward of 10€ (that would be gained when the resolution draw matched the winning color), i.e., EV = 10€ × P w .Notably, we were also able to assess whether participants non-normatively updated values when uncertainty was not reduced (i.e., in risky gambles and risky color draws).However, the observations may have violated prior expectations.Expectancy violation is quantified as 1 − P drawcolor , which is a greater than zero value for any draw of any gamble (see definition above in 2.3.1).
To test whether the participants behaved normatively in value updating, we fitted their WTSs to the Bayesian model using linear mixed model analysis.A full model (using lme function in nlme package; Pinheiro et al., 2017) was specified with Bayesian model predictions as urn-wise predictors and participants as the random effect for WTSs in trials of the two non-zero categories (i.e., norm-pos and norm-neg) separately.Results of the full model were compared with a null model that did not incorporate the Bayesian prediction term (i.e., only a constant).Since the null model is nested in the full model, log-likelihood ratio tests were employed to test model fitting.Since the null model predicts no differences in value updating behavior across all urn compositions, a better model fit of the full model would then indicate that the participants' value updating varies as a function of urn compositions, which would be consistent with the Bayesian model.This approach differs from our pre-registered analysis of correlating expected and observed value updating behavior.We opted for regression over the pre-registered correlational analyses because integrating Bayesian predictions in a mixed model allows for individual deviations while still capturing the underlying relationship as opposed to averaging values into only one value per category per participant across all urn compositions.To further explore whether and how value updating may quantitatively deviate from the Bayesian prediction for each non-zero category, we performed another set of one-sample Wilcoxon signed rank tests in which the dependent variable was the deviations of ΔWTSs from Bayesian model predictions, i.e., deviation (DEV) = value updating observed -value updating predicted .The category-wise DEV values were averaged across all urn compositions for each participant and tested against zero.Positive DEV indicates that the observed updating is larger than the predicted updating and thus may reflect an overweighting of novel information, while negative DEV suggests the opposite.
From the Bayesian normative perspective, value updating should only be driven by belief updating, i.e., the difference between predraw and postdraw probability of ambiguous color draws in the ambiguous gambles only.However, in a non-normative manner, expectancy violation could also influence the value updating, since in any draw of any gamble the probability of expectancy violation (1 − P drawcolor ) is >0.To better understand the mechanisms underlying value updating, in line with similar analyses conducted in a previous study (Schulreich and Schwabe, 2021), we also included non-preregistered analyses that tested multiple models in which belief updating and expectancy violation (as defined by the Bayesian model) were separately or jointly included as urn-wise predictors with participants as random effects.The models were separately fitted to the observed value updates (i.e., ΔWTSs) from different trial categories, resulting in four models: (1) belief updating as sole predictor, (2) expectancy violation as sole predictor, (3) belief updating and expectancy violation as predictors, and (4) belief updating, expectancy violation and belief updating × expectancy violation as predictors.Specifically, all four models were fitted to data from trial categories in which belief updating could be normatively expected, including (i) ambiguous gambles with matching (norm-pos) or (ii) mismatching (norm-neg) ambiguous color draws, and (iii) risky gambles with ambiguous color draws (a subcategory of norm-zero category).As regarding the remaining trials from the other two norm-zero subcategories that involved risk color draws (i.e., ambiguous or risky gambles with risky color draws), only model (2) was fitted to these data.All models were fitted using maximum likelihood (ML) estimation and evaluated by the Bayesian Information Criterion (BIC; Schwarz, 1978).
Lastly, Spearman's correlations (stats package; R Core Team, 2020) between a measure of fluid intelligence (i.e., logical reasoning ability assessed by Raven's Progressive Matrices) and performance in the gamble bidding task (quantified as the absolute deviation |DEV| from Bayesian model predictions) was conducted to explore the potential relation between individual differences in reasoning ability and value updating.FDR corrections were applied to correct for multiple comparisons for the correlations.
MATLAB R2020b (MathWorks Inc, Natick, MA, USA) was used for data pre-processing; statistical analyses were computed using R and R studio (version 4.2.0).For all analyses, the two-tailed significance level was set at p ≤ 0.05.The data were visually checked for potential anomalies using boxplots, histograms, density and Q-Q-plots (via ggplot2 package; Wickham, 2016;and ggbubr package;Kassambara, 2020).The normality assumption was tested using Shapiro-Wilk tests (shapiro.testfunction in stats package; R Core Team, 2020) and by interpreting skewness and kurtosis (pastecs package; Grosjean and Ibanez, 2018).We used non-parametric variants of statistical tests for our pre-registered analyses in cases of non-normality of the data as commonly suggested (Kvam et al., 2022).Effect sizes were calculated as rank-biserial correlation (r rb ) for the Wilcoxon tests and Kendall's W for Friedman's ANOVA.Effects were interpreted according to the definition by Cohen (1988), in which an r rb or Kendall's W between 0.1 and 0.3 is considered a small, between 0.3 and 0.5 a medium, and greater than 0.5 a large effect.

fNIRS data analysis
2.5.2.1.Data quality check.The fNIRS data were first loaded into the HOMER3 toolbox (Huppert et al., 2009) to check data quality.The function hmrR_PruneChannels was used to check the raw data of light intensity range (dRange) and the signal-to-noise ratio (SNR) for each channel and wavelength.A SNR threshold of 6.67 [~15% coefficient of variation (CV); SNR = 1/CV × 100] and dRange of 0.1 to 10 were used as criteria for identifying bad channels.However, even though some channels may have good SNR and signal levels, they might not capture data reflecting physiology.We thus further visually checked the power spectral density (PSD) for each channel of each participant to see whether the signal also contained the heartbeat frequency component, typically around 1 Hz (Tong et al., 2011).The presence of the heartbeat frequency is a common indicator of a strong signal quality in fNIRS data (Hocke et al., 2018).
Although having passed the calibration procedure before starting the fNIRS recording (see 2.4), two participants were identified with poorquality data based on the offline SNR check and visual PSD inspection (with more than 33% of bad channels, i.e., 13 channels) and excluded from further analyses (see Table S4 for an overview of the number of participants with good signal quality by channel).Of note, the careful offline checks were only to ensure data quality, no channels of any of the remaining 45 participants were pruned based on this.Instead, we adopted statistical models that downweighs noisy channels (see details in next section).Including these channels may increase type-II error but will not increase the false-positive rate (Huppert, 2016;Meidenbauer et al., 2021).

fNIRS pre-processing pipeline, parametric modulation, and category-based activation.
The checked fNIRS data were preprocessed and analyzed using the NIRS Brain AnalyzIR Toolbox (Santosa et al., 2018).We adopted an analytical approach that minimized data manipulation (Santosa et al., 2018, p. 29).The raw fNIRS light intensity data was first converted to optical density and then converted to oxygenated (HbO) and deoxygenated (HbR) hemoglobin concentrations by the modified Beer-Lambert law (Strangman et al., 2003) with a partial pathlength factor of 0.1.Noise reduction and correction were conducted with regression models.Specifically, we used an autoregressive iteratively reweighted least-squares model (AR-IRLS) for the first (individual) level analysis.The AR-IRLS model corrects serially correlated errors using an auto-regressive filter (prewhitening) and employs robust weighted regression to iteratively downweigh noisy channels (Barker et al., 2013).This procedure outperforms other methods in correcting physiological noise and correlated errors in fNIRS measurements (Huppert, 2016).Concentrations of HbO have been consistently shown to have a better SNR and is more sensitive than HbR in reflecting task-induced, event-related cortical responses (Cheng et al., 2015;Hoge et al., 2005;Huppert et al., 2006;Jiang et al., 2015).Thus, we focus here only on results regarding HbO concentrations (results regarding HbR are available in full in Tables S5 and S6).
The pre-processed HbO data was first subjected to two model-based analyses to examine different aspects of the updating process and potential valence-dependent effects at the individual (first level).As preregistered, for the first model (Model-1) we created a regressor with the onsets of the 72 observed draws presented in the gamble scenarios, with each scenario being one trial.This regressor was then parametrically modulated by belief updating, expectancy violation, and value updating.Value updating was directly quantified as the participants' ΔWTS at each trial.Belief updating and expectancy violation refer to internal processes that cannot be directly measured with the behavioral data, their trial-wise values were defined by the Bayesian model (see Section 2.5.1.1 for details).Given the way the winning colors and the colors of the observed draws were manipulated across urn compositions of different gamble types, by design, belief updating and expectancy violation can independently occur in the experiment at various levels across ambiguous gambles with ambiguous color draws (r = 0 between model-predicted values of belief updating and expectancy violation).As in the previous fMRI study of the gamble bidding task, this manipulation allowed analyses of brain correlates of belief updating and value updating independent of expectancy violation.However, when all trials (i.e., also including trials of risky color draws where zero belief updating was predicted) were considered together, a statistical correlation between belief updating and expectancy violation resulted (r = 0.62, cf.r = 0.7 found in Kobayashi and Hsu, 2017).Given this correlation, we thus calculated the variance inflation factor (VIF; Stine, 1995) to check for collinearity of the regressors (using check_collinearity in performance package; Lüdecke et al., 2021), which can signal unstable and difficult-to-interpret coefficients.A VIF of 1 indicates complete orthogonality between the regressors and 10 is a commonly used threshold for high collinearity (O'Brien, 2007).When examining our data, the maximum VIF value was 1.64, which indicated that the correlation among the regressors would not substantially influence the results of the models.Therefore, we mean-centered and normalized (range: − 1 to 1) all parametric modulators (i.e., belief updating, expectancy violation, value updating), instead of statistically orthogonalizing them.Furthermore, with the active variant of the gamble bidding task used here, we could use the empirically observed ΔWTS values as the parametric modulator of value updating, instead of using model-predicted values (cf.Kobayashi and Hsu, 2017).Importantly, this also enabled us to assess potential valence-dependent effects on updating processes.For this, we set up the second model (Model-2) with three separate regressors: the onsets of the observed draws for norm-pos, norm-neg and norm-zero trials.For both Model-1 and Model-2, we also included identical regressors of no interest to control for variances associated with (1) urn presentation, as well as (2) motor responses associated with predraw or postdraw bidding.All regressors were modeled with their respective duration (i.e., presentation duration or response times for the bidding phases).The canonical hemodynamic response function (also known as 'double gamma function') with default parameters (peak time 4 s and undershoot time 16 s; Santosa et al., 2018) was selected for convolution to form the main regressors in the design matrix.
At the second (group) level, the modeled data at first level by Model-1 and Model-2 were analyzed separately with linear mixed-effects models to calculate the group mean for each condition, where participants were included as random effects in the models.The Student's t-test was performed to calculate and compare the channel-wise regression coefficients for each condition.To correct for multiple comparisons, we adopted a false-discovery rate (FDR) correction with q < 0.05 (Benjamini and Hochberg, 1995).All tests from analyses conducted at the group level (i.e., including two data types, 40 channels and all conditions of interests) were subjected to this FDR correction, making our correction rather conservative.The results from group-level analyses were visualized using the nirs2img function (https://www.alivelearn.net/?p=2230) to convert the t statistic values of significant channels and the corresponding MNI coordinates into *.img files.The converted images were then rendered over the 3D brain model using Surf Ice (https ://www.nitrc.org/projects/surfice/).

Relationship between updating-related cortical response and
Bayesian rationality.We examined potential associations between updating-related cortical responses and the behavioral performance index of optimal updating (i.e., the deviation between predicted value updating and ΔWTSs).This analysis was pre-registered, but the specific methodological details could not be specified at preregistration.We defined four regions of interest (ROIs), i.e., the right frontal gyrus (12 channels), the left frontal gyrus (11 channels), the right parietal gyrus (8 channels) and the left parietal gyrus (9 channels).Before conducting the correlational analyses, we checked the z-scores of beta values of all participants in each ROI separately.No outliers (defined by the preregistered criteria of 3 standard deviations above or below the group mean) were identified.Thus, no participants were excluded for the analyses.

Value updating behavior is Bayesian quasi-optimal
The participants' subjective values before and after observed draws differed across urn compositions and gamble types (see Figs. S1 and S2 with accompanying texts in Supplementary Materials for the distributions and ranges of WTSs at predraw and postdraw stages for ambiguous and risky gambles, respectively).Note, however, the measure of value updating we computed (the ΔWTSs) takes into account individual differences in subjective values.Regardless of such individual differences, the mean effects showed that participants' value updating behavior differed across urn compositions of the different gamble types, consistent with the Bayesian model (log-likelihood ratio tests, null model vs. full model, ps < 0.0001; Fig. 3A).More importantly, their updating behavior is sensitive to the relevance of new information, i.e., the reducibility of uncertainty.Specifically, the participants' value updating was significantly larger than 0 in normatively positive (match) trials [median(Mdn) = 0.94, p< 0.001, r rb = 0.54] and less than 0 in normatively negative (mismatch) trials (Mdn = -1.56,p < 0.001, r rb = 0.84).Although value updating in the normatively zero trials statistically differed from 0 (Mdn = -0.028,p = 0.02, r rb = 0.34) which indicates a deviation from the Bayesian predictions, when performing analyses separately for the three normatively zero subcategories, value updating did not differ from 0 in all subcategories (all ps > 0.1; see Fig. S3 for details).

Valence-dependent value updating and deviations from Bayesian optimality
Although the above results indicate that in general the participants behaved quasi-optimally, the data also reveal deviations below the predicted values for the norm-pos trials (i.e., when the colors of observed draws match with winning colors in scenarios of ambiguous gambles; Fig. 3A).To assess whether the degree of value updating differed between categories of gamble scenarios, we flipped the sign of ΔWTSs of the norm-neg trials.The pre-registered Friedman's ANOVA revealed a significant main effect of category, χ 2 (2) = 52.84,p < 0.001, Kendall's W = 0.59.Post-hoc pairwise Wilcoxon signed rank tests showed significant differences between all three trial categories.ΔWTSs in the norm-neg trials (sign-flipped; Mdn = 1.56) was significantly higher than in the norm-pos (Mdn = 0.94, p = 0.001, Bonferroni-Holm corrected) and in the norm-zero trials (Mdn = − 0.03 p < 0.001, Bonferroni-Holm corrected).ΔWTSs in the norm-pos trials was higher than in the norm-zero trials (p < 0.001).We further tested whether ΔWTSs in these categories confirmed with Bayesian predictions and could show that unlike in the norm-neg (Mdn = 0.04, p = 0.09, r rb = 0.25) and all norm-zero (all ps > 0.05, see also Fig. S3) categories which did not differ from model predictions, ΔWTSs in the norm-pos trials were significantly less pronounced than the model predicted (Mdn = − 0.65, p < 0.001, r rb = 0.73; Fig. 3B).Together, these results indicate valence-dependent updating behavior in ambiguous gambles, with a suboptimal underweighting of new information in norm-pos trials.

Mechanisms underlying valence-dependent updating
According to the Bayesian model, normative value updating should only be driven by belief updating, which takes the reducibility of uncertainty into account (i.e., only after ambiguous color draws in ambiguous gambles).Nonetheless, value updating potentially could also be influenced, to some extent, by expectancy violation, because in every observed draw (including trials with risky color draws) the expectations can be violated merely due to the probabilistic nature of each color draw.We fitted four models with different regressors to the valueupdating data from different trial categories to determine potentially distinct underlying mechanisms.The best-fitting model for each trial category is reported in Table 1 (see also Table S7 for an overview of all model fits).
In line with the findings reported by Schulreich and Schwabe (2021), for norm-neg trials, model-derived belief updating values emerged as a significant predictor for the observed value-updating data (i.e., ΔWTSs; see Table 1).Adding values of expectancy violation as defined by the Bayesian model and the interaction of belief updating × expectancy violation as additional predictors resulted in worse model fits (see Table S7).In contrast, for the norm-pos trials the best-fitting model included both model-derived values of belief updating and expectancy violation (Likelihood-ratio test of improved fit, L-ratio = 23.12,p < 0.001).This indicates that value updating in the norm-pos category was not only driven by belief updating, as normatively predicted, but was also partly driven by expectancy violation (Table 1).Adding a belief updating × expectancy violation interaction term as a predictor resulted in a worse model fit, whilst expectancy violation as the only predictor yielded the worst model fit for both normative trial categories (see Table S7 also for descriptions and fits for data in the norm-zero category).
In light of these results, we conducted a moderation analysis (Baron and Kenny, 1986) to assess whether the contributions of model-derived belief updating and expectancy violation values in predicting ΔWTSs would depend on the valence of updating (henceforth updating valence, which was defined in the model as a dummy variable with 0 and 1 for the norm-neg or norm-pos categories, respectively).A mixed-effect regression model with random intercepts for the participants and random slopes on the function of urn composition were specified (Preacher et al., 2015).Belief updating and expectancy violation were centered and normalized before been included as predictors.For each participant, urn-wise ΔWTSs from norm-pos and norm-neg (sign-flipped) trials were combined in one column as dependent variable.Two interaction terms were created and added to the regression model by multiplying the predictors and dummy factor of updating valence separately.To better understand the moderating effects, we estimated the regression slopes of the belief-updating predictor for each updating valence (using function emtrends in emmeans package; Lenth et al.,  3C, updating valence had a significant moderating influence on the relationship between belief updating and ΔWTSs (β = − 0.36, t 491 = − 3.27, p = 0.001) in both categories.Furthermore, a steeper slope of the regression line was observed for the norm-neg trial category, suggesting the influence of belief updating on value updating is stronger in norm-neg than in norm-pos trials.Although updating valence did not significantly moderate the relationship between expectancy violation and value updating (β = 0.21, t 491 = 1.87, p = 0.06; the lower panel of Fig. 3C) in general, expectancy violation significantly influenced value updating in norm-pos trials but not the norm-neg trials.
Lastly, for trials in the norm-zero category, value updating after risky color draws in ambiguous gambles was significantly predicted by expectancy violation (β = 2.69, t 224 = 3.27, p = 0.001); whereas value updating after risky color draw in the risky gambles was better described by the null model (with a constant estimate).After ambiguous color draws in risky gambles, the Bayesian model assumed that beliefs (but not values) were updated.Including belief updating as a predictor resulted in a poorer model fit.While the model with expectancy violation showed the best model fit, the coefficients of the predictor and intercept were not significant (Table S7).
Taken together, these results indicate that the mechanisms associated with value-updating behavior varied depending on updating valence.Interestingly, other than being affected by belief updating as expected by the Bayesian model, value updating in the norm-pos trials was also affected by expectancy violation.Expectancy violation also accounted for value updating after risky color draws in ambiguous gambles, which suggests suboptimal value updating in this trial category.In risky gambles, in line with predictions of the Bayesian model, irrespective of the color of the balls drawn, no value updating was observed, and expectancy violation did not significantly influence the process.

Individual differences in value updating behavior
To explore individual differences in the suboptimal behavior in norm-pos trials, we separately fitted linear regression models with expectancy violation and belief updating as predictors for each participant's trial-wise norm-pos ΔWTSs.As described in the Methods section, model-derived values of expectancy violation and belief updating were separable and independent across ambiguous gambles (r = 0) given our task design; thus, the standardized regression coefficients would allow us to quantify their contribution (Johnson, 2000).Both predictors and ΔWTSs were standardized before model specification.The relative influence of expectancy violation on value updating for each participant was then quantified as the squared standardized regression coefficient of expectancy violation divided by the sum of the squared standardized coefficient of both predictors, i.e., β 2 expVio /(β 2 expVio + β 2 beliefupd ).The range of this index is [0, 1], with closer to 1 indicating greater relative influence of expectancy violation.Next, we performed correlational analyses to explore relations between individual differences in expectancy violation and the updating behavior.Since both variables were not normally distributed, the non-parametric Spearman rank correlation was conducted (de Winter et al., 2016).The results indicated that individuals who were more influenced by expectancy violation in their updating process during gambles in the normative-pos category also more tended towards underweighting positive new information relative to the Bayesian predictions (r s = − 0.27, p = 0.07; Fig. 3D) and deviated significantly more (in absolute values) from the predictions (r s = 0.36, p = 0.02; Fig. 3E).
In terms of the relationship between value updating performance and logical reasoning ability, which is an important facet of fluid intelligence, we observed significant negative correlations between Raven's scores and the |DEV| of norm-pos trials, r s = − 0.35, p = 0.02 (q = 0.03 with FDR correction) after controlling for Raven's processing time (Fig. 3F).We controlled for individual differences in the time taken for the Raven's test because it was correlated with the test score, r s = 0.28, p = 0.04, implying better performance with increased time spent on the test.Furthermore, the Raven's scores were also significantly negatively correlated with |DEV| of norm-zero trials, r s = − 0.38, p = 0.01 (q = 0.025 with FDR correction) and the average |DEV| across all trial types, r s = − 0.41, p = 0.005 (q = 0.025 with FDR correction).However, there was no significant correlation with the |DEV norm-neg |, r s = − 0.29, p = 0.06.These results indicate that value updating is in general more in line with Bayesian predictions in individuals with higher reasoning ability.

Neural correlates of updating processes
We first performed channel-wise parametric analyses of the fNIRS HbO signal in the NIRS Brain AnalyzIR Toolbox (Santosa et al., 2018) to assess cortical correlates of belief updating, expectancy violation, and value updating.We observed that five channels, primarily in the right middle frontal gyrus (MFG) and right superior parietal gyrus (SPG) and right inferior parietal gyrus (IPG), showed activity that was positively related to belief updating, whereas the activity of one left SPG channel exhibited a negative correlation with belief updating (q < 0.05; Fig. 4A, see also Table S5 for an overview of statistics individual channels).The activity of six channels primarily in the left MFG and bilateral SPG was associated positively with expectancy violation (q < 0.05; Fig. 4B, see also values in Table S5).As for value updating, activity of six frontal channels was found to be positively related to it, whereas activity of two parietal channels and one left MFG channel was negatively related to value updating (q < 0.05; Fig. 4C, see also Table S5; note that the values of updating in norm-neg trials was not sign-flipped here so that the negative T statistics here reflect the negative values in these trials).Together, these results based on regressions with parametric modulations of belief updating, expectancy violation and value updating yield activity in similar brain regions as in the fMRI results reported by Kobayashi and Hsu (2017).

Valence-dependent effect of value updating in frontoparietal regions
Bayesian decision theory expects the same underlying updating process for normatively positive or normatively negative trials.However, behaviorally we observed asymmetric updating behavior depending on valence, indicating non-normative underweighting of new information in norm-pos trials.Thus, in the second model, we compared cortical activity between these two categories of gamble scenarios.Results from the analyses revealed that HbO responses in 12 channels (nine frontal channels and three right parietal channels, q < 0.05; Fig. 5, see also Table S6) were larger in the norm-pos trials than in the normneg trials, indicating valence-dependent recruitment of these regions.

Brain-behavior correlations of valence-dependent value updating
Results from previous sections show valence-dependent effects both at the behavioral and brain levels.Specifically, worse performance (greater deviations from Bayesian predictions) but greater frontoparietal activity was found in the norm-pos trials compared to norm-neg trials.To better understand individual differences in valence-dependent cortical activity and value-updating performance in norm-pos trials, for each participant we extracted the contrast values (i.e., β norm-posβ norm-neg ) from four ROIs (left/right frontal/parietal, see Method section for details) and correlated them with individual differences in value updating performance in norm-pos trials and with valence-dependent asymmetry of the updating performance.Updating performance in norm-pos trials was measured by the absolute difference between ΔWTSs and Bayesian model prediction (|DEV norm-pos |).The closer | DEV norm-pos | is to 0, the more normative the performance (i.e., less underweighting of positive new information).The asymmetry in updating performance was defined as the difference between the absolute model deviations of the norm-pos and norm-neg trials (|DEV norm- pos | -|DEV norm-neg |).Positive values in this case indicate relatively less normative performance of norm-pos trials compared to norm-neg trials.
As shown in Fig. 6A, individuals with higher norm-pos vs. norm-neg activity in left frontal, left and right parietal cortex showed less model deviated value updating behavior in the norm-pos trials (r ss > − 0.30, ps < 0.05), while a marginal effect was found in the right frontal cortex (r s = − 0.29, p = 0.051).After FDR correction (q < 0.05) for multiple testing, only the left-parietal correlation remained significant.Consistently, as shown in Fig. 6B, participants with higher norm-pos vs. normneg activity also showed more normative performance in norm-pos trials relative to their own performance in the norm-neg trials (all r ss > − 0.31, ps < 0.04).Other than the right-frontal region, effects in the remaining three regions survived the FDR correction.These correlations remained largely unchanged after controlling for individual's Raven score and the time for completing the Raven's test (for details see Text S3).

Discussion
Extending previous studies on mechanisms underlying updating processes during decision making with reducible and irreducible uncertainty, the present study investigated potential valence-dependent effects on value updating and the associated cortical mechanisms in an adapted variant of a gamble bidding paradigm (cf.Kobayashi and Hsu, 2017).In combination with model-based analyses using Bayesian decision theory and measuring cortical activities using fNIRS while simultaneously assessing behavioral value updating, the active variant of the gamble bidding task used here not only allowed the dissociation of subprocesses associated with belief updating, value updating, and expectancy violation, but also the examination of valence-dependent mechanisms.The observed valence effects here provide the first clear findings of valence-dependent updating behavior in the context of decision making under reducible uncertainty and the associated brain mechanisms.These findings have implications for discussions about limitations of normative Bayesian decision theory and can be interpreted in line with processes of statistical inference and learning when couched in the more general principles of Bayesian inference underlying perception and cognition.Below, we discuss these results in detail in light of previous findings and theories.
The behavioral results from the current study corroborate previous empirical findings showing that young adults are sensitive to the nature of uncertainty (reducible vs. irreducible) and behave in a quasi-optimal manner when updating values after experiencing scenarios providing new information about the gambles (Fig. 3A; cf.Kobayashi and Hsu, 2017;Schulreich and Schwabe, 2021).Together, these findings lend support for Bayesian principles about the roles of uncertainty in perceptual or cognitive information integration (Fiser et al., 2010;Knill and Pouget, 2004;Ma and Jazayeri, 2014;Pouget et al., 2013).Specifically, the active inference framework postulates that in dynamic, probabilistic environments, individuals make inferences about statistical contingencies in the environment and gradually learn to adapt their internal models of the world (or beliefs) in order to minimize the discrepancies between expectations based on prior knowledge and action outcomes (Da Costa et al., 2020;Friston, 2005Friston, , 2009)).Considering our gamble bidding task through the lens of this framework, participants may have formed prior beliefs about the potential winning probability of a given gamble, based on the initial presentation of urn composition.This initial belief served as a starting point for making inferences about the expected value of that gamble.As the participants gained further information reflecting a positive or negative change in the winning probability in each different scenario of the gamble from trial to trial, the inference about the expected value of the gamble was updated as reflected in the ΔWTSs.Such updates across independent scenarios of a gamble allowed the participants to gradually learn more about the statistical properties of the gamble and update their beliefs about the winning probability.In line with Bayesian normative behavior, belief and value updating only occurred in scenarios when new information Participant's ΔWTSs were adopted as parametric modulators of value updating.ΔWTSs of normatively negative scenarios were not sign-flipped here, so the negative neural correlates reflected the negative value updating (see text in Results section for details).Only significant channels (FDR-q < 0.05) are shown.reduced uncertainty, as in the case of ambiguous gambles.However, updating behavior was valence dependent and thus deviating from Bayesian predictions.This valence dependency calls for separate, valence-specific parameters in Bayesian decision theory to account for updating behavior with new information about increased or decreased winning probabilities.Similarly, generic theories in the active-inference (see Da Costa et al., 2020 for review) or reinforcement-learning frameworks (see Doll et al., 2012 for review) do not usually model learning differently depending on outcome valence; however, these models can be extended by valence-specific parameters to account for valence-dependent inference and learning (cf.Niv et al., 2002;Palminteri and Lebreton, 2022).
At the neural level, we observed HbO activities assessed with fNIRS in different regions of the lateral frontoparietal cortex that were distinctly associated with belief updating, expectancy violation, and value updating (Fig. 4).The observed frontoparietal involvements  (Santosa et al., 2018).The boxplots show the median and quartiles of the data, individual data dots are plotted.Asterisks indicate FDR q-values, ***q < 0.001, **q < 0.01, * q < 0.05.(C) Group-level average HbO concentration changes related to normatively positive (red), normatively negative (blue) and normatively zero (gray) updating category averaged across 4 ROIs with standard error of the mean as shaded area.Gray area indicates the time window of observed draw (duration = 4 s).(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)suggest that both processes of inference and learning (cf.Friston, 2005) were involved in performing the gamble bidding task.These findings in part replicate observations made in earlier fMRI studies, but with an adapted active-bidding paradigm and a different imaging modality (i.e., fNIRS).While the replication of earlier fMRI findings in the different imaging modality of fNIRS is certainly of some interests, the main benefit of the study design lies in the possibility to examine brain mechanisms associated with valence-dependent updating behavior by using an active bidding paradigm, which could not be investigated previously in the passive variant of the task.
Regarding replicating previous findings, the HbO signals associated with belief updating were found in the right lateral frontoparietal cortex, similar to the fMRI results reported by Kobayashi and Hsu (2017), which confirm the role of frontoparietal cortex in integrating new information with existing beliefs.Despite some variations in the exact localizations, several studies across different domains and contexts have confirmed the link between the frontoparietal cortex and belief updating (e.g., Gläscher et al., 2010;Visalli et al., 2019;Waskom et al., 2017).Moreover, a recent study found that applying anodal transcranial direct current stimulation (tDCS) over the right dorsolateral prefrontal cortex made young adults behave more rationally during Bayesian updating (Schulreich and Schwabe, 2021).Regarding expectancy violation, we observed associated HbO signals in the left MFG and the bilateral posterior parietal cortex (PPC), which is consistent with findings from previous studies that decomposed the belief updating and surprise (Nour et al., 2018;O'Reilly et al., 2013;Schwartenbeck et al., 2016;Visalli et al., 2019).However, this finding differed from Kobayashi and Hsu's (2017) fMRI study, which suggested a unique association between the activity of anterior insula and the degree of expectancy violation.This discrepancy in part could be due to the use of different neuroimaging modalities as well as differences in task procedures.Although both studies reported the neural correlates of updating processes during the phase of observing different gamble scenarios, the procedural details of the tasks differed markedly between the two studies.Particularly, in Kobayashi and Hsu's (2017) fMRI study, the participants were merely required to passively observe different scenarios without actual subsequent bidding involved.Our active bidding procedure, which eventually was associated with a potential bonus at the end of the task, may engage participants more during task performance.Some evidence suggests that active/passive decision involvement may influence how people process decision information (Kuhnen, 2015) and the underlying neural responses (Rao et al., 2008).For instance, Kuhnen (2015) found that individuals update their beliefs more pessimistically in the loss domain when actively investing compared to passive involvement (i.e., only evaluating how good the stock is based on the provided information).
Furthermore, going beyond results observed in the two previous studies (Kobayashi and Hsu, 2017;Schulreich and Schwabe, 2021), by allowing active bidding and simultaneously assessing cortical responses, we were able to detect and examine valence-dependent updating behavior (Fig. 3B) and cortical responses (Fig. 5).Specifically, although the participants generally weighed new information in a quasi-optimal manner after observing scenarios of gambles indicating a decrease of winning probability (normatively negative scenarios), they deviated from the Bayesian prediction and underweighted the new information after observing scenarios indicating an increase in winning probability (normatively positive scenarios).Results from model-based analyses suggest that the Bayesian suboptimal underweighting of new positive information may, in part, be associated with the influence of expectancy violation on value updating in scenarios indicating higher winning probabilities than initially thought.When predicting value updating behavior, the best-fitting model for norm-pos trials included both belief updating and expectancy violation as predictors.In contrast, only belief updating was required for fitting data from the norm-neg trials (Table 1).Results from correlational and moderation analyses lend further support for this finding: value updating is only associated with belief updating and not with expectancy violation in norm-neg trials; however, it is both associated with belief updating and expectancy violation in norm-pos trials (Fig. 3C).In terms of brain correlates, individuals with greater involvement of frontoparietal activity during the norm-pos trials showed less suboptimal underweighting of positive information during value updating (Fig. 6A) and a lesser asymmetry in valence-dependent value updating (Fig. 6B).
Although normative Bayesian perspectives commonly assume equal weighting of positive and negative outcomes, human decision-making processes are known to be affected by other factors underlying cognition or motivation during uncertain situations (Maddox and Markman, 2010).Thus, human decision-making behavior may not be entirely captured by normative predictions in many situations.In our study, for instance, we observed a valence-dependent value updating pattern.This deviation from Bayesian normativism need not be considered as biases of human reasoning but could also reflect bounded rationality within the constraints of the decision contexts and individual attributes (Kahneman, 2003;Gigerenzer and Selten, 2002).Previous studies found young adults showed quasi-optimal updating behavior and sensitivity to the reducibility of uncertainty (cf.Kobayashi and Hsu, 2017;Schulreich et al., 2020).Congruent with this, our results also indicate that participants only updated subjective values in ambiguous gambles upon receiving information indicating changes in the winning probabilities.However, our findings introduce a new perspective: the way they carry out the updating deviated from Bayesian predictions as it depended on the valence of the new information, a factor not accounted for by Bayesian theory.Following a draw that indicated a decrease in winning probability, values were rationally adjusted downwards; however, following a draw that indicated an increase in winning probability, values were not adjusted upwards to the extent the Bayesian model predicted, reflecting conservative updating behavior in this case.In contrast to recent findings by Schulreich and Schwabe (2021), which showed conservative updating following both normatively positive and negative scenarios, our results are rather in line with the typical behavior of loss aversion observed when people make decisions under uncertain and risky situations.Prospect Theory (Kahneman and Tversky, 1979) proposes a different weighting of prospective gains and losses during choice.In mixed gambles, for instance, people tend to show greater sensitivity to losses than gains (Schulreich et al., 2020;Tversky and Kahneman, 1992).This framework also postulates that evaluations of a given outcome are based on changes from a reference point (or a status quo) rather than on the final state.In the current study, the participants' initial subjective value after the urn presentation prior to observed draws represent a plausible reference point.An ambiguous-color draw that matches the winning color (i.e., the norm-pos scenarios) represents a gain compared to this reference point (i.e., prior to the draw), as it informs about an increase of the likelihood of winning in that gamble.In contrast, an ambiguous-color draw that mismatches the winning color (i.e., the norm-neg scenarios) represents a loss compared to the reference point, as it indicates a decrease in winning probability.Interestingly, the shape of the observed function of deviations (as illustrated in Fig. 3A) from Bayesian predictions might be explained by the shape of a prospect-theoretic value function (Kahneman and Tversky, 1979).Specifically, a steeper slope for potential losses (i.e., stronger updating for negative information) would translate to a deviation function closer to the Bayesian prediction in our study.In a similar vein, the S-shape of the deviation function could be explained by the concavity and convexity of the prospect-theoretic value function for gains and losses, respectively.Our findings complement and extend the theory by suggesting that positive and negative events are also processed differently in how they are incorporated into prior beliefs.In addition, results from the regression analyses (Table 1) as well as moderation models (Fig. 3C) showed that, besides belief updating, expectancy violation also contributed to value updating after observing new positive information, but not after negative information.
In terms of neural correlates, concurrent with the behavioral valence-dependent effects on value updating, we found valencedependent recruitments of brain activity.Although several frontoparietal regions were involved in belief and value updating in general, category-based contrasts directly comparing activity during norm-pos and norm-neg scenarios showed that greater frontoparietal activity was recruited when integrating new positive than negative information.What might underlie the greater demand of cortical resources when integrating new positive information albeit the associated value updating was reduced (and Bayesian suboptimal) during scenarios indicating increases in winning probabilities?
One explanation might be the observed influence of expectancy violation (i.e., surprise).Probabilistic events are also associated with certain degrees of expectancy violation that do not necessarily reflect systematic changes in statistical contingencies of the decision contexts.In multi-step reinforcement-based decision tasks (e.g., Daw et al., 2011), expectancy violation may trigger processes associated with the so-called model-free (or habitual) learning instead of model-based processing.In the paradigm investigated here, the experimental manipulation ensured a relative independence of the two factors (or only negligible collinearity).This allowed the explorations of distinct and shared brain correlates of these two processes.Other than dissociable brain activities observed in previous studies which were in part replicated here (see discussions above), belief updating and expectancy violation were both also associated with HbO signals in the frontoparietal region.Although we could not observe activity in deeper brain regions using fNIRS, activity in the insula has been found to be uniquely associated with expectancy violation in fMRI data (Kobayashi and Hsu, 2017).Inputs from the insula to the striatum are known to affect reward-dependent behavior or memory (Haggerty et al., 2022;Parkes et al., 2015).Higher activity in the anterior insula and striatum have been shown to be associated with the tendency of a default bias (stay with a default option or status quo instead of switch option) during a gambling task (Yu et al., 2010).Taken together, these findings suggest that other than the frontoparietal model-based process of belief updating, model-free reward expectancy processes may also be engaged, particularly during the normatively positive scenarios.The underweighting of positive information (conservative updating) would be in line with a conjecture that the positive information about an increase in the winning probability of a given gamble is rewarding itself, which may either motivate individuals to stay with default beliefs or make them less sensitive to the increase in winning probability.
Another possible interpretation could be considered in terms of processing demands of cognitive information theory.Specifically, it has been argued that encoding surprising events (cf.expectancy violation) requires cognitive effort (see Zénon et al., 2019, for an overview).Our regression analysis revealed that expectancy violation is an additional influence besides belief updating in predicting value updating behavior in norm-pos and not in norm-neg scenarios.The suboptimal performance after observing draws indicating an increase in winning probability could be associated with an attentional distraction since surprising events associated with rewards may attract attentional resources away from other aspects of cognitive processing (cf.Anderson, 2016;Noonan et al., 2018).Optimal task performance thus requires recruiting additional cognitive control that ignores prepotent but irrelevant signals in favor of attention to decision-relevant information.Recent models of cognitive control proposed that the frontoparietal cortex is engaged in the trial-by-trial adjustment of task-relevant information, which underlies top-down control (Cocchi et al., 2013;Crittenden et al., 2016;Marek and Dosenbach, 2018) and monitor performance.Relatedly, results from previous EEG studies also indicate that frontoparietally distributed event-related potentials (e.g., P300) were associated with processes of probabilistic expectations (Kluger et al., 2019) or uncertainty resolution in a novelty processing task (Harper et al., 2016).Furthermore, different components of the P3 have been associated with perceptual inference-and learning stages as postulated by a predictive-processing account (Barceló, 2021).In line with these earlier studies, our results indicated that individuals who showed higher activities in the frontoparietal regions deviated less from Bayesian predictions in their value updating behavior during the norm-pos scenarios.
Consistent with our prediction, participants with higher logical reasoning ability as measured by the Raven's test deviated less from the Bayesian prediction during value updating in all three scenario types (Fig. 3F).This result indicates that individuals with higher reasoning abilities may be more likely to utilize rational strategies, such as the use of Bayes' rule, which make their updating behavior more in line with the Bayesian prediction.Conversely, individuals with lower reasoning abilities may have a less concrete model/belief about the task and were more inclined to rely on situational factors, such as expectancy violation, that can lead to suboptimal decision-making.These results accord with a previous study, which found that individuals with a higher level of reasoning ability acquired a complex state transition structure underlying sequential decision making better than individuals with a lower level of reasoning ability (Eppinger et al., 2015).Taken together, under uncertain and dynamic environments, individual differences in reasoning ability may contribute to differences in utilizing new information to update their decisions and choices.
While our findings provide new insights into the valence-dependent neurocognitive processing of environmental signals, some limitations of this study must be noted.First, fNIRS brain imaging is restricted to a set of regions of interest and to the outer surface of cortical tissue due to the limited penetration depth of the light.Therefore, we were not able to investigate additional subcortical brain regions that might play an essential role in information processing.For instance, the cinguloopercular (CON) network, including the anterior cingulate and bilateral insula, was suggested to process unexpected events, highly interact with the frontoparietal regions, and engage in cognitive control (Dosenbach et al., 2008;Marek and Dosenbach, 2018;Visalli et al., 2019).The evidence from affective neuroscience studies suggests that the valence-dependent bias may be due to asymmetric emotional and neural responses to prospective gains and losses, which have consistently been identified in the striatum and the amygdala (see Schulreich et al., 2020; or Sokol-Hessner and Rutledge, 2019 for a review).Hence, how the valence-dependent effects dynamically modulate the cortical-subcortical interactions during adaptive uncertainty reduction is an interesting question for future research.Furthermore, fNIRS cannot address the temporal dynamic of the updating processes because of its relatively poor temporal resolution (in our dual system setup, 3.47 Hz).Future studies may consider adopting a multimodal imaging approach (e.g., fMRI-EEG), which combines measures of high spatial and millisecond temporal resolution to better describe the neural dynamics of different updating scenarios.Finally, as our sample was relatively young (average age = 22.04 years), these results may not generalize to an older cohort.Given that aging is accompanied by a wide range of neurobiological and cognitive decline (see Grady, 2012;Li and Rieckmann, 2014 for reviews), which may also affect the decision-making quality in old age (e.g., Chowdhury et al., 2013;Eppinger et al., 2015;Samanez-Larkin et al., 2012), future research with a broader age range would be beneficial in investigating neurocognitive processes underlying potential age-related differences in the sensitivity to the reducibility of uncertainty during decision making.

Conclusions
Using an active variant of a gamble bidding task in combination with fNIRS, we found valence-dependent effects both at the behavioral and neural levels in young adults.Specifically, although young adults' integration of negative information is aligned with the Bayes' rule, they systematically underweighted positive information.Furthermore, we found greater frontoparietal activities in response to positive than negative information, and this valence-dependent modulation of brain activities was associated with better performance (i.e., less underweighting of positive information).The suboptimal behavior updating after observing positive information may be associated with the impact of expectancy violation on motivation and attention that interfere with belief updating.Together, our study supported the view that frontoparietal regions play a crucial role in adaptive information processing and shed new light on the valence-dependent asymmetric updating of beliefs and values under uncertainty

Data and code availability
The anonymized fNIRS and behavioral data that support the findings of this study are not openly available due to the conditions of the ethical approval we obtained for this study but are available upon request by researchers.Access to data by qualified investigators must comply with the European Union General Data Protection Regulations (GDPR) and all relevant guidelines.The completion of a data transfer agreement signed by an institutional official will be required.For data access please contact corresponding authors of the study, who will assist the data accessing process and contact the relevant institutional office.
The fNIRS Analysis code (for Brain AnalyzIR Toolbox in MATLAB), behavioral analysis code, results, and the gamble bidding task Psychtoolbox program are all publicly available at: https://osf.io/qnrea/.

Declaration of Competing Interest
Authors declare that they have no conflict of interest.

Fig. 2 .
Fig. 2. Montage setup of fNIRS measurements.(A) fNIRS montage in the international 10-10 coordinate space.Two 8 × 8 fNIRS systems were setup in the tandem mode resulting in 40 channels.Cz is highlighted in green.(B) Sensitivity profile in log10 (mm − 1 ) of montage.Sources are indicated in red, detectors are indicated in blue, and yellow lines indicate channels.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) X.-R.Peng et al.

Fig. 3 .
Fig. 3. Results of updating behavior.(A) Bayesian predictions and observed data (ΔWTSs) of value updating in ambiguous gambles (data points represent the six urn compositions).Error bars indicate the standard error of the mean (SE).(B) Category-wise deviations between observed and predicted value updating.The boxplots show the median and quartiles of the data, individual data dots are plotted ( ♯ value updating was not different from 0 in all three normatively zero subcategories).(C) Upper panel: belief updating is positively associated with value updating in both norm-pos (red) and norm-neg (blue) trial categories, yet have a greater effect on the latter; lower-right panel: expectancy violation is positively associated with value updating only in norm-pos but not in norm-neg trial categories.(D)Individuals with higher contribution of expectancy violation to value updating (see text in the Results section for details) tend to update their value less than Bayesian model's prediction and (E) perform worse in normatively positive trials.(F) Individuals with higher Raven's scores exhibited better overall Bayesian rationality (quantified as the mean absolute deviation from the Bayesian model in the norm-pos categories).Of note, raw Raven scores were illustrated for descriptive purposes, but the statistical test was based on Raven scores after controlling for the time for completing the Raven's test.Similar relations were found for averages across all trial categories (see text in Results section for details).The shaded areas in (D), (E) and (F) represent a confidence interval of 95%.***p < 0.001, **p < 0.01, ns p > 0.05.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 4 .
Fig. 4. Activation in frontoparietal regions correlated with updating processes during observed draws.(A)Belief updating, (B) expectancy violation and (C) value updating.Participant's ΔWTSs were adopted as parametric modulators of value updating.ΔWTSs of normatively negative scenarios were not sign-flipped here, so the negative neural correlates reflected the negative value updating (see text in Results section for details).Only significant channels (FDR-q < 0.05) are shown.

Fig. 5 .
Fig. 5. Trial category-related neural activity during observed draw.(A) Updating category contrast map for HbO.Only significant channels (q < 0.05) are shown.(B) Mean β-value from the HbO signal over all channels in normatively positive (red), normatively negative (blue) and normatively zero (gray) updating categories during observed draws.Student's T-tests were extracted from contrast analysis in NIRS Brain AnalyzIR Toolbox(Santosa et al., 2018).The boxplots show the median and quartiles of the data, individual data dots are plotted.Asterisks indicate FDR q-values, ***q < 0.001, **q < 0.01, * q < 0.05.(C) Group-level average HbO concentration changes related to normatively positive (red), normatively negative (blue) and normatively zero (gray) updating category averaged across 4 ROIs with standard error of the mean as shaded area.Gray area indicates the time window of observed draw (duration = 4 s).(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 6 .
Fig. 6.Correlations between updating-related cortical responses during observed draws and gambling bidding task performance.(A) Individuals with higher norm-pos vs. norm-neg activity in four frontoparietal regions showed more Bayesian optimal value updating (as indicated by less deviation) in the norm-pos trials.(B) Individuals with higher norm-pos vs. norm-neg activities also showed better relatively performance in norm-pos trials as compared to their own performance in the norm-neg trials.|DEV| denotes the absolute difference between ΔWTSs and Bayesian model prediction (both uncorrected p-values and the FDR corrected q-values are shown here).

Table 1
Best-fitting models of the influences of belief updating and expectancy violation on value updating in different trial categories (negative updates not signflipped*).The coefficient for the norm-neg category is negative because Bayesian belief updating was included as absolute values and ΔWTSs was not sign-flipped here.Note.Bold font indicates statistically significant results (p < 0.05).