Habits as Adaptations: An Experimental Study

Psychologists emphasize two aspects of habit formation: (i) habits arise when the history of a decision process correlates with optimal continuation actions, and (ii) habits alleviate cognition costs. We ask whether serial correlation of optimal actions alone induces habits or if, instead, habits form as optimal adaptations. We compare lab treatments that differ in the information provided to subjects, holding fixed the serial correlation of optimal actions. We find that past actions affect behavior only in the treatment in which this habit is useful. The result suggests that caution is warranted when modeling habits via a fixed utility over action sequences.


Introduction
Habits play an important role in economic discourse. Economists employ them to explain diverse phenomena ranging from inertia of consumption to brand loyalty. The standard modeling approach represents habits via a fixed time-nonseparable utility function, thus leaving the issues of when and why habits form, and their responses to counterfactual environments, unanswered. Psychologists offer a view on both the purpose of and the mechanism underlying habit formation. They define habits as automated responses triggered by cues, where cues are elements of the decision history that empirically correlate with optimal continuation choices. In this view the purpose of habits is to alleviate cognition costs. 1 We ask whether habits originate by mechanically following a cue that empirically correlates with well-performing choices or whether, instead, habits are second-best adaptations. Our main experimental result supports the latter hypothesis. We view the finding as good news for the predictability of habits. An understanding of habit formation rooted in optimization can inform analysts which cues, out of several available cues, a decision-maker will leverage. Modeling habits as optimal adaptations also permits counterfactual predictions of habit strength under various policies.
To discriminate between the mechanical and optimization origins of habits, we compare treatments from a lab experiment in which subjects face a sequence of tasks generated by a given stochastic process. The compared treatments differ only in the information feedback. If habits form mechanically whenever past cues and optimal continuation choices correlate, then the variation in feedback should not impact habit formation and the selection of cues. Our data, however, show that subjects form distinct habits across these treatments; moreover, the cues selected are naturally rationalized as adaptations to the information provided.
Our basic experimental task is to recognize a binary state variable presented visually on a computer screen. Correctly identifying the state requires moderate cognitive effort. Subjects face two periods of this state-recognition task, across which the state evolves according to a known stochastic process with positive serial correlation. In the treatment without feedback, we reveal both realized states to the subjects only at the conclusion of the two-period sequence. We find that subjects form a habit: the first-period outcome predicts the second-period choice (controlling for the second-period state) in this treatment. The cue that subjects leverage is their first-period action; the first-period state does not predict the second-period action. In other words, the behavioral pattern exhibits action inertia. The habit alleviates the subject's cognitive burden since, due to the serial correlation of the states, the first-period action contains useful information about the second-period optimal choice, and the subjects utilize this information.
In the other treatment, with information feedback, we employ the same state-generating process, but we reveal the first-period state in between the two periods. Subjects again form a habit in this treatment: payoff-irrelevant elements of the history predict the continuation choice (controlling for the second-period state). However, importantly, the cue changes relative to the previous treatment.
The first-period action is no longer predictive; all of the predictive power is associated with the firstperiod state, which contains superior information about the optimal continuation action relative to the first-period action. This finding is inconsistent with the view that habits arise as a mechanistic consequence of the serial correlation of optimal actions. Rather, the result suggests that our subjects choose cues optimally, according to their informational content. As a further check, we ran additional treatments (with and without information feedback) in which the states were serially independent. As expected, subjects do not form habits in these treatments; the second-period choice is independent of all first-period variables.
If habits are, as our results suggest, optimal adaptations, then their strength should vary predictably with the decision environment, in particular, with the incentive stakes and the serial correlation of states. When stakes are decreased or correlation increased, the trade-off between reliance on the cues and the acquisition of new information shifts in favor of the cues. Thus, we predict that habits become stronger-cues become more predictive of continuation behavior-when stakes are lower and correlation is greater. We test this hypothesis experimentally. We first confirm that when states are uncorrelated then a change of stakes has no impact: habits are not formed.
For the correlated treatments, changes in stakes and correlation have no impact on the cue selection, but they do affect the strength of habits. We obtain strong statistical evidence in favor of the predicted comparative statics when the selected cue is the past action. When the cue is the past state, the evidence continues to support the prediction, although it is less conclusive.
We supplement the experiments with a model that derives habit formation from primitive assumptions on the information-processing friction. In the model, a decision-maker chooses information structures (i.e., a strategy for how to acquire information about the states) and trades off the precision of her information against an acquisition cost. The model allows us to formalize the above intuitive predictions about habit formation, cue selection, and the comparative statics of habit strength. In Section 6, we discuss how such a model can provide counterfactual predictions for habit formation disciplined by optimization arguments in a general setting.
Popular macroeconomic models explain the empirically observed inertia of consumption by imposing a time-nonseparable utility function u c t − c t−1 , where c t−1 is an aggregate of the consumption history, e.g. Pollak (1970) and Abel (1990). When u is concave, high past consumption triggers high current consumption; i.e., c t−1 becomes the cue for a consumption habit. Since the assumed utility representation is exogenous, the modeling choice of c t−1 is not obvious and specifications in the literature include aggregates of past population-wide consumption, past individual consumption, and past individual consumption of specific categories of goods; see Schmitt-Grohé & Uribe (2007) for a review. Laibson (2001) proposes a model of habit formation rooted in psychology that, like us, focuses on the endogenous selection among several available habit cues, albeit, unlike in our case, the cue selection is not rooted in the optimization of cognition costs. Camerer et al. (2018) study a model of habit formation inspired by neuroeconomics and advocate for the optimization-based origin of habits. Angeletos & Huo (2018) prove observational equivalence between a model featuring higher-order uncertainty, and a model of a representative agent with consumption habits.
Our model of habit formation belongs to the rational-inattention literature originating in Sims (2003). It is a special case of the discrete dynamic rational-inattention model by Steiner et al. (2017), which in turn extends a related static model by Matějka & McKay (2015). Rational inattention models have been used to derive inertia of behavior in a macroeconomic context, see Mackowiak & Wiederholt (2009) for a theoretical contribution and Khaw et al. (2017) for an experimental exploration.

Habits and cues
We study habit formation in the simplest possible setting. A decision-maker (DM) chooses a binary action a t ∈ {0, 1} in each of two periods to maximize 2 t=1 u(a t , θ t ). The binary state θ t ∈ {0, 1} evolves according to a stochastic process known to the DM. The first-period state attains value 1 with prior probability p 1 , and θ 2 = θ 1 with probability γ ≥ 1 2 for each value of θ 1 . The two states are independent when γ = 1 2 , and they are positively correlated if γ > 1 2 . The DM's task in each period is to match the action to the state; u(a, θ) = s if a = θ and zero otherwise; s > 0 is the stake.
An analyst collects data on the states and actions across many repetitions of the two-period sequence. In its idealized form, the analyst observes the joint probability distribution π(θ 1 , a 1 , θ 2 , a 2 ) over the quadruples of states and chosen actions. Our data extends the state-dependent stochasticchoice data introduced by Caplin & Dean (2015a) in a static setting to the dynamic context considered here.
We say that the DM forms a habit if there exists a triple (θ 1 , θ 2 , a 1 ) ∈ {0, 1} 3 such that π(a 2 | θ 2 , θ 1 , a 1 ) = π(a 2 | θ 2 ). Otherwise, if a 2 is independent of (θ 1 , a 1 ) conditional on θ 2 , we say that the DM does not form a habit. Thus, the DM forms a habit if the history of the processwhich is irrelevant to the continuation payoff -predicts continuation behavior. Our definition of habits is behavioral in nature and distinct from the commonly used non-separable utility approach.
Our analyst knows that the DM's utility is, in fact, time-separable; she attributes any correlation between the history and continuation behavior to imperfections in the decision process, and refers to the predictive power of the history as a habit. Our definition is related to the concept from Camerer et al. (2018) who define habits as a lack of adaptation to evolving incentives. When, as in our definition, the history predicts the continuation behavior controlling for the current state, then the DM has not fully adapted to the evolving state.
Additionally, we define cues that drive the habitual behavior. Is the habitual behavior in the second period, if it arises, triggered by the past state θ 1 , or by the past action a 1 ? Let z be one of the two random variables in the set {θ 1 , a 1 } and w be the complementary variable from this set.
We say that z is the cue for the habit if (i) π(a 2 = 1 | θ 2 , z = 1) > π(a 2 = 1 | θ 2 , z = 0), and if (ii) π(a 2 = 1 | θ 2 , z, w) = π(a 2 = 1 | θ 2 , z). Thus, for instance, the past action a 1 is the cue for the habit if the probability that the DM chooses the high action in the second period increases with a 1 given θ 2 , and θ 1 has no additional predictive power. The latter condition prevents a spurious identification of cues. Since θ 1 and a 1 may be correlated (and indeed are correlated in our data), it may happen that they both correlate with continuation behavior, but all the predictive information is contained in only one of them.
Habits exhibit a continuous range of strength since the correlation between cues and continuation behavior varies with the DM's environment. We capture this as follows. Suppose that the DM has developed a habit with cue z ∈ {θ 1 , a 1 }. strength φ z (θ) at state realization θ 2 = θ to be which measures how strongly the probability that the DM chooses the correct a 2 varies with the cue value in the state θ 2 = θ.
Based on the hypothesis that habits are useful adaptations, we predict that the DM does not form a habit when the states are independent. When states are persistent and θ 1 is not revealed then we predict that the DM forms a habit with the cue a 1 . In this case, a 1 contains useful information about the optimal a 2 , and the DM may reduce her cognitive costs by partially relying on this information. When θ 1 is revealed (and states are persistent) then we predict that the habit cue is θ 1 , since θ 1 contains information about θ 2 superior to the informational content of a 1 .
To test the comparative statics predictions arising from the DM's optimization problem, we study two specifications of stakes and state correlations. Stakes are low and state persistence is high in the strong-habit treatments, whereas stakes are high and persistence is low in the weakhabit treatments. As the labels suggest, we predict the habit strength to be high in the strong-habit treatments and low in the weak-habit treatments. In the former, the habit cue is highly predictive of the optimal continuation action and, additionally, the incentive to resolve uncertainty is low. In the latter case, the cue is less informative and incentive to acquire information is high.
Altogether, the experimental treatments vary along three dimensions: (i) we consider independent or positively serially correlated states, (ii) we reveal or do not reveal θ 1 before the second period, and (iii) we vary the stakes and the state correlation. The resulting eight treatments, summarized in Table 1, allow us to test all the above hypotheses.

A rational-inattention model of habit formation
The model we present here deviates from the typical modeling of habits in that it retains the standard assumption of time-separable utility and explains habits as optimal second-best adaptations to an information-processing friction. The model is a special case of Steiner et al. (2017).
The DM solves the two-period binary decision problem with an evolving state from the beginning of Section 2. We now formalize how the DM acquires information. She conducts a costly statistical experiment that produces a signal x t in periods t = 1, 2. Additionally, in between periods 1 and 2, she receives an exogenous signal y. In each period t, she chooses an action according to a (pure) action strategy σ t that maps the observed signals up to period t to a t ; that is, a 1 = σ 1 (x 1 ) and a 2 = σ 2 (x 1 , x 2 , y). The DM controls the experiments that generate x t and can condition the employed experiment on all the available information at the given period: Let X, |X| ≥ 2, be a fixed signal space. The DM can choose any experiment f 1 (x 1 | θ 1 ) and any system of experiments f 2 (x 2 | θ 2 , x 1 , y) that govern the conditional probability distribution of the signals x t ∈ X for each combination of the values of the random variables specified in the condition. 2 We consider two distinct processes that generate the exogenous signal y. In one case, y perfectly reveals the first state; y = θ 1 , and we say that the DM receives feedback. In the other case, y = y 0 , where y 0 is an arbitrary constant, and we say that the DM does not receive feedback.
The DM chooses the experiments and the action strategies to maximize her expected payoff net of the information cost: The cost of the first signal is proportional to the mutual information I(θ 1 ; x 1 ) that measures the informativeness of x 1 about θ 1 . The cost of the second signal is proportional to the conditional mutual information I(θ 2 ; x 2 | x 1 , y) that measures informativeness of x 2 about θ 2 relative to the information contained in x 1 and y. We provide formal definitions in Appendix A. The parameter λ > 0 scales information costs.
Let π(θ 1 , a 1 , θ 2 , a 2 ) be the joint distribution of the states and actions generated by the optimal experiments f * 1 and f * 2 and action strategies σ * 1 , σ * 2 . We impose a regularity condition that all quadruples (θ 1 , a 1 , θ 2 , a 2 ) are attained with positive probabilities. 3 Lemma 1. The optimal joint distribution π of states and actions is unique.
The next two propositions confirm the hypotheses from the end of the previous section.
1. If states are independent then the DM does not form a habit.
2. If states are positively correlated and the DM does not receive feedback, then she forms a habit with the cue a 1 .
3. If states are positively correlated and the DM receives feedback, then she forms a habit with the cue θ 1 .
The proofs of the last two results are in Appendix A.
Proposition 2. The habit strength decreases with stakes and increases with the state persistence.

Experimental design and data
Our experimental design follows Caplin & Dean (2015b). Subjects were presented with images of a 10 × 10 matrix of red and blue dots on a computer screen. In each matrix, either 51 red and 49 blue (state θ = 1) or 51 blue and 49 red dots (θ = 0) are displayed. The positions of the colored dots are random conditional on the state; see the screenshot in Appendix A.3. Subjects are incentivized to determine the majority color and do not face any explicit information cost; any perception errors stem from frictions in the cognitive process. When a subject is ready, she enters her choice by clicking one of two radio buttons marked "Red" and "Blue". 4 To ensure a reasonable duration of the experiment, each image disappeared after 45 seconds. The experiments were implemented using z-tree (Fischbacher 2007). We refer to the above one-period decision problem as the counting task.
We recruited 76 subjects from the University of California, Santa Barbara over 4 sessions during May 2018. In each session, subjects faced 4 treatments. Each treatment consisted of 12 iterations and each iteration consisted of the two-period decision problem described above, with one counting task per period. Thus, each subject faced 96 = 4 × 12 × 2 counting tasks in total. At the conclusion of the session, the software randomly chose a single counting task for each subject, and the subject's payment was based only on the outcome of that task.
An "iteration" is our basic unit of observation. In each iteration, both state realizations were equally likely in the first period. The four treatments per session are defined by the combinations of: (i) the state persistence, where I denotes independent and C denotes correlated states, 5 and (ii) whether θ 1 was revealed in between the two periods, where F denotes the provision of information feedback and N denotes no provision. In addition, in two of the sessions we used parameters that we hypothesized to induce strong habits; treatments in these two sessions are denoted by S, and the treatments in the other two sessions are labeled by W . We thus have 8 treatments Table 2. 6 Additionally, we ran a preliminary session prior to the 4 4 Since we are interested in serial correlations that arise in the absence of real switching costs, we set the positions of the blue and red radio buttons to randomly vary across tasks. Thus, provision of the same answer in consecutive periods does not arise from a mental or physiological advantage.
5 Within a treatment, each subject faced the same sequence of images.

Results
We present basic summary statistics in Table 3. The aggregate accuracy of choices is high and homogeneous across both treatments and periods. Accuracy is heterogenous at the individual level; the number of correctly answered tasks per subject varied from 46 to 96 out of 96 tasks.
(Mild action persistence in the treatment IF W , in which the frequency of a 1 = a 2 is 0.60, is caused by the realized frequency of θ 1 = θ 2 being 0.67 and by the subjects' attentiveness to the state realizations.) We proceed to test for the presence of habits and to identify the cues. To examine how θ 1 and a 1 predict a 2 , we run separate logit regressions for all 8 treatments of the form: a n 2,i =    1 if β 0 + β θ 2 θ n 2 + β θ 1 θ n 1 + β a 1 a n 1,i + β se session + β sc score n i + β scθ 2 score n i θ n 2 + ε n i > 0, 0 otherwise, with robust standard errors clustered at the subject level, where a n t,i is the action taken by subject i in iteration n = 1, . . . , 12 and period t = 1, 2; θ n t is the realized state in iteration n and period t; and session is a dummy variable indicating session (each of the 8 treatments occurs in exactly two weak (W ) parameter sessions.  two sessions). 7 Finally, score n i is a subject-specific proxy for counting ability. It is the total number of correct answers by subject i in all treatments (excluding the two choices from iteration n of the considered treatment to avoid endogeneity). The interaction term score n i θ n 2 captures the idiosyncratic sensitivity of the subject to the variation in θ 2 . Table 4 reports the estimated average marginal effects, their standard errors and p-values of the explanatory variables of interest. We draw the following conclusions from these results.
2. When the states are independent, the subjects do not form habits: neither a 1 nor θ 1 predict a 2 in treatments IF W , IN W , IF S, and IN S.
3. When the states are persistent and feedback is not provided, the subjects form a habit with cue a 1 : a 1 predicts and θ 1 does not predict a 2 in treatments CN W and CN S.
4. When the states are persistent and the feedback is provided, the subjects form a habit with cue θ 1 : θ 1 predicts and a 1 does not predict a 2 in treatments CF W and CF S.
To analyze the comparative statics of habit strength, we focus on the four treatments with persistent states in which habits occur, and we compare the habit strength across the weak and strong treatments. Namely, for the treatments without feedback, we pool the data from CN W and CN S and create a dummy variable δ ∈ {0, 1} to indicate treatment S. We run the same logit specification as in (2) with the inclusion of the additional variables δ, δθ n 2 , δa n 1,i , δscore n i , and 7 In the treatment CN S, the state realizations satisfied θ n 1 = θ n 2 for all n, and thus we dropped θ1. δscore n i θ n 2 . 8 Since empirically the habit cue is a 1 , we estimate the difference between the average marginal effect of a 1 conditional on δ = 1 (S) and its average marginal effect conditional on δ = 0 (W ), where X stands for all explanatory variables other than a 1 and δ. We obtain a point estimatê ∆ CN = .31 with standard error .12, which is highly significant (p-value .009).
Analogously, for the treatments with feedback, we pool the data from treatments CF W and CF S and create a dummy variable δ ∈ {0, 1} to indicate the strong treatment. We run the regression model (2) with the inclusion of δ, δθ n 2 , δθ n 1 , δa n 1,i , δscore n i , and δscore n i θ n 2 . Since the habit cue is θ 1 , we estimate the difference between the average marginal effect of θ 1 conditional on δ = 1 (S) and its average marginal effect conditional on δ = 0 (W ), Here, we obtain the point estimate∆ CF = .23 with standard error .12, which is marginally significant (p-value .06).
Result 2. The level of persistence and incentives has no impact on the selection of the cues.
Additionally, we find conclusive (inconclusive) statistical evidence that the habit formed in the correlated treatments without (with) feedback is stronger in the treatment with high persistence and low incentives than in the treatment with low persistence and high incentives.

Discussion
Relative to models of exogenous habit formation, optimization-based models may predict how cues are selected, and how the strength of the habits change with the decision-making environment.
The static rational inattention model of Matějka & McKay (2015) and its dynamic extension by Steiner et al. (2017) provide a simple optimization-based representation of habits that can be fruitfully combined with structural-estimation techniques. The optimal stochastic-choice rules in these models are the static logit of McFadden (1973) and its dynamic counterpart by Rust (1987), respectively, modified by a system of endogenous "habits". That is, relative to the standard logit choice rules, the rationally inattentive DM experiences an information-processing penalty whenever she makes an ex ante surprising choice at any given decision node. Each such penalty represents the information cost of the surprising information needed to rationalize the surprising choice. These penalties increase the relative attractiveness of the modal-habitual-actions. Since 8 We have excluded the interaction term δθ 1 n , since the state realizations satisfied θ n 1 = θ n 2 for all n in CN S. the rational inattention formulation predicts how the system of endogenous "habits" changes with policy interventions, these models provide counterfactual predictions on habit formation.
We found that people utilize information contained in the history of the decision process when they make continuation choices. We conclude the paper with a brief discussion of a complementary question. Do decision-makers in early stages of their decision processes internalize the continuation value of the information they acquire? In particular, when states are persistent and θ 1 is not revealed in between the periods, then information acquired in the first period of our lab task has a positive continuation value deriving from its use in the second period. If decision-makers internalize this continuation value, then their choices in the first period should be more accurate in the treatments CN relative to other treatments. We do not observe significant differences in the accuracy of the first-period choices across treatments, which we interpret as suggestive evidence of myopia in the information acquisition choices, and we leave this topic for future research.

A Proofs
Entropy of a r.v. W that attains values w with probabilities q(w) is where 0 log 0 = 0 by convention. Mutual information I(X; Y ) of two r.v.'s X and Y is Conditional mutual information I(X; Y | Z) is We first review the posterior approach to static rational-inattention (RI) problems from Caplin & Dean (2013). The DM chooses a ∈ {0, 1} and receives u(a, θ) = s × 1 a=θ . Thus, an optimizing DM who assigns probability q to θ = 1 receives expected value v(q) = s max{q, 1 − q}. The DM assigns prior probability p ∈ (0, 1) to θ = 1. She chooses a statistical experiment that generates signal values x ∈ X, |X| ≥ 2, with probabilities f (x | θ). We let q(x) = Pr(θ = 1 | x) denote the posterior. The DM chooses f to maximize E x v(q(x)) − λH(p) + λH(q(x)) , whereH : [0, 1] −→ R is a concave function that represents information cost.
The optimal experiment f attains two signal values. Thus, the posteriorq = q(x) is a r.v.
attaining two values q, q ∈ [0, 1] that solves s.t.: Eq q = p, where the optimization is over all binary r.v.sq attaining values in [0, 1]. We refer to (3) as to the static RI problem with generalized entropyH. It has a unique solution, and if q, q = p then the support {q, q} of the posteriorq does not depend on the prior p, and q > 1/2, q < 1/2. Additionally, due to the symmetry u(a, θ) = u(1−a, 1−θ), the optimal posterior values are symmetric: q = 1−q.
Next, we establish a structure of the solution of Problem (1) and prove its uniqueness.
Proof of Lemma 1. Steiner et al. (2017) prove that the support of the optimal experiments f * 1 (x 1 | θ 1 ) and f * 2 (x 2 | θ 2 , x 1 , y) is at most as large as the action set. Since, by our regularity condition, a t attains both values with positive probabilities for t = 1, 2, the posterior assigned to θ 1 at the end of period 1,q 1 = Pr(θ 1 = 1 | x 1 ), is a r.v. that attains two values q 1 and q 1 . By our regularity condition, both values of a 1 are attained with positive probabilities, and thus q 1 ≤ 1/2 ≤ q 1 .
The expectations are with respect toq 1 ,q 2 , and y.
We observe thatq 1 andq 2 (q 1 , y) in the setting with feedback, andq 2 (q 1 ) in the setting without feedback all solve the static RI problem (3) withH = H. Therefore, they all have the same unique support {q H , q H } given by the solution of the static RI problem with the entropy cost.
We now analyzeq 1 in the setting without feedback. Since the support of the second-period posteriors is independent of the first-period posterior,q 1 solves the static RI problem (3) with generalized entropy functionH Both the support and the distribution of q 2 (q1, y) are allowed to depend on (q1, y) since the DM may adjust the experiment in t = 2 to her information set.
whereH is concave. Therefore,q 1 attains values qH , qH given by the unique solution of the static RI problem with the generalized entropyH.
The joint distribution π(θ 1 , θ 2 , a 1 , a 2 ) is unique since it is uniquely determined by the unique posterior values q H , q H , qH , qH attained by the random first-and second-period posteriors.
Proof of Proposition 1. Statement 1.: The support of the random second-period posterior is independent of the first-period posterior and of y. Since θ 1 and θ 2 are independent, the prior at the beginning of period 2 is independent of θ 1 and a 1 . Thus, the random second-period posterior (and hence a 2 ) is independent of θ 1 and a 1 , conditionally on θ 2 , as needed.
The right-hand sides do not depend on θ 1 , as needed. It suffices to prove that for each θ 2 ∈ {0, 1}, the first expression exceeds the latter. We consider the case θ 2 = 1; the computation for θ 2 = 0 is analogical.
We first analyze comparative statics with respect to γ. Posterior q 2 is independent of γ.
We prove that d 2 q 2 dq 2 1 is positive.
We notice from (4) that φ a 1 decreases with s if p 2 +q 2 −1 q 2 −p 2 decreases with s (since 1−p 2 p 2 decreases with q 1 and hence with s). Thus, φ a 1 decreases with s if 0 > d ds where we have used dp 2 dq 1 = (2γ − 1) for the third equality, and 2p 2 −1 2γ−1 = 2q 1 − 1 to establish the fourth equality. Therefore, it suffices to prove that We observe that q 2 = 1/2 when q 1 = 1/2. Thus, by the Mean value theorem, there exists 1/2 <q 1 < q 1 such that, where the inequality follows from the fact that dq 2 dq 1 increases with q 1 .
Feedback setting: Again, q 2 solves (6) and thus q 2 increases with s and it is independent of γ.

A.2 Preliminary Session
We ran a preliminary session prior to the regular sessions. Sixteen participating subjects obtained a $15 show-up fee and an additional $5 for a correct answer to the counting task (randomly selected at the end of the experiment). The parameters were: γ = .5 in treatments with independent states (I) and γ = .75 in treatments with correlated states (C). As in the regular sessions, θ 1 was revealed in between periods in the treatments F with feedback and it was not revealed in treatments N without feedback. The treatment order was IF , CF , IN , CN .
The basic data description in Table 5 and the estimated average marginal treatment effects in Table 6 are consistent with the results from the regular sessions. However, in this session, the subjects were free to leave immediately once they finished all their counting tasks in the last treatment (CN ), which affected their information processing costs in an uncontrolled manner, and thus we omit the pilot data from the main analysis.

A.3 Experimental instructions Instructions
Welcome to the experiment! Please take a record sheet at the front if you don't have one already. Please do not use the computers during the instructions. When it is time to use the computer, please follow the instructions precisely.(Repeat if necessary.) Please raise your hand if you need a pencil. Please put away and silence all your personal belongings, especially your phone. We need your full attention during the experiment.
Raise your hand at any point if you cannot see or hear well.
The experiment you will be participating in today is an experiment in decision making. At the end of the experiment, you will be paid for your participation in cash. The amount you earn depends on your decisions and on chance. You will be using the computer for the experiment, and all decisions will be made through the computer. DO NOT socialize or talk during the experiment.
All instructions and descriptions that you will be given in the experiment are accurate and true. In accordance with the policy of this lab, at no point will we attempt to deceive you in any way.
If you have any questions, raise your hand and your question will be answered out loud so everyone can hear.
After you have completed all the tasks, please wait while everyone else finishes his or her tasks. Once everyone has completed the experiment, I will ask you to fill in the questionnaire. After the questionnaire you will collect your earnings and leave. You will be presented with a series of choices to make. There will be four SETS of choices in today's experiment. Each set contains twelve ITERATIONS, and each iteration has two PERIODS. In each period, you will be shown a picture of 100 dots. Each dot will be either RED or BLUE. We have displayed an example of such a screen on your computer monitor. (show an example screen) This is an example of the screens you will see during the experiment. In every period, the picture will contain either 51 red dots and 49 blue dots, or instead, 51 blue dots and 49 red dots. We will call these two cases MAJORITY RED and MAJORITY BLUE, respectively. In each case, the dots are randomly allocated to the positions in the matrix. In each period the computer will choose randomly between MAJORITY RED and MAJORITY BLUE. You will be told in advance how likely each case is to happen.
In each period, you will be asked to determine if the image is MAJORITY RED or MAJORITY BLUE. While you may take as much time as you need to make your choice, the image will disappear after 45 seconds.
I am now going to describe the details of the experiment.
The experiment is divided into four SETS. In each set, you will be presented with twelve iterations, and each iteration consists of two periods, each with its own image. The rules for the 12 iterations within each set are identical, but the rules are different in different sets.
In PERIOD 1 of each iteration, the image is always generated so that there is an equal chance of MAJORITY RED and MAJORITY BLUE, meaning that there is a 50% chance of MAJORITY RED and a 50% chance of MAJORITY BLUE.
In period 2 of each iteration, the image will be generated in a way that differs across sets. In some sets, the majority color for period 2 is chosen in a way that is completely separate from the period 1 image, and is randomly generates so that there is an equal chance of MAJORITY RED and MAJORITY BLUE, just like the period 1 image. But in other sets, the period 2 image depends on the majority color of the period 1 image. In these sets, the computer generates the period 2 image so that there is a 75% chance that the majority color matches the period 1 majority color, and a 25% chance that the majority color is different from the period 1 majority color.
It is important to remember that while the periods within each iteration may be related to each other, the periods across iterations are never related.
After making your choices, you will always be told what the majority color was, but the timing of this differs from set to set. In some sets, the majority colors will be revealed after every period. In other sets, the majority colors for an iteration will not be revealed until you complete both periods. Before each set, you will be told about the timing of the feedback you will receive.
The amount of money you will receive at the end of the experiment depends on your choices. After we have completed all four sets, you will have made 96 choices (4 sets times 12 iterations times 2 periods). The computer software will randomly select one of these 96 periods. Your payment will be determined by your choice in that single period. If your choices in the randomly chosen period matches the majority color, you will earn an additional $5 dollars on top of the $15 show-up fee.
Otherwise, you will receive no additional payment, but you will still receive the show-up fee.
After you complete the last set, please wait until we start the questionnaire part. After you finish the questionnaire, please fill your record sheet on the desk. I will pay one by one to keep everyone's privacy.
To summarize, remember that we have four sets in the experiment today. Each set consists of 12 iterations, and each iteration consists of two periods. The sets will vary in how likely it is that the majority colors are the same for both periods within an iteration, and in the timing that the majority colors are revealed. Please raise your hand if you have any questions.
(1) FI/FC/NI/NC Feedback/IID: In the next set of twelve iterations, the majority color for period 2 is randomly generated so that there is an equal chance of MAJORITY RED and MAJORITY BLUE, and it does not depend on the majority color in the first period.
The majority colors will be revealed after every period, so that you will be told the majority color from period 1 before you see the image for period 2. Please raise your hand if you have any question.

Feedback/Corr.:
In the next set of twelve iterations, the majority color for period 2 is randomly generated so that there is a 75% chance that the majority color matches the majority color from period 1, and a 25% chance that the majority color is different from period 1.
The majority colors will be revealed after every period, so that you will be told the majority color from period 1 before you see the image for period 2. Please raise your hand if you have any question.

No Feedback/IID:
In the next set of twelve iterations, the majority color for period 2 is randomly generated so that there is an equal chance of MAJORITY RED and MAJORITY BLUE, and it does not depend on the majority color in the first period.
The majority colors for both periods of an iteration will be revealed only at the end of each iteration, so that you will see the period 2 image before being told the majority color from period 1. Please raise your hand if you have any question.

No Feedback/Corr.:
In the next set of twelve iterations, the majority color for period 2 is randomly generated so that there is a 75% chance that the majority color matches the period 1 majority color, and a 25% chance that the majority color is different from the period 1 majority color.
The majority colors for both periods of an iteration will be revealed only at the end of each iteration, so that you will see the period 2 image before being told the majority color from period 1. Please raise your hand if you have any question.

No Feedback/IID:
In the next set of twelve iterations, the majority color for period 2 is randomly generated so that there is an equal chance of MAJORITY RED and MAJORITY BLUE, and it does not depend on the majority color in the first period.
The majority colors for both periods of an iteration will be revealed only at the end of each iteration, so that you will see the period 2 image before being told the majority color from period 1. Please raise your hand if you have any question.

No Feedback/Corr.:
In the next set of twelve iterations, the majority color for period 2 is randomly generated so that there is a 75% chance that the majority color matches the period 1 majority color, and a 25% chance that the majority color is different from the period 1 majority color.
The majority colors for both periods of an iteration will be revealed only at the end of each iteration, so that you will see the period 2 image before being told the majority color from period 1. Please raise your hand if you have any question.

Feedback/IID:
In the next set of twelve iterations, the majority color for period 2 is randomly generated so that there is an equal chance of MAJORITY RED and MAJORITY BLUE, and it does not depend on the majority color in the first period.
The majority colors will be revealed after every period, so that you will be told the majority color from period 1 before you see the image for period 2. Please raise your hand if you have any question.

Feedback/Corr.:
In the next set of twelve iterations, the majority color for period 2 is randomly generated so that there is a 75% chance that the majority color matches the majority color from period 1, and a 25% chance that the majority color is different from period 1.