Focus theory of choice and its application to resolving the St. Petersburg, Allais, and Ellsberg paradoxes and other anomalies

We present a decision theory which models and axiomatizes a decision-making procedure. This proce- dure involves two steps: in the ﬁrst step, for each action, some speciﬁc event which can bring about a relatively high payoff with a relatively high probability or a relatively low payoff with a relatively high probability is selected as the positive or negative focus, respectively; in the second step, based on the foci of all actions, a decision maker chooses a most-preferred action. Our model handles decision making with risk or under ambiguity or under ignorance within a uniﬁed framework. Our model resolves several anomalies, including the St. Petersburg, Allais, and Ellsberg paradoxes, and violations of stochastic dominance.


Introduction
The expected utility (EU) theory axiomatized by von Neumann and Morgenstern (1944) , and the subjective expected utility (SEU) theory axiomatized by Savage (1954) are the foundation of decision under risk and uncertainty. However, the hypothetical experimental findings reported by Allais (1953) and Ellsberg (1961) show that people systematically violate the axioms proposed by von Neumann and Morgenstern for the EU and by Savage for the SEU.
In this paper, we propose and axiomatize the focus theory of choice (FTC), which is a procedurally rational choice model E-mail address: guo@ynu.ac.jp that can account for several puzzling phenomena, including the St. Petersburg, Allais, and Ellsberg paradoxes, and violations of stochastic dominance.
In most normative and descriptive theories, when evaluating a lottery, a decision maker (DM) is assumed to hold a holistic viewpoint; that is, the lottery is evaluated by an aggregated multiplicative model, such as the SEU. In contrast, Simon (1979 , p. 507) argued, "It is not that people do not go through the calculations that would be required to reach the SEU decision-neoclassic thought has never claimed that they did. What has been shown is that they do not even behave as if they had carried out those calculations." Recently, accumulated evidence gained by the researches using eye-tracking and scanpath methods shows that a risky decision is unlikely to be based on a weighting and summing process (e.g., Glockner & Herbold, 2011;Stewart, Hermens, & Matthews, 2016;Zhou et al., 2016 ). In the paper by Zhou et al. (2016 ), an individual is asked to conduct two tasks: the proportion task and the probability task. The scanpaths of the typical trials showed that the information-processing sequence in the proportion task appears to be more consistent with a weighting and summing process whereas in the probability task, the scanpath did not diagnostically show a pattern similar to that of the proportion task. Hence, they claim "Our findings suggest that participants are unlikely to employ a weighting and summing process to make a decision in the probability task" ( Zhou et al., 2016 , p. 174). Gigerenzer and Gaissmaier (2011 , p. 451) argue, "Ignoring part of the information can lead to more accurate judgement than weighting and adding all information".
Some studies have shown that individuals evaluate a lottery by treating each outcome separately. Wedell and Bockenholt (1994 , p. 499) draw such a conclusion from their experiments that "justification for single-play choices tended to focus on a single attribute of the gamble" where a single attribution involves the amount that can be won or lost, the chance of doing so, or other factors. Brandstätter, Gigerenzer, and Hertwig (2006) suggest that individuals make choices by using four attributes in the following order: minimum payoff, probability of minimum payoff, maximum payoff, and probability of maximum payoff. In addition, several studies (e.g., Tversky & Kahneman, 1981 ) show that individuals evaluate a lottery based on some specific event associated with this lottery; that is, they consider a payoff and its probability. Based on the above studies, FTC argues that a DM is boundedly rational and suffers from bounded attention, so that instead of taking into account all events of a lottery simultaneously, the DM considers the event which is personally most salient due to its payoff and probability. We call this event-based thinking. This is also the fundamental argument of the one-shot decision theory proposed by Guo (2011) .
Another key feature of FTC is the postulate that a DM is endowed with two distinct evaluation systems: a positive evaluation system (PES) and a negative evaluation system (NES). In the PES, for each lottery, an event which brings about a relatively high payoff with a relatively high probability has a relatively high salience. Such an event generates the individual's overall impression of this lottery, and so we call this the positive focus of this lottery. Then, based on the positive foci of all lotteries, the best lottery is chosen. On the other hand, in the NES an event which leads to a relatively low payoff with a relatively high probability has a relatively high salience. This event will generate the overall impression of the lottery, and so we call this the negative focus of this lottery. Similar to the above, the individual will choose the best lottery based on the negative foci of all lotteries. For a DM, one system is apparent and the other is latent. Which system is apparent is strongly related to the DM's personality traits. For example, the PES is usually apparent for an optimistic DM, while the NES is often apparent for a pessimistic DM. It can also be strongly influenced by the framing ( Kahneman & Tversky, 1984 ): the NES becomes apparent when the problem is negatively framed or the problem is critical or serious for the DM. It is possible that both systems are simultaneously activated, which would result in hesitation or even an inability to make a decision.
A growing body of evidence has shown that salience (attentiongrabbing information) plays a critical role in decision making ( Brandstätter & Korner, 2014;Busse, Lacetera, Pope, Silva-Risso, & Sydnor, 2013;Lacetera, Pope, & Sydnor, 2012 ). However, there has been surprisingly little work on salience in the context of choice under risk or uncertainty. Bordalo, Gennaioli, and Shleifer (2012) propose the salience theory of choice under risk (STC). We share the common psychological basis of attention with that research. However, we treat the lottery choosing in radically different ways. First, in STC "salience depends on payoffs, and not on the probabilities of different states" ( Bordalo et al., 2012( Bordalo et al., , p. 1254. In contrast, in FTC the focus (salient event) is independently determined in each lottery while considering not only the payoff but also the probability of each event. Generally, the most salient state of a lottery in STC is not identical with the focus (salient event) of this lottery in FTC. Second, STC uses a weighted utility when evaluating a lottery in which decision weights are distorted in favor of salient payoffs, whereas FTC uses only its focus (salient event) to evaluate a lottery.
Procedural rationality was first articulated by Simon to distinguish substantive rationality for normative economics: "Behavior is procedurally rational when it is the outcome of appropriate deliberation. Its procedural rationality depends on the process that generated it" ( Simon, 1976 , p. 67). A well-known theory for formulating the decision-making procedure is the similaritybased theory ( Rubinstein, 1988 ). Other models can be found in Rubinstein (1998) . FTC belongs to the class of procedural rationality methods, because it delineates the decision-making procedure according to how DMs choose the focus of a lottery and how they choose the best lottery based on foci.
We cite several examples to provide an overview of how FTC works prior to introducing its theoretical framework. The first example is decision under ignorance. Due to ignorance, we think that all events have the same probability. In PES, the event which makes an action generate the highest payoff is the positive focus of this action because it is the most attractive (salient) event for this action; the DM then chooses such an action that produces the highest payoff from among all positive foci. This procedure is exactly the same as decision making under ignorance with the maximax criterion. On the contrary, in NES, the event which makes an action yield the lowest payoff is the negative focus of this action because it is the most concerned (salient) event for this action; the DM then chooses from the negative foci the one with the highest payoff. This procedure is just the same as decision making under ignorance with the maximin criterion. Interestingly, the Hurwitz criterion corresponds to the case in which the PES and NES are activated simultaneously, the optimistic coefficient reflects the percentage of the time that the PES works.
The second example is the well-known Asian disease problem ( Tversky & Kahneman, 1981 ) given as follows: Imagine that the U.S. is preparing for the outbreak of an unusual Asian disease, which is estimated to kill 600 people. Two programs are proposed to fight against this disease. The predicted results of the two programs are as follows: Problem I: If Program L 1 is adopted, 200 people will be saved. If Program L 2 is adopted, there is a one-third chance that 600 people will be saved, and a two-thirds chance that no people will be saved. Which of the two programs would you favor?
Problem II: If Program L 3 is adopted, 400 people will die. If Program L 4 is adopted, there is a one-third chance that nobody will die, and a two-thirds chance that 600 people will die. Which of the two programs would you favor?
Although Problems I and II are stochastically equivalent, the experimental results ( Tversky & Kahneman, 1981 ) show that a substantial majority of respondents prefer L 1 to L 2 and prefer L 4 to L 3 . Kahneman and Tversky advocate that the framing effect leads to risk aversion in Problem I and risk seeking in Problem II.
Let us analyze the above two problems with FTC. Since Problem I is positively described, the PES is activated when thinking about this problem. The positive focus of Program L 1 is the event that 200 people will be saved because it is a unique event for L 1 while the positive focus of Program L 2 is the event that 600 people will be saved with a one-third probability because it is more attractive (salient) than the other for L 2 . Then the DM compares these two foci and thinks that the event that 200 people will be saved is better than the event that 600 people will be saved with a one-third probability because the DM emphasizes certain effect. As a result, the DM decides to choose L 1 .
Since Problem II is negatively described, the NES becomes apparent when taking into account this problem. The negative focus of Programs L 3 is the event that 400 people will die because it is a unique event for L 3 while the negative focus of Programs L 4 is the event that 600 people will die with a two-thirds probability because it is more concerned (salient) than the other for L 4 . Then the DM compares these two foci and thinks that the certain death of 400 people is less acceptable than the death of 600 people with a two-thirds probability. So that the DM decides to choose L 4 . Such explanations are exactly the same as the ones given by Tversky and Kahneman (1981) .
The third example is an example of violations of stochastic dominance ( Tversky & Kahneman 1986 , p. 264) introduced as follows: There are two lotteries, described by the percentages of marbles of different colors in each box and the amount of money you win or lose depending on the color of a randomly drawn marble. Which lottery do you prefer? Clearly, Lottery II stochastically dominates Lottery I. However, the experiment conducted by Tversky and Kahneman (1986) shows that a majority of subjects (58%) choose stochastically dominated Lottery I.
Let us analyze this problem by the PES. There are two actions: I and II. The positive focus of I is winning $45 with a probability of 6% and the positive focus of II is winning $45 with a probability of 7% because these two events are the most attractive (salient) for I and II. The DM compares these two foci of two actions and feels that they are almost as good as each other. Then the DM further considers the second most attractive (salient) events of I and II, i.e. winning $30 with a probability of 1% and winning $0 with a probability of 90%. Since a 1% chance of winning $30 is better than a 90% chance of winning $0, the DM chooses Lottery I.
The remainder of this paper is organized as follows. In Section 2 , we discuss how the positive focus of an action is selected and how the optimal action is determined based on the positive foci. In Section 3 , we show how PES resolves the St. Petersburg, Allais, and Ellsberg paradoxes. In Section 4 , the concluding remarks are given. In Appendix A , we summarize the mathematical symbols used in PES. In other appendices, we define and characterize the NES, exhibit how framing affects the choice between the PES and the NES in the context of the Asian disease problem, resolve the example of violations of stochastic dominance given in this section, and list the mathematical symbols used in NES. In addition, it should be emphasized that instead of conducting new experiments, we utilize well-documented and reliable experimental data.

Positive foci of an action
Consider an action a i ∈ A = { a 1 , . . . , a n } associated with a set of mutually exclusive events S i . Like S i , we use the superscript i in the mathematical symbols to stand for the action a i throughout the paper. The payoff function of a i is v i : S i → R . That is, an event s ∈ S i corresponds to a payoff v i (s ) when taking an action a i . s ∈ S i is an action-specific event and may consist of one or more states. The objective probability of s ∈ S i or its subjective probability exogenously given is p i (s ) . Hence, an event s ∈ S i can be characterized by ( v i (s ) , p i (s ) ) . An action a i with n (i ) events is represented as a lottery  ( 200 , . 5 ) } , r espectively. Such information can be obtained by directly asking the DM. It should be noted that splitting or coalescing events will generate an unidentical decision problem for FTC. We have the following definition to characterize the events.
Like R + , we use the superscript or subscript + in mathematical symbols to stand for the positive evaluation system (PES) throughout the paper. Clearly, R + satisfies transitivity. That is, if ( s 1 , s 2 ) ∈ R + and ( s 2 , s 3 ) ∈ R + then ( s 1 , s 3 ) ∈ R + . Given Definition 1 , we have the set of the undominated events which is defined below.
Definition 2. Given U ⊆ S i , the set of R + -maximal elements of U, denoted as F i ( U, R + ) , is as follows: (1) F i ( U, R + ) stands for the set of the undominated events s ∈ U ⊆ S i that have relatively high probabilities and make a i generate rel- It should be noted that (1) does not mean that changing the dominated events does not influence the positive frontier of an action at all. The reason is as follows: changing the dominated events may cause a reconstruction of events so that a new event can become an undominated one. To facilitate the understanding of the above-mentioned symbols and the concepts, let us consider the following numerical example.

Example 1. We set
In the PES, for a i ∈ A , the primitive of our analysis is a binary preference relation + over S i , and we denote the symmetric and asymmetric parts of + as ∼ + and + , respectively. s 1 + s 2 means that the event s 1 is more attractive than the event s 2 . We have the following axioms for + .
Axiom 1-Decidability: For each a i , a DM can choose the most attractive event from S i .
Axiom 1 postulates that a DM is able to select the most attractive event from among all events of each action. Meanwhile, it implies that the most attractive event is not necessarily derived from a pairwise comparison where completeness and transitivity are needed.
Axiom 2 assumes that an event with a higher probability and a not smaller payoff or with a not smaller probability and a higher payoff will result in the DM feeling more attractive (having higher salience of this event). This axiom is intuitively appealing for describing the procedure of selecting events because it employs dominance relationship which is recognized as the most widely acceptable principle and a compelling reason for choice ( Montgomery, 1983;Payne, Bettman, & Johnson, 1992 ). This axiom represents an optimistic attitude in evaluating events.
It follows from Axioms 1 and 2 that the most attractive event of an action a i over all s ∈ U ⊆ S i , denoted as c i is used to represent the set of positive foci of a i over U ⊆ S i in the case that multiple positive foci of a i exist.
In what follows, let us consider how to identify c i + (U ) . First, let us introduce two basic concepts.
is exogenously given to represent the relative position of the likelihood of an event s ∈ U ⊆ S i . We call it the relative likelihood degree of s . Definition 3 considers a general form of the relative likelihood function. For a DM, he has one specific relative likelihood function. We can give a simple relative likelihood function as follows: (2) Instead of directly using the probability of an event, we employ the relative likelihood degree of an event in FTC mainly for the following three reasons. First, the relative likelihood is regarded as a heuristic variable of probability with considering the heuristic attribute substitution ( Kahneman & Frederick, 2002 ) in the sense that the relative likelihood comes more readily to mind than the target attribute, i.e. probability. Second, probability might fit the long-run perspective of a DM while the relative likelihood captures the feature of single-instant event ( Gigerenzer, 1994;Wedell & Bockenholt, 1990 ). Third, it is in accordance with the argument of Bordalo et al. (2012) : in the specific context of choice under risk, the relative magnitude is itself a critical determinant of salience.

Definition 4. Denote the set of all payoffs resulted from an action
We abuse notion by writing u i is exogenously given to represent the relative position of the payoff generated by s ∈ U ⊆ S i and a i ∈ A among all payoffs from the upside standpoint in the sense that u i Definition 4 considers a general form of the satisfaction function.
For a DM, he has one specific satisfaction function. If v i (s ) is a bounded function, as an example, we can set the following linear function where T 1 is a predetermined positive constant satisfying A wealth of evidence shows that although absolute well-being and relative position seem to matter to people, the relative standing is nevertheless significantly important. For example, a study of satisfaction among 257 professors, students and staffs at the Harvard School of Public Health conducted by Solnick and Hemenway (1998 , p. 381) shows, "Half of respondents said they would prefer a world in which they have 50% less real income, so long as they have high relative income." In addition, Frank (1985) finds, "Some whose close associates all earn $50,0 0 0 a year is likely to feel actively dissatisfied with his material standard of living if his own salary is only $40,0 0 0. Yet that same person would likely be content if his closest associates earned not $50,0 0 0 but $30,0 0 0 a year" (cited by Solnick & Hemenway, 1998 , p. 374). In line with the above studies, we use the satisfaction function to represent the relative position of a payoff in which the reference point is the maximum payoff gained by an action.
Next, let us examine whether F i ( U, R + ) will change if p i (s ) and v i (s ) become π i U (s ) and u i U (s ) , respectively. We have the following lemma.
Lemma 1. (Invariance). Suppose θ (·) and ψ (·) are strictly increasing functions. Given U ⊆ S i , the positive frontier of a i with θ ( p i (s ) ) and ψ ( v i (s ) ) is the same as the one with p i (s ) and v i (s ) .
It follows from Lemma 1 that for U ⊆ S i the positive frontier of a i with p i (s ) and v i (s ) is the same as the one with π i U (s ) and We have the following theorem for characterizing the positive focus c i Theorem 1. (Representation theorem of positive foci).
holds. (7) , we know that increasing ϕ i (U ) will lead to a positive focus of a i with a relatively high satisfaction level (payoff) and a relatively low likelihood (probability). Choosing a positive focus of a i over U ⊆ S i with a relatively high value ϕ i (U ) means that the DM is willing to pursue a high payoff by sacrificing the probability of that payoff, so ϕ i (U ) is used to characterize the degree of emphasizing possible payoff for choosing a positive focus of a i . A higher value of ϕ i (U ) corresponds to a higher degree of emphasizing possible payoff when choosing a positive focus of a i . We name min ( π i U ( s ) , ( 1 / ϕ i ( U ) ) * u i U (s ) ) wher e s ∈ F i ( U, R + ) the attractiveness level of the undominated event of a i . Thus, Theorem 1 states that the positive focus of an action is the undominated event with the highest attractiveness level. It should be noted that min ( π i U (s ) , ( 1 / ϕ i (U ) ) * u i U (s ) ) is only for seeking the most attractive event. That is, we cannot claim that the event having the second largest value of min

Comments. It follows from Theorem 1 that
is more attractive than the one having the third largest value of From the above introduction, we know that the positive focus of an action is selected by two consecutive steps: in the first step, the events of this action are selected on the basis of the Pareto criterion, and the indecisive events form the positive frontier of this action; in the second step, the tradeoff between payoff and probability is made, which is described in Theorem 1 . This idea is closely related to sequentially rationalizable choice ( Manzini & Mariotti, 2007 ) in which at first the inferior alternatives are removed, then a fairness criterion is used for choosing the best one from among the alternatives left.

Optimal action based on positive foci
We consider the relationships between two actions' positive foci. They are summarized as the following definitions.
, then it is said that s 1 dominates s 2 between a i and a j in the PES, denoted as From Definition 5 , we understand that ( s 1 , s 2 ) ∈ Q + means that an event s 1 of a i can bring a payoff at least that of an event s 2 of a j , and the occurrence probability of s 1 is at least that of s 2 .
Definition 6. For s 1 ∈ S i and s 2 ∈ S j , set α Here V o is a predetermined large positive constant representing the tolerance level for the difference of two payoffs. If v i ( s 1 ) and v j ( s 2 ) are positive, usually we can take Definition 6 is closely related to Rubinstein's ε-indifference similarity ( Rubinstein, 1988 ). ( s 1 , s 2 ) ∈ = δ means that s 1 and s 2 are equally preferred at the level δ. Definition 7. Given H ⊆ ∪ j=1 , ... ,n S j , the set of Q + -maximal elements of H, denoted as F(H, Q + ) , is as follows: F(H, Q + ) is the set of the events undominated by the events of the other actions. We set then F( C + , Q + ) stands for the set of the undominated foci (attractive events) with the relatively high probabilities and the relatively high payoffs.
D + (G ) stands for the set of actions whose positive foci belong to F(G, Q + ) .
We have the following axioms for characterizing the optimal action in the PES.
Axiom 3-Decidability: A DM can choose the most preferred action a i * from A .
We relax the assumptions of completeness and transitivity in standard economic theory and replace them by decidability. It means that a DM can determine his most-preferred action but there is no need to judge between any pair of actions. This assumption is intuitively appealing because in the real world the observable and observed action is usually the optimal action itself.
Axiom 4-Focus lexicographical dominance: We call s (l) the l th order positive focus of a i * . Axiom 4 links the selection of an action to the choosing of a positive focus. In other words, which action is chosen as the most preferred depends on which action's positive focus is the most attractive to the DM. Axiom 4(1) means that a positive focus of the most preferred action a i * belongs to F( C + , Q + ) and it is the most attractive to the DM. Axiom 4(2) means that although the positive foci of a i * do not belong to F( C + , Q + ) , one positive focus of a i * is identical with a positive focus of another action a j belonging to F( C + , Q + ) at the level δ (0) , the l th order positive focus of a i * is identical with the one of a j at the level δ (l) for l = 1 , . . . , m − 1 , and a m th order positive focus of a i * is more attractive than the one of a j . For the sake of simplicity, we have confined ourselves to the cases in which the positive foci of two actions are equally attractive at some level; the theory can be easily extended to multiple actions. In order to know which positive focus is the most attractive among F( C + , Q + ) , let us consider the following definitions.
We can give a simple readjusted likelihood function as follows: We abuse notion by writing u (H) ( s, a i ) in place of u (H) ( v i (s ) ) where s ∈ S i ∩ F( H, Q + ) and a i ∈ D + (H) . We can give a simple readjusted satisfaction function as follows: where T 2 is a predetermined positive constant satisfying We have the following theorem for characterizing the optimal action in the PES.
Theorem 2. (Representation theorem for an optimal action in the PES).
(1) If a i * satisfies Axiom 4(1), then there exists a function for c i * + ( S i * ) ∈ F( C + , Q + ) where κ > 0 , and a g is the action satisfying (2) If a i * satisfies Axiom 4(2), ∃ t (0) = c j where κ (m ) > 0 , B + is given in Axiom 4 and a q is the action a i * or holds.

Proof. Setting
it is easy to prove (15) by taking the same procedure as used for proving Theorem 1 . Similarly, we can prove Theorem 2 (2).
It follows from Theorem 2 (1) that κ is endogenously derived from the observed optimal action and its positive foci. Certainly, setting a value of κ, we can obtain a i * and c i * + ( S i * ) ∈ F( C + , Q + ) . From F( C + , Q + ) and (17) , we know that increasing κ will lead to an optimal action whose positive focus has a relatively high readjusted satisfaction level (payoff) and a relatively low readjusted likelihood degree (probability). Hence, setting κ to a high value means that the DM is willing to pursue a high payoff by sacrificing its probability (so-called possible effect), whereas a low value for κ means that the DM chooses the high probability while sacrificing the payoff (so-called certain effect). In other words, increasing κ represents that the DM tends to pursue possible effect (emphasizing payoff); decreasing κ represents that the DM tends to pursue certain effect (emphasizing probability). Following the same logic as used for Theorem 2 (1) , we can explain the cases for Theorem 2 (2) .
We name min ( π + C + ( s ) , ( 1 /κ ) * u ( C + ) ( s, a i ) ) where s ∈ F( C + , Q + ) and a i ∈ D + ( C + ) the attractiveness level of a i . Thus, Theorem 2 (1) states that the optimal action in the PES is the one with the highest attractiveness level. Likewise, we can explain the cases for Theorem 2 (2) where min ( π + B + ( s ) , ( 1 / κ (m ) ) * u ( B + ) ( s, a q ) ) is the m th order attractiveness level of a q . It should be noted that is only for seeking the most attractive action. In other words, it does not make sense that the action having the second largest value of min ( π + C + ( s ) , ( 1 /κ ) * u ( C + ) ( s, a i ) ) is more attractive than the one having the third largest value of If a DM prefers multiple actions in the PES, then these preferred actions are called equally optimal actions in the PES. If a common κ does not exist in the case of multiple optimal actions, we say that the DM has inconsistent attitudes for purchasing possible effect.
In the PES, κ represents the weight which a DM is willing to put on the satisfaction level over the relative likelihood degree.
The parameter κ does not take a unique value when determining one positive focus. This is not a shortcoming of FTC, because what matters is which focus is ultimately chosen.
The maximin criterion was criticized by Harsanyi (1975) because it requires the DM to evaluate every available action only in terms of the worst case. Although FTC utilizes the maximin operator, it also incorporates the relative likelihood, and so it simply eliminates the possibility of obtaining extreme results.
Furthermore, changing the value of the parameter κ corresponds to different behaviors of a DM.
It is correct that Theorems 1 and 2 can be reformulated directly in terms of probabilities p i (s ) and payoffs v i (s ) . Howe ver, ϕ i (U ) in Theorem 1 and κ in Theorem 2 have a clear meaning if using the satisfaction function and the relative likelihood function. For example, on the one hand, setting ϕ i (U ) = 1 means that the DM wants to seek a focus while thinking that payoff and probability are equally important; on the other hand, the derived ϕ i (U ) = 1 from choosing the focus shows that the DM puts the same weight on payoff and probability. Another reason why the relative likelihood function and the satisfaction function are used is that they are strongly supported by the psychological evidences. Let us go back to Example 1 . Since C 1 (8) and (10) , we have F( (11) and (14) and setting κ = 1 , we calculate the attractiveness levels of a 1 and a 3 as min (π + C + ( s 1 2 ) , 88 , respectively. It follows from (15) that a i * = a 3 and c i * + ( S i * ) = s 3 3 because 0 . 88 > 0 . 77 , that is, the optimal action is a 3 and its positive focus is s 3 3 . We postulate that in general the positively framed problem activates the PES, while the negatively framed problem usually makes the NES apparent. In the NES, for each action a specific event which generates a relatively low payoff with a relatively high probability is the negative focus (the most salient event) of this action. Then, based on the negative foci of all actions, the action whose negative focus is the most acceptable is chosen as the most preferred. The theoretical framework of the NES is given in Appendix B. In the next section, we resolve the St. Petersburg, Allais, and Ellsberg paradoxes with the PES. We use L 1 > L 2 ( L 1 = L 2 ) to stand for that the action L 1 is preferred to the action L 2 ( L 1 and L 2 are equally preferred) throughout the paper.

The St. Petersburg paradox
The St. Petersburg paradox was proposed by Nicolas Bernoulli in 1713 as follows. A fair coin is tossed at each stage. Once a tail appears, the game ends and the player obtains 2 m dollars where m equals the number of tosses. How much is a player willing to pay for this game?
It follows from (22) and (21) that y can take the value from ( 2 , 2 6 ) with the different value of κ; the bigger the value of κ, the larger the value of y . This means that if one is more willing to pursue the possible effect, one is more willing to pay more.
Next, let us discuss the parameter ϕ(S) . Increasing ϕ(S) in (6) from 1 to, for example, 10 will change the positive focus of PLAY from HHHHHT to HHHHHHHT. Likewise, we know that y can take the value from ( 2 , 2 8 ) . It means that the player who focuses on the event with the higher payoff is willing to pay more for PLAY. It can be easily understood that increasing the payoffs, for example, from 2 k to 3 k , can make a DM more aggregative, that is, increasing ϕ and κ. Using the same logic as shown above, we can argue that increasing the payoffs of the game will make the player willing to pay more.
Although the St. Petersburg paradox has been resolved by Daniel Bernoulli by introducing a utility function, it reoccurs to challenge cumulative prospect theory. Blavatskyy (2005) proves that cumulative prospect theory (CPT) cannot avoid the St. Petersburg paradox without the condition that the power coefficient of the utility function is lower than the power coefficient of the probability weighting function. In addition, Rieger and Wang (2006) point out that in cumulative prospect theory, a prospect with a finite expected value may have an infinite subjective value. On the contrary, FTC can resolve all types of the St. Petersburg problems because instead of taking a weighted sum, FTC focuses on only a single event.

The Allais paradox
The Allais paradox ( Allais, 1953 ) is described as follows: A subject is asked to choose one between the following two gambles: Gamble L 1 : 100% chance of receiving $100 million; Gamble L 2 : 10% chance of receiving $500 million, 89% chance of receiving $100 million, 1% chance of receiving nothing.
Then, once more this subject is asked to choose one between the following two gambles: Gamble L 3 : 11% chance of receiving 100 million, 89% chance of receiving nothing; Gamble L 4 : 10% chance of receiving 500 million, 90% chance of receiving nothing.
Empirical studies show that most subjects choose L 1 > L 2 but L 4 > L 3 ; it violates the independence axiom.
Since decreasing κ (κ < 2) reflects the attitude of pursuing certain effect, i.e. em phasizing probability, we argue that the DMs who prefer L 1 to L 2 should emphasize certain effect.
Considering κ in the first and second problems, we know that taking ∀ κ ∈ ( . 22 , 2 ) will lead to L 1 > L 2 but L 4 > L 3 . Allais (1953) expects that people faced with these choices might opt for L 1 in the first problem, lured by the certainty of becoming a millionaire, and select L 4 in the second problem in which the odds of winning seem very similar, but the prizes are very different. Clearly, FTC's explanations are similar to Allais's (1953) .

The Ellsberg paradox
We examine the Ellsberg paradox ( Ellsberg, 1961 ) with the PES where the numbers of balls are reduced to 10 percent of the original ones for simplicity.
Subjects confront two urns containing well mixed red and black balls. Urn I contains exactly 5 red and 5 black balls. Urn II contains 10 red and black balls, but in an entirely unknown ratio. You have to choose one urn and draw a ball at random from it. Please decide which you prefer, Urn I or Urn II in the following two games.
Game A: If you draw a red ball, you will receive $100 and if you draw a black one, you will receive nothing.
Game B: If you draw a black ball, you will receive $100 and if you draw a red one, you will receive nothing.
Empirical evidence shows that most subjects choose Urn I in both Game A and Game B; it violates Savage axioms.
Let us analyze Game A. The set of actions is {choosing Urn I, choosing Urn II}. For choosing Urn I, the set of events is {Black, Red}, and clearly its positive focus is (100, 0.5). Since we do not know the ratio of red balls to black balls in Urn II, choosing Urn II in Game A is equivalent to a two-stage procedure as follows. In Step I, choose one type of urn from 11 types of uniformly distributed urns. In FTC, the action-specific event is equipped with two necessary components: payoff and probability which are exogenously given. For the action of choosing Urn II, we are unable to directly assign the probabilities to Red and Black so that we utilize the two-stage procedure to determine the probabilities. This idea is not new. To accommodate ambiguity aversion in choice behaviors, second order probabilities have been used in the literature (e.g., Segal, 1987Segal, , 1990 where horse lotteries and roulette lotteries are taken into account. The model based on FTC shares the same roulette lotteries as the models based on second order probabilities. However, subjective second order probabilities are endogenously derived from choice behaviors in the models based on second order probabilities while in the model based on FTC second order probabilities are exogenously set as a uniform distribution.
The Ellsberg problem has been regarded as the typical problem which highlights the difference between risk and ambiguity. However, from the above analysis, we know that FTC provides a unified framework to handle decision making under risk and ambiguity.

Concluding remarks
According to the definition given by Simon (1976) , there are two kinds of theories for modelling rationality: one is substantively rational theory and the other is procedurally rational theory. Except a few models, such as Rubinstein (1988) , the existing rational theories are substantively rational ones. The basic idea of these theories is to replace or relax the part of axioms of the expected utility theory or the subjective expected utility theory. However, the empirical studies show that the new theories generate new paradoxes. This paper provides a fundamental theory with a few intuitively appealing axioms for modelling procedural rationality and demonstrates that the focus theory of choice accounts for several empirical phenomena and handles decision making with risk or under ambiguity or under ignorance within a unified framework.
The core argument of FTC is that the most salient event corresponds to the most-preferred action. The process of seeking the most salient event involves two steps: first, the salient event (focus) of each action is chosen; then the most salient event is selected from among foci of all actions. Interestingly, we have found out several psychological evidences in the paper by Stewart et al. (2016 ), for example, the findings of very little systematic variation in eye movements over the time course of a choice or across the different choices; more eye movements when choice options were similar; people choosing the gamble they look at more often, etc. Considering the two-step decision process and the roles of foci in FTC, it is easily understood that the above-mentioned evidences consist with the basic assumptions of FTC.
Prospect theory ( Kahneman & Tversky, 1979 ) uses a concave function to evaluate the gain and a convex function to evaluate the loss, and these reflect risk aversion and risk seeking, respectively, within the framework of a weighted average. We use the PES and NES to correspond to gain and loss, respectively; that is, in the PES, the reference point is the highest payoff, while in the NES, it is the largest loss, assuming that the others are normalized by them. Shafir, Simonson, and Tversky (1993) classify decision models into two classes: formal (value-based) models and reason-based models. A formal model "typically associates a numerical value with each alternative, and characterizes choice as the maximization of value," while a reason-based model "identifies various reasons and arguments that are purported to enter into and influence decision, and explains choice in terms of the balance of reasons for and against the various alternatives" ( Shafir et al., 1993 , p.12). There have been some studies of reason-based choices with multiple attributes (e.g., de Clippel & Eliaz, 2012;Shafir et al., 1993 ). However, there has been little consideration of how this applies to lottery choices. In FTC, we postulate that the revealed focus of the optimal action is the reason of choice. Although a DM is sometimes unaware of the precise factor that determines his optimal action ( Nisbett & Wilson, 1977 ), the focus of the chosen action can nevertheless play three roles: we can gain insight into a DM's behavior, we can clarify the reason behind our own decisions, and we can assist a DM to determine the "right" decision. Since the focus of the optimal action captures significant aspects of a DM's deliberation, we believe that FTC can be applied to complex, real-world decisions of the type that might be difficult for some existing models.
FTC provides a rigorous formal underpinning for modeling procedural rationality in management-related disciplines. Although it is well-known that the behavior factors are very important in operational research ( Becker, 2016;Brocklesby, 2016;Franco & Hämäläinen, 2016;Villa & Castañeda, 2018;White, 2016 ), it is still difficult to incorporate personality traits of players into the mathematical models due to the lack of appropriate theories. FTC provides a theoretical base to build the behavioral models in operational research; as a special case of FTC, one-shot decision theory ( Guo, 2011 ) has been applied to auction problems ( Wang & Guo, 2017 ), newsvendor problems for innovative products ( Guo & Ma, 2014 ), multistage decision making ( Guo & Li, 2014;Li & Guo, 2015 ), duopoly markets of innovative products ( Guo, 2010a;Guo, Yan, & Wang, 2010 ) and private real estate investment ( Guo, 2010b ). Using FTC, we can build the models for supply chain management while considering players' behavioral features in designing customized contracts. Further, FTC provides a possible theoretical base to analyze the newsvendor anomalies (e.g., Schweitzer & Cachon, 20 0 0; Gavirneni & Isen 2010 ), the bullwhip effect (e.g., Lee et al., 1997;Croson & Donohue, 2006;Chen et al., 2017 ), and information share strategies from behavior perspectives.
The FTC-based decision problem is mathematically a bilevel programming problem in which the upper level program is for determining the optimal alternative while the lower level program is for seeking the foci of alternatives. These bilevel problems are fundamental, interesting and challenging ( Zhu & Guo, 2017 ). In addition, FTC provides a completely new theoretical base for dealing with the uncertainty in stochastic optimization problems. It can help researchers to build scenario-based decision models ( Zhu & Guo, 2016 ).
There are several limitations of this research. Although FTC has succeeded in accounting for several well observed anomalies, conducting psychological experiments to verify the proposed axioms is still unfinished. It will be our research work in the near future. Since the relative likelihood and the satisfaction are exogenously given, it requires more demanding of cognitive effort. However, the increased burden could be remarkably reduced by using linear normalization functions, such as (2) and (5) .

Acknowledgments
This work was supported by JSPS KAKENHI Grant Number 15K03599.

Supplementary materials
Supplementary material associated with this article can be found, in the online version, at doi: 10.1016/j.ejor.2019.01.019 . ( v i (s ) , p i (s ) ) : the event s ∈ S i which make a i generate the payoff v i (s ) with the probability p i (s ) . π i U (s ) : the relative likelihood of an event s ∈ U ⊆ S i . ( s 1 , s 2 ) ∈ = δ : s 1 ∈ S i is identical with s 2 ∈ S j at the level δ. ( s 1 , s 2 ) ∈ R + : s 1 ∈ S i positively dominates s 2 ∈ S i . F i ( U, R + ) : the set of the undominated events s ∈ U ⊆ S i with regard to R + . s 1 + s 2 : the event s 1 ∈ S i is more attractive than the event s 2 ∈ S i . D + (H) : the set of actions a i ∈ A whose positive foci belong to F(H, Q + ) , H ⊆ ∪ i S i . a i * : the most preferred action amongst A . π + C + (s ) : the readjusted likelihood degree of s ∈ F ( C + , Q + ) . u ( C + ) (·) : the readjusted satisfaction function for s ∈ F ( C + , Q + ) . c i * + ( S i * ) : the positive focus of a i * over S i * .