The duration problem with multiple exchanges

We treat a version of the multiple-choice secretary problem called the multiple-choice duration problem, in which the objective is to maximize the time of possession of relatively best objects. It is shown that, for the $m$--choice duration problem, there exists a sequence (s1,s2,...,sm) of critical numbers such that, whenever there remain k choices yet to be made, then the optimal strategy immediately selects a relatively best object if it appears at or after time $s_k$ ($1\leq k\leq m$). We also exhibit an equivalence between the duration problem and the classical best-choice secretary problem. A simple recursive formula is given for calculating the critical numbers when the number of objects tends to infinity. Extensions are made to models involving an acquisition or replacement cost.

1 Introduction and summary Ferguson et al. (1992) were the first to consider a sequential and selection problem referred to as the duration problem, a variation of the classical secretary problem as treated by Gilbert and Mosteller (1966) and others (see Ferguson (1989) and Samuels (1991) for a history and review of the secretary problem). The basic form of the duration problem in the no-information setting can be described as follows: a set of n rankable objects appears one at a time in random order with all n! permutations equally likely. As each objects appears, we decide either to select or reject it based on the relative ranks of the objects. The payoff is the length of time we are in possession of a relatively best object that have appeared to date. Thus we will select only a relatively best object, receiving unit payoff as we do so and an additional unit for each new observation, as long as the selected object remains relatively best. Though Ferguson et al. (1992) considered the various models extensively, they confined themselves to the study of the one-choice problem. We consider here, as a natural generalization of their study, the no-information multiple-choice duration problem and its modifications. Preliminary results are included to Tamaki et al. (1998). The multiple-choice duration problem is reformulated as the multiple optimal stopping problem, which has been treated by many authors. The double-stopping problem was posed by Haggstrom (1967) and for discrete-time Markov processes has been considered by Ehjdukyavichyus (1979), Nikolaev (1979Nikolaev ( , 1998 and Stadje (1985).
For the m-choice duration problem, we choose at most m objects sequentially, and receive unit payoff at each time point as long as the last chosen object remains a candidate, that is, a relatively best object. Only candidates can be chosen, the objective being to maximize the expected total payoff. More formally, this problem can be described as follows: let T n (i) denote the arrival time of the first candidate after time i if there is one, and n + 1 if there is none, so T n (i) : Ω → {i + 1, . . . , n + 1}. Then D n (i) ≡ T n (i) − i is the duration of the candidate selected at time i and the objective is to find a stopping vector (τ * 1 , τ * 2 , . . . , τ * m ) such that Here τ i (1 ≤ i ≤ m) denotes the stopping time related to the i-th choice and C m is the set of all possible vectors (τ 1 , τ 2 , . . . , τ m ).
A generalization of this problem is considered in Section 2.3, in which we allow the number M of objects presented to be a random variable. We show for the m-choice duration problem that, subject to a condition on the distribution of M, there exists a nonincreasing sequence (s 1 , s 2 , . . . , s m ) of critical positive integers such that, whenever there remain k choices to be made, the optimal strategy immediately selects a candidate if it appears at or after time s k (1 ≤ k ≤ m). That is, the optimal strategy has a threshold form. In Section 2.3 we show that this condition is satisfied in the two particular cases when M ∼ M s (n) (the distribution with P {M = n} = 1 i.e. the degenerate distribution) and M ∼ M u (n) (the distribution with P {M = i} = 1/n (i = 1, . . . , n i.e. the discrete uniform distribution).
In Section 2.4 we investigate the asymptotics for n → ∞ in the case when M has the degenrate distribution P {M = n} = 1. The ratio s k /n converges to a limit s * k ∈ (0, 1). A recursive formula for calculating s * k in terms of s * 1 , s * 2 , . . . , s * k−1 is given by Throughout the paper the empty sum is taken as zero, so the above formula is valid for all k ≥ 1. We show also that, as n → ∞, the expected proportional payoff, that is, the expected maximum payoff per unit time, is given by - The classical best-choice secretary problem (BCSP ) is concerned with maximizing the probability of choosing the best object. Samuels (1991) and Ferguson et al. (1992) pointed out that, for the one-choice problem, the duration problem with a known number n of objects is equivalent to the BCSP with an unknown number of objects having a uniform distribution on {1, 2, . . . , n}. This was first studied by Presman and Sonin (1972). See also Petruccelli (1983) and Lehtinen (1993) for the problem with an unknown number of objects, Gilbert and Mosteller (1966), Sakaguchi (1978) and Preater (1994) for the multiple-choice problem and Tamaki (1979b) for the formulation of the multiple-choice problem with an unknown number of objects and solution of the two-choice problem having a uniform prior of the actual number of objects.
We show in Section 2.3 that this equivalence still holds for the multiple-choice problem.
Recently Gnedin (2005) established an equivalence between the various best-choice problems and the related duration problems in a greater generality (see also Gnedin (2004)). Ferguson et al. (1992) considered another type of problem, the full-information duration problem, where the observations are the actual values of the objects assumed to be independent and identically distributed (iid) from a known distribution and hence decisions are based on the actual values of the objects. They showed that the above equivalence between the best-choice problem and the duration problem holds also for the full-information problem. Porosiński (1987); Porosinski (2002) consider the full-information best-choice problem with a random number of objects and Mazalov and Tamaki (2006) and Samuels (2004) the limiting maximum proportional payoff for the full-information one-choice duration problem.
In Sections 3 and 4, the multiple-choice duration problem with M ∼ M s (n) is generalized by introducing costs. In Section 3 a constant acquisition cost is incurred each time an object is chosen, while in Section 4 a constant replacement cost is incurred with the selection of any candidate other than the first. The objective in Sections 3 and 4 is to maximize the expected net payoff. It can be shown that, under an appropriate cost condition, the optimal strategies have similar structure to that for the problem involving no cost. In Sections 3.2 and 4.2 we investigate the respective associated asymptotics.
The multiple-choice duration problem with replacement and acquisition costs may be considered as a marriage and divorce problem, interpreting the replacement cost as alimony.
Recently, Rapoport (1997, 2000) investigated the behaviour of the decision makers under circumstances similar to those of the best choice problem model. The discussion of the problem are the subject of papers by Bearden (2006) and Szajowski (2006). It seems important to construct a model with physical parameter to fit it to the empirical data. The considerations of sections 3 and 4 suggest one way forward.

Multiple exchange and hold of the relatively best item
In this section we address the optimal choice problem with an unknown (bounded) number of objects. At most n objects appear in turn before us. We have only an a priori distribution p i = P {M = i} on the actual number of objects, where n i=1 p i = 1. Without loss of generality, we may assume that p n > 0, so π i > 0 for 1 ≤ i ≤ n. We set n j=i p i = π i . We are allowed to make at most m choices and wish to maximize the expected duration of holding a relatively best object.
We assume that all that can be observed are the relative ranks of the objects as they are presented. Thus if X i denotes the relative rank of the i-th object amongst those observed so far (a candidate if X i = 1), the sequentially-observed random variables are X 1 , X 2 , . . . , X n . It is well known that under the assumption that the objects are in random order with all n! permutations equally likely, we have (a) the X i are independent random variables and We formulate the m-choice duration problem as a Markovian decision-process model. First we condition on M = ℓ ≥ i. Since decisions about selection or rejection occur only when a candidate appears, we describe the state of the decision process as (i, k), 1 ≤ i ≤ ℓ, 1 ≤ k ≤ m, if the i-th object is a candidate and there remain k more choices to be made. For the above process to be a Markov chain, we must further introduce an additional absorbing state (ℓ + 1, k, e) for the situation where the last object is presented at time ℓ and is not a candidate, with k choices left (1 ≤ k ≤ m). When it leaves (i, k), the process moves to a state (j, k − 1) if the i-th object is selected. Otherwise it moves to a state (j, k) or (ℓ + 1, k, e). By (a) and (b), the distribution of j is given by We now remove the conditioning on M. The probability of a next candidate appearing at time j (i < j ≤ n) is given by The corresponding probability of transition from (i, k) to a state (j, k), (j, k −1) or (j, k, e) is In accordance with our convention about empty sums, we shall interpret π n+1 as zero.
Let f : {1, 2, . . . , n + 1} → ℜ be the payoff function. Define the expectation operator with respect of the probability distribution (3) and the expectation operator with respect of the probability distribution (4).
If the i-th object observed is a candidate, the period for which it remains a candidate has mean where When π j = 1, for j = 1, 2, . . . , n then the sequence L j given in (8) is denoted by H j (see section 5 for further details and section 2.4 for properties given by formulae (31)). The contribution to the expected occupancy time from a further candidate, if any, is When n ≤ m, the optimal strategy is easily seen to be to select the candidates successively as they appear. Thus we assume n > m. Before proceeding to investigate the optimal strategy, we introduce some notation. Suppose we start in state (i, k). We denote by U the expected total possession time when we select or reject respectively the i-th object and then proceed in an optimal manner. We also denote by W (k) i the expected total possession time under an optimal strategy starting from state (i, k) (1 ≤ i ≤ n, 1 ≤ k ≤ m). The Bellman principle of optimality yields for 1 ≤ i ≤ n and 1 ≤ k ≤ m that Equations (10)-(12) together with the boundary conditions W (0) i = 0 for 1 ≤ i ≤ n + 1 and W (k) n+1 = 0 for 1 ≤ k ≤ m can be solved recursively to yield the optimal strategy and the optimal value W (m) 1 . When n ≤ m selecting the candidates successively as they appear is the optimal strategy and achieves total possession time n for relatively best objects. Thus we may assume n > m without loss of generality.

The main auxiliary theorem
In the sequel, in the construction of optimal solutions, the properties of the sequence of differences between some payoffs and expected payoffs will be analysed. Important properties of such sequences are gathered in the following theorem.
Theorem 2.1 Suppose that s 1 and N are integers with 1 ≤ s 1 ≤ N. Suppose further that (G recursively for 1 < k ≤ m, then we have the properties: Proof. We shall employ induction on k. From (C1)-(C3) we have (P 1) 1 , (P 3) 1 and (P 4) 1 and that G By (P 1) 1 and (P 4) 1 the summand in (15) is nonnegative so that (15) yields (P 2) 1 . From this and (P 4) 1 we deduce (P 5) 1 . Thus we have a basis for the induction.

The basic theorem
We complement relations (10)-(12) with V (k) i , the expected total possession time if a candidate at time i is rejected and the next candidate (if any) accepted, with optimal choices following such acceptance.
be given by (17) below with k = 1. Suppose that for N = n and some integer s 1 , conditions (C1)-(C3) of Theorem 2.1 are satisfied. Then for the m-choice duration problem with k (1 ≤ k ≤ m) choices still to be made, the optimal strategy selects immediately the first candidate, if any, to appear at or after time s k of Theorem 2.1.
Proof. Since conditions (C1)-(C3) of Theorem 2.1 are satisfied, the conclusions of Theorem 2.1 hold. We proceed inductively, establishing the following: (Q1) k the optimal strategy when there are k choices still to be made to to select the first candidate to appear at or after time s k ; For a basis, consider the one-choice duration problem. By definition (Q2) 1 is given. We should select a candidate observed at time i in preference to the next candidate (if any) if U By (P 4) 1 of Theorem 2.1, this condition cannot be satisfied if i < s k . Also by (P 1) 1 and (P 4) 1 , the first candidate at or after time s 1 satisfies (18) and choice of this candidate is strictly preferable to choice of any candidate subsequent to the second candidate after s k . Thus (Q1) 1 holds and k = 1 provides a basis for induction.
For the inductive step, suppose (Q1) ℓ and (Q2) ℓ to be true for ℓ = 1, . . . , k − 1 for some k with 2 ≤ k ≤ m. Then and By the Bellman principle of optimality we have by (9).
By (19), subtraction of (21) from (11) yields (20)). Thus the inductive assumption provides The recursive definition of the functions G (k) i leads to and we have established (Q2) k . The argument leading from (Q2) k to (Q1) k follows that leading from (Q2) 1 to (Q1) 1 and the inductive step is complete.

Applications of the basic theorem
We can apply Theorem 2.2 whenever we can verify conditions (C1)-(C3). To this end, we note that by (7) and (9) for 1 ≤ i ≤ n, whence we derive that for 1 ≤ i < n.
Proposition 2.3 For 1 ≤ i ≤ n put ψ i = (i + 1)π i . Then a sufficient condition for (C1)-(C3) to hold is that there should exist an integer i 0 with 1 ≤ i 0 ≤ n such that Proof. Since L n < φ n and G (1) n = π n /n > 0, we have readily from (23) As corollaries we consider the two special choices M ∼ M s (n) (when M = n with probability one) and M ∼ M u (n) (when M is uniformly distributed on {1, 2, . . . , n}).
Corollary 2.1 For 1 ≤ m < n, the optimal strategy in the m-choice duration problem with M ∼ M s (n) is given by the conclusion of Theorem 2.2.
Proof. Here π i = 1 for 1 ≤ i ≤ n, so (24) holds with i 0 = n. The stated result follows from Theorem 2.2 and Proposition 2.3. Corollary 2.2 For 1 ≤ m < n and M ∼ M u (n), the optimal strategy in the m-choice duration problem is given by the conclusion of Theorem 2.2.
The V (m−1) i (m ≥ 2) may be calculated recursively from with the interpretation that V In the three subsequent sections on asymptotics it is convenient to scale mean durations by dividing by n so as to work in terms of the average possession times per unit time. We shall set U i /n, etc. This leaves optimal strategies unaffected. As a prelude to this, we consider one further application.
We compare two differently formulated multiple optimal stopping problems. The first is the m-choice problem of Corollary 2.1 and the second the m-choice best-secretary problem with an unknown number of objects having distribution M u (n). We show that these problems have the same solution in the sense that the optimal strategies and expected payoffs are the same. In the latter problem we win if the last chosen object is best overall. The objective is to maximize the winning probability. Tamaki (1979b) formulated this problem as a Markovian decision process model and solved explicitly the two-choice problem with a uniform prior on M. We describe the state of the process be described as (i, k) (1 ≤ i ≤ n, 1 ≤ k ≤ m) if the i-th object is a candidate and there remain k choices to be made. We denote by u i ) the winning probability when we select (reject) the i-th object and then continue optimally from state (i, k). If we let the principle of optimality yields When M ∼ M u (n), we have π i = (n − i + 1)/n for i = 1, . . . , n. If we set (27)-(29) are transformed respectively into These are (10)-(12) for the scaled version of the process of Corollary 2.1, with the correct normalized value for U i . Because of the common multiplicative factors in (30), w i . Thus optimal choices are the same in the two processes. Since 1 , the two also share their optimal payoff value. Thus we have established the following result.
Theorem 2.4 The optimal-choice strategy and expected payoff is the same for the mchoice versions of (i) the best-choice secretary problem with unknown number of objects distributed uniformly on {1, 2, . . . , n}; (ii) the duration problem with possession times of relatively best objects scaled by division by n.

Asymptotics for the basic problem with the degenerated distribution of objects
It is of interest to investigate the asymptotic behaviour of s k /n (1 ≤ k ≤ m) and q m /n as n → ∞. To do this, we observe that the sums in the formula of Section 2 are Riemann sums. With M ∼ M s (n), (22) becomes for 1 ≤ i ≤ n, where H ℓ = n i=ℓ 1/ℓ for ℓ ≥ 1 and H 0 = 0 (see also (8)).
For m = 1 with i/n → x as n → ∞, the Riemann sum given by G (1) i converges to the integral From (14), is obtained as the unique root x ∈ (0, 1) of G (1) (x) = 0.
More generally (13) leads to functions G (k) (x) (0 < x < 1) defined recursively by with G (k) i a Riemann approximation to G (k) (x) if i/n → x as n → ∞ .
Correspondingly s * k := lim n →∞ s k /n exists and may be obtained for k ≥ 2 as the unique root x ∈ (0, s * k−1 ) of From (34) and (35), s * k is a root of or equivalently, from (32), To derive the tractable form (1), we need some lemmata.

Lemma 2.3 For k a positive integer, define
Then A k, i satisfies the recursion with A k, k = 0 (k ≥ 0).

Proof. From (34)
The second and third terms of the last line are respectively whence the desired result.
For simplicity, set A k := A k, 0 . Repeated use of (38) gives the following recursion for A k .
Lemma 2.4 For k ≥ 1, A k satisfies the recursion Then from (36) and we have the following lemma.
Lemma 2.5 For k ≥ 1, N k satisfies the recursion Proof. Straightforward calculation from (37) yields Combining this with (39) completes the proof.
We have the following lemma concerning the expected payoff.
Lemma 2.6 Let q * m = lim n→∞ q m /n for m ≥ 1. Then Proof. For m = 1, we have from (25) that By (33), this may also be written For m ≥ 2, we have from (25) and (26)  if V (m−1) (x) (0 < x < 1, m ≥ 2) are defined recursively by On the other hand, we have from (11) and (21) by (12). On letting i/n → x as n → ∞, we derive from the case k = m that Then G (m) (s * m ) = 0 implies that or equivalently, from (32) and Application of (45) to (44) provides q * m = −s * m ln s * m + q * m−1 , which upon repetition and use of (43) provides (42).

Remark 1
The above approach is rather intuitive. To make the argument more rigorous, we can approximate the difference equations by differential equations. This method was suggested by Dynkin and Yushkevich (1969) and has since been applied successfully by Szajowski (1982), Suchwa lko and  and Yasuda (1983). Mucci (1973a,b) has developed the idea for a wider class of optimal stopping problems.
we have by subtraction that In the development of the differential equation approach cited above it can be shown that asymptotically with boundary condition f (k) (1) = 0. Here the nonincreasing sequence of critical numbers is . For example, routine algebra yields for k = 1 and 2 that where s * 1 = exp(−2) and s * 2 = exp{−(1 + 7/3)}. For k ≥ 3, we can proceed in similar way.

The multiple-choice duration problem with acquisition costs
In this section, the multiple-choice duration problem is generalized by imposing a constant acquisition cost c = c(n) > 0 each time an object is chosen. The objective of this problem is to maximize the expected net payoff, that is, total possession time less the total acquisition cost incurred.

The degenerate distribution of the number of objects
For simplicity we restrict attention to the case P {M = n} = 1, so that π i = 1 for 1 ≤ i ≤ n. To avoid triviality we assume n > 1.
Consider first the one-choice problem. The expected net payoff resulting from a choosing a candidate presenting at time i is which by (7) is given by We have that U (1) which is strictly decreasing in i and is negative for i = n − 1.
Put K(n) := min{i : Then U (1) and U (1) K(n) ≤ 0, it is optimal never to choose a candidate, so without loss of generality we may assume U (1) Further, there exist integers a = a(n, c), b = b(n, c) satisfying ≥ 0 if and only if a ≤ i ≤ b and to maximize expected total payoff we never choose a candidate presenting at time i when a ≤ i ≤ b fails. Clearly this holds also in the m-choice problem.
We define V (1) i as the expected net payoff when we reject a candidate appearing at time i ≤ b but select the next candidate (if any) appearing no later than time b. We then have We now turn attention to the m-choice problem. For i ≤ b we employ the notation U i , respectively and referring to expected net maximal payoff rather than expected total time of possession of candidates and with choice of second and subsequent candidates occurring no later than time b. If m > b − a it is clearly optimal to simply choose every candidate appearing in I, so we suppose m ≤ b − a. The following theorem summarizes the optimal strategy for the m-choice problem with acquisition cost.
Theorem 3.1 For the m-choice duration problem with acquisition cost c subject to (48), there exists a sequence (s 1 (c), s 2 (c), . . . , s m (c)) of integral critical numbers such that, whenever there remain k choices to be made, the optimal strategy selects the first candidate to appear at or after time s k (c) but no later than b. Moreover s k (c) is nonincreasing in k and determined by Theorem 2.1 with N = b and Finally, s m (c) ≥ a.
Proof. There is nonnegative expected payoff from a candidate selected at time i with a ≤ i ≤ b, but not for one selected after time b, so it suffices to establish the result for candidates arriving at times i ≤ b.
From (46) and (49), we can verify that (50) is equivalent to We derive that for i < b, The right-hand side is strictly decreasing in i for i < K(n) and nonpositive for K(n) ≤ i < b. It follows that (C1)-(C3) of Theorem 2.1 are satisfied provided that G To see that this requirement is met, observe that U Thus the conditions of Theorem 2.1 are met. Establishing the theorem now follows closely the rest of the proof of Theorem 2.2, operating on the interval [1, b] instead of [1, n]. Since a candidate arriving before time a is never accepted, we have finally s m (c) ≥ a.
The case c = 0 corresponds to the duration problem treated in Corollary 2.1. Thus we have a(n, 0) = 1 and b(n, 0) = n. In Section 8 we shall need to compare quantities occurring in that context and the present one. Accordingly we shall where necessary for clarity write the G Proof. We have for i ≤ b that from which (51) follows.

Asymptotics for the duration problem with acquisition costs
Observe first that, from (47), lim n →∞ K(n)/n = e −1 , so the cost condition (48) is reduced, as n → ∞, to c = lim n→∞ c(n)/n ≤ e −1 .
After division by n, we may let i/n → x as n → ∞ in (46) to show that U Let β = lim n →∞ b(n, c)/n. Then β is the unique root x ∈ [e −1 , 1) of U (1) (x) = 0 under the cost condition (52) and satisfies −β ln β = c.
Lemma 3.2 Under (52), s * k satisfies the recursion Let α = 1 + ln β. Then from Lemma 3.2 we can calculate the s * k successively as For m ≥ 1, let q * m be the scaled expected net payoff for the m-choice duration problem when n tends to infinity. Then we have the following result.
Proof. Similar to that of Lemma 2.6. Table 1 The asymptotic critical number s * m for some values of m and c.  Table 2 The asymptotic expected net payoff for some values of m and c.  Table 1 presents numerical values of β and s * m for some values of m and c. Let β ′ be the unique root x ∈ (0, e −1 ] of −x ln x = c. It is intuitively clear that, as m → ∞, s * m converges to β ′ (= s * ∞ ) because there is no benefit in choosing a candidate prior to β ′ . Table 2 presents numerical values of q * m for some values of m and c. It is interesting to compare, for example, q * 1 = 0.0335 for c = 0.3 to q * 1 = 0.2707 for c = 0, which implies that we can still gain positive expected payoff even when the acquisition cost is larger than the mean maximum payoff attainable when the acquisition cost is zero. This is not a contradiction. The stopping region shrinks as c gets large (see Table 1) and positive mean payoff is assured by restricting our choice to a really good object. Table 2 suggests also that, as m → ∞, q * m converges to a value q * ∞ . This is given in the following lemma.
Lemma 3.4 Proof. 3.4 As the arrival times of the n objects, we consider time epochs 1/n, 2/n, . . . , n/n instead of 1, 2, . . . , n. When n → ∞, the transition probability p(i/n, j/n) = i/(j(j − 1)) then converges to the transition density p(x, y) = x/y 2 as i/n → x, j/n → y (see (2)) and the candidates appear according to a non-homogeneous Poisson process with intensity function λ(x) = 1/x from (a), (b), in Section 2. That is, if we let N(a, b) denote the number of candidates that appear in time interval (a, b), then N(a, b) becomes a Poisson random variable with parameter ln(b/a) (see Theorem 1 of Gilbert and Mosteller (1966)).
Let T (x) denote the time of the first candidate after time x if there is one and 1 if there is not. From the above T (x) has density f T (x) (t) = p(x, t) = x/t 2 on the time interval (x, 1) and probability mass x at 1. As the number of choices m → ∞, the optimal strategy chooses all the candidates that appear in time interval (β, β ′ ). Thus the total proportional duration D is expressed as It is readily verified that T (β) and T (β ′ ) are independent. Hence by conditioning on T (β ′ ), Thus the expected net payoff q * ∞ is which yields (54).

Duration problem with replacement costs
In this section a constant cost d = d(n) > 0 is incurred each time there is replacement, whether or not the new candidate is the one to end the candidature of the previously chosen candidate. For simplicity we consider only the case where M ∼ M s (n) and ignore acquisition costs. The objective is to maximize the expected net payoff, that is, the total time of possession of a relatively best object less any replacement costs incurred. The multiple-choice duration problem with a replacement cost may be considered as a marriage and divorce problem, interpreting the replacement cost as alimony.

The degenerate distribution of the number of objects
We treat the m-choice duration problem with replacement cost d > 0. In the m-choice problem we are allowed to replace objects up to m − 1 times, m ≥ 2. We define the state of the process as in Section 3 and W Consider a candidate other than the first arriving at time i. As in Section 6, we may argue that such a candidate is never chosen unless Further, U Once the first choice is made, the problem reduces to the (m − 1)-choice problem with an acquisition cost d. Thus the main concern is to determine when to make the first choice. The optimal strategy can be summarized as follows.
Theorem 4.1 For the m-choice duration problem with replacement cost condition (55), there exists a sequence (s 1 (d), s 2 (d), . . . , s m−1 (d), t m (d)) of integral critical numbers such that the optimal strategy first selects the first candidate (if any) to appear at or after time t m (d). Thereafter it replaces each previously chosen object with the first new candidate (if any) that appears at or after time s k (d) but no later than b(n, d) if k more replacements are available (1 ≤ k ≤ m − 1), where b(n, d) = max{i : U Proof. The part of the result relating to choices when fewer than m replacements are to be made is immediate from Theorem 4.1, so it remains to address the first choice of a candidate.
As before For m ≥ 2, let r * m be the expected net payoff for the m-choice duration problem when n tends to infinity. Then we have the following. Proof. The proof is omitted. Table 3 The asymptotic critical number t * m for some values of m and d. d t *