Two-player Knock ’em Down

We analyze the two-player game of Knock ’em Down, asymptotically as the number of tokens to be knocked down becomes large. Optimal play requires mixed strategies with deviations of order √ n from the na¨ıve law-of-large numbers allocation. Upon rescaling by √ n and sending n → ∞ , we show that optimal play’s random deviations always have bounded support and have marginal distributions that are absolutely continuous with respect to Lebesgue measure.


Knock 'em Down
In the game of Knock 'em Down, a player is given n tokens which (s)he arranges into k piles, or bins. After that, a k-sided die is thrown; if the outcome is side i (which occurs with probability p i > 0), then a token is knocked down from the i th pile. In the event that there are no tokens in the i th bin, then no tokens get knocked down. The die is thrown repeatedly until all the tokens have been knocked down. [In the original version of this game as described in [Hun98] [BF99], n = 12 and two fair six-sided dice are thrown, with the bin chosen being given by (1 less, let us say, than) the sum of the two numbers showing. In that case, k = 11 and p i ≡ (6−|i−6|)/36. The reader may wish to note that we have altered the spelling from "Knock 'm Down" to "Knock 'em Down".] We consider two versions of Knock 'em Down. In solitaire Knock 'em Down, which we analyze in a separate article, there is one player, and his goal is to minimize the expected number of iterations until all the tokens have been knocked down. Solitaire Knock 'em Down is also equivalent to a two-player zero-sum game, where the payoff to the winner is the expected number of extra die throws that the other player requires to knock down all his tokens, and the goal is to maximize the expected payoff. In competitive Knock 'em Down, which we analyze in this article, it is enough merely to win, and the amount by which a player wins is irrelevant. There are two players, who each arrange their tokens into bins without seeing what the other player is doing, and then the same die is used to knock down tokens from each of the players' bins. The winner is the player whose tokens get all knocked down first (the outcome may be a tie, which we resolve with a coin flip). With competitive Knock 'em Down, as we shall see, there is an interesting Nash equilibrium.
Competitive Knock 'em Down is quite easy to analyze when k = 2. The result is Theorem 1 in [BF00]; the authors show that the best strategy is to use the allocation (m, n −m), where m is a median of the Binomial(n, p 1 ) distribution. It is instructive to consider competitive Knock 'em Down in the next simplest case, in which a fair three-sided die is used. The first player may guess that the best strategy is to place n/3 tokens into each of the three bins (assume for convenience that n is divisible by 3). But if the first player uses that strategy, then the second player can "undercut" by placing (n/3) − 1 tokens into each of the first two bins and (n/3) + 2 tokens into the third bin. Then, for large n, the probability that the second player's third bin empties out last is only slightly larger than 1/3, while the probability that the first player's first or second bin finishes last is only slightly smaller than 2/3, so that the second player wins about two-thirds of the time. It turns out that an optimal strategy is to allocate approximately n/3 tokens to each bin, but with certain random perturbations to the bin allocations. (See Figure 1.) In game-theoretic terminology, optimal play employs a mixed (non-pure) strategy.
To simplify the analyses of both games for general k and p := (p 1 , . . . , p k ), we suppose that the games are run in continuous time, with the die being thrown at instants governed by a Poisson point process with rate 1. Throwing the die at random times rather than at deterministic times has absolutely no impact on the outcome of a game of competitive Knock 'em Down; likewise, in the case of solitaire Knock 'em Down the expected (total) clearance time (i.e., time to knock down all tokens) remains unchanged. Suppose in either game that a player places ξ i tokens into bin i, and let T i denote the time that it takes for bin i to be cleared. Since the game is Figure 1: An optimal strategy for (an approximation to) the continuous limiting version of two-player Knock 'em Down described in Section 1.3 when k = 3 and p = (1/3, 1/3, 1/3). The strategy chooses a hexagon with probability proportional to its darkness, and then allocates the chips according to the hexagon's coordinates. Hexagons near the lower-left, lower-right, or top of the triangle allocate more chips to the first, second, or third piles respectively. The top corner has coordinates (−0.28, −0.28, +0.56)/ √ 3. run in continuous time, the T i 's are mutually independent, and the variable p i T i is the sum of ξ i independent exponential random variables with unit mean, i.e., p i T i is a Gamma random variable with shape parameter ξ i . The clearance time is T = max i T i .
(We have used here the usual notation X n = O p (y n ) to mean that inf n Pr[|X n | ≤ cy n ] tends to 1 as c → ∞; equivalently, one says that the family of distributions L(X n /y n ) is tight.) If ξ i ≡ nq i , where the q i 's are nonnegative and sum to unity, then the clearance time T satisfies Since max i (q i /p i ) ≥ 1, this suggests that the optimal allocation is ξ i ≈ np i , and indeed for the solitaire game this was first shown in [BFH01]. We show in our companion paper on solitaire Knock 'em Down that the optimal choice is ξ For competitive Knock 'em Down, our Theorem 1.2 below shows similarly that an optimal mixed strategy will never choose ξ i differing from np i by more than order √ n.
Define the overplay of an allocation ξ relative to n p to be In the following lemma, { ξ beats η} is the event that the corresponding clearance times satisfy the strict inequality T ξ < T η , or else T ξ = T η and ξ wins the coin toss. The notation extends naturally to {α beats β} for mixed strategies α, β; see the start of Section 2 for careful analogous definitions in the continuous-game setting that we introduce in Section 1.3.
Lemma 1.1. If the overplay of allocation η exceeds the overplay of allocation ξ by at least w √ kn min i p i , then Pr[ η beats ξ] < 1/w 2 .
Proof. Let j be the bin maximizing η j /p j . In order for allocation η to win, there must be some bin ℓ = j for which η's bin j clears out before ξ's bin ℓ. The clearance time T 0 of η's bin j has mean η j /p j and variance η j /p 2 j ≤ n/p 2 j , while the clearance time T ℓ of ξ's bin ℓ = j has mean ξ ℓ /p ℓ ≤ (η j /p j ) − w √ kn min i p i and variance ≤ ξ ℓ /p 2 ℓ . Since T 0 and T ℓ are independent, by Chebyshev's inequality and since the support of T ℓ − T 0 is R, the first above inequality is strict. Upon summing over ℓ = j, we find that the probability that there is some bin ℓ = j for which T 0 < T ℓ is less than 1/w 2 .
Proof. Let α be any optimal strategy, and let a(x) be the probability that strategy α picks an allocation with an overplay of x or more. We show successively, in each case using Lemma 1.1, that Suppose first that a((w √ kn + 1)/ min i p i ) ≥ p 1 . Consider pure (i.e., deterministic) strategy β which always plays n p (rounded to an integer vector summing to n). Because of the rounding, β may overplay n p slightly, but never more than 1/ min i p i . By Lemma 1.1, strategy β beats α with probability > p 1 × p = 1/2, contradicting the optimality of α.
With positive probability (> 1−p 1 ), strategy α overplays by no more than (w √ kn + 1)/ min i p i , so we can define a strategy β which plays according to strategy α conditioned to overplay by no more than (w √ kn + 1)/ min i p i . When β plays against α, by Lemma 1.1, β wins with probability > (1 − p 1 ) × 1 2 + p 2 × p = 1/2, which is again a contradiction to the optimality of α.
Suppose finally that a((3w √ kn + 1)/ min i p i ) = δ > 0. Let β be the strategy which plays according to α conditioned to overplay by no more than (3w √ kn + 1)/ min i p i . When β plays against α, by Lemma 1.1, strategy β wins with probability , which is again a contradiction.

A continuous game
From Theorem 1.2 we see that any optimal strategy for competitive Knock 'em Down plays allocations deviating by O p ( √ n) from the naïve law-of-large-numbers allocation n p. For given n and allocation ξ, it is thus natural to define numbers Then (By∼ we mean that the total variation distance between the two distributions tends to 0 as n → ∞ with p i fixed and x i bounded.) The player chooses the x i 's so that k i=1 x i = 0 and p i n + x i √ n is an integer, and the player's tokens are all exhausted at time approximately where the Z i 's are independent standard normal random variables.
Thus the large-n asymptotics of either version of Knock 'em Down is effectively a continuous game (called solitaire/competitive continuous Knock 'em Down), where the first player chooses real numbers x i satisfying i x i = 0, and his clearance time is The second player similarly chooses numbers y i summing to 0, with clearance time T y defined using the same Z i 's. The sequel provides various rigorous connections between n-token Knock 'em Down and our continuous game.
We present here two indications that even qualitative analysis of the continuous game is not entirely trivial. First, the naïve pure strategy x = 0 has no optimal response. Indeed, the responding player can undercut by playing (−ε, −ε, . . . , +(k − 1)ε), and letting ε ↓ 0 provides strategies which give the respondent asymptotically the optimal probability (k −1)/k of winning, but no strategy achieves this probability. Second, it is not immediately clear that our two-player continuous game has a value (in the game-theoretic sense). There are standard tools for proving that a continuous game has a value, such as results of Ky Fan [Fan53], but our payoff function is rather severely discontinuous at certain points and the tools require a semicontinuous payoff function. Continuous games without values do exist [SW57], but it turns out that our game does indeed have a value. One can prove this by a suitable comparison of our game with a "ties go to player 1" modification of the game having upper semicontinuous payoff function, but we will give a somewhat more direct proof whose basic idea is simply to pass to the limit from n-token optimal strategies.

Guide to later sections
With solitaire continuous Knock 'em Down, the optimal strategy is deterministic, and we are able to characterize it for general p i 's [FW06]. With competitive continuous Knock 'em Down, optimal play is random (i.e., mixed) with a rather complicated distribution (see Figure 1). Even in the simplest nontrivial instance, where k = 3 and p = (1/3, 1/3, 1/3), we are unable to calculate an optimal strategy. However, for general k ≥ 3 and p, we are able to derive some basic results about optimal play. In Section 2 we show, for example, that any optimal strategy for the continuous game has absolutely continuous marginals. In Section 3 we prove that the continuous game has an optimal strategy. In Section 4 we show that a good strategy for the continuous game can be converted to good strategies for the n-token games by rounding; in particular, an optimal continuous strategy can be converted to asymptotically optimal n-token strategies. Finally, in Section 5 we list some open problems arising from our work.

Properties of optimal play
The main result of this section (Theorem 2.5) asserts that the marginals of any optimal strategy for the continuous game described in Section 1.3 are absolutely continuous. This will follow from an estimate which holds (see Lemma 2.4) equally well for the discrete game.
We begin by establishing some terminology for the continuous game; analogous terminology will be employed for the discrete game. Recall that is the clearance time for a player using allocation x = (x 1 , . . . , x k ) with x 1 + · · · + x k = 0. Here Z 1 , . . . , Z k are independent standard normal random variables. Recall that if player 1 (say) uses allocation x and player 2 uses allocation y, then player 1 wins if and only if T x < T y ; for short, we simply say " x beats y ", in which case player 2 pays 1 unit (utile) to player 1. It will be convenient to view the contest between allocations x and y as follows. Let m i := max(x i , y i ) for 1 ≤ i ≤ k. For convenience we will refer to m = (m 1 , . . . , m k ) as an allocation, even though we are now working on the √ n-scale of deviations and the sum m 1 + · · · + m k may ) To compare various strategies, we will couple the I m 's for various values of m, taking the viewpoint that the same random sequence of die tosses will be made for a given game regardless of the allocations x and y that the two players use (in the continuous game, that the same vector of Z i 's will be used). Then increasing m i while leaving the other m j 's (j = i) fixed may change I from a value different from i to i, but otherwise I will not change. Proof. Let F t ( M ) denote the probability that the game with allocations given by M has clearance time ≤ t. If Y ℓ,t denotes the number of times that bin ℓ has been selected through time t, then the Y ℓ 's are independent Poisson processes with respective intensity parameters p ℓ . We have The probability that the clearance time falls in (t, t + dt) with bin j the last bin cleared is where e j is the j th unit vector and ∆ j denotes the difference operator Letting M ′ = M − e j we therefore have where here we are viewing I = I M as a function of M = pn + m √ n. In this notation, the desired bound we will establish is Since we keep p j fixed as n → ∞, we see from (2.2) that we may treat h, i, and j symmetrically. Thus there are three cases to consider: (i) h, i, and j all distinct, (ii) h = i = j, (iii) h = i = j. Let us consider first the case (i) of distinct h, i, j:

If both
Let λ := p i t and M := M i + 1. For fixed M , the expression is easily seen to be maximized when λ is one of the values λ = M ∓ √ M , and so As in case (i), we conclude that ∆ i ∆ i Pr[I M = j] = O(1/n).
In case (iii) h = i = j, the desired bound follows from the equality Define the δ-undercut of x to be the mixed strategy that increments x by (k − 1)δ in a uniformly random coordinate, and decrements x by δ in the remaining coordinates. (In the discrete setting, δ will be a multiple of 1/ √ n.) The preceding lemma implies Corollary 2.3. Uniformly in allocations x and y such that Proof. We prove this in the discrete setting; the result for the continuous setting follows by taking limits. We assume without loss of generality that the bins are numbered so that x i > y i for 1 ≤ i ≤ ℓ and x i < y i for ℓ + 1 ≤ i ≤ k. Recall that m = max( x, y). Let x ′ denote the random δ-undercut of x, and m ′ = max( x ′ , y). We have and because x and y differ by more than (k − 1)δ in each coordinate, The same proof works for both the discrete and continuous versions of the game.
Proof. We construct a strategy β which attempts to beat strategy α by undercutting it, to wit: β picks an allocation x from α, but, rather than playing x, strategy β instead plays the δ-undercut of x. By analyzing how β fares against α we will be able to bound Pr[|x i − y i | ≤ δ].
When β and α are pitted against each other, we will take the viewpoint that x and y are independently drawn from α and a fair coin is flipped; if the coin lands heads, then α plays x and β plays the δ-undercut of y, while if the coin lands tails, then α plays y and β plays the δ-undercut of x. Recall that I is the last bin to be cleared when the allocations are x and y. Without the undercutting, bin I is owned by β with probability 1/2. (We say that a player "owns" the last bin to be cleared if he was the player who placed more chips in that bin, or lost the coin toss in the event of tie.) Let I ′ be the bin last to be cleared with the undercutting.
Letting E be the event that |x i − y i | ≤ (k − 1)δ for some coordinate i and E c its complement, we may express , we condition on I. For example, conditionally given the event δ < |x I − y I | < (k − 1)δ, the player using β owns I with probability 1/2 if the "overcut bin" [the bin with allocation incremented by (k − 1)δ] chosen is not bin I and with probability 1 if it is. Thus, if δ < |x I − y I | < (k − 1)δ, then Pr[β owns I | x, y, I] = 1 2 1 − 1 k + 1 k = 1 2 + 1 2k . The other entries in the following formula are computed similarly: Thus, conditional on x and y but not I, Substituting this into (2.3), and then rearranging, Recall from Theorem 1.2 that we have O( √ n) bounds on the overplay or underplay of the optimal strategy α. It follows that there is a positive constant q (depending on p) such that Pr[I = i | x, y] ≥ q for any coordinate i and plays x and y that α might make. Recalling also that E is the event that |x i − y i | ≤ (k − 1)δ for some coordinate i, we see that Letting r := k − 1, we have then that, for some c and all δ ≥ 1/ √ n, When r = 1 (i.e., k = 2) this inequality is uninformative, but otherwise we may iterate it to show, for j ≥ 1 and δ ≥ 1/ √ n, We take j = ⌈log(1/δ)/ log r⌉ so that (1 + cδ) × · · · × (1 + cr j−1 δ) = O(1) and both terms on the right-hand side are O(δ). Thus Note that the support of any measure with nonzero absolutely continuous part has positive Lebesgue measure, and that any subset of the line with positive Lebesgue measure has positive 1-dimensional Hausdorff measure. Thus we have Corollary 2.6. For k ≥ 3 bins, the support of any optimal strategy for continuous Knock 'em Down has Hausdorff dimension at least 1.

Recalling again that Pr[|x
It is natural to guess that the true Hausdorff dimension is k − 1, and indeed that optimal strategies are absolutely continuous with respect to (k − 1)-dimensional Lebesgue measure.

Existence of an optimal continuous strategy
While every game in which each player has a finite number of options will have a value (which of course is 0 for n-token competitive Knock 'em Down), there are continuous games without a value [SW57]. Thus Theorem 3.1 below has nontrivial content.
Recall the definition of K at (2.1). For the n-token game, we regard a mixed strategy α n as a probability measure on k-tuples x = (x 1 , . . . , x k ) with vanishing sum, as described in Section 1, and we define the payoff function K n on this x-scale. We say that a sequence (α n ) of strategies for the n-token competitive Knock 'em Down games is asymptotically optimal if min y∈An K n (α n , y) → 0 as n → ∞.
Here the min is taken over the finite (but growing, as n → ∞) set A n of (normalized) actions (allocations) available for n-token Knock 'em Down.
Theorem 3.1. The continuous game has value 0, and there is at least one optimal strategy. Indeed, any subsequential weak limit of any asymptotically optimal sequence (α n ) of strategies for n-token competitive Knock 'em Down is an optimal strategy for the continuous game.
Key ingredients to the proof of this theorem are Theorem 1.2 and Lemma 2.4. The converse to this theorem is proved in Corollary 4.3.
Proof of Theorem 3.1. By Theorem 1.2 and Lemma 1.1, any sequence (α n ) of asymptotically optimal strategies (probability measures) is tight. Therefore there is a subsequential weak limit α which is a probability measure (see e.g. [Str93, Theorem 3.1.9]). It is easy to check that α is concentrated on k-tuples with vanishing sum, so that α is a mixed strategy for the continuous game, and (using Lemma 2.4) that α has atomless marginals. We claim that α is optimal, and then we see immediately that the continuous game has value 0.
The proof that α is optimal is rather routine. To avoid double subscripts, we henceforth innocuously assume that the full sequence (α n ) converges weakly to α. Fix any pure strategy y for the continuous game. Let y n denote a rounding of y to a pure strategy for K n ; the details of the rounding procedure are irrelevant for our purposes here, as long as y n → y. Using the facts that α has atomless marginals, α n w → α, K is continuous away from pairs ( x, y) for which x i = y i for some i, and (for any ε > 0) |K n ( x, y n ) − K( x, y)| → 0 uniformly for x satisfying |x i − y i | > ε for i = 1, . . . , k, it follows (we omit the details) that K n (α n , y n ) → K(α, y). But by the optimality of α n , K n (α n , y n ) ≥ 0 for every n, so K(α, y) ≥ 0 and α is optimal.
Remark 3.2. (a) If it happens to be true that there exists a unique optimal strategy α 0 for K, then Theorem 3.1 implies that α n w → α 0 as n → ∞.
(3.1) (b) If p 1 = · · · = p k , it is possible to choose the strategies α n to be symmetric. If so, and if it happens that there exists a unique symmetric optimal strategy α 0 for K, then (3.1) holds.

Asymptotically optimal play of Knock 'em Down
The main result of this section (see Corollary 4.3) is that when an optimal strategy α 0 for the continuous game K is "rounded" to produce allocations for n-token Knock 'em Down, the result is an asymptotically optimal strategy. The drawback here is that we do not know how to construct such an α 0 , but we might at least hope to find a not unreasonably suboptimalα for K, such as the one discussed in Remark 4.2. We are thus motivated to show more generally (see Theorem 4.1) that "rounding" of any strategyα with atomless marginals and bounded support gives a strategy for Knock 'em Down whose worst-case payoff (i.e., payoff against an opponent playing the best possible response) is asymptotically at least as large as the worst-case payoff in game K from use ofα. and let y n ∈ A n be a strategy achieving this minimum. Since the sequence (α n ) converges weakly, it is tight, so by Lemma 1.1, y n remains bounded. Let n ℓ ↑ ∞ be any sequence for which lim ℓ→∞ κ n ℓ = lim inf n→∞ κ n . By compactness, we know that there is a subsequenceñ ℓ ↑ ∞ and a continuous-game allocation y such that yñ ℓ → y.
The supremum of |K n ( x, y) − K( x, y)| over any bounded set of ( x, y) tends to 0 as n → ∞; in particular, ties do not cause a problem here. Since the sequence (α n ) is tight and the sequence y n is bounded, we have that κ n differs by o(1) from κ n := K(α n , y n ).
Remark 4.2. Even very simple-minded mixed strategies can offer substantial improvement over pure strategies. To illustrate this, we consider k = 3 and p = (1/3, 1/3, 1/3). Numerical experimentation suggested that the uniform distributionα over the simplex of all triples x = (x 1 , x 2 , x 3 ) summing to zero and satisfying x i ≥ −1/6 for all i might be a good strategy for the continuous game. Indeed, numerical explorations indicate that the best response toα is the pure strategy (0, 0, 0) and thence that RHS(4.1) .
[a huge improvement on the worst-payoff value −1/6 . = −0.1666667 resulting from use of the naïve pure strategy (0, 0, 0) in place ofα]. The strategyα was "rounded" to a strategy α 180 for 180-chip Knock 'em Down by taking α 180 to be uniform over allocations placing at least 58 chips in each bin. Corollary 4.3. If α n w → α 0 , where α 0 is optimal for the continuous game K, then α n is asymptotically optimal for the n-token Knock 'em Down game K n , in the sense that κ n defined at (4.2) vanishes in the limit.

Open problems
We have proved the existence of an optimal strategy for two-player continuous Knock 'em Down, but we do not have an explicit description of optimal play even when k = 3 and p 1 = p 2 = p 3 = 1/3. We know that the marginal distributions of optimal play are absolutely continuous with respect to Lebesgue measure. Consequently the set of pure strategies supporting optimal play will have dimension at least 1; perhaps the dimension is k − 1.