Bayesian group testing with dilution effects

Summary A Bayesian framework for group testing under dilution effects has been developed, using lattice-based models. This work has particular relevance given the pressing public health need to enhance testing capacity for coronavirus disease 2019 and future pandemics, and the need for wide-scale and repeated testing for surveillance under constantly varying conditions. The proposed Bayesian approach allows for dilution effects in group testing and for general test response distributions beyond just binary outcomes. It is shown that even under strong dilution effects, an intuitive group testing selection rule that relies on the model order structure, referred to as the Bayesian halving algorithm, has attractive optimal convergence properties. Analogous look-ahead rules that can reduce the number of stages in classification by selecting several pooled tests at a time are proposed and evaluated as well. Group testing is demonstrated to provide great savings over individual testing in the number of tests needed, even for moderately high prevalence levels. However, there is a trade-off with higher number of testing stages, and increased variability. A web-based calculator is introduced to assist in weighing these factors and to guide decisions on when and how to pool under various conditions. High-performance distributed computing methods have also been implemented for considering larger pool sizes, when savings from group testing can be even more dramatic.


C. Tatsuoka and others
Note that state j k if the collection of negative subjects associated with state j contains all the negative subjects associated with state k.The collection of these states thus form what will be referred to as a powerset lattice.
Let the collection of classification states S be a finite lattice, with an unknown true state.In brief, recall that a lattice is a partially ordered set such that any two elements have both a unique least upper bound (join) and a unique greatest lower bound (meet).For two elements i, j ∈ S, their join and meet are respectively denoted as i ∨ j and i ∧ j.The up-set of an element i is  i = {j ∈ S : i j}, and the down-set of i is  i = {j ∈ S : j i}.More generally, for a set I ⊆ S,  i = {j ∈ S : there exists i ∈ I such that i j}, and  I = {j ∈ S : there exists i ∈ I such that j i}.A top element 1 is an element such that for any i ∈ S, i 1 Similarly, a bottom element 0 is an element such that for any i ∈ S, 0 i.Both a top and bottom element exist in a finite lattice.For i ∈ S, define C i to be the set of covers of i, where y ∈ S is a cover of an element i if i < y and there does not exist z ∈ S such that i < z < y.Also, let D i denote the set of anti-covers, in other words the set of states y ∈ S such that for y < i, there does not exist z ∈ S such that y < z < i.In Figure 1a of the main text, note that the anti-covers for state AB are the states A and B, while state AB is the only cover for both states A and B. In general, for a powerset lattice, the covers of a true state s are the states associated with one more negative subject, while its anti-covers are the states that are associated with one less negative subject.Let E be a finite collection of pooled tests.Denoting X as the random variable being observed for an experiment e ∈ E, let f (x|e, j) be the class conditional probability density for j ∈ S.
Note that there is a one-to-one correspondence between elements in E and S\ 0, so that for instance e = j for some j ∈ S implies that the set of subjects out of N that are represented as negative by j are the corresponding samples being pooled.Please note that here we do not use the notation ẽ for experiments, as in the main text.In establishing asymptotic results, assume each Page 27 of 46 http://biostatistics.oupjournals.com/Manuscript Submitted to Biostatistics e ∈ E can be replicated an unlimited number of times.Also, let K(f, g) represent the Kullback-Leibler information for distinguishing distribution f and g when f is true.Throughout, it will be assumed that for all e ∈ E, there exists j 1 , j 2 ∈ S such that K(f (•|e, j 1 ), f (•|e, j 2 )) > 0. In this case, e is said to distinguish states j 1 and j 2 .Also, assume that for any j 1 , j 2 ∈ S and e ∈ E, Denote s ∈ S as the unknown true state.For j ∈ S, let π 0 (j) denote the prior probability that j = s.Assume π 0 (j) > 0 for all j ∈ S. Given that at the first stage a test e 1 ∈ E is selected, and a random variable Inductively, at stage n for n > 1, conditionally on having chosen tests e 1 , e 2 ,. . ., e n−1 , and The posterior probability that j = s then becomes The posterior probability distribution on S at stage n will be denoted by π n .
As seen in Tatsuoka and Ferguson (2003), it is necessary and sufficient to distinguish s from all other states in order for π n (s) → 1 almost surely.Letting n e be the number of times e is administered up to stage n, the limiting proportion that e is administered is denoted as p e , with p e = lim inf n e /n.It is desirable to seek rules that sequentially select tests, e 1 , e 2 , . .., in the appropriate limiting proportions so that the posterior probability of the true state s, π n (s), converges almost surely to 1 at the fastest possible, or optimal rate.An optimal strategy is comprised of a collection of limiting proportions associated with each e ∈ E such that administration of the tests in their respective limiting proportions leads to convergence at the optimal rate.From Theorem A.2 in Tatsuoka and Ferguson (2003), it is argued that for k ∈ S, k = s, The right-hand side denotes the rate at which π n (k) → 0 as n → ∞.This rate depends on

A.2 Dilution Effects
Consider the following statistical formulation.Again, suppose the collection of tests E corresponds to all possible non-empty subsets of subjects and hence to S\{ 0}.In other words, every possible combination of the N can be pooled and tested.For e ∈ E and j ∈ S, assume that where |e| denotes the number of samples in pool test e, and where r is the number of positive samples in pool test e given j is the true state.
Note when e j, this implies r = 0.This formulation allows for response distributions to vary depending on how many positive samples are in a pool, and according to pool size.When (iii) Moreover, suppose the functions in (A2) are non-increasing in r for fixed r and |e|, with for each r 1, A.3 Optimal designs and optimal rates of convergence The following examples illustrate issues that arise in determining optimal strategies under dilution effects, which are established in Theorem A.1.An emphasis is on how the lattice structure determines the difficulty in classification.Given conditions (A1)-(A3), from a lattice-theoretic point of view, it is necessary and sufficient to distinguish a true state from its covers and anticovers (the states that directly surround it) in order to distinguish the true state from all the others.
Example A.1 Suppose that S is the lattice in Figure 1a, and that s = 0 (all subjects are positive), as in part (i) of Theorem A.1.The covers of s are the atoms of the lattice, in other words the states that respectively represent only one of the samples being negative, C s = {A, B}.
Consider now possible strategies for distinguishing s from its covers.One approach would be to individually test each subject.For instance, if sample from A is tested individually, then for This follows since if state A or B are true, there is one positive sample in the pool, while given s is true, both sample are positive.Hence, more than one cover at a time can be distinguished from s through pooled experiments.Because of dilution effects (i) and (iii), K(1, 0, 1) K(2, 1, 2), so individual testing may lead to more efficient discrimination between s and its covers, while pooling A and B has the advantage of distinguishing s from A and B simultaneously.Hence, in the presence of dilution effects, there is a trade-off between these testing approaches.Theorem A.1 resolves such trade-offs in terms of optimizing the rate of convergence.Note also that by dilution effect (ii), K(2, 0, 2) K(2, 1, 2), so state AB is distinguished from s at least as efficiently as states A and B when samples from A and B are pooled.Under individual testing, state AB is distinguished whenever states A or B are distinguished.Hence, with either approach, π n (AB) will converge to zero at a rate at least as fast as either π n (A) or π n (B).
Example A.2 In part (ii) of Theorem A.1, it is supposed that the true state s = 1 (all subjects are negative).In Figure 1a, state AB = 1.In distinguishing s from its anti-covers, D s = {A, B}, again more than one strategy must be considered.have distribution f 0,1 since A would be negative if either is the true state, while states B and 0 would have distribution f 1,1 , as A is positive if either of those states is true.Hence, the anti-cover B is distinguished.Similarly, the anti-cover A is distinguished when B is tested alone.Due to the increasing pool size, K(0, 1, 1) K(0, 1, 2), as reflected by dilution effect (i).Again, because of the presence of a dilution effect, there is a trade-off to consider in terms of distinguishing both anti-covers at once, but perhaps less efficiently than distinguishing them one at a time.As one would expect, it will be seen that as the dilution effect gets stronger, it becomes more attractive to test smaller pools.
When 0 < s < 1, more complex situations can arise.This is because in distinguishing covers, anti-covers can be distinguished as well when positive samples are pooled together with negative ones.However, when it is attractive to do so, there do not exist general closed form solutions of optimal strategies for such cases, as many contingencies can arise.It will instead be assumed that for r positives, 1 r N − |s|, and k negatives, As will be argued in Theorem A.1, this condition insures that for an optimal strategy, it can be assumed that the optimal value of k is k * = 0, and hence that asymptotically, positives and negatives are not tested together.When r > 0 and k = 0, a corresponding pooled test distinguishes covers but not anti-covers.Hence, given (A1)-(A4), when S is a powerset lattice and 0 < s < 1, it will be shown that determining optimal rates of convergence involves identifying an optimal value r * , the number of positives to be pooled to distinguish covers, and j * , the optimal size of pools comprised only of negatives, to distinguish the anti-covers.In contrast, when there is no dilution effect, as in Tatsuoka and Ferguson (2003), the optimal strategy is to distinguish covers by individually testing positives (r * = 1, k * = 0), and to distinguish anti-covers by pooling Page 32 of 46 http://biostatistics.oupjournals.com/Manuscript Submitted to Biostatistics negatives all at once (j * = |s|).
In Theorem A.1, it is assumed that the true state is known.Clearly, in practice, it is unknown, and in fact determining its identity is the primary objective of classification.Still, the following results have direct practical relevance.After, an optimal rule for selecting pooled tests will be established.In order to be optimal, a rule must first be convergent in the sense that the true state is eventually identified almost surely.In addition, it must then also eventually adopt an optimal strategy for selection of tests, corresponding to whatever state is true.Thus, Theorem A.1 characterizes the pooling sequences that must eventually be adopted almost surely under the various scenarios that can arise.
For 0 |s| < N , let with 1 j * |s| attaining the value of R * d .
Theorem A.1 Suppose that S is a powerset lattice of N subjects, and let s ∈ S denote the true state.Suppose that tests in E satisfy (A1), dilution effects (i)-(iii), and (A4).
(i) If s = 0, the optimal rate of convergence is R * c , with |s| = 0.This rate is attained if each possible subset of r * subjects are pooled in equal limiting proportion 1/ N r * .
(ii) If s = 1, the optimal rate of convergence is R * d , with |s| = N .This rate is attained if each possible subset of j * subjects are pooled in equal limiting proportion 1/ N j * .

Manuscript Submitted to Biostatistics
This optimal rate is attained if each subset of r * positive subjects are pooled and tested in equal limiting proportion and all tests e s, |e| = j * (pools of size j * consisting only of negatives) are administered in equal limiting proportion The respective given optimal strategies are not necessarily unique.When j * or r * are not unique, the optimal rate can be attained by any mixture of optimal allocations associated with each of the values.If E is restricted by bounds on pool size, Theorem A.1 can straightforwardly be extended by optimizing j * and r * with respect to the corresponding constrained values.
A.1 gives demarcations for when it is optimal to pool subjects under dilution effects.This result is in some sense analogous to that in Ungar (1960), which states that when there is no testing error with binary outcomes, it is preferable to individually test when the proportion of positive subjects is greater than (1/2)(3 − √ 5); otherwise, group testing is preferred.In part (i) with s = 0 (all subjects are positive), the demarcation for when it is optimal to individually test subjects is that for all 1 < r N − |s|, This condition follows by comparing rates of convergence for the covers of s = 0 when r subjects are tested at a time.It is a regulation on dilution effect (iii).In part (ii) when s = 1, it is optimal to pool all the subjects if for 1 j N − 1, More generally, it is optimal to do some form of pooling as long as 1 < j * N .Condition subjects as pool size increases, and hence relates to dilution effect (i).If this decrease does not occur too quickly, it is optimal to pool all subjects, which are all negative when s = 1.These demarcations will be illustrated below, as well as for when pooling is preferred when 0 < s < 1.
Given the presence of dilution effects, it will be of particular interest to demarcate how strong the effects must be to alter the optimal strategies from when there is no-dilution effect.If (A5) and (A6) hold, the respective optimal strategies for when s = 0 and s = 1 correspond to the no-dilution effect case.The following examples indicate that the same optimal strategy as with no dilution effects is still optimal even in the presence of significant dilution effects.
Example A.3 Suppose s = 1, let N be the number of objects to be classified, and suppose sensitivity decreases with pool size, but specificity stays constant.Let f r,|e| be Bernoulli density functions, with p 0,|e| = 0.99 for all 1 |e| N , and q 1,1 = 0.99.Following (A6), note that if N = 15, in terms of the rate of convergence, group testing all fifteen subjects is preferred if q 1,15 > 0.294 and (A6) holds for the other values of j < 15.Further, (A6) would still be satisfied at j = 30 if q 1,30 > 0.174.For j = 100, (A6) would still hold if q 1,100 > 0.073.Hence, under these conditions, (A6) should hold in most practical applications.
Suppose f 0,|e| is the density for the standard normal distribution for all e, and f 1,|e| is the density for a normal distribution with mean µ 1,|e| and variance 1. Assume that (A6) holds for j < N .Letting µ 1,1 = 3.0, it can be seen that when N = 15, it is still more attractive to group test all objects if µ 1,15 > 0.775, when N = 30, µ 1,30 > 0.548, and when N = 100, if µ 1,100 > 0.300.
Example A.4 Consider when s = 0.The demarcation in (A5) can similarly be illustrated nu- instance, sensitivity q r,r = 0.99 for r 1, and specificity p 0,|e| = 0.99 for all 0 < |e| N .For this case, the demarcation is the same as in example A.3.Thus, assuming (A5) holds for all j < N , (A5) is also satisfied for N = 15, 30 and 100 when respectively q 14,15 < 0.294, q 29,30 < 0.174, and q 99,100 < 0.174.This example illustrates that the dilution effect (iii) can be quite strong, and yet the no-dilution effect strategy of eventually individually testing positive subjects is not affected.
Given (A1), dilution effects (i)-(iii), and 0 < s < 1, a simple condition that is sufficient for for 2 r N − |s|.This condition is essentially the same as (A5), except (A7) applies to a smaller range of r.Hence, if (A5) is already established, (A7) follows as well.The numerical demarcation in example A.4 is thus applicable to (A7), and indicates that (A7) can hold generally.
If the following condition holds, along with (A7), testing positives individually is optimal: for all Note that (A8) is a special case of (A4) with r = 1.Finally, given (A7) and (A8), it follows that distinguishing the anti-covers is conducted by pooled tests consisting of negatives of size j * .For Note that (A6) implies (A9).Indeed, substituting N with |s|, note that example A. In sum, Examples A.3 and A.4 illustrate that dilution effects, unless severe, do not generally alter the optimal strategy of eventually pooling all negative subjects, while individually testing each of the positive subjects.One practical ramification that these results suggest is that, given low prevalence of positive samples, it is attractive to initially pool as many samples as possible, in spite of dilution effects, since the top element is most likely the true state a priori.In the next section, it will be seen that when (A5), ( A6) and (A8) hold, a simple and intuitive rule for dynamically selecting pools will be optimal in the sense that optimal strategies will be selected eventually, corresponding to the unknown true state, with probability one.

A.4 Optimal Pooling Selection Under Dilution Effects
Now let us consider rules for pooling selection that, given the observed outcomes to previously administered pools and prior information, select the composition of the next stage pool to test.It is desirable to have a rule that adopts optimal strategies and attains optimal rates almost surely, no matter which state is true.It will be seen in this section that an intuitive and simple rule, a halving algorithm, attains optimal rates of convergence when the optimal strategy coincides with the no-dilution effect case, and hence even under strong dilution effects.For instance, if (A5), (A6) and (A8) hold and s = 1, then it is desirable for a pooling selection rule to eventually pool all objects.If 0 < s < 1, then we would want the rule to lead to sequences of pools that eventually consist of all the negative subjects, or individually tests the positive subjects.

13
(1) will attain the optimal rate almost surely.

A.5 Bayesian Estimation of Prevalence Through Group Testing
Based on the above-described Bayesian group testing framework, prevalence can be estimated as mixtures of posterior distributions.This approach reflects the uncertainty from test response error, and can employ pooled test results.Assume a binomial distribution.Suppose that it is of interest to estimate the proportion of positives in a target population, denoted by θ, based on group testing data.Given there is uncertainty as to the positivity status of subjects, a natural estimate could rely the information provided by π n .For instance, given a conjugate Beta prior distribution for θ, f (a, b), the posterior distribution for θ given the observed group testing outcomes is a mixture of posterior Beta densities with respect to π n , where the Beta densities are updated depending on which state is true: For the powerset lattice, |j| is the number of negatives out of the N objects given state j is true.
Note that this mixture will converge to the posterior distribution for θ that would be obtained if the correct diagnoses for all the N subjects were known exactly, given π n (s) converges to 1 almost surely for true state s.
As a simple illustration, suppose as in Example A.1 that N=2.We are interested in estimating Consider first when s = 1.The case when s = 0 follows similarly.For z ∈  D s \D s , note that since S is a powerset lattice, there exists x, y ∈ D s such that z x ∧ y.Hence, z is distinguished from s whenever x and y are distinguished.Further, by (A2), the rate of convergence for terms in  D s is thus slowest for terms in D s .We thus need only consider the terms in D s in establishing the optimal rates.Moreover, it can be established that if all experiments in {e : |e| = j, e s}, 1 j N , are each administered in limiting proportion 1/ N j , the rate of convergence is (j/N )K(0, 1, j).Denote such an allocation as a(j, N ).
It will now be shown that if the maximin rate is obtained by administering r positives with k = 0 negatives rather than if k > 0. Hence, for states in C s ∪ D s , it is optimal to consider only strategies with k = 0.
If j is incomparable to s, then there must be an object k 1 that is negative if s is true but that is positive if j is true.Let d 1 ∈ D s be the anti-cover associated with k 1 in the sense that the objects associated as negative for d 1 correspond to all the negatives for s except k 1 .When d 1 is so distinguished, so is j.Further, if k = 0 when distinguishing covers, j converges at least as fast as d 1 .Hence, only states in C s ∪ D s would determine the optimal rate, and so it is optimal to only consider strategies with k = 0. Following as above, only pools with r * positives and k = 0 negatives will be administered to distinguish covers, along with pools comprised only of j * negatives to distinguish anti-covers.Respective limiting proportions that insure that posterior probability terms for states in C s ∪ D s converge at the same rate, such as stated in (iii), are optimal.
Proof. of Theorem A.2 Define p j (n) = n j /n, where n j represents the number of times j ∈ E is administered through stage n, and p js (n) = n js /n, where n js represents the number of times j ∈ S\{s} is distinguished from s through stage n.Also, let G cs be the minimal elements in K cs .For a powerset lattice, G cs corresponds to the atom associated with the subject that is negative for c but positive for s.
First note that the halving algorithm is convergent in the sense that π n (s) → 1 almost surely,

f
r,|e| is a Bernoulli density, let p r,|e| be the probability that the outcome indicates that no positive samples are present given r positive samples are present and pool size is |e|.Note then that for r = 0 ( no positives), p 0,|e| represents the specificity of a test of pool size |e|.Also, for r 1, q r,|e| = 1 − p r,|e| is the sensitivity of the test when r positive samples are present in a pool of size |e|.In this section, let K(r 1 , r 2 , |e|) = K(f r1,|e| , f r2,|e| ).The following conditions, based on Kullback-Leibler information, are given to reflect the Page 29 of 46 http://biostatistics.oupjournals.com/Manuscript Submitted to Biostatistics presence of dilution effects: (i) Suppose that both K(r, r − r , |e|) and K(r, r + r , |e|) 1, |e|) > 0 and K(1, 0, |e|) > 0. (ii) Suppose also that the Kullback-Leibler information values are non-decreasing in r for fixed r 0 and |e|, with respectively 0 r r, and 0 r |e| − r.

(
A6) regulates the decrease in efficiency of detecting a single positive subject versus no positive 3 illustrates that j * = |s| except under strong dilution effects.In practice, checking conditions (A5) and (A6), Page 36 of 46 http://biostatistics.oupjournals.com/Manuscript Submitted to Biostatistics and (A8) with |s| = N , is sufficient for determining for all states whether or not corresponding strategies are altered.
http://biostatistics.oupjournals.com/Manuscript Submitted to Biostatistics the {A, AB}, since A is negative, the distribution of response is f 0,1 .If the true state is in {  A} c = { 0, B}, then subject A would be positive, and hence the corresponding distribution would be f 1,1 .Hence, this test distinguishes s = 0 from A and AB.Similarly, states B and AB can be distinguished through individually testing B. In sum, when testing subjects individually, the corresponding atoms are positive distinguished one test at a time.However, when response distributions depend on the number of positives in a pool, the atoms also can be distinguished from the state s = 0 by pooling objects.Note if subjects A and B are pooled, then either state A or B being true would result in the test having distribution f 1,2 , while for s = 0 it would be f 2,2 .
Figure A.1: Standard deviations of number of tests ranging pool size from 6 to 16 over 3 prior settings and k = 1 to 4.