Extracted Text Content in Record:

First 5000 Characters:Group testing saves time and resources by testing each preassigned group instead of each individual, and onestage group testing emerged as essential for costeffectively controlling the current COVID19 pandemic. Yet, the practical challenge of adjusting pooling designs based on infection rate has not been systematically addressed. In particular, there are both theoretical interests and practical motivation to analyze onestage group testing at finite, practical problem sizes, rather than asymptotic ones, under noisy, rather than perfect tests, and when the number of positives is randomly distributed, rather than fixed. Here, we study noisy group testing under the probabilistic framework by modeling the infection vector as a random vector with Bernoulli entries. Our main contributions include a practical onestage group testing protocol guided by maximizing pool entropy and a maximumlikelihood recovery algorithm under the probabilistic framework. Our findings highlight the implications of introducing randomness to the infection vectors we find that the combinatorial structure of the pooling designs plays a less important role than the parameters such as pool size and redundancy.
Group testing is a procedure to find positives in a cohort by applying tests for the presence of any positives to cohort subsets (groups), instead of testing each individual separately. When the fraction of positives among the samples is low, implementing group testing saves time and resources. Group testing has applications in genetics [21] , drug screening [13] communications [25] and epidemiology [23] .
Onestage group testing [6] addresses the scenario of groups being set in advance, independently of test results a common practical requirement, e.g. due to testing speed constraints. Existing algorithms on infection vector recovery for noisy group testing under the combinatorial prior (the number of positives among samples is fixed) include LP relaxation [16] , belief propagation [20] , Markov Chain Monte Carlo (MCMC) [9] , Noisy Combinatorial Orthogonal Matching Pursuit (NCOMP) [4] , separate decoding [18] and Definite positives (DD) [19] . However, group testing under a fully probabilistic framework has not been extensively studied. In particular, to the best of our knowledge, there has been a lack of work on nonasymptotic results on noisy group testing in the realistic scenario where number of positives in not known in advance, as in [2, 3] , but rather is a random variable.
As part of the global efforts to control the COVID19 pandemic, there has been an emerging body of work on implementing onestage group testing for COVID19 [10, 11, [22] [23] [24] . However, testing scenarios vary significantly due to fluctuating infection rates across time and geography often caused by emerging variants, as well as different testing objectives, such as screening of health care workers versus large scale monitoring of the community. Group testing protocols should thus be adjusted according to the infection rate among the tested samples, measurement error rates, and the recovery error tolerance level [5] .
As a result, systematically studying onestage noisy group testing with a random number of positives is of both theoretical interests and practical importance. In this paper, we study noisy group testing under the probabilistic framework in order to address the practical challenges posted by group testing implementation for COVID19.
The rest of the paper is organized as follows. We begin with an introduction of the noisy group testing under the probabilistic framework in Section 2, including a group testing protocol and a novel recovery algorithm under the probabilistic framework. The performance of the recovery algorithm and the pooling designs is in Section 3. Finally, we conclude with a discussion on future work in Section 4.
We assume a tested cohort of n individuals, with some infection rate f among the population the cohort is sampled for. Note, that the actual number of true positives is not known in advance, as combinatorial priors unrealistically assume [2, 3] . Also, this is the ground truth rate of infected individuals, as opposed to the observed positivity rate. Specifically:
We aim to design a protocol to test n samples with t < n tests. We arrange the pool assignments into a pooling matrix M ∈ {0, 1} t×n such that M ij = 1 if and only if individual j is included in pool i. Notice that each row i sum of M correspond to a pool size s i , and each column j sum corresponds to the number of pools that a sample participates, which we define as redundancy r j . Following [1] and [8] , we focus on the following four classes of pooling designs that are of practical interests (see Fig. 1 Under the noiseless setting, a test result is negative if and only if all samples in the pool are negative. In practice, test results might suffer from measurement errors. Suppose the test we use for each pool has a false negative rate of f 0 a 