Smart Pooled sample Testing for COVID-19: A Possible Solution for Sparsity of Test Kits

There is an exponential growth of COVID-19. The adaptation of preventive measures to limit the spread of infection among the people is the best solution to this health issue. The identification of infected cases and their isolation from healthy people is one of the most important preventive measures. In this regard, screening of the samples from a large number of people is needed which requires a lot of reagent kits for the detection of SARS-CoV-2. The use of smart pooled sample testing with the help of algorithms may be a quite useful strategy in the current prevailing scenario of the COVID-19 pandemic. With the help of this strategy, the optimum number of samples to be pooled for a single test may be determined based on the total positivity rate of the particular community.


Introduction:
Over the past few months, COVID-19 has attacked more than two million people around the globe and it has caused more than one hundred and fifty thousand deaths.
The rapid spread of COVID-19 occurred due to rapid transmission from infected to healthy persons. The causative agent of Coronavirus disease (COVID-19) has been named as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Person to person transmission of SARS-CoV-2 as this pathogen is present in the respiratory droplets of the infected person and respiratory droplets containing viruses may spread the virus to healthy persons [1][2]. Some persons infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) may have a mild disease or may be asymptomatic [3][4]. The apparently healthy people with COVID-19 can be a source of the spread of SARS-CoV-2 to the community [5]. The detection of such cases in the community has got vital importance in the prevention of viral spread. The mass level screening of the community for the identification of infected cases is a very important step to control the damage to the health of the community.
For the detection of COVID-19, the PCR test is performed on the specimens acquired from the nasal swab of asymptomatic person and bronchial secretion of severely ill patients. This test is not routinely performed in all laboratories and it required special instruments and specific expertise along with specific kits for the detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in patients of COVID-19.
There is less availability of kits and diagnostic facilities for the vast majority of the people around the world. The smart utilization of resources may be of vital importance in controlling the pandemic. In this regard, the possibility of pool sample testing may be explored which may be helpful in reducing the consumption of less . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) available kits of this pathogen and it may also reduce the workload over the overburdened laboratory staff.
Data on COVID-19 testing from various countries revealed that in the mass testing for the screening purpose, the majority of the test is negative. The pooling of the negative sample will reduce kit consumption without affecting the outcome. The possibility of mixing the positive specimen with the negative may result and it would affect the true results. The one possibility is that the positive result may become negative due to the dilution with the negative specimen. The PCR can detect the small number of pathogens and the specimen from positive patient usually contains a large number of viral particles, so the possibility of the positive specimen is to become negative by mixing with the negative specimen is very remote. The studies revealed that there is no significant effect of the pooling of negative specimens with the positive specimen in terms of the detection of the positive test by PCR methods [6]. Another study showed that the positive specimen may be detected even when it is diluted with up to 64 specimens [7].
The other limitation is that the mixing of one positive with a negative specimen will reveal positive results, and for the identification of the true positive patient, the tests will have to be repeated on all the specimens and it may consume more kits. To address these issues, the application of algorithms after gathering the data of previous test results from that country may be quite helpful in reducing the possible consumption of test kits.
The aim and objective of the present article are to find out the number of samples to be pooled in a particular community for the reduction of utilization of kits.
. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04.21.20044594 doi: medRxiv preprint

Materials & Methods:
In this study, we work on a research question that if we group people and perform the test for COVID-19 on a group, can we achieve any optimization in the utilization of recourse (in our case are test kits) required to perform tests.
To answer that question, we develop a hypothesis that if we made a group with the number of people that is some power of 2, and apply divide and conquer strategy, a significant optimization can be achieved. Following are the steps to perform divide and conquer strategy.

Steps:
1. Get the group size n 2. Perform a test on the whole group. 3. If the result of the whole group is negative, then infer that everyone in the group has a negative result. 4. If the result of the group is positive, it means that one or more than one member of the group has a positive result. 5. Divide the group into two equal subgroups. As n should be of the power of 2, hence dividing the group into two equal subgroups is possible. 6. Go to step 2 for both groups in parallel.

Let take an example, suppose we made a group of 8 (2 3 ) Figure 1: visual representation of divide and conquer strategy
As the tests for COVID-19 are very critical and we need 100% surety that no one infected by it, can skip because of basic assumptions of divide and conquer strategy.
Hence the standard calculation of divide and conquers can't be used. Therefore, we derived the formulae to give us approximate maximum tests needed to perform to be 100% sure everyone is tested in best, average and worse case scenarios.
. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04.21.20044594 doi: medRxiv preprint Let n is the total number of people in one group.

Best case scenario:
The best-case scenario is the scenario in which no one in the group have COVID-19.
Which means that the whole group's result will come negative. In this scenario, we will need no further tests.
In the best-case scenario, total tests need to be done are ࣮ ൌ 1

Worst Case Scenario:
In this scenario, 40 % or more people of the group are infected by SARS-CoV-2, and the samples of these people are uniformly distributed in the group. In the worst-case scenario, total tests need to be done are All the other possible scenarios, which do not fall in best or worse case scenarios, fall into average case scenarios.
In the average-case scenario, total tests need to be done are . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2020.

Exceptions:
For a group of 2, the following steps should be taken.
• Check the whole group.
• If the combined group result is positive then perform the test on only one sample.
• If his/her result is negative, then there is no need to perform the test on the other sample. Where if his/her result is positive, then it is mandatory to perform the test on the other sample as we are not sure if the other is positive or not.

Results:
As the scenarios are dependent upon the percentage of people who are infected by SARS-CoV-2. Hence the best way to reach a logical conclusion was to find the ratio between the number of tests conducted by any country, and how many people were reported positive. In this study, we are going to refer this term as Test to Positive Ration (TPR). For a better understanding, analyzing and to find the most optimal group set, we developed a simulator. The job of the simulator is to generate a random population with given TPR as input, and then using different group sizes perform the divide and conquer strategy, and give results.
. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

All the analysis is done using the concept that 1 test kit can do a test of one patient. The Simulator takes the following inputs:
Population size We used 10,000 TPR we used the TPR of the countries given in . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2020.   The simulation results show that if the TPR is directly linked with the number of tests needed to test a population of 10,000. It TPR is low, required tests will be low and if TPR is high then required tests will be high.
To get a better understanding of the TPR, the number of tests needed and which group size will be optimal for which TPR. We perform the number of tests on 100 different populations of size 10,000 using different TPRs.
Graph 2: Showing how many tests in percentage, one can save to tests a population of 10,000 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04.21.20044594 doi: medRxiv preprint from the upper respiratory tract (swabs from nasopharynx, oropharynx or aspirates from nasopharynx) or the lower respiratory tract (sputum, endotracheal aspirate, and bronchoalveolar lavage). The highest positive rate is seen in bronchoalveolar lavage specimens [8].
With the spread of COVID -19, There will be an increase in the requirement of kits for the detection of SARS-CoV-2. As the availability of test kits for SARS-CoV-2 are limited, the possibility of pool sample testing may be adapted to overcome the shortage of kit reagents. In the pool testing, the multiple specimens from different suspected cases are combined to perform them in a single test process, and this method is helpful in the improvement of a screening program for the community screening [9][10]. The important thing is pool sample testing is the number of tests to be included in the sample pool. If this number is not optimum, then there may be more utilization of test reagents and it would enhance the wastage of resources and efforts. The determination of an optimum number of specimens for the pooled sample testing is determined by the application of algorithms as discussed in the results. This application determines the optimum number of tests to be pooled for a single test and it will depend upon the total positive rate of COVID-19 cases in that particular community.
If the TPR is below 30, a significant optimization can be achieved resulting in performing fewer tests for every 10,0000 population. Results show that if the TPR is below or close to 10, higher group size is more beneficial. Similarly, a group size of 2 would be appropriate if TPR is 20 to 35, but in case TPR is more than 35 then the use of this grouping method would not be feasible, rather as it will add extra burden and cost.

Conclusion:
The application of algorithms to determine the appropriate number of specimens to be pooled for a single test would be a very cost-effective solution for the screening of community for the diagnosis of COVID-19.
. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2020. .