SARS-CoV-2 Distribution in Residential Housing Suggests Contact Deposition and Correlates with Rothia sp.

ABSTRACT Monitoring severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) on surfaces is emerging as an important tool for identifying past exposure to individuals shedding viral RNA. Our past work demonstrated that SARS-CoV-2 reverse transcription-quantitative PCR (RT-qPCR) signals from surfaces can identify when infected individuals have touched surfaces and when they have been present in hospital rooms or schools. However, the sensitivity and specificity of surface sampling as a method for detecting the presence of a SARS-CoV-2 positive individual, as well as guidance about where to sample, has not been established. To address these questions and to test whether our past observations linking SARS-CoV-2 abundance to Rothia sp. in hospitals also hold in a residential setting, we performed a detailed spatial sampling of three isolation housing units, assessing each sample for SARS-CoV-2 abundance by RT-qPCR, linking the results to 16S rRNA gene amplicon sequences (to assess the bacterial community at each location), and to the Cq value of the contemporaneous clinical test. Our results showed that the highest SARS-CoV-2 load in this setting is on touched surfaces, such as light switches and faucets, but a detectable signal was present in many untouched surfaces (e.g., floors) that may be more relevant in settings, such as schools where mask-wearing is enforced. As in past studies, the bacterial community predicts which samples are positive for SARS-CoV-2, with Rothia sp. showing a positive association. IMPORTANCE Surface sampling for detecting SARS-CoV-2, the virus that causes coronavirus disease 2019 (COVID-19), is increasingly being used to locate infected individuals. We tested which indoor surfaces had high versus low viral loads by collecting 381 samples from three residential units where infected individuals resided, and interpreted the results in terms of whether SARS-CoV-2 was likely transmitted directly (e.g., touching a light switch) or indirectly (e.g., by droplets or aerosols settling). We found the highest loads where the subject touched the surface directly, although enough virus was detected on indirectly contacted surfaces to make such locations useful for sampling (e.g., in schools, where students did not touch the light switches and also wore masks such that they had no opportunity to touch their face and then the object). We also documented links between the bacteria present in a sample and the SARS-CoV-2 virus, consistent with earlier studies.


Preparing Revision Guidelines
To submit your modified manuscript, log onto the eJP submission site at https://msystems.msubmit.net/cgi-bin/main.plex. Go to Author Tasks and click the appropriate manuscript title to begin the revision process. The information that you entered when you first submitted the paper will be displayed. Please update the information as necessary. Here are a few examples of required updates that authors must address: • Point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER. • Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file. • Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file. • Manuscript: A .DOC version of the revised manuscript • Figures: Editable, high-resolution, individual figure files are required at revision, TIFF or EPS files are preferred ASM policy requires that data be available to the public upon online posting of the article, so please verify all links to sequence records, if present, and make sure that each number retrieves the full record of the data. If a new accession number is not linked or a link is broken, provide production staff with the correct URL for the record. If the accession numbers for new data are not publicly accessible before the expected online posting of the article, publication of your article may be delayed; please contact the ASM production staff immediately with the expected release date.
For complete guidelines on revision requirements, please see the journal Submission and Review Process requirements at https://journals.asm.org/journal/mSystems/submission-review-process. Submission of a paper that does not conform to mSystems guidelines will delay acceptance of your manuscript.

Summary/Overview:
This paper by Cantú et al. advances our understanding of the best surfaces for surveillance swabbing for SARS-CoV-2 in the built environment and is very timely. The data on surfaces was compelling and would improve surveillance swabbing. However, I am very skeptical of the presented results using total raw 16S read counts from each sample. Additionally, I have some concerns on the normalization methods used for the Differential Abundance and Random Forest Classification. I expanded on these points in the major revisions section. My other comments are all minor. Overall, I enjoyed reading this paper and found it very interesting.

Major Revisions:
-Line 128-129: Why were total read counts used as a proxy for biomass? Due to the compositional nature of sequencing, total 16S reads are not a good proxy for total biomass; and I highly recommend the authors either remove these results or consider replacing it (for instance, with 16S qPCR data). I recognize that adding qPCR data would add significant time to the turnaround of this manuscript; and do think removing Fig S3A and B would be a reasonable alternative. -SI Line 443: Was any normalization on the unrarefied feature table used for the differential abundance analyses? I do not trust comparisons between samples that have not been normalized in some fashion. -SI Line 453: What data was used to build the machine learning model? Were rarefied read counts or another normalization used? I would appreciate additional details to understand what data went into this model and to fully access its validity.
Minor Revisions: -Line 106: I think this should refer to "Table S1" not " Table 1".
-Line 107-108 (and 132-134): There was a large difference in the number of positive samples in Apartments A and C compared to B. However, this wasn't explored in the manuscript. It would strengthen the manuscript to discuss here (or later in the discussion), if there were any specific reasons that might explain this. -Line 116: I really like the maps to visualize the sampling around each apartment.
However, I think it would improve the results to add another visualization to illustrate the sentence here. Additionally, it would be useful to contextualize this with other papers on surfaces in the indoor environment in the discussion. -Line 123-124: The use of "features" at the start of line 124 is unclear to me (suggests that it was rarefied based on taxa or something else). Should this be "rarefied to 4000 sequences" or something similar instead? -Line 143: not sure about this method -Line 167-169: It would be useful to cite if Corynebacterium is a common skin bacterium.
-Line 167: Corynebacterium should be italicized here.
- Figure S2: It would be more clear if in the caption the authors defined what the "+" and "-" signs stand for (I assumed that it is for SARS-CoV-2 positive or not). . We believe the manuscript has been significantly strengthened due to these changes. Below is a point-by-point response: Summary/Overview: This paper by Cantú et al. advances our understanding of the best surfaces for surveillance swabbing for SARS-CoV-2 in the built environment and is very timely. The data on surfaces was compelling and would improve surveillance swabbing. However, I am very skeptical of the presented results using total raw 16S read counts from each sample. Additionally, I have some concerns on the normalization methods used for the Differential Abundance and Random Forest Classification. I expanded on these points in the major revisions section. My other comments are all minor. Overall, I enjoyed reading this paper and found it very interesting.
We thank the reviewer for their time and for carefully reviewing our manuscript. We have taken their constructive comments and edited our manuscript appropriately.
Major Revisions: -Line 128-129: Why were total read counts used as a proxy for biomass? Due to the compositional nature of sequencing, total 16S reads are not a good proxy for total biomass; and I highly recommend the authors either remove these results or consider replacing it (for instance, with 16S qPCR data). I recognize that adding qPCR data would add significant time to the turnaround of this manuscript; and do think removing Fig S3A and B would be a reasonable alternative.
We decided to remove the total read count observations (Sup. Fig. S3A-B) as recommended, and instead focused our alpha diversity analysis on differences in Faith's phylogenetic diversity between different sample groupings. (Lines 136 -138) (revised Sup. Fig. S3).
-SI Line 443: Was any normalization on the unrarefied feature table used for the differential abundance analyses? I do not trust comparisons between samples that have not been normalized in some fashion. The multinomial regression method applied is appropriate for unrarefied compositional data; it employs a centered log-ratio transformation of the feature space. We included explicit discussion of the centered log-ratio transformation, and relevant references both in the main text (Lines 149-154) and the supplementary information (SI Lines 482-487).
-SI Line 453: What data was used to build the machine learning model? Were rarefied read counts or another normalization used? I would appreciate additional details to understand what data went into this model and to fully access its validity.
Random Forest machine learning models were trained on rarefied feature tables (same feature tables described in lines 127 -134 and used for the microbiome diversity analyses).
We have made this explicit in the main text (Line 147) and supplementary information (SI Lines 495-502).
Minor Revisions: -Line 106: I think this should refer to "Table S1" not " Table 1".
We have changed this, and appreciate the correction.
-Line 107-108 (and 132-134): There was a large difference in the number of positive samples in Apartments A and C compared to B. However, this wasn't explored in the manuscript. It would strengthen the manuscript to discuss here (or later in the discussion), if there were any specific reasons that might explain this. Unfortunately, we were not able to identify a verifiable explanation for the lower rate of detection for Apartment B, as this was outside of the scope of our experimental design. However, we expanded the discussion concerning the results of Apartment B, highlighting that the detection events in this apartment closely mirrored those seen in the other 2 apartments (Apartments A & C), and in the literature (Lines 173-176).
-Line 116: I really like the maps to visualize the sampling around each apartment. However, I think it would improve the results to add another visualization to illustrate the sentence here. Additionally, it would be useful to contextualize this with other papers on surfaces in the indoor environment in the discussion. We included an additional supplementary table (Sup . Table S2) to summarize the observations drawn from the 3D maps related to high-touch vs low-touch surfaces. We also described trends surrounding rates of positivity across these different types of surfaces (hightouch, low-touch) and floors in the main text (Lines 111-115), and contextualized this with references to similar results in the literature (Lines 173-176). We thank the reviewer for the suggestion.
-Line 123-124: The use of "features" at the start of line 124 is unclear to me (suggests that it was rarefied based on taxa or something else). Should this be "rarefied to 4000 sequences" or something similar instead?
We have included a parenthetical description of the "features", with relevant citation (Line 132).
3 -Line 143: not sure about this method We have expanded on the description of this method, which is appropriate for compositional data and has proven to outcompete other popular differential abundance methods in microbiome analyses, in the main text (Lines 149 -154) and supplementary information (SI Lines 483-484).
-Line 167-169: It would be useful to cite if Corynebacterium is a common skin bacterium.
We have included relevant references that list Corynebacterium as a common human skin microbe. (Lines 185-187).
-Line 167: Corynebacterium should be italicized here.
We have corrected this.
- Figure S2: It would be more clear if in the caption the authors defined what the "+" and "-" signs stand for (I assumed that it is for SARS-CoV-2 positive or not). This is a great suggestion, and we have clarified that "+" = SARS-CoV-2 positive, "-" = SARS-CoV-2 negative.
- Figure S3: For clarity, I would be consistent in the use of Kruskal-Wallis and Mann-Whitney U tests (the figure caption says Mann-Whitney U, while the Alpha Diversity section of the SI uses Kruskal-Wallis). We appreciate the correction, and have clarified in the Supplementary Information (SI Lines 467-468). Faith's phylogenetic diversity comparisons across different sample groupings were done with Mann-Whitney U tests.
-SI Line 471: I found the Phylogenetic Tree visualization section unclear. Does the phylogenetic tree only show the top 32 important features? My understanding from the figure caption was that it included more than these, but just highlighted the important features in the inner and outer ring. It would be useful to expand this section to more clearly describe what was plotted and how the phylogenetic tree was created.