Bacterial microbiome and host inflammatory gene expression in foreskin tissue

The penile epithelial microbiome remains underexplored. We sequenced human RNA and a segment of the bacterial 16S rRNA gene from the foreskin tissue of 144 adolescents from South Africa and Uganda collected during penile circumcision after receipt of 1–2 doses of placebo, emtricitabine + tenofovir disoproxil fumarate, or emtricitabine + tenofovir alafenamide to investigate the microbiome of foreskin tissue and its potential changes with antiretroviral use. We identified a large number of anaerobic species, including Corynebacterium acnes, which was detected more frequently in participants from South Africa than Uganda. Bacterial populations did not differ by treatment received, and no differentially abundant taxa were identified between placebo versus active drug recipients. The relative abundance of specific bacterial taxa was negatively correlated with expression of genes downstream of the innate immune response to bacteria and regulation of inflammation. Our results show no difference in the tissue microbiome of the foreskin with short-course antiretroviral use but that bacterial taxa were largely inversely correlated with inflammatory gene expression, consistent with commensal colonization.


Introduction
The penile foreskin and its microbiome are important determinants of genital health, but the scientific literature is scant on the * Corresponding author.Center for Global Infectious Disease Research, Seattle Children's Research Institute, 307 Westlake Ave N, Suite 500, Seattle, WA 98109, USA.
typical bacterial constituents and their relationship to tissue inflammation.Previous work using swabs of the coronal sulcus or urethra has shown a predominance of anaerobic species in the microbiota [1][2][3][4][5][6][7][8][9][10][11] and reported associations between species such as Prevotella and increased mucosal inflammation and risk of Human Immunodeficiency Virus (HIV) acquisition [4,12].Following circumcision, the surface microbiota shifts to be dominated by more aerobic species as found on other skin surfaces, which are associated with decreased inflammation and reduced risk of HIV acquisition [6,13,14].
In addition to condom use [15,16], pre-exposure prophylaxis (PrEP, taking antiretroviral medications to prevent HIV infection) [17] is also effective at reducing HIV incidence in men, however the contribution of the penile microbiome to its mechanism has not been fully explored.The most commonly prescribed antiviral regimen for PrEP globally is an oral combination of two nucleoside reverse transcriptase inhibitors: emtricitabine (FTC), a cytidine analog, with one of two forms of tenofovir (TFV), an adenosine analog: tenofovir disoproxil fumarate (TDF) or tenofovir alafenamide (TAF) [18].Limited data also show a complex relationship of these ARVs with genital bacteria.When applied topically to the vagina, L. crispatus was shown to endocytose TFV then either actively metabolize or release it back into the environment [19].Similarly, Gardnerella vaginalis and other anaerobes have been shown to metabolize TFV [20] or block its entry into cells by secretion of adenine [19].Antiretrovirals may also theoretically alter bacteriophage populations which can dramatically reshape the bacterial component of the microbiome which they infect [21].At the rectal mucosa, small studies have investigated the effects of oral FTC with TDF on the bacterial microbiome and innate inflammatory pathways in men who have sex with men (MSM) and transwomen [22][23][24][25] but with varying results.
Within the CHAPS clinical trial, young men from Uganda and South Africa were randomized to 1 to 2 doses of placebo, FTC with TDF, or FTC with TAF, prior to medical penile circumcision.Foreskin tissue was collected and subjected to both 16S rRNA sequencing and RNASeq to characterize the bacterial microbiome and inflammatory gene expression.Additional studies in this cohort have shown differences in mitochondrial gene expression [26] but no differences in inflammatory gene expression [26] or lymphoid/myeloid cell density in the foreskin [27] between treatment arms.We hypothesized that there would be a relationship between the microbiota and inflammatory gene expression, but that short courses of PrEP, as utilized in a dose-finding trial, would not result in significant disruptions of either.

Study participant details
The Combined HIV Adolescent PrEP and Prevention Study (CHAPS) was a randomized controlled trial that enrolled 144 men living without HIV aged 13-24 years between 2019 and 2021 from the Chris Hani Baragwanath Academic Hospital in Soweto, South Africa (n = 72) and the Entebbe Regional Referral Hospital in Entebbe, Uganda (n = 72).Inclusion criteria were male sex at birth, hemoglobin >9 g/dL, weight >35 kg, two successive negative rapid HIV antibody tests, and clinical eligibility for surgical circumcision [28].Exclusion criteria were conditions precluding circumcision or receipt of the study medications.Participants were randomized to placebo versus FTC with either TDF or TAF for 1-2 days prior to surgical penile circumcision to investigate ARV dosing for on-demand PrEP.

Study procedures and specimen collection
All participants underwent a physical exam at study entry and completed survey instruments including sexual history at the randomization visit.At the circumcision visit, they provided midstream urine for Chlamydia trachomatis (CT) and Neisseria gonorrhea (GC) testing via nucleic acid amplification testing (NAAT) prior to surgery.If an asymptomatic sexually transmitted infection was diagnosed, antibiotic treatment was prescribed at the post-operative visit.
Penile circumcision was performed using the dorsal slit method and the removed prepuce was placed immediately in cold Dulbecco's Modified Eagle Medium and shipped on ice within 1 h (median 30 min) to the local laboratories in Uganda (Medical Research Council/Uganda Virus Research Institute) and South Africa (Perinatal HIV Research Unit in Johannesburg) for processing.Smaller, 5-7 mm 2 -sized sections were stored dry at − 80 • C until the samples were transported on dry ice to the Seattle Children's Research Institute, U.S.A for microbiome studies and to the Karolinska Institutet, Sweden for transcriptome analyses.

16S rRNA analysis
At the time of analysis, vials were thawed and approximately 25 mg of tissue was dissected and processed by a customized Qiagen PowerSoil Pro protocol [29] for extraction of DNA using a QIAcube instrument.A negative extraction control consisting of solution CD1 without specimen was also included.The specimens from each collection site were extracted on single plates.The resulting total DNA was diluted 1:4 to reduce PCR inhibition.
The 16S rRNA gene V3-V4 region was amplified using 357F/806R universal primers for 20 cycles of PCR as previously described [30] for each specimen along with a negative PCR control reaction consisting of reaction mix without DNA template for each replicate and evenly and staggered genomic DNA from mock bacterial libraries (BEI Resources) as positive sequencing controls.The amplified products were purified using Agencourt AMPure XP beads (Beckman Coulter) and submitted to an additional 10 rounds of PCR with indexing primers (Illumina).The resulting libraries were pooled by volume with specimens at 100× the positive controls.The resulting library comprising all participants was purified using a MinElute PCR purification column (Qiagen), followed by the QiaQuick gel extraction kit (Qiagen).The cleaned library was quantitated using qPCR (NEBNext Library Quant Kit for Illumina), then pooled with B.S. Maust et al.PhiX, denatured, and loaded onto a MiSeq instrument (Illumina) with a v3 2 × 300 flow cell following the manufacturer's protocol.
Sequences were de-multiplexed using Illumina's BaseSpace workflow.Primers and adapters were removed by cutadapt 2.7 [31].Sequences were further trimmed for quality, then filtered and merged using dada2 1.22.0[32] to generate amplicon sequence variants (ASVs).Taxa were annotated using the Silva 138.1 database [33] with additional genital-associated species [34] using a 100 % nucleotide identity threshold.The phyloseq 1.40.0 [35] and vegan 2.6-2 [36] R packages were used to create ASV tables and calculate diversity measures.ASV sequences were aligned using ssu-align 0.1.1 [37], and a maximum likelihood phylogeny was generated using PhyML 3.3.20220408[38] with a GTR substitution model.Contaminating sequences were identified by their presence in negative controls for the extraction and PCR amplification or mock community using decontam 1.16.0 [39] and microfiltR [40].After decontamination, specimens with fewer than 25-fold as many reads than extraction and PCR controls were excluded.For differential abundance analysis, decontaminated ASVs were filtered with prevalence≥10 % and relative abundance threshold of 1 × 10 − 4 before combining counts for all ASVs classified as the same species.ALDEx2 1.28.1 [41], ANCOM-BC 1.6.2[42], and DESeq2 1.36.0 [43] (using the poscounts factors estimation) were used for differential abundance testing to overcome the documented limited power and accuracy of these tools when used individually on 16S data sets which contain a high proportion of zero counts [44,45].

RNAseq of foreskin tissue
Details of the RNASeq protocol have been published previously [26].Briefly, foreskin samples were disrupted and homogenized using a Tissuelyzer (Qiagen) and total RNA isolated using the RNeasy Kit (Qiagen) according to manufacturer's instructions.RNA was subjected to quality control with Agilent Bioanalyzer (Agilent).To construct libraries suitable for Illumina sequencing, the Illumina stranded mRNA prep ligation sample preparation protocol was used with starting concentration of 200 ng total RNA.The protocol includes mRNA isolation, cDNA synthesis, ligation of adapters and amplification of indexed libraries.The yield and quality of the amplified libraries were analyzed using Qubit by (Thermo Fisher) and the Agilent Tapestation (Agilent).The indexed cDNA libraries were normalized and combined, and the pools were sequenced on the Illumina Novaseq 6000 S4 flowcell to generate 150 bp paired-end reads.
The Gene Ontology (GO) term "inflammatory response" (GO:0006954) selected 860 putative inflammatory genes which were filtered to only those with at least two copies detected in at least 90 % of specimens.RNA read counts were normalized then transformed by centered log ratio (CLR).

Integration of microbiota and host gene expression data
The 16S ASVs were filtered and combined as described for the differential abundance analysis and also CLR-transformed.As the previous analysis showed minimal differences between in expression between the control arms from the two clinical sites [26], we again analyzed specimens from the entire trial as a single data set.We calculated the correlation between the gene counts and bacterial relative abundances, then filtered for r > 0.4 and Benjamani-Hochberg-adjusted p-value <0.05.The resulting genes were manually inspected for their most relevant GO annotation and grouped according to their immunological function and pro-or anti-inflammatory nature (Supplemental Table S1).
Normalized gene counts were used to perform random forest feature selection as implemented in the Boruta R package 7.0.0[49].Only the importance measures of statistically significant (p < 0.01) features were reported.

Quantification and statistical analyses
All statistical analyses were performed in R version 4. Alpha diversity comparisons were evaluated using the Wilcoxon rank sum test.Beta diversity was compared using Permutational Multivariate of Variance (PERMANOVA) using the adonis2 function of the vegan R package.The relationship between treatment arm and CST was assessed using multinomial logistic regression.RNAseq and 16S taxa correlations were calculated using Pearson coefficient.A significance threshold of α = 0.05 was used for the differential abundance hypothesis testing.

Participant characteristics
The median age of the participants was 19 years (range: 13-24).All participants were assigned male sex at birth.No participants reported STI symptoms, and no physical exams revealed urethral discharge or other genital abnormality.None of the participants had clinical balanoposthitis or evidence of macroscopic inflammation.No GC infections were diagnosed, but NAAT for seven participants was positive for CT: five from Uganda and two from South Africa.

Microbiome 16S sequencing
After filtering and contamination removal, 137 specimens from the 144 enrolled participants had sufficient bacterial DNA reads to B.S. Maust et al. proceed with analysis.The identified bacterial taxa include a variety of skin-associated Gram-positive and genital-associated anaerobic species in addition to Gram-negative enterics (Fig. 1).Corynebacterium was the most prevalent and abundant genus, appearing in 132 (97 %) of specimens at median relative abundance of 34 % (range: 0.14 %-98 %).Anaerobic species were highly abundant, including bacteria that are commonly found in bacterial vaginosis (an inflammatory dysbiosis of the vagina) including Prevotella, Anaerococcus, Finegoldia and Porphyromonas.There were no ASVs in the Chlamydiaceae family which includes C. trachomatis.
Unsupervised partition around medioids clustering separated the unweighted Unifrac distances into two community structure types (CST) with distinct community structure (PERMANOVA R 2 = 0.25 with p < 0.001) and alpha diversity (p = 2.39 × 10 − 19 ) (Fig. 2).Two clusters maximized the silhouette score with acceptable within sum of squares and gap statistics.CST1 was highly diverse with a median Shannon index of 3.05.CST2 was dominated by Corynebacterium tuberculostearicum and Finegoldia magna and, with median relative abundances of 21 % and 9.6 %, respectively.CST2 also had a significantly lower median Shannon index of 1.68 (Wilcoxon unpaired exact, p = 2.39 × 10 − 19 ).Though F. magna and C. tuberculostearicum were also abundant in CST1 (median relative abundances of 4.6 % and 2.5 %), they demonstrated similar relative abundance with Anaerococcus, Campylobacter, Fenollaria, Finegoldia, Ezakiella, Mobiluncus, and Peptinophilis species without a clear dominant taxon.

Microbiome differences by study site
The 137 participants with 16S data included 69 (50.3 %) individuals from South Africa and 68 (49.7 %) from Uganda.We compared microbiota between the two study sites and found no differences in within-participant alpha diversity (Shannon entropy) or between-participant beta diversity (unweighted Unifrac distance) (Fig. 3A and B).Differential abundance testing identified Cutibacterium acnes as significantly higher in participants from South Africa compared to those from Uganda by ALDEx2 with a CLR difference of 3.3 between sites (Wilcoxon rank test with Benjamani-Hochberg correction p = 0.0184).ANCOM-BC identified the same ASV with a significant q-value (0.0079), but the log 2 fold change of 1.87 did not meet the recommended effect size threshold.DESeq2 did not identify any taxa as significantly differentially abundant.(Fig. 3, C-E).

Microbiome differences by parent study treatment arm
We performed similar analyses for the bacterial populations in participants who received active drug versus placebo.The participants with 16S data were equally distributed among treatment arms with FTC-TAF and FTC-TDF (n = 59 and 62, respectively, χ 2 p = 0.437).All 16 participants who received placebo had sufficient sequences for analysis.We found no differences in alpha or beta diversity (Fig. 4A and B) between placebo and FTC-TAF or FTC-TDF regimens.Combining both treatment groups and comparing to placebo, no species were significantly differentially abundant by any of the three tools (Fig. 4, C-E).An un-annotated Dialister species was identified with statistically significantly higher abundance in participants who received active drug by ANCOM-BC but did not  meet the effect size threshold at only 1.9-fold more abundant.The treatment arm was not a significant predictor of CST (p = 0.32 for placebo, p = 0.99 for FTC/TAF, p = 0.88 for FTC/TDF).

Inflammatory gene expression in foreskin tissue and bacterial taxa
Expression of 40 inflammatory genes showed significant correlation with CLR-transformed relative abundance of 31 bacterial species (Table 1).Six genes had unknown inflammatory function and were therefore excluded.The remaining 34 genes were primarily pro-inflammatory with negative correlation with bacterial species abundance (Fig 5) without difference by usual environmental niche.IL-15 was the gene with the most correlations to microbial taxa, with significant negative correlations to seven bacterial taxa not   typically associated with invasive infection: Brevibacterium luteolum, Corynebacterium urealyticum, Dietzia timorensis, and unannotated species in the Cutibacterium, Corynebacterium, Dietzia, and Nosocomiicoccus genera.Most of the bacterial taxa significantly correlated with other genes were gram-positive organisms which frequently colonize the skin.The CLR-transformed relative abundance of Corynebacterium massilliense, in particular, was associated with significantly lower expression of genes involved in regulation of inflammatory responses and neutrophil chemotaxis and activation.Anaerobes also found in the oral cavity such as Parvimonas and Porphyromonas were also correlated with primarily lower expression of inflammatory genes.However, the oral anaerobe Rothia amarae was associated with higher expression of the regulatory factor GHSR and an unclassified Rothia species was associated with higher expression of the pro-inflammatory gene REG3G.Species in three genera canonically associated with Bacterial Vaginosis (BV), Atopobium, Prevotella, and Sneathia, were correlated with lower expression of inflammatory genes, but one unannotated Prevotella ASV was negatively correlated with ZFP36, an immune regulatory gene.
We performed a similar analysis, grouping bacterial ASVs at the genus level (Supplemental Table S2).Eight genes showed correlation with nine bacterial genera.As expected, all the correlated genes and bacterial genera were also identified in the species-level analysis.
We conducted a separate query of associations between inflammatory genes and the CSTs described in

Discussion
To our knowledge, this study is the first to analyze the tissue-level microbiome of the foreskin.Consistent with previous reports using penile swabs [1][2][3][4][5][6][7][8][9][10][11][12][13][14]50], the identified bacteria are predominated by taxa commonly colonizing the skin (chiefly Corynebacteria spp) [51,52] in addition to anaerobic bacteria such as Prevotella, Dialister, Murdochiella, Peptoniphilus, and Negativicoccus.These anaerobic species have been associated with increased inflammation and HIV acquisition in uncircumcised men [4] and with bacterial vaginosis in women [53].In the foreskin, we describe two major CSTs, one significantly more diverse than the other.We did not identify bacterial species that were differentially abundant between participants receiving placebo compared to emtricitabine with either of the two forms of tenofovir.While vaginal-associated species such as Lactobacillus and Gardnerella take up tenofovir from their  environment [19,20], further studies of more prolonged ARV use may better elucidate whether there is an interaction between ARVs and the bacterial or viral communities of the penis.
The overall composition of the foreskin bacterial community did not appear to differ by study site; however, we did find Cutibacterium acnes to be significantly more abundant in South African than Ugandan participants.C. acnes is typically resident in the deep dermis in association with sebaceous glands and hair follicles [54], which our study sampled by digesting full-thickness specimens rather than resuspending skin swabs.While its contribution to its namesake acne vulgaris is debated, it is otherwise non-pathogenic in immunocompetent hosts without artificial material [54].It is frequently identified in surgical cultures, with unclear significance to sterility, likely due to its resistance to surgical sterilization techniques and transection of deep dermal structures during surgery [55].The differential abundance we observed may have been caused by different surgical preparation techniques at the two sites or by an actual difference in bacterial populations.As there were no surgical complications observed during the study, the difference does not appear to have clinical significance.
Our inflammatory gene analysis primarily identified an inverse relationship between expression of genes associated with response to bacteria and skin commensals such as Corynebacterium and Cutibacterium, consistent with non-inflammatory commensal colonization.IL-15, the most commonly correlated gene, is a pleotropic cytokine secreted by a narrow range of cell types found in foreskin tissue including epithelial cells, fibroblasts, Langerhans cells, and macrophages [56].It has broad immunostimulatory function, promoting NK cell differentiation and survival [57], inflammatory cytokine production by macrophages [58] and dendritic cells [59], neutrophil activation, survival, and phagocytosis [60], germinal center B cell proliferation [61], CD8 + T cell survival [62], and it is required for development of skin-resident memory CD8 + T cells [63].Its lower expression correlating with higher relative abundances of commensal bacteria is consistent with dampened inflammatory signaling corresponding to prolonged exposure to commensal

bacteria.
When examining gene expression between the participant groups with high versus low diversity bacterial communities, we did not confirm previous findings in penile swabs [4,6,12] and the vagina [64] that highly diverse, predominantly anaerobic communities are associated with increased inflammation.While we frequently identified anaerobic taxa, individual species found previously to be inflammatory were present across fewer specimens than in studies using surface swabs, possibly reducing our power to confirm prior associations.Another explanation could be methodological: rather than measuring secreted cytokines, we processed the entire tissue specimen for bulk RNAseq, which likely included many cells not directly interacting with bacteria or immune cells.Alternatively, the pre-procedure sterilization or the surgery itself may have preferentially removed inflammatory species.The sterilization or collection could also have altered host-bacterial interactions, though the rapid timeframe makes this explanation seem less likely.Lastly, the anaerobic strains in our cohort may employ different metabolic or virulence strategies that render them less inflammatory than those in other cohorts, however thus far no studies have investigated strain level differences in penile bacteria.
In the vagina, high diversity communities are inflammatory [64] and associated with increased risk of adverse outcomes including HIV acquisition [65,66] and preterm birth [67].The foreskin microbiome had two obvious CSTs, one highly diverse CST1 and a less diverse CST2.We hypothesized that the more diverse CST1 would be associated with higher levels of genes related to inflammation, as in the vagina.However, when comparing gene expression between CST1 versus CST2 specimens, four genes were differentially expressed not including canonical innate antibacterial cytokine responses such as TNF-ɑ and IL-6.While named for its key activity in T-Cell Receptor (TCR) signaling, NFAT family members play a broad role in cell differentiation in the immune system and beyond [68], including in B cells [69], Toll-like receptor (TLR) signaling in monocytes [70], and proliferation in perivascular tissue [71] and keratinocytes [72].Higher expression of NFATC3 in CST1 specimens may represent increased proliferation or signaling from these cells.SELENOS is up-regulated by cytokines such as IL-1β and TNF-α via NF-κB and in turns acts to suppress cytokine secretion in macrophages [73].Its higher expression in CST1 samples is consistent with increased suppression of chronic inflammation.STAP1 has the best evidence for signaling downstream of the B-Cell Receptor (BCR) [74], and lower expression could represent a response to sustained signaling (although this was not supported by the expected changes in other genes).NLRP6 is both an inducer and a component of the inflammasome, binds directly to the lipopolysaccharide of gram negative or lipoteichoic acid of gram positive bacterial cell membranes, and plays both a pro-and anti-inflammatory role in tissues such as liver, kidney, and intestine [75].Given these contrasting roles, its lower expression is difficult to interpret with the available data.None of the identified genes plays a direct role in regulating bacterial sensing or inflammatory response.
In summary, we report the first tissue-level examination of the bacterial microbiome of the foreskin, with no effect of brief antiretroviral drug exposure.Correlating the abundance of bacterial taxa with RNAseq data revealed largely negative correlations with genes involved in the inflammatory response, consistent with maintenance of immune tolerance of these commensal organisms.

Fig. 1 .
Fig. 1.Top 25 genera.The top 25 most abundant bacterial genera across participants by study treatment received (anaerobic taxa in earth tones, aerobes in purples, uncultured or unidentified species in grays).
B.S.Maust et al.
Fig 2A distinguished by high and low diversity bacterial populations.In the random forest classification, four genes showed statistically significant associations.Nuclear Factor of Activated T cells 3 (NFATC3) and Selenoprotein S (SELENOS) showed higher expression in the highly diverse CST1 relative to CST2, while Signal Transducing Adapter Family Member 1 (STAP1) and Nod-like Receptor Pyrin domain-containing 6 (NLRP6) showed lower expression (Fig 6).
B.S.Maust et al.

Table 1
Bacterial taxa and human gene correlations.