A randomised clinical study to determine the effect of a toothpaste containing enzymes and proteins on plaque oral microbiome ecology

The numerous species that make up the oral microbiome are now understood to play a key role in establishment and maintenance of oral health. The ability to taxonomically identify community members at the species level is important to elucidating its diversity and association to health and disease. We report the overall ecological effects of using a toothpaste containing enzymes and proteins compared to a control toothpaste on the plaque microbiome. The results reported here demonstrate that a toothpaste containing enzymes and proteins can augment natural salivary defences to promote an overall community shift resulting in an increase in bacteria associated with gum health and a concomitant decrease in those associated with periodontal disease. Statistical analysis shows significant increases in 12 taxa associated with gum health including Neisseria spp. and a significant decrease in 10 taxa associated with periodontal disease including Treponema spp. The results demonstrate that a toothpaste containing enzymes and proteins can significantly shift the ecology of the oral microbiome (at species level) resulting in a community with a stronger association to health.

The human body's resident microbiota is not only essential for life but also plays a critical role in both the protection from, and development of, various diseased states 1 . As described by Kilian et al., humans have co-evolved with microorganisms and have a symbiotic or mutualistic relationship with their resident microbiome which, for the most part, remains homeostatic 2 . However, the bacteria, viruses and fungi that occupy our body sites, be it the scalp 3 , the face 4 , the gut 5 or the oral cavity 6 have also been linked through causation or correlation, to a number of disease states and cosmetic conditions including dandruff 7 , acne 8 , inflammatory bowel disease 9 and periodontitis 10 .
When in equilibrium in the oral cavity, the microbiota forms an ecosystem that is important in the promotion and maintenance of health 11 . The mouth harbours one of the most diverse microbiomes in the body which includes viruses, fungi, protozoa and archaea as well as bacteria 12 . Over 700 different microbial species have been identified to date 13,14 . While host predisposition to oral disease has been shown to be important [15][16][17] , in general, the presence of such a highly diverse microbial community prevents outgrowth of any single species which otherwise might lead to a bacterial load exceeding the pathological threshold 18 . However, the benefits of the oral microbiota go beyond protection from colonisation by exogenous microbes and include immunological priming and down-regulation of excessive pro-inflammatory responses 19 . The symbiotic relationship can break down, for example, through poor oral health regimes, resulting in dysbiosis and plaque-related diseases 20 .
Saliva plays an important role in preventing dysbiosis and maintaining health in the oral cavity 21,22 . Salivary components, particularly antimicrobial factors such as enzymes and proteins, exert significant selective pressures on the microbiota, helping to provide protection against pathogenic organisms 23,24 and helping to shape and control the resident community 25 . Saliva is also important in the formation of the pellicle, the thin acellular organic film that forms on oral surfaces following exposure to saliva 26 . Salivary enzymes and proteins have been shown to be incorporated into the salivary pellicle 27 , to be immobilised in an active form at the surface 28 and to directly influence the pattern of initial microbial colonisation 29 .
One of the main defence mechanisms of saliva is the lactoperoxidase system (LPO system) 28 . This system is activated in part by hydrogen peroxide which oxidises thiocyanate to hypothiocyanite, a process that has been discussed extensively in the literature [30][31][32] . Hydrogen peroxide itself possesses antimicrobial activity 33 and has been shown to play an important role in oral health. Hydrogen peroxide in the oral cavity is produced by both the human host 34 and members of the oral microbiome. Levels of hydrogen peroxide produced by some streptococcal species in the laboratory have been described as being sufficient to inhibit the growth of many plaque bacteria 35 . Hydrogen peroxide also participates in a number of enzymatic reactions in addition to the LPO system, including the production of oxygen by the enzyme catalase 23 . In addition, hypothiocyanite plays a key role in oral health as a natural antimicrobial and has been extensively researched 34,[36][37][38][39][40] . Hypothiocyanite produced by the LPO system has antibacterial effects on both cariogenic bacteria 32 and black pigmented anaerobic bacteria associated with periodontal disease such as Porphyromas gingivalis 41 .
In addition to the LPO system, other salivary components including lysozyme and lactoferrin are critical to the mouth's natural defences against bacteria. Lysozyme is an antibacterial protein found in a variety of mucosal fluids 18 . Quantitatively, it is the most important salivary component with antibacterial properties, due to its ability to break glycosidic linkages in peptidoglycans 21,42 . The effect is most pronounced against Gram-positive bacteria due to the thick peptidoglycan layer in the cell wall, whereas the peptidoglycan is protected by an outer membrane in Gram-negative bacteria 43 . Lactoferrin has been shown to permeabilise the outer membrane of Gram-negative bacteria making them susceptible to penetration by lysozyme 44 . This action of lactoferrin is in addition to its main mode of action as an iron-binding protein which reduces the concentration of iron available as a co-factor for bacterial enzymes and in turn retards bacterial growth. As well as bacteriostatic properties, lactoferrin is known to have bactericidal properties in its own right resulting from direct interaction between the protein and bacteria 45 . Lactoferrin and lysozyme have been shown to work synergistically in combination 46 and additionally both lactoferrin and lysozyme have been reported to have elevated antimicrobial activity when combined with the LPO system 39,47 . Furthermore lactoferrin has been linked to anti-inflammatory activity against periodontitis 48 .
To boost the role of natural salivary defences in controlling the oral microbial community, oral hygiene products including toothpastes have been developed that contain enzymes and proteins. Zendium ™ contains a three enzyme system (amyloglucosidase, glucose oxidase and lactoperoxidase), designed to promote the generation of hydrogen peroxide and hypothiocyanite, as well as three further protein components (lysozyme, lactoferrin and immunoglobulin IgG), designed to provide additional antimicrobial benefits. Both lysozyme and hydrogen peroxide have been shown to be elevated in saliva after brushing with Zendium ™ compared to a control toothpaste without the enzymes and proteins 49 . This is of significance as it has been reported in the literature that normal physiological levels of hydrogen peroxide may be too low to activate the LPO system 31 and enhancement of the LPO system in vivo may be effective in the regulation of acid producing bacteria 50 .
Studies reporting the effect of toothpastes on the ecology of the oral microbiome, have for the most part, been limited to the use of traditional culture based methods 51,52 . This has limited our understanding as a large proportion of the resident microbiota cannot be grown in the laboratory 53 . Despite the rapidly emerging use of molecular techniques, microbial ecology studies reporting changes in the oral microbiome after toothpaste use are currently sparse in the scientific literature 54,55 . With the latest developments in DNA sequencing technology, it is possible to measure community level changes in the oral microbiome, highlighted by the wealth of recent studies investigating the differences between healthy and diseased states 16,56,57 . These studies have been facilitated by the availability of bespoke, highly curated databases that allow the assessment of human associated microbiomes to species level e.g Human Oral Microbiome Database 58,59 .
Given the complexity of the microbial community it is essential to make an assessment at the species level to explore the contribution of individual species to the overall community function. The objective of this work was to understand the effect of toothpaste use on the ecology of the oral microbiome at the species level, comparing a fluoride toothpaste containing enzymes and proteins with a fluoride toothpaste without enzymes and proteins. Any changes observed provide insights into the benefits of using a toothpaste with enzymes and proteins to boost natural salivary defences, shift oral ecology and provide potential health benefits.

Results
Sequence processing and taxonomic classification. Two hundred and twenty samples were processed and analysed via Illumina sequencing, initially resulting in approximately 37.9 million raw sequence paired reads which, following quality processing, produced 26.9 million overlapping contigs. 14.7 million contigs were successfully classified to genus/species level following use of The Forsyth Institute pipeline resulting in 17 phyla, 183 genera and 1220 species level taxa. Taxa with counts of fewer than 100 reads were aggregated; leaving 414 species level taxa taken forward for statistical analysis. Eight paired samples were removed at this stage due to either generation of no sequence data (4 samples) or fewer than 20,000 reads (4 samples). The remaining 204 samples were processed through the statistical analysis pipeline.
Community changes -Genus level. Analysis was carried out at genus level to determine the genera affected by use of the toothpastes over 14-weeks. Beta diversity was used to examine the differences between sample groups and visualised using ordination plots. The ordination plot of the random forest analysis (Fig. 1) shows the bacterial communities for both toothpastes at the baseline and 14-week time points. ANOVA was used to compare the two toothpaste groups. No significant difference was observed between the bacterial communities at baseline (p = 0.36). The data was assessed for community changes over the 14-week study period and this Scientific RepoRts | 7:43344 | DOI: 10.1038/srep43344 highlighted a significant shift in the community profile for the test toothpaste users (p = 0.01) but no such shift for control toothpaste users (p = 0.97). A significant difference between the bacterial communities was observed between both toothpaste groups at 14-weeks (p = 0.011).
Community changes -Species level. The outcome of the analysis at the species level was consistent with the genus level results. ANOVA and associated ordination plots of the random forest analysis (Fig. 2) showed no significant difference in communities at baseline (p = 0.23) while significant community shifts were observed for the test toothpaste users over 14-weeks (p = 0.025). No differences were observed for control toothpaste users (p = 1.0). A statistically significant difference was observed between the test and control toothpastes at the 14-week time point (p = 0.003).
Whilst representing data in two dimensions is informative, visualising these data in three dimensions provided a better fit to the spatial distribution. The three dimensional model differentiated the sample groups providing an easy to interpret exploratory visualisation (Fig. 3). MicrobiVis was used to visualise changes in relative abundance of selected species (Fig. 4). Visualisation carried out in this way allowed assessment of the distribution of taxa between individual samples.

Analysis of Species Changes.
To understand in more detail, the in-group and between-group differences observed through ordination and ANOVA measures, the mean relative abundance of individual species was compared between toothpastes over time. The 414 species were examined for changes in relative abundance from baseline to 14-weeks, for each toothpaste, using Dirichlet Multinomial algorithm 60 . The positive false discovery rate was tightly controlled to within 5% using the q-value. The q-value is an adjusted p-value method 61 . This allowed the identification of statistically significant taxa changes with q < 0.05.  Significant changes in the abundance of 54 taxa were observed, 37 for the test toothpaste users and seventeen for control toothpaste users (Figs 5 and 6). Two taxa were identified as having changed in abundance for both toothpastes; one taxon consisted of members of the mitis group streptococci and the other a Fretibacterium sp. For each toothpaste, significant taxa were assessed in terms of increase or decrease in mean relative abundance expressed as a percentage of the total community. For the test toothpaste group, 18 taxa increased in relative abundance and 19 decreased in relative abundance. For the control toothpaste group, 7 taxa increased in relative abundance and 10 decreased in relative abundance.
The species with the largest increase in relative abundance in the test toothpaste group after 14-weeks of use was Neisseria flava with a 2.9% change. Three other Neisseria species also increased in relative abundance. No significant changes were observed in Neisseria species for the control toothpaste where Fusobacterium nucleatum ss polymorphum showed the largest increase at 0.8%. The species with the largest decrease in relative abundance in the test toothpaste group after 14-weeks was Rothia dentocariosa with a 3.2% change. Four taxa from the genus Treponema were also found to decrease in relative abundance for the test toothpaste. In the control group, the largest decrease in relative abundance was also attributed to the Rothia genus but in this instance, Rothia aeria was identified, with a decrease of 0.9%.

Discussion
In this paper we report for the first time an in vivo study using molecular metataxonomics 62 that demonstrates species level changes in the ecology of the oral microbiome after toothpaste use. Specifically, we demonstrate that brushing with a toothpaste containing enzymes and proteins, Zendium ™ , promotes a shift in the ecology of the oral microbial community over time, compared to a toothpaste without enzymes and proteins in healthy subjects. This finding was consistent at both genus and species level and was informed by both state-of-the-art informatics processing and robust statistical analysis, without the need for rarefaction.
Importantly, the results confirm that a comparison of communities at the genus level, whilst informative, is insufficient to robustly discriminate the role of specific community members in driving the ecological shift. Within a bacterial genus, closely related taxa can provide different functionality within a community and as such could be drivers for health or disease. Therefore, reliable identification of the species provides an understanding of the potential for differential community impact. For example in the genus Porphyromonas, P. gingivalis, is Figure 5. Summary of the significant species changes associated with gum health and/or disease for the Test toothpaste after 14-week test period. *Rothia dentocariosa has been reported in the literature to be associated with both health and periodontal disease. associated with periodontal disease 63,64 , whilst P. catoniae is associated with health 16 . A review of the literature facilitated the association of the identified taxa with gum health and/or disease (Figs 5 and 6). It was shown that there was an increase in the relative abundance of organisms associated with gum health for the test toothpaste and a concomitant decrease in those organisms associated with periodontal disease.
For the test toothpaste group 12 taxa associated with health increased in relative abundance. 10 taxa associated with periodontal disease decreased in relative abundance. Of the remaining taxa, 11 currently have no known association with gum health or disease (Fig. 5). One taxon, Rothia dentocariosa, has been associated with both health and disease 65,66 .
For the control toothpaste group 1 taxon associated with health increased in relative abundance. 4 taxa associated with periodontal disease decreased in relative abundance. Of the remaining taxa, 10 currently have no known association with health or disease (Fig. 6). The single taxon associated with health that increased in the control group (mitis group streptococci) also increased in the test toothpaste group and the increase seen in the test toothpaste group was almost double (0.097%) the change in abundance observed in the control toothpaste group (0.057%).
Species increased in relative abundance after test toothpaste use. Analysis at species level is critical to understand if changes within the microbiome are associated with health or disease. Justification of this approach is provided by analysis of significant changes in the relative abundance of the genus Prevotella. Some members of the genus Prevotella, such as Prevotella intermedia are associated with disease 67 . In this study we identified that Prevotella melaninogenica and a number of associated phylotypes increased in relative abundance. Importantly P. melaninogenica is an organism present more commonly in healthy plaque 68 . This observation further emphasises the necessity of classifying to species level to gain accurate information on the biological significance of the ecological changes occurring within communities.
Changes observed at species level are consistent with the mode of action of the enzymes and proteins within the Zendium toothpaste. For example, Neisseria species, commensal organisms representing some of the only aerobes in the mouth and generally associated with health 69 , increased in relative abundance. Neisseria species exhibit enhanced growth in an aerated mixed culture 70 and as a catalase positive species 71 they are able to protect themselves from the antimicrobial action of hydrogen peroxide subsequently raising local oxygen levels. We propose that this will give Neisseria a competitive advantage over anaerobic catalase negative organisms. The second most significant increase in relative abundance is attributed to Kingella denitrificans. Whilst little information exists in the literature, its taxonomic classification is in the same grouping as Neisseria suggesting that it occupies a similar ecological niche and would be likely to increase under the same conditions that promote the growth of Neisseria spp.. K. denitrificans has previously been shown to be increased in relative abundance in stable periodontal sites as opposed to active periodontal sites 72 , although all subjects in this study were healthy and free of periodontitis.
Other organisms shown to increase in relative abundance in the test toothpaste group after 14-weeks include Granulicatella elegans. G. elegans has been shown to be statistically more abundant in healthy sites from periodontally healthy individuals compared to periodontitis patients 73 . Indeed, Lourenco et al. proposed that the absence of G. elegans was associated with a higher risk of generalised, aggressive periodontitis in relation to chronic periodontitis. Lactobacillus gasseri, which has also been shown to increase in the test toothpaste group, has been screened recently for its potential use as a probiotic, having been shown in the laboratory to have antibacterial activity against P. gingivalis (ATCC 33277) 74 .
Species decreased in relative abundance after test toothpaste use. Changes observed are consistent with the mode of action of the test toothpaste. As part of the enzyme cascade that leads to hypothiocyanite formation, hydrogen peroxide is also produced and increased levels of hydrogen peroxide have been observed in-vivo 49 which in turn could lead to an increase in oxygen concentration through, for example, catalase activity. Such changes in the local environment would be expected to inhibit the growth of anaerobic species, and a number of obligatory anaerobic species were reduced (e.g. representatives of the genera Treponema, Bacteroidales, Eubacterium, Prevotella, Fusobacterium and Fretibacterium: Fig. 5) in those using the test toothpaste. Treponema species have been shown to be more abundant in periodontal disease 16,65 and are known to be highly sensitive to oxygen 75 .
The largest decrease in relative abundance for the test toothpaste group was attributed to Rothia dentocariosa. Whilst there is no conclusive proof to this organism's role in health or disease, as it has been shown to be associated with periodontal diseases 66,76,77 as well as a positively associated with gum health 16,65,78 . This organism is a catalase positive aerobe 79 , occupying a similar niche to Neisseria spp so it is unlikely that the decrease in Rothia abundance can be attributed to an increase in hydrogen peroxide levels or increased oxygen concentration. However, R. dentocariosa is a Gram-positive bacterium and as such would be expected to be more susceptible to lysozyme, the levels of which are almost doubled after brushing with the test toothpaste 49 . This provides a rationale for the observed decrease in this organism's relative abundance in the test toothpaste group. Whilst the role of R. dentocariosa remains undefined, other organisms known to be associated with periodontal disease also showed a decrease in the test toothpaste group. These included Bacteroidales [G-2] sp._oral_taxon_274, P. intermedia and The data presented here confirm that a toothpaste formulated to augment the natural defences of saliva leads to a positive shift to a microbiome more associated with health. Specifically, it shows that a toothpaste containing enzymes and proteins, Zendium TM , can significantly increase the relative abundance of health-associated organisms in plaque whilst driving a concomitant decrease in a number of disease-associated organisms compared with a toothpaste without enzymes and proteins over time. In this healthy study population, the magnitude of the statistically significant changes in the abundance of some of the individual taxa appear relatively modest, however, the cumulative ecological effect of many small but beneficial shifts in health and disease associated species within the microbiome is hypothesised to have biological relevance. There is emerging evidence in the literature that differences in health status can be associated with modest compositional differences in microbial communities 80 . We propose that the regular use of a toothpaste containing enzymes and proteins can actively maintain a balanced microbiome, with a composition that is consistent with oral health, thereby helping to reduce the risk of dysbiotic changes associated with disease. This study was performed with subjects in good oral health over 14-weeks and it is possible that more extensive taxonomical and/or percentage abundance shifts in the microbiome could be achieved when the product is used for longer periods or in subjects with a greater risk of disease or poorer oral health.
This research provides new insights into both the action and microbial re-profiling abilities of this toothpaste. We demonstrate that an ecological approach using DNA sequencing, bespoke bacterial databases and rigorous statistical analysis provides a highly robust approach to assess the impact of oral care products on the ecology of the oral microbiome. Furthermore, brushing with Zendium ™ exerts a significant positive shift in the plaque microbiome, which we propose provides evidence for the biological relevance of increased salivary defence factors in promoting oral health.
Further work in the area of metatranscriptomics analysis will provide functional profiling of microbial communities and will help determine the community metabolic output 72,81,82 . This, together with information on the spatial structure of the plaque biofilm community using techniques such as fluorescent in situ hybridisation 83 will provide more detail as the overall effect of the beneficial changes reported here.

Materials and Methods
Ethics statement. Written informed consent was obtained from all enrolled individuals. The study protocol was reviewed and approved by the Unilever R&D Port Sunlight independent ethics committee. The methods were carried out in accordance with the approved guidelines.
Participants. Subjects in good health aged 18 or over were recruited onto the study. The mean age of subjects was 42, with 33 male and 78 female participants completing the study. Demographic information for this study is given in Table S13 in the Supplementary Information. Key inclusion criteria included: minimum age 18 years, minimum number of teeth 20, no antibiotic therapy or professional cleaning within one month of the start of the study. Key exclusion criteria included: pregnancy, nursing mothers, diabetics, denture wearers, smoking within the last 6 months, medical conditions and/or regular use of any medication which might affect the outcome of the study and obvious signs of untreated caries/significant periodontal disease. Study Design. This study was a double-blind, randomised, parallel group study conducted by an independent third party. A study flow diagram is shown in Fig. 7. Subjects were enrolled on to the study according to the inclusion/exclusion criteria. After recruitment, subjects were given a fluoride toothpaste to use for four weeks prior to the commencement of the study. Following the initial 4-week at home use, baseline supra-gingival plaque from the upper jaw was collected for the assessment of microbiome composition. Subjects were randomly allocated to one of two toothpastes, a fluoride toothpaste (1450 ppm) containing enzymes and proteins (Zendium ™ ) or a control fluoride toothpaste (1450 ppm) without enzymes and proteins. Subjects were instructed to use the toothpaste at home, brushing twice a day for 14-weeks. Supra-gingival plaque from the upper jaw was collected at week 14 for the assessment of microbiome composition. Each plaque sample was placed in 1 ml TE buffer in a low-DNA-binding Eppendorf tube. The samples were stored at − 25 °C until analysis. DNA extraction. Samples were defrosted, vortexed for 30 seconds in the original TE buffer, sonicated for 20 seconds and vortexed for a further 30 seconds. 500 μ l of each sample was transferred into an individual well of a 96-well Lysis plate containing pre-aliquoted matrix beads B (116981001, MPBio, California, USA). 3 μ l aliquots of Ready-Lyse lysozyme (Epicenter, Wisconsin, USA) were added and the plate incubated using an Eppendorf Thermomixer at 37 °C with shaking at 300 rpm for 18 hours. Following incubation, a bead-beating step was performed using a Tissue Lyser (Qiagen, Germany) for 3 minutes at 20 Hz. An off-board lysis was performed by incubating the samples at 68 °C for 15 minutes in the presence of Proteinase K, Carrier RNA, ATL and ACL buffer in a Qiagen S-plate following manufacturer guidelines. Post-incubation, the plate was loaded on to the QIAsymphony and the samples processed using the QIAsymphony Virus/Bact Midi Kit (931055, Qiagen).
DNA quantification was performed using a Quant-IT high sensitivity DNA Assay kit (Invitrogen, California, USA) following manufacturers guidelines. Each sample was normalised to 1 ng/μ l using molecular grade water on a QIAgility robot (Qiagen) prior to PCR amplification.
16S rRNA gene amplicon library preparation. Oligonucleotide primers targeting the V4-V6 hypervariable region of the 16S rRNA gene were evaluated in silico using PrimerProspector 84 . PCRs were conducted using the primers listed below (underlined section of the primer denotes the PCR attachment sequence). Primers were modified from the standard 533f and 1061r to include recognition sequences allowing a secondary nested PCR process facilitating the addition of standard Illumina adapters and sample specific indexes.
533f 5′ ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNGTGCCAGCMGCCGCGGTRA3′ and 1061r 5′ GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCRRCACGAGCTGACGAC3′. PCRs consisted of 0.25 μ l (10 μ M) of each primer, 7 μ l of HotStar Taq Plus Mastermix (Qiagen), 5 μ l of normalized template DNA and 4.5 μ l molecular grade water (Qiagen). Samples were amplified in triplicate using the following parameters: 95  PCR products were purified using Axygen SPRI Beads (Axygen, California, USA). A second round PCR incorporated Illumina adapters containing indexes (i5 and i7) for sample identification utilising eight forward primers and twelve reverse primers each of which contained a separate barcode allowing up to 96 different combinations. General sequences of the primers are illustrated below with the variable 8 bp barcode underlined. N501 f 5′ AATGATACGGCGACCACCGAGATCTACACTAGATCGCACACTCTTTCCCTACACGACGCTC3′ N701 r 5′ CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTGACTGGAGTTCAGACGTGTGCTC3′. Second round PCRs consisted of 0.5 μ l (10 μ M) of each primer, 10 μ l of 2 x Kapa Mastermix (Kapa Biosystems, Massachusetts) and 9 μ l of purified sample from the first PCR reaction. Samples were amplified using the following parameters: 98 °C for 2 minutes, then 15 cycles of; 20 seconds at 95 °C, 15 seconds at 65 °C, 30 seconds at 70 °C with a final extension of 5 minutes at 72 °C. Samples were purified using Axygen SPRI Beads before being quantified using Qubit fluorimeter (Invitrogen, California, USA) and assessed using the Fragment Analyzer (Advanced Analytical Technologies, Iowa, USA). Resulting amplicon libraries were taken forward and pooled in equimolar amounts using the Qubit and Fragment Analyzer data and size selected on the Pippin prep (Sage Science, Massachusetts, USA) using a size range of 300-600 bps. The quantity and quality of each pool was assessed by Bioanalyzer (Agilent Technologies, California, USA) and subsequently by qPCR using the Illumina Library Quantification Kit (Kapa) on a Light Cycler LC480II according to manufacturer's instructions (Roche, Switzerland). Each pool of libraries was sequenced on one flowcell of an Illumina MiSeq with 2 × 300 bp paired-end sequencing using v3 chemistry (Illumina, California, USA).
Informatics processing. All raw reads were processed simultaneously. Fastq files were trimmed for the presence of Illumina adapters using Cutadapt 85 version 1.2.1. 16S rRNA gene amplification primers were removed from each fragment using Trim Galore v 0.4 86 to account for the presence of degenerate bases that might impact downstream taxonomic assignment. Reads were further trimmed using Sickle v1.33 87 with a minimum window quality score of 28. Reads shorter than 100 bp after trimming were removed. If only one of a read-pair passed this filter, its read-pair was removed from the dataset. Surviving read pairs were merged using Pandaseq v2.9 88 generating contigs with a minimum overlap of at least 20 bp. Merged reads greater than 450 bp were taken forward for taxonomic classification.
Species level taxonomic assessment. Species level classification was carried out at The Forsyth Institute using an in-house pipeline. Briefly, unique assembled reads were compared to three reference databases using ncbi-blast-2.2.30+ in a sequential manner. For a successful result to be taken, sequences had to match the reference at 99% similarity across 98% of the amplicon. Reads not assigned to species level against these three databases were discarded. The databases used were the Human Oral Microbiome Database (HOMD), HOMD extended (HOMDEXT) and GreenGenes Gold, each of which had been curated to facilitate accurate species level assignment 59 .
Statistical Analysis. Statistical analyses were performed on the table of counts produced from the bioinformatics pipeline. Counts were analysed at genus and species level, corrected for heteroscedasticity and unequal library sizes. It is common practice to employ a rarefaction process to correct for these variances in microbiome datasets 54 . However, this has been shown to be statistically inadmissible 89 . Our approach was to deal with the normalisation dependent on the context being examined. A variance stabilising transformation (VST) 89 was applied prior to ordination using a random forest dissimilarity measure [90][91][92] . Alternatively, model-based approaches were employed for comparative testing of relative abundance between groups which explicitly model the uneven sample read numbers and sparseness. Analysis of variance (ANOVA) was conducted in parallel to determine the statistical significance using a between sample distance measure (Canberra Distance) based on taxonomic profile 93,94 . Analyses were performed using the R packages Vegan 95 , Party 96 and HMP 60 .
Visualisation. Visual exploration of trends in the samples at species level was accomplished using a novel visualisation tool MicrobiVis 97 . The bespoke MicrobiVis tool provides functionality for interactive visual analysis of microbial communities and enables interactive selection of subsets of species for in-depth detailed investigation. The selected subset of species is displayed using a Parallel Coordinates plot 98 , where each vertical axis represents one species. The individual samples are displayed as polylines that intersect the axes at positions corresponding to their microbial count for that particular species, with minimum count at the bottom of the axes and maximum count at the top. The samples at baseline were coloured blue and the samples at 14-weeks were coloured green. This approach provides an overview of the microbial profiles of the samples within the subset of species, while also facilitating the identification of outlier samples that do not follow the general profile trends. Additionally, the median profiles of the two sample groups are represented by thicker lines to provide overview of the general profile patterns of the groups. Whilst the distribution differences for individual species may be visualized using, for instance, box plots, the approach taken here has the ability to represent profiles across a group/set of species, and through this reveal sample clusters and correlations between species in more detail.