Exploring the Effects of Probiotic Treatment on Urinary and Serum Metabolic Profiles in Healthy Individuals

Probiotics are live microorganisms that confer health benefits when administered in adequate amounts. They are used to promote gut health and alleviate various disorders. Recently, there has been an increasing interest in the potential effects of probiotics on human physiology. In the presented study, the effects of probiotic treatment on the metabolic profiles of human urine and serum using a nuclear magnetic resonance (NMR)-based metabonomic approach were investigated. Twenty-one healthy volunteers were enrolled in the study, and they received two different dosages of probiotics for 8 weeks. During the study, urine and serum samples were collected from volunteers before and during probiotic supplementation. The results showed that probiotics had a significant impact on the urinary and serum metabolic profiles without altering their phenotypes. This study demonstrated the effects of probiotics in terms of variations of metabolite levels resulting also from the different probiotic posology. Overall, the results suggest that probiotic administration may affect both urine and serum metabolomes, although more research is needed to understand the mechanisms and clinical implications of these effects. NMR-based metabonomic analysis of biofluids is a powerful tool for monitoring host-gut microflora dynamic interaction as well as for assessing the individual response to probiotic treatment.


Urine metabolites -Dosage A and B
As stated in the main text of the article, to study the urine metabolite trends and their relationship with the treatment, a mixed-e ects linear regression framework is used for each metabolite S1.Using a simplified notation, the full model for the log-quantification of a generic urine metabolite is specified as it follows: • log(Q) is the dependent variable of the model, i.e. the log-quantification of a generic metabolite; • Subject + Subject • Sample is the random part of the model: each Subject has a random intercept and slope, hence, for each subject, the trend in consecutive samples defined by the variable Sample (numerical, from 1 to 40, one for each consequent subject's measurement) could be di erent; • -0 is the fixed intercept; • -1 to -7 are the coe cients for the Sample number, treatment Dosage (categorical, A or B), Phase (categorical, I for samples collected before the probiotic supplementation, II for samples collected during the probiotic supplementation), and their interactions; • -8 to -10 are the coe cients for Age, Gender, and BMI which were included in the models as they are considered possible confounding variables; • the reference level for this model is represented by a male subject, belonging to dosage group A, before the treatment intake starts (i.e., Phase I).

Estimate the models
To obtain a reduced model, more parsimonious than the full model, but with a comparable ability to describe the data variability, a stepwise model selection procedure is used.In detail, a log-likelihood ratio test is used to determine which variables are not significant for a given metabolite (p-value threshold = 0.1).In addition, two di erent models are estimated in each step of the selection: the first one with an undefined intra-group correlation structure between observations and the second model with an autoregressive correlation structure of the first order.In the latter model a relationship between consecutive observations is hypothesised (e.g., sample 1 and sample 2, sample 2 and sample 3, etc.).The Akaike Information Criterion (AIC) is used to choose between the two models.
The R package nlme (v3.1-162) 1 is used to estimate the models.

Visualise the models
In Figure S1a every column represents a metabolite and each row represents a mixed-e ects regression model coe cient (related to a covariate).Focusing on one metabolite, some of its cells could be empty, while others are colored.Empty cells indicates that the corresponding coe cients are not statistically significant in explaining the variability of that metabolite.On the contrary, colored cells contain the coe cient values which are somehow related to that metabolite.A red filled cell indicates a positive relationship between the coe cient variable and the metabolite, while a blue filled cell has the opposite meaning.A red square around the cell means that the coe cient associated p-value is below the threshold (p-value < 0.1).It may occur that a cell is filled without the red square around it.This is the case when the interaction terms are significant for a metabolite.Indeed, the main e ects are kept into the model even if their p-values are not significant when the interactions are significant.
Regarding the interpretability of each model: • the main e ects indicate an average di erence between the level in the coe cient row (e.g., Phase = II) and its reference level (e.g., Phase = I); • the Sample covariate indicates the estimated increase/decrease of the corresponding metabolite values when the Sample changes by a single unit (i.e., the average di erence between consequent samples); • the interaction terms measure metabolite changes in presence of one or more conditions compared to the baseline levels (Phase = I, Dosage = "A").
Since all the subjects are treated in Phase II, the Phase covariate is directly linked to the probiotic supplementation e ect.A positive or negative value for the coe cient of Phase = II is related to an increase or decrease of the average log-quantification level for the corresponding metabolite compared to Phase I.However, it cannot be considered by its own when interactions are significant: • when the coe cient for the interaction between Phase and Dosage is significant, an average di erence of log-quantification values is present for the corresponding metabolite and it is not the same across phases and dosage groups; • when the coe cient for the interaction between Sample and Phase is significant, an increasing or decreasing trend is present for the corresponding metabolite and it is not the same in Phase I and Phase II; • when the coe cient for the interaction between Sample and Dosage is significant, an increasing or decreasing trend is present for the corresponding metabolite and it is not the same in dosage groups; • when the coe cient for the interaction between Sample, Phase, and Dosage is significant, the previous trends could be di erent between phases and also between dosage groups.
The determination index R 2 measures the amount of variability explained by each model (computed using the MuMIn package v1.47.5 2 ).

Extract the results
To describe metabolic variations across conditions the estimated models can be used.Some "contrasts" of interest are evaluated, using the multcomp R package v1.4-23 3 , in order to answer biologically relevant questions: • Is the average level of the metabolite the same in Phase I and Phase II, net of other variables?
• Is the average level of the metabolite stable between consecutive samples or is there an increasing/decreasing trend?
Estimated average di erences between Phase II and Phase I To measure the average di erence between Phase II and Phase I, for a given metabolite, estimated log quantification levels in the middle of each phase are compared.According to the experimental design, while the 10 th sample is in the middle of the Phase I, the 30 th sample is not in the middle of the Phase II.Indeed, the first sample of the Phase II is taken after 28 days of probiotics supplementation.However, as 20 samples are collected for each phase, the 30 th sample is, numerically speaking, the sample in middle of the collected samples of Phase II.From a statistical perspective, the regression estimates of these samples are exactly in the middle of the two phases representing a meaningful choice to summarize them.Indeed it is robust with respect to the phase trends which may occur when a metabolite is increasing and/or decreasing during one or both phases.
The following null hypothesis is tested: The average level of the metabolite is the same in Phase I and Phase II, net of other variables.
It can be translated using the regression model formulation: • In dosage group A: Which corresponds to H 0 : 20-1 + -2 + 30-4 = 0 • In dosage group B: The statistically significant di erences are reported in Table 4 of the main text.The same results are also summarized in Figure S4.

Average di erences between consequent samples
The choice to compare samples in di erent phases allows to describe di erences between phases.However, the stability of a metabolite within the 20 samples of each phase can also be investigated.This can be done by comparing consecutive samples.
The following null hypothesis is tested: The average level of the metabolite is the same between consecutive samples, net of other variables.
It can be translated using the regression model formulation: • In dosage group A during Phase I: Which corresponds to H 0 : -1 = 0 • In dosage group B during Phase I: Which corresponds to H 0 : -1 + -6 = 0 • In dosage group A during Phase II: Which corresponds to H 0 : -1 + -4 = 0 • In dosage group B during Phase II: The statistically significant di erences are reported in Table 3 of the main text.The same results are also summarized in Figure S3.Di erences above zero indicate an increase of the average log-quantification values of the corresponding metabolites between consecutive samples (a positive trend).On the contrary, all the estimates below the zero line, indicate a reduction of the corresponding metabolites' log-quantification values (a negative trend).The higher the absolute value of the estimate, the bigger the di erence.Instead, for the 90% confidence interval bars, the narrower they are, the smaller the variability of the estimated di erence is.Colors and line types represent the di erent dosage groups and the phase respectively.Indeed, each metabolite can be characterized by a positive trend in one phase (or one dosage group) and a di erent behavior in the other group.

Urine metabolites -Unique Dosage
Assuming the equality of the e ects for the two dosage groups over time translates to a simplified version of the mixed-e ects linear regression model.The Dosage variable and its interactions with Sample and Phase are no longer included in the models.Using a simplified notation, the full model for the log-quantification of a generic urine metabolite is specified as it follows: Where: • log(Q) is the dependent variable of the model, i.e. the log-quantification of a generic metabolite; • Subject + Subject • Sample is the random part of the model: each Subject has a random intercept and slope, hence, for each subject, the trend in consecutive samples defined by the variable Sample (numerical, from 1 to 40, one for each consequent subject's measurement) could be di erent; • -0 is the fixed intercept; • -1 to -3 are the coe cients for the Sample number, Phase (categorical, I for samples collected before the probiotic supplementation, II for samples collected during the probiotic supplementation), and their interaction; • -4 to -6 are the coe cients for Age, Gender, and BMI which were included in the models as they are considered possible confounding variables; • the reference level for this model is represented by a male subject, before the treatment intake starts (i.e., Phase I).

Estimate the models
As in 1.1.1,to obtain a reduced model, more parsimonious than the full model, but with a comparable ability to describe the data variability, a stepwise model selection procedure is used.

Visualise the models
In Figure S1b the mixed-e ects regression model results are presented for each urine metabolite.
Regarding the interpretability of each model: • the main e ects indicate an average di erence between the level in the coe cient row (e.g., Phase = II) and its reference level (e.g., Phase = I); • the Sample covariate indicates the estimated increase/decrease of the corresponding metabolite values when the Sample changes by a single unit (i.e., the average di erence between consequent samples); • the interaction term measures metabolite changes in presence of one or more conditions compared to the baseline levels.

Extract the results
To describe metabolic variations across conditions the estimated models can be used.Some "contrasts" of interest are evaluated in order to answer biologically relevant questions: • Is the estimated average level of the metabolite the same in Phase I and Phase II, net of other variables?
• Is the average level of the metabolite stable between consecutive samples or is there an increasing/decreasing trend?
Estimated average di erences between Phase II and Phase I As in 1.1.3.1, to measure the average di erence between Phase II and Phase I, for a given metabolite, estimated log quantification levels in the middle of each phase are compared.
The following null hypothesis is tested: The average level of the metabolite is the same in Phase I and Phase II, net of other variables.
It can be translated using the regression model formulation: The statistically significant di erences are reported in Table S2.The same results are also summarized in Figure S5.

Average di erences between consequent samples
Similarly to the analysis performed in 1.1.3.2, the stability of a metabolite within the 20 samples of each phase is investigated by comparing consecutive samples.
The following null hypothesis is tested: The average level of the metabolite is the same between consecutive samples, net of other variables.
It can be translated using the regression model formulation: • During Phase I: Which corresponds to H 0 : -1 = 0 • During Phase II: The statistically significant di erences are reported in Table S3.The same results are also summarized in Figure S6.

Serum metabolites
To study the serum metabolite average levels and their relationship with the treatment, a mixed-e ects linear regression framework is used for each metabolite S1.While 20 urine samples were collected for each phase, serum samples are only 2, the first is collected at the beginning of Phase I and the other at the beginning of Phase II.As stated in the main text, the model formulation for the log-quantification of a generic serum metabolite is much simpler than the one for urine metabolites and it is specified as it follows: Where: • log(Q) is the dependent variable of the model, i.e. the log-quantification of a generic metabolite; • Subject is the random part of the model: each Subject has a random intercept; • -0 is the fixed intercept; • -1 to -4 are the coe cients for the Dosage group (categorical, A or B), Phase (categorical, I for the sample collected before the probiotic supplementation, II for the sample collected during the probiotic supplementation), and their interactions; • -5 to -7 are the coe cients for Age, Gender, and BMI which were included in the models as they are considered possible confounding variables; • the reference level for this model is represented by a male subject, belonging to dosage group A, before the treatment intake starts (i.e., Phase I).

Estimate the models
As for the urine metabolites' models, to obtain reduced models, more parsimonious than the full models, but with a comparable ability to describe the data variability, a stepwise model selection procedure is used.

Visualise the models
In Figure S2 the mixed-e ects regression model results are presented for each serum metabolite.
Regarding the interpretability of each model: • the main e ects indicate an average di erence between the level in the coe cient row (e.g., Phase = II) and its reference level (e.g., Phase = I); • the interaction term between Phase and Dosage is never significant.Hence, di erences between phases are present but they are not di erent between dosage groups.
Among the other covariates included in the models, both Age, Gender, and BMI are able to explain metabolite levels (e.g., when the variable Gender is significant, we observe a reduction for the metabolite levels in females).

Extract the results
To describe metabolic variations across conditions the estimated models can be used.Some "contrasts" of interest are evaluated in order to answer a biologically relevant question: • Is the average level of the metabolite the same in Phase I and Phase II, net of other variables?

Estimated average di erences between Phase I and Phase II
The choice to compare the sample collected in Phase I and the sample collected in Phase II is given by the need to capture di erences between phases.Chronologically speaking, the Phase I sample was collected at the beginning of the Phase I, instead the Phase II sample was collected after 28 days of probiotics supplementation.
The following null hypothesis is tested: The average level of the metabolite is the same in both phases, net of other variables.
It can be translated using the regression model formulation: The statistically significant di erences are reported in Table 5 of the main text.The same results are also summarized in Figure S7.
All serum metabolites were not changing di erently by dosage group between phases (no significant Dosage-Phase interactions).For this reason the unique dosage analysis is not performed for serum metabolites.

Figure S1 :
Figure S1: Mixed e ects models graphical representations.Models' coe cients by rows, metabolites by columns.Each cell contains the estimated coe cient values colored by sign (positive or negative).Significant coe cients' cells (p-value < 0.1) are red-framed.R-squared index is reported for each model (the closer to 1, the better the model fit).a. Urine metabolites with Dosage A and B. b.Urine metabolites with unique dosage.

H i p p u r a t e X 2 F
FigureS3: Average di erences for urine metabolites between consecutive samples distinguishing for Phase (type of line) and Dosage group (color).Estimates and their 90% confidence intervals are colored by dosage group and the line type is di erent between phases.

T a r t r a t e V a l i n e X 3 h y d r o x y i s o b u t y r i c A c i d X 4 h y d r o x y p h e n y l a c e t a t
Figure S4: Average di erences for urine metabolites between Phase II and Phase I. Estimates and their 90% confidence intervals are colored by dosage group.
Figure S5: Average di erences for urine metabolites between Phase II and Phase I (unique dosage).

H i p p u r a t e F o r m a t e X 2 h y d r o x y i s o b u t y r i c A c i d A c e t o a c e t i c A c iP h e n y l a c e t y l g l u t a m i n e X 4 h y d r o x y p h e n y l a c e t a t
Unique dosageEstimated log(Quantification) differences between consecutive samples

Figure S6 :
Figure S6: Average di erences for urine metabolites between consecutive samples distinguishing for Phase (type of line).Estimates and their 90% confidence intervals are reported.

Table S2 :
Estimated average di erences for urine metabolites between Phase II and Phase I (unique dosage).90% confidence intervals are reported.

Table S3 :
Estimated average di erences for urine metabolites between consecutive samples (unique dosage).90% confidence intervals are reported.
Mixed e ects models graphical representations for serum metabolites.Models' coe cients by rows, serum metabolites by columns.Each cell contains the estimated coe cient values colored by sign (positive or negative).Significant coe cients' cells (p-value < 0.1) are red-framed.R-squared index is reported for each model (the closer to 1, the better the model fit).